class: center, middle # Course Introduction __CS291A__ Dr. Bryce Boe (Bryce, or bboe, is preferred) September 27, 2018 --- # Welcome to CS291A This slide can be found at: https://cs291.com/slides/2018/01_course_introduction/#2 Please complete this form via your phone or computer: https://goo.gl/forms/qIeWIvdbPYujNj8j2 If that does not work, please write on a piece of paper: * Name and email address * Your education level and year (e.g., 2nd year MS) * What is your prior experience with Internet technology? * What do you hope to gain from this course? --- # About Me * UCSB CS Alum (BS: 2008, Ph.D.: 2014) * As a graduate student: * Taught 3 undergraduate CS courses * TA-ed for numerous other courses * Took this course in Winter 2009 (CS290N at the time) * 4th time teaching this course (was CS290B in Fall 2015) * Staff Software Engineer, Tech Lead at AppFolio * 16+ years of web development and operations experience * First web page using HTML in 1996 * Learned PHP and MySQL around 2002 * Member of [Order of the Overflow](https://www.oooverflow.io/) (current DEF CON CTF organizing team) --- # My Teammates "Pear" Programming ![Jon and Adam 'Pear' Programming](pear_programming.png) --- # Today's Agenda * Course Overview * Course Motivation * Course Structure * Course Grading * Course Info * The Life Cycle of a Web Request * Group Exercise * Review --- # Pull requests, pull requests, pull requests! .center[![Pull Requests! Pull Requests! Pull Requests!](developersdevelopers.gif)] If you notice an issue with or wish to make an improvement to any of the course content (e.g., slides, web pages) please edit them and make a pull request. Website source: https://github.com/scalableinternetservices/ucsb_website/ Slide source: https://github.com/scalableinternetservices/ucsb_website/tree/master/slides/2018/ --- # Questions and Feedback At any point during this course: Stop me to: * ask a question * ask for clarification * provide an additional example Communicate to me: * how I can help you succeed in this course * ideas for making the course more engaging * any other feedback you may have --- class: center inverse middle # Course Motivation --- class: center middle # How do you find a place to rent? --- # How do you find a place to rent? * [Rent.com](http://www.rent.com/) * [Apartment Guide](http://www.apartmentguide.com/) * [Rental Houses](http://www.rentalhouses.com/search/Goleta-CA) * [Zillow](http://www.zillow.com/homes/for_rent/) * [Realtor](http://www.realtor.com/apartments/Goleta_CA) * [Trulia](http://www.trulia.com/for_rent/Goleta,CA) * [PadMapper](http://www.padmapper.com/) * [Craigslist](https://santabarbara.craigslist.org/) --- class: center middle # How do you find your way in a new city? --- # How do you find your way in a new city? * [Google Maps](https://www.google.com/maps) * [mapquest](http://www.mapquest.com/maps?city=Goleta&state=CA) * [tripadvisor](http://www.tripadvisor.com/LocalMaps-g32438-Goleta-Area.html) * [Yelp](http://www.yelp.com/c/goleta-ca-us/restaurants) * [Uber](https://www.uber.com) * [Lyft](https://www.lyft.com) --- class: center middle # How do you find someone to date? --- # How do you find someone to date? * [Tinder](https://www.tinder.com/) * [OkCupid](https://www.okcupid.com/) * [Coffee Meets Bagel](https://site.coffeemeetsbagel.com/) * [EliteSingles](https://www.elitesingles.com/) * [Match](http://www.match.com/) * [Bumble](https://bumble.com/) * [eHarmony](http://www.eharmony.com/) * [Zoosk](https://www.zoosk.com/) * [HER](https://weareher.com/) * [Grindr](https://www.grindr.com/) * [Jdate](https://www.jdate.com/en-us) * [Christian Mingle](http://www.christianmingle.com/) * [Facebook](https://www.wired.com/story/facebook-dating-how-it-works/) * and many more! --- # Internet (or Web) Services! Each of the previous problems can be solved by a variety of Internet services. Every day billions of people use various Internet services to solve such problems. > What other every day problems are solved by Internet services? --- class: center inverse middle # As an Internet service grows in popularity, supporting the increased amount of Internet traffic results in increased complexity of the Internet service --- class: center middle ![Reddit downs North Korea's web sites](reddit_nk.png) ![Stephen Fry takes down websites with single tweet](stephen_fry_tweet.png) --- # Internet Services, what's that? .center[![Question Mark](question_mark.jpg)] --- # Internet Services, what's that? There are many application-level protocols that are used to build out Internet Services. For this class, Internet services will refer to HTTP-based services. The interface to your web service may be the web browser (e.g., Chrome, Firefox), an API client (via REST, GraphQL, etc), or both. --- # What about mobile? Many native mobile apps are backed by Internet services via an API. ![Mobile v. Desktop from 2010 through 2014](mobile_v_desktop.png) High Performance Browser networking details issues with mobile users and offers some optimizations designed for them ([HPBN chapters 5 through 8](https://hpbn.co/#toc)). However, these topics won't be covered in this class. --- # Scalable, what does that mean? .center[![Question Mark](question_mark.jpg)] --- # Scalable, what does that mean? An Internet service is scalable if increasing demands can be effectively met with increasing capacity. Demands could be: * Web traffic quantity (typical association) * Dataset size --- # Effectively meet demands: Explanation * Internet service remains available * Response time does not excessively degrade ## Think about it Assume you have a web service designed to run on a single server with a plan to use a bigger server when it can no longer effectively meet demand. __Is this scalable?__ --- # What you will do: In this course you will learn and utilize some of the technologies behind building large-scale Internet services. You will test and support to the best of your abilities: * Exponential growth in the amount of traffic to your web service * Exponential growth in the dataset your web service relies upon --- # In Summary This course won't teach you how to build a web application that obtains worldwide attention and usage. However, this course will teach you how to build a web application __that can respond to__ worldwide attention and usage. --- # Other Topics In addition to scaling, we will learn about: * Performance * Security * Agile software development * Test driven development * Web clients (e.g., web browsers) --- class: center inverse middle # Course Structure --- # Lectures * Tuesday and Thursday from 1 -- 2:50PM in Phelps 2510 (here) * The lectures and associated reading will cover the concepts you are expected to learn in this class --- # Labs * Meets Tuesday from 4 -- 5:00PM in Phelps 3523 (upstairs) * Labs are mandatory * Labs will focus on the course project * During the labs you will: * Plan out a week's worth of work * Discuss how to improve team processes * Work with your team * Demo your progress --- # Course Project You will apply your learnings from this course to your course project. The project entails: * Working in teams of four or five people * Developing a sufficiently complex Internet service * Deploying your service to Amazon EC2 via elastic beanstalk * Iteratively measuring your service's performance and scalability * Applying techniques presented in class to improve your service's performance and scalability * Documenting these improvements via a detailed write-up * Presenting the results at the end of the quarter (Monday, December 10 between 12PM and 3PM) --- # Course Skills __This course is fairly demanding, but is one of the most industry-applicable courses you can take.__ You will develop the following skills: * Programming in __Ruby__ * Building web services using the __Rails__ framework * Experience with Amazon Web Services (__AWS__): __EC2__, __S3__, __Elastic Beanstalk__ * Load testing Internet services via __Tsung__ * Test-Driven Development (__TDD__) * __Agile__/__Scrum__ software development * Development using __Git__ with a _feature-branch_ flow via __github__ _pull requests_ --- # This course is _not_ a deep-dive into: * Cloud Computing * Distributed Systems * Networking * Relational Databases * Security But we will touch on all of the above. --- # Industry Focused The skills you should develop through this course are the same that I use everyday at work. The projects will all be open source so if you're proud of your team's work (you should be) then put a link to the project on your résumé. Industry related tools you will use: * Git via Github (project source version control) * Ruby on Rails (development stack) * Travis CI (automated testing) * NewRelic (performance metrics) --- # Why Ruby on Rails? Ruby is an interpreted language, thus it is not terribly __fast__, nor is very __memory efficient__. However, it is very easily scalable, and for most Internet services, developer time ($$$) is going to be much more significant than the efficiency of the service. Building Rails Internet services quickly with zero prior experience makes this class possible. --- # Texts ## "High Performance Browser Networking" * by Ilya Grigorik * Available free online ## "The Ruby On Rails Tutorial" * by Michael Hartl * Available free online --- class: center inverse middle # Course Grading --- # Project Grade (assigned to your group) * 30% web service complexity * 50% load testing and subsequent scaling (explained in presentation and write-up) * 10% quality of presentation * 10% quality of write-up --- # Individual Grade * 5% participation (in-class, on piazza, slide/material corrections) * 95% project grade (your relative group project involvement percent) ## How is relative involvement computed? * Privately, everyone has 100% to assign across the other members of their group * The relative percent for each individual is the sum of what their group-mates assign them (can go above 100%) We will compute these scores __three__ times during the quarter. Only the last score will be used for your grade. Any moderate deviations from near-equal grades will require discussion. --- # Letter Grades Final letter grades will be assigned as indicated at: * https://cs291.com/#letter-grades --- # Course Info * https://cs291.com * https://github.com/scalableinternetservices/ * * All out-of-class announcements will be made here * Set up email notifications * It is strongly encouraged for you to respond to questions, and improve upon the "student answer" by making edits. * For clarifications on existing questions, please make a comment on an existing post * For related but separate questions, please create a "new post" --- # First Five Weeks * The basics (HTTP and HTML) * _Industrial_ software engineering: _Agile_, _TDD_, _Continuous Integration_ (CI), Pair Programming * HTTP Application Server architectures * High availability via load balancing: a share-nothing web stack * Client-side and server-side caching * Relational databases with web applications: concurrency control and query analysis * Scaling via: * Sharding * Service-Oriented-Architecture (SOA) * Read-followers --- # Later Course Topics * Web security: _firewalls_, _https_, _XSS_, _CSRF_ * HTTP 2.0 * Content-delivery networks * Non-relational data stores (NoSQL) * _Serverless_ via AWS Lambda --- # Guest Lectures * Ricky Ramirez, Senior Systems Engineer @ Reddit * At least two others TBD --- # TO-DO ## By next class * Join the class on [Piazza]() * Read the list of project ideas: https://cs291.com/project_ideas/ * Post or comment on at least one idea on Piazza under the `project_idea` "folder" * Read chapters 1 and 2 in [High Performance Browser Networking](https://hpbn.co/primer-on-latency-and-bandwidth/) ## Before Tuesday's Lab (shortly after Tuesday's class) * Complete the [Learn Ruby Codecademy](https://www.codecademy.com/learn/learn-ruby) * Complete chapter 1 in the [Ruby on Rails Tutorial](https://www.railstutorial.org/book/beginning) --- # Questions / Brief Break .center[![Question Mark](question_mark.jpg)] --- class: center inverse middle # The Life Cycle of a Web Request --- # The Two Endpoint Basics A web browser is a process (at least one) that runs on an operating system. It: * responds to user input * renders the display * utilizes the network -- A web server is a process (at least one) that runs on an operating system. It: * responds to network requests * loads resources that may come from file system, database, other servers --- # Web Request Life Cycle Group Exercise Prompt: What _things_ (e.g., events, protocols, actions) (might) occur when someone types [https://www.reddit.com](https://www.reddit.com) in their web browser and presses return. ## Part 1 (~10 minutes) Discuss in pairs, and write down in-order the components you come up with. Start generic, and leave space to provide additional detail for sub-sequences. ## Part 2 (~10 minutes) Merge your pair with a near-by pair. Start by comparing the lists you've come up with, and then write-down your combined lists. (~10 minutes) ## Part 3 Collectively, we build a more definitive list using the whiteboard. --- # Core Components of a Web request * Web server: Opens a TCP socket to listen for requests * Browser: Makes a DNS query to obtain an IP address for www.reddit.com * Browser: Initiates a TCP connection to the IP address * Web server: Accepts the TCP connection * Web server: Adds TLS context to the TCP connection * Browser: Wraps the TCP connection in a TLS session * Browser: Sends an HTTP request over the TLS session * Web server: Parses the request, fetches and sends the requested resources --- # What about scalability? Let's add a load balancer and additional servers. .center[![Load balancer and three web servers](load_balancer_simple.jpg)] Source: [http://www.laymance.com/blog/apache-load-balancers-and-log-files/](http://www.laymance.com/blog/apache-load-balancers-and-log-files/) --- # Questions .center[![Question Mark](question_mark.jpg)]