class: center, middle # Course Introduction ## CS291A: Scalable Internet Services --- # Zachary Walker (Zach) * San Jose State University (MSCS in 2006) - BS in Computer Science from SJSU (2003) * Software Engineer at LCOGT 2006 - 2015 - Full Stack Software Engineer - Building global network of robotic telescopes * Software Engineer at Appfolio 2015 - present - Full stack, infrastructure, data engineering * First time teaching this course - Credit to Bryce Boe et al. for the materials for the class - Thanks to Shyr-Shea Chang (last year's instructor) for help in preparing --- # Course web page and GitHub repo Website: https://cs291.com/ Website source:
Slide source:
If you notice an issue with or wish to make an improvement to any of the course content (e.g., slides, web pages) please edit them and make a pull request. --- # Questions and Feedback At any point during this course: Stop me to: * ask a question * ask for clarification * provide an additional example Communicate to me: * how I can help you succeed in this course * ideas for making the course more engaging * any other feedback you may have --- # Agenda * Course Motivation * Course Structure * The Life Cycle of a Web Request --- class: center inverse middle # Course Motivation --- class: center middle # How do you find a place to rent? --- # ...How do you find a place to rent? * [Rent.com](http://www.rent.com/) * [Apartment Guide](http://www.apartmentguide.com/) * [Rentals.com](http://www.rentals.com/California/Goleta/) * [Zillow](http://www.zillow.com/homes/for_rent/) * [Realtor](http://www.realtor.com/apartments/Goleta_CA) * [Trulia](http://www.trulia.com/for_rent/Goleta,CA) * [PadMapper](http://www.padmapper.com/) * [Craigslist](https://santabarbara.craigslist.org/) --- class: center middle # How do you find your way in a new city? --- # ...How do you find your way in a new city? * [Google Maps](https://www.google.com/maps) * [mapquest](http://www.mapquest.com/maps?city=Goleta&state=CA) * [tripadvisor](http://www.tripadvisor.com/LocalMaps-g32438-Goleta-Area.html) * [Yelp](http://www.yelp.com/c/goleta-ca-us/restaurants) * [Lyft](https://www.lyft.com) * [Uber](https://www.uber.com) --- class: center middle # How do you ask questions or generate... -- Emails -- Code -- Birtday cards for you mom -- Reports -- Love Letters -- Resumes --- # ...How do you ask questions or generate text? * [ChatGPT](https://chat.openai.com/) * [Gemini](https://gemini.google.com/) * [Claude](https://claude.ai/) * [LLaMA](https://ai.meta.com/llama/) * [PaLM](https://ai.google/discover/palm2/) * [Cohere](https://cohere.com/) * [Mistral](https://mistral.ai) * and many more! --- # Internet (or Web) Services Each of the previous problems can be solved by a variety of Internet services. Every day billions of people use various Internet services to solve such problems. > What other every day problems are solved by Internet services? > Do services use other services? --- class: center inverse middle # As an Internet service grows in popularity, supporting the increased amount of Internet traffic results in increased complexity of the Internet service --- class: center middle ![Reddit downs North Korea's web sites](reddit_nk.png) ![Stephen Fry takes down websites with single tweet](stephen_fry_tweet.png) --- class: center inverse middle # Complex Services Can Also Fail --- class: center middle ![Cloudflare November 2, 2020](cloudflare_1120.png)
--- class: center middle ![Slack May 12, 2020](slack_0520.png)
--- class: center middle ![Cloudflare Outage July 2, 2019](cloudflare_0719.png)
--- class: center middle ![Facebook Outage March 14, 2019](facebook_0319.png)
--- --- class: center middle These are happening all the time... [List of post mortems for service failures](https://github.com/danluu/post-mortems) --- # Internet Services, what's that? - What are the characteristics of an internet service? .center[![Question Mark](question_mark.jpg)] --- # ...Internet Services, what's that? There are many application-level protocols that are used to build out Internet Services. For this class, Internet services will refer to HTTP-based services. The interface to your web service may be the web browser (e.g., Chrome, Firefox), an API client (via REST, GraphQL, etc), or both. --- # What about mobile? - Many native mobile apps are backed by Internet services via an API. - Concerns of mobile app/service development - In app purchases ![Mobile v. Desktop from 2010 through 2014](mobile_v_desktop.png) High Performance Browser networking details issues with mobile users and offers some optimizations designed for them ([HPBN chapters 5 through 8](https://hpbn.co/#toc)). However, these topics won't be covered in this class. --- # Scalable, what does that mean? .center[![Question Mark](question_mark.jpg)] --- # ...Scalable, what does that mean? An Internet service is scalable if increasing demands can be effectively met with increasing capacity. Demands could be: * Web traffic quantity (typical association) * Dataset size --- # Effectively meet demands: Explanation * Internet service remains available * Response time does not excessively degrade ## Think about it You have a web service designed to run on a single server. What do you do when you can no longer effectively meet demand. __Is your solution scalable?__ --- class: center inverse middle # Course Structure --- # First Five Weeks * The basics (HTTP and HTML) * HTTP Application Server architectures * High availability via load balancing * Client-side and server-side caching * Content-delivery networks * Software engineering techniques: _Agile_, _TDD_, _Continuous Integration_ (CI), Pair Programming --- # Later Course Topics * Relational databases with web applications: concurrency control and query analysis * Scaling via: * Sharding * Service-Oriented-Architecture (SOA) * Read-followers * Non-relational data stores (NoSQL) * Web security: _firewalls_, _https_, _XSS_, _CSRF_ * Scalability of machine learning services --- # Course Skills __This course is fairly demanding, but is one of the most industry-applicable courses you can take.__ You will develop the following skills: * Programming in __Ruby__ * Building web services using the __Rails__ framework * Working with __Docker__ * Experience with Amazon Web Services (__AWS__): __EC2__, __Elastic Beanstalk__, __Lambda__, __S3__ * Load testing Internet services via __Tsung__, and __ab__ * __Agile__/__Scrum__ software development * Development using __Git__ with a _feature-branch_ flow via __github__ _pull requests_ --- # This course is _not_ a deep-dive into: * Cloud Computing * Distributed Systems * Networking * Relational Databases * Security But we will touch on all of the above. --- # Industry Focused The skills you should develop through this course are the same that I use everyday at work. The projects will all be open source so if you're proud of your team's work (you should be) then put a link to the project on your résumé. Industry related tools you will use: * Git via Github (project source version control) * Ruby on Rails (development stack) * GitHub actions (automated testing) --- # What you will do In this course you will learn and utilize some of the technologies behind building large-scale Internet services. You will test and support to the best of your abilities: * Exponential growth in the amount of traffic to your web service * Exponential growth in the dataset your web service relies upon --- # In Summary This course won't teach you how to build a web application that obtains worldwide attention and usage. However, this course will teach you how to build a web application __that can respond to__ worldwide attention and usage. --- --- # Why Ruby on Rails? Ruby is an interpreted language, thus it is not terribly __fast__, nor is very __memory efficient__. However, it is very easily scalable, and for most Internet services, developer time ($$$) is going to be much more significant than the efficiency of the service. Building Rails Internet services quickly with zero prior experience makes this class possible. Consistency in frameworks across teams allows us to better support each other. --- class: center inverse middle # The Life Cycle of a Web Request --- # Two Endpoints -- A __web browser__ is a process (at least one) that runs on an operating system. It- * responds to user input * renders the display * utilizes the network -- A __web server__ is a process (at least one) that runs on an operating system. It- * responds to network requests * loads resources that may come from file system, database, other servers --- # Web Request Life Cycle Group Exercise Prompt: What _things_ (e.g., events, protocols, actions) (might) occur when someone types [https://www.reddit.com](https://www.reddit.com) in their web browser and presses return. ## Part 1: ~10 minutes Discuss in groups of four, and write down in-order the components you come up with. Start generic, and leave space to provide additional detail for sub-sequences. ## Part 2 (remaining time) Collectively, we build a more definitive list using the whiteboard. --- # Core Components of a Web request * __Web server__: Opens a TCP socket to listen for requests * __Browser__: Makes a DNS query to obtain an IP address for www.reddit.com * __Browser__: Initiates a TCP connection to the IP address * __Web server__: Accepts the TCP connection * __Web server__: Adds TLS context to the TCP connection * __Browser__: Wraps the TCP connection in a TLS session * __Browser__: Sends an HTTP request over the TLS session * __Web server__: Parses the request, fetches and sends the requested resources --- # Assignments Due Monday (9/30) 2pm - Piazza - Intro survey Due Monday (10/7) 2pm - [Project 0](/project0) - Building a simple website using Github Pages - Will be introduced next week Due Monday (10/14) 2pm - [Project 1](/project1) - Introduction to Ruby programming language - Feel free to start now to get ahead