| <%= submission.title %> | 
      <%= submission.community.name %> | 
      <%= submission.comments.size %> comments | 
    
  <% end %>
<% end %>
```
---
# Choosing a Cache Key
```ruby
module SubmissionHelper
  def cache_key_for_submission(sub)
    "submission/#{sub.id}/#{sub.updated_at}/#{sub.comments.count}"
  end
end
```
Cache automatically invalidates when:
* Submission is updated (`updated_at` changes)
* Number of comments changes
---
# Low-level Rails Caching
You can use the same built-in mechanisms to manually cache anything:
```ruby
class ConversationController < ApplicationController
  def index
    most_recent_conversation_updated_at = Conversation.where(user_id: user_id).maximum(:updated_at)
    cache_key = "conversations/#{user_id}/#{most_recent_conversation_updated_at}"
    @conversations = Rails.cache.fetch(cache_key, expires_in: 1.hour) do
      Conversation.where(user_id: user_id).order(updated_at: :desc).map do |conversation|
        {
          id: conversation.id,
          title: conversation.title,
          updated_at: conversation.updated_at
        }
      end
    end
    render json: @conversations
  end
end
```
```ruby
class LLMPromptCache
  def self.fetch(prompt)
    Rails.cache.fetch("llm_prompts/#{prompt}", expires_in: 24.hours) do
      OpenAI.chat.completions.create(
        model: "gpt-4o-mini",
        messages: [{ role: "user", content: prompt }],
        temperature: 0.7
      )
    end
  end
end
```
---
class: center inverse middle
# Separating Reads from Writes
---
# Database Operations

This graph shows significantly more _reads_ than _writes_. This may be the case
for your application.
???
* More reads than writes
---
# Database Horizontal Scaling Problem

Simple load balancing of writes doesn't work: multiple servers could have different views of data.
---
# Database Horizontal Scaling
.left-column[
In general, a relational database is hard to horizontally scale.
However, when limited to a read-only copies, databases are very easy to
horizontally scale.
* Set up separate machines to act as __read replicas__
* Whenever any transaction commits to the _primary_ database, send a copy to
  each _replica_ and apply it
]
.right-column[]
???
* Horizontal scaling is easy for read-only copies
---
# Database Replication
.left-column40[]
.right-column60[
The sending of data from the primary to its replicas (replication) can happen
either synchronously or asynchronously.
## Synchronous
When a transaction is committed to mater, the primary sends the transaction to
its replicas and waits until applied by all before completing.
## Asynchronous
When a transaction is committed to the primary, the primary sends the
transaction to its replicas but does not wait to see if the transaction is
applied.]
???
* Wait until all replicas have applied the transaction
* Wait until some replicas have applied the transaction
* Don't wait for any replicas to apply the transaction
---
# Database Replication Trade-offs
.left-column[
> What are the advantages of waiting until writes are applied to all replicas?
]
.right-column[]
---
# Database Replication Trade-offs
.left-column[
> What are the advantages of waiting until writes are applied to all replicas?
Consistency. Subsequent read requests will see changes.
> What are the disadvantages of waiting until writes are applied to all
replicas?
]
.right-column[]
---
# Database Replication Trade-offs
.left-column[
> What are the advantages of waiting until writes are applied to all replicas?
Consistency. Subsequent read requests will see changes.
> What are the disadvantages of waiting until writes are applied to all
replicas?
Performance. There may be many read replicas to apply changes to.
]
.right-column[]
---
# Database Replication Levels
.left-column[]
.right-column[
## Statement-level
Similar to streaming the journal from the primary to its replicas.
## Block-level
Instead of sending the SQL statements to the replicas, send the consequences
of those statements.
> What are the advantages of each?
]
---
# Database Statement-level Replication
Statement-level is faster than block-level, with a catch.
An SQL statement is generally more compact than its consequences.
```sql
UPDATE txns SET amount=5;
```
The above query acts on all rows which may require a lot of data to transmit
the consequences.
However, SQL statements must now be deterministic:
```sql
UPDATE txns SET amount=5, updated_at=NOW();
```
> What is the value of `NOW()`?
Such values must be communicated from the primary to its replicas.
---
# MySQL Replication
> MySQL replication by default is asynchronous. ... With asynchronous replication, if the source crashes, transactions that it has committed might not have been transmitted to any replica. Failover from source to replica in this case might result in failover to a server that is missing transactions relative to the source.
> ... Fully synchronous replication means failover from the source to any replica is possible at any time. The drawback of fully synchronous replication is that there might be a lot of delay to complete a transaction.
> Semisynchronous replication falls between asynchronous and fully synchronous replication. The source waits until at least one replica has received and logged the events (the required number of replicas is configurable), and then commits the transaction. The source does not wait for all replicas to acknowledge receipt, and it requires only an acknowledgement from the replicas, not that the events have been fully executed and committed on the replica side. Semisynchronous replication therefore guarantees that if the source crashes, all the transactions that it has committed have been transmitted to at least one replica.
https://dev.mysql.com/doc/refman/8.4/en/replication-semisync.html
???
* What is the use case for replication
Tradeoff between consistency and performance
* High availability with zero data loss?
  * Synchronous replication
* High availability with some data loss?
  * Asynchronous replication
---
# Rails Read Replica Support
Rails 6+ has first-class support for read replicas now. "Automatic switching"
must be explictly configured and enabled:
> Automatic switching allows the application to switch from the primary to
> replica or replica to primary based on the HTTP verb and whether there was a
> recent write.
--
> If the application is receiving a POST, PUT, DELETE, or PATCH request the
> application will automatically write to the primary. For the specified time
> after the write the application will read from the primary. For a GET or HEAD
> request the application will read from the replica unless there was a recent
> write.
--
> Rails guarantees "read your own write" and will send your GET or HEAD request
> to the primary if it's within the delay window. By default the delay is set
> to 2 seconds. You should change this based on your database
> infrastructure. Rails doesn't guarantee "read a recent write" for other users
> within the delay window and will send GET and HEAD requests to the replicas
> unless they wrote recently.
---
# Trade-offs of Read Replicas
## Strengths
For applications with a high read-to-write ratio:
* the load on the primary database can be dramatically reduced.
* read replicas can be horizontally scaled (even with a load balancer)
--
## Weaknesses
Application developer needs to think about reads that affect writes vs. reads
that do not affect writes as such dependent reads should occur in the same
transaction as the write.
---
# MySQL Replication
MySQL replication is **asynchronous by default**.
**Replication lag risk**: If the primary crashes and you **failover to a replica**, transactions that were committed on the primary but not yet replicated will be missing from the replica (even though they were successfully committed).
**Semi-synchronous replication** waits for confirmation that at least one replica has received the transaction before the primary acknowledges commit, reducing risk of data loss during failover at the cost of increased response time.
---
class: center inverse middle
# Sharding
---
# Sharding: Idea
Take a single database and __*split/partition/shard*__ it up into multiple smaller
databases such that everything still works.
* **Partitioning** is a more general term that can be used to describe any way of splitting data into smaller parts
* Generally, you might **shard** a database into multiple databases each with the same schema
* A table within a database (or shard) may by **partitioned** into multiple smaller tables
--
> How do we handle joins across partitioned data?
---
# Partitioning: Joins
Any particular database join connects a small part of your database. However,
transitively, database joins could connect everything together.
## E.g. Class Project
* Any user asking questions is only related to their own messages and conversations.
* An expert profile is only related to its own user.
* ExpertAssignments are only related to their own conversation and expert.
* Transitively a user asking questions can be joined to the profile of experts who have answered their questions through ExpertAssignments.
---
# Partitioning: Separating Data
.left-column40[
Find a separation of your data that ideally produces unrelated (not joined
across) _partitions_.
Once separated, your application cannot utilize the database to join across
partitions.
If you need to perform operations across sharded data, you will need to do it
at the application level.
Consider the performance trade-offs. Could you partition another way?
]
.right-column60[

]
---
# Partitioning: Similar Data
Partitioning involves splitting data of the same type (e.g., the rows of the
tables).
For instance if we wanted to split our `Messages` table into two partitions, we
could store messages belonging to half the conversations in _partition1_, and
those belonging to the other half in _partition2_.
--
> What is not partitioning?
--
Separating tables into their own databases is not partitioning. While this approach
may work for some applications, the ability to join across tables is lost.
---
# Finding the Data
Assume we have partitioned the data for our application.
> How can we find what partition our data is on?
We need some sort of mapping to determine where to find that data.
---
# Finding the Partition
## At the application server layer?
> How would we implement this?
## At the load balancer?
> How would we implement this?
## Across multiple load balancers?
> How would we implement this?
---
# At the App Server
.left-column40[
Each application server contains a configuration that informs it of where each
database is (IP address, DNS name) and how to map data to the database.
The mapping can be arbitrarily complex.
The mapping itself may even be stored in a database.
]
.right-column60[

]
---
# At the load balancer
The load balancer could be configured to route requests to the app servers that
are configured to _talk_ to the right database.
Such mappings are limited by knowledge that the load balancer can inspect:
* Resource URI
* Headers
* Request Payload
.center[]
---
# Across Load Balancers
.left-column40[
Host names (DNS) can be configured to point to the correct load balancer for a
given request.
Examples:
* en.wikipedia.org vs. es.wikipedia.org (language based sharding)
* google.com vs. google.co.uk (location based sharding)
* na6.salesforce.com vs. naX.salesforce.com (customer based sharding)
__Note__: The above examples could involve only a single load balancer.
]
.right-column60[

]
---
# Finding Data: Trade-offs
The approaches we just described are vary from providing more flexibility to
providing more scalability.
* App Server (most flexible)
* Load Balancer
* DNS (most scalable)
---
# Partitioning and Growth
Ideally the number of partitions increase as the usage of your application
increases.
Example:
If each customer's data can be partitioned from the others, then doubling the
number of customers doubles the number of partitions.
---
# Email Example
The data that represent one user's email conceptually requires no relation to
the data representing other users' email.
When a request arrives associated with a particular user, the server applies
some mapping function to determine which database the user's data are located
in.
Should the email provider need to take down a database, they can relocate the
partitioned data to another database, and update the mapping with little
disruption.
---
# Class Project Example
* Users can create and view conversations.
* Users can write messages in these conversations.
* An expert can be assigned to a conversation to answer the question.
---
# Partitioning Demo App
## By Question Asker User?
**Easy For** Viewing the users initiated conversations and their messages.
**Harder For** Viewing all messages in a conversation, viewing conversations from other users in expert mode, ...
--
## By Conversation?
**Easy For** Viewing a single conversation and its messages.
**Harder For** Viewing a list of conversations, viewing all messages sent by a user to any conversation
--
## By Expert?
**Easy For** Viewing a single expert and their assigned conversations.
**Harder For** Re-assigning a conversation to a different expert, Viewing all conversations initiated as a question asker
---
# User Partitioning
> How could we make user based paritioning work?
What if we partiionted the data by the user initiating the conversation?
We can use information in the url with any of the partitioning approaches.
* http://zwalker.classproject.com (user sub-domain)
* http://classproject.com/zwalker (user path)
Either the application server connects to the right database for the `zwalker`
community, or DNS/loadbalancer directs the request to an application server
that always talks to the `zwalker` containing database.
---
# User Partitioning Success
.left-column20[

]
.right-column[

]
.center.clear[

]
---
# User Partitioning Difficulty
* The expert list of conversations waiting to be claimed
> What can we do to resolve these issues?
.left-column40[

]
---
# Solving Partitioning Problems
* Modify the user interface such that the difficult to partition page does not
  exist.
    > Can you get by with only providing the list of claimable conversations on the users shard?
* Alternatively, periodically run an expensive background job to keep a
  semi-up-to-date global conversation list aggregating results from across
  databases.
---
# Partitioning in Rails
Rails 6+ has built-in support for partitioning:
https://guides.rubyonrails.org/active_record_multiple_databases.html
```ruby
def index
    ...
    ActiveRecord::Base.connected_to(database: :customer1)
        ...
    end
end
```
---
# Sharding Trade-offs - Summary
## Strengths
* If you genuinely have zero relations across partitions, this scaling path is very
  powerful.
* Sharding works best when shards grow with usage.
--
## Weaknesses
* Sharding can inhibit feature development. That is your application may be
  perfectly partitionable today, but future features may change that.
* Not easy to retroactively add sharding to an existing application.
* Transactions across shards do not exist.
* Consistent DB snapshots across shards do not exist.
---
class: center inverse middle
# RDBMS vs NoSQL
---
# When RDBMS Techniques Aren't Enough
We've looked at several techniques for scaling RDBMSes:
* Distinguishing Reads from Writes (Read Replicas)
* Server-Side Caching
* Query Optimization and Indexes
* Sharding
> What if these techniques aren't sufficient?
--
When relational databases fail to scale to our needs, we need to consider non-relational solutions.
---
# NoSQL Overview
Non-relational databases are often referred to as **NoSQL** databases.
This is an umbrella term for many types:
* **Key-value stores** (Redis)
* **Column-oriented data stores** (Cassandra)
* **Document-oriented stores** (MongoDB)
* **Graph databases**
---
# NoSQL: Horizontal Scaling
.center[



]
Most NoSQL solutions are good at horizontal scaling.
**In exchange for better horizontal scaling, NoSQL databases provide applications fewer guarantees.**
---
# CAP Theorem
.center[]
**Theorem**: You can have at most two of these properties for any shared-data system:
* **Consistency**: All nodes see same data simultaneously
* **Availability**: System remains operational
* **Partition Tolerance**: System continues despite network failures
---
# CAP Choices
**CP (Consistent + Partition Tolerant)**: Always consistent, can handle partitions, not always available
  * Example: Would not allow writes during partition
**AP (Available + Partition Tolerant)**: Always accessible, can handle partitions, not always consistent
  * Example: Would accept conflicting writes
**CA (Consistent + Available)**: Always accessible, always consistent, assumes no partitions (very limiting)
---
# Partition Tolerance Required
Assuming no partitions is very limiting:
* For high availability and latency, multiple data centers are desirable
* Even within a single datacenter, partitions occur
**As a result, scalable Internet services require partition tolerance, and thus choose between consistency or availability.**
---
# ACID vs. BASE
The BASE acronym describes NoSQL solutions that tradeoff Availability and Consistency.
| ACID        | BaSE                  |
|:------------|:----------------------|
| Atomicity   | Basically Available |
| Consistency | Soft State |
| Isolation   | Eventually Consistent |
| Durability  | |
---
# SQL vs NoSQL Trade-offs
## Relational Databases (SQL)
* General-purpose persistence layer
* Offer more features (ACID, joins, transactions)
* Have limited ability to scale horizontally
* Best when: You need ACID guarantees, complex queries, relationships
## Non-relational Databases (NoSQL)
* Often more specialized
* Require more from the application layer
* Better at scaling horizontally
* Best when: High write throughput needed, simple access patterns, eventual consistency OK
---
# When to Use What?
**Use RDBMS when:**
* You need ACID properties
* Complex queries with joins
* Strong consistency requirements
* Your scaling needs fit within RDBMS capabilities
**Use NoSQL when:**
* Write throughput exceeds RDBMS capabilities
* Simple access patterns (key-value lookups)
* Eventual consistency is acceptable
* Horizontal scaling is the primary concern
---
# Summary
1. **Performance Optimizations**: Reduce queries, add indexes, use EXPLAIN
2. **Server-Side Caching**: Reduce database load with Memcached/Redis
3. **Read Replicas**: Scale reads horizontally for read-heavy applications
4. **Sharding**: Partition data when you can cleanly separate it
5. **RDBMS vs NoSQL**: Choose based on consistency needs vs scaling requirements
There is no silver bullet. Often you'll combine multiple techniques.
---
class: center middle
# Questions?