In this set of scenarios we will explore the difference between MongoDB sharding and replication, and explain when each is the most appropriate solution. Both replication and sharding are forms of horizontal scaling to create a high availability (HA) setup.
Use Case: Sharding vs Replication
Both replication and sharding can be used (individually or together) for horizontal scaling of a MongoDB installation.
Sharding is MongoDB's solution for meeting the demands of data growth. Sharding stores data records across multiple servers to provide faster throughput on read and write queries, particularly for very large data sets.
Any of the servers in the sharded cluster can respond to a read or write operation, which greatly speeds up query responses.
Replication is MongoDB's solution for providing stability, backup, and disaster recovery to a MongoDB installation. This process copies and synchronizes the replica data set across multiple servers. This prevents downtime if one server goes offline.
Any of the secondary servers can respond to read queries, but only the primary server will perform write operations. The results of the write operation will then be propagated out to the secondary servers.
Scenario 1: Fault-Tolerance
In this scenario, the user is storing billing data in a MongoDB installation. This data is mission-critical to the user's business, and needs to be available 24/7, even if a server crashes or is taken offline.
MongoDB replication is the best solution for this user. With replication, the entire data set is mirrored on multiple servers. If a server fails or is taken offline, the other servers in the cluster take over.
Scenario 2: High Performance
In this scenario, the user is running a social networking site which is run from a MongoDB database. As the social network grows, the MongoDB data set has grown along with it. The user is seeing query times and page loads increase beyond an acceptable point. It is critical that the user's MongoDB installation receives a major performance boost.
Setting up a sharded MongoDB cluster is the best solution for this user. The sharded cluster will break up the user's data set and store parts of it on separate secondary servers. Each secondary server can respond to read or write queries on its portion of the data, which greatly increases the installation's response time.