Monitoring Options for MongoDB
In this age of Big Data, there is a need to scale the database horizontally across many nodes with a technique referred to as sharding. When you “shard”, you put small amounts of a large dataset onto separate nodes using a shard key that determines which host each piece of data belongs to. Then data is located faster when the shard key is used in a query or update, since only a portion of the large dataset is being accessed. This allows the dataset to scale – by space and performance – as needed, simply by adding more shard nodes. That is all very nice, but how does one assess the health of all of these hosts? Gone are the days when you had a single massive database server and monitoring involved simply watching that one host. Today you could be responsible for managing dozens of hosts. This post will talk about monitoring MongoDB clusters, what to look for and what monitoring options are available.
What to Monitor?
The following is a list of perhaps the most important  performance measurements to monitor and what they indicate.
- File descriptor limit – This Linux operating system setting defaults to 1,024, but MongoDB recommends 64,000
- Process limit – Exactly like the file descriptor limit, this defaults to 1,024, but MongoDB recommends 64,000
- Lock percentages – This is the percentage of time spent waiting for any particular lock
- Connection counts – Too many client connections can overwhelm the server
- Replication lag – This is the delay in applying updates to secondary nodes in a replica set
- Primary election events – many could indicate networking or host hardware issues or misconfiguration
- Total disk usage – An additional shard may be needed if disk usage is high
- Memory allocations – Ideally, the entire dataset for each node should fit into available memory
- Page faults – Indication that data was not available in memory and the disk was accessed
Available Monitoring Options
The MongoDB Management Service (MMS) provides a consolidated view of a MongoDB cluster in a management portal. This option is available for free – hosted on mms.mongodb.com – if you agree to install a monitoring agent on a single node in your network . This agent will collect and send statistics – over the Internet – to the management service. The MMS portal then allows you to view your cluster nodes, their activity and all of the performance measurements listed above and more.
Of course, it may be undesirable to send this kind of data over the Internet. There is a new option available to Enterprise MongoDB customers (paid subscription). It is referred to as MMS “on-prem” or on-premises. This option allows for self-hosting MMS to keep all of the management data within the network.
However, given that one of MongoDB’s advantages is that it is open source and free to use, a support contract might not be a practical option. In this case, there are open-source tools and those that ship with MongoDB, which require extra effort to use, but can provide the same information in raw form as that presented by MMS. The following are some self-hosted open source monitoring tools that provide a high-level view of health and performance :
- Ganglia (mongodb-ganglia plugin): Python script to report operations per second, memory usage, btree statistics, master/slave status and current connections.
- Munin (mongo-munin plugin): Retrieves server statistics.
- Zabbix (mikoomi-mongodb plugin): Monitors availability, resource utilization, health, performance and other important metrics.
Additionally, there are MongoDB server commands whose output can be used to monitor the health and performance of each node. These are typically run in the mongo shell. They can be used to compliment the information provided by the open source self-hosted tools above:
- db.serverStatus(): disk usage, memory use, connection, journaling, and index access
- db.stats(): Shows the quantity of data contained in the database, and object, collection, and index counters
- db.collection.stats(): Statistics similar to db.stats(), but for an individual collection
- rs.status(): Provides info about the state and configuration of the replica set and statistics about its members
One downside to running these commands is that they must be run on each node in the cluster, or in the case of rs.status(), on a node of each replica set. The output of these commands must then be aggregated in order to fully assess the health of a cluster of MongoDB shards or replica sets.
As a recap, knowing how and what to monitor in MongoDB can be difficult, but MongoDB Management Service (MMS) was presented as an effective way to easily monitor key metrics of MongoDB clusters. And, for those on a strict budget, some open-source alternatives to MMS were also suggested as free on-premises alternatives to MMS for monitoring MongoDB.
For more information about MongoDB, checkout the video presentations from sessions presented at the MongoDB World Conference 2014 at https://www.mongodb.com/mongodb-world/presentations.
-Robert Manning, Software Engineer at CollabraSpace
“Black Box MongoDB — Running MongoDB at Scale at Parse”, MongoDB World Conference 2014,
“MMS MongoDB Management Service”, https://mms.mongodb.com/learn-more
“Self Hosted Monitoring Tools”, http://docs.mongodb.org/manual/administration/monitoring