How to reclaim storage space on two-node MongoDB replica sets

At mongodb.org they seem to assume we can create MongoDB replica sets using unlimited numbers of instances which have infinite amounts of storage. In practice, however, we often need to use replica sets with only two nodes (plus arbiter) which have limited storage. The problem then is that MongoDB has the tendency to use vast amounts of disk space without reclaiming the space from dropped data, so it consumes ever-increasing amounts of storage. It’s then hard to deal with this storage problem given the limited options available in a two-node replica set.

A solution to this is clearing all the data from each node in turn, which forces MongoDB to rebuild its data using only the disk space it needs. When performed on a regular basis, this stops the amount of storage which MongoDB is using from constantly increasing at an unacceptable rate.

To achieve this, I wrote the following script which can be run on the primary node via cron as the mongod user on a regular basis (e.g. once a week, or even once a day, depending on the seriousness of the problem). The script firstly clears then rebuilds data on the secondary, then temporarily promotes the secondary to primary whilst clearing and rebuilding data on the primary, then puts everything back to normal again.

N.B. Whilst I’ve built a lot of safety checks and backups into this script, be aware that it deletes all data on your MongoDB nodes so there is high potential for serious problems such as complete data loss if you’re not careful. So, read through the following points very carefully, and deal with these issues before you even think about running the script:

  • Only run this on a properly functioning, problem-free two-node system where you have an arbiter configured on a third machine.
  • Follow the instructions in the comments at the top and ensure that you have the mongod user, SSH and sudo set up properly before commencing.
  • For the latest version of the script you’ll need the timeout Unix command installed, so make sure that’s available on your systems before you start.
  • Get this working properly and safely in test environments before considering deployment in any production environments.
  • Before adding this to cron, run it manually so you can see what it’s doing and stop it if necessary to fix issues.
  • Always make sure you have recent data backups before running it, so that you can restore all your data in the event of a disaster.
  • I’ve run this in various environments with CentOS 5 and CentOS 6, but I haven’t tested it on Debian or Ubuntu, so you may need to make some changes to run it on those distributions.

If you choose to use this then you do so at your own risk, and after all those warnings I’m not going to take any responsibility if you lose data as a result!

2016-01-11: I’ve modified the script to use the timeout command in various places. This adds a level of safety to the script to stop it from unexpectedly doing dangerous things if it doesn’t run properly for some reason.

Change your environments and hostnames in the script as needed. You can get the script from GitHub or copy and paste it below:

#!/bin/bash

# Force MongoDB to only use as much storage as it needs
# instead of taking up more and more space without reclaiming it

# Make sure of the following:
#
# 1. The mongod user has its shell set to /bin/bash on both machines
# 2. The mongod user has SSH keys set up such that it can SSH from 
#    the primary to the secondary without prompt
# 3. The mongod user has the following permissions in /etc/sudoers:
#    mongod ALL=NOPASSWD: /sbin/service mongod status, /sbin/service mongod stop, /sbin/service mongod start
#    (modify accordingly if not using Red Hat/CentOS)
# 4. Make sure the requiretty option is off in /etc/sudoers

# Only run as mongod user
if [ "$(whoami)" != "mongod" ] ; then echo "Not mongod user" ; exit 1 ; fi

# Determine environment - change these as needed
case "$(hostname)" in
  primary.production.mydomain.com ) primary=primary.production.mydomain.com ; secondary=secondary.production.mydomain.com ;;
  primary.staging.mydomain.com ) primary=primary.staging.mydomain.com ; secondary=secondary.staging.mydomain.com ;;
  primary.development.mydomain.com ) primary=primary.development.mydomain.com ; secondary=secondary.development.mydomain.com ;;
  * ) echo "Unknown environment" ; exit 1 ;;
esac

# Check sudo and SSH
if ! sudo -n /sbin/service mongod status > /dev/null ; then
  echo "Problem with sudo on $primary" ; exit 1
elif ! ssh -q $secondary "sudo -n /sbin/service mongod status > /dev/null" ; then
  echo "Problem with SSH and/or sudo on $secondary" ; exit 1
fi

# Take backup on primary
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Taking backup /tmp/dump on $primary..."
cd /tmp ; rm -rf dump ; mongodump > /dev/null
if [ "$?" != "0" ] ; then echo " Problem taking backup on $primary" ; exit 1 ; fi
echo " done"

# Clear data on secondary
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Clearing data on $secondary..."
timeout 300 ssh -q $secondary "sudo -n /sbin/service mongod stop > /dev/null"
if [ "$?" != "0" ] ; then echo " Problem stopping mongod on $secondary" ; exit 1 ; fi
timeout 300 ssh -q $secondary "rm -rf /var/lib/mongo/*"
if [ "$?" != "0" ] ; then echo " Problem clearing /var/lib/mongo on $secondary" ; exit 1 ; fi
timeout 300 ssh -q $secondary "sudo -n /sbin/service mongod start > /dev/null"
if [ "$?" != "0" ] ; then echo " Problem starting mongod on $secondary" ; exit 1 ; fi
echo " done"

# Wait for secondary to come back up
issecondary=$(timeout 300 ssh -q $secondary "echo 'db.isMaster()' | mongo" | grep secondary | awk -F '[ ,]' '{print $3}')
if [ "$?" != "0" ] ; then echo " Problem getting isMaster status on $secondary" ; exit 1 ; fi
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Waiting for $secondary to come up..."
until [ "$issecondary" == "true" ] ; do
  sleep 5
  echo -n "."
  issecondary=$(timeout 300 ssh -q $secondary "echo 'db.isMaster()' | mongo" | grep secondary | awk -F '[ ,]' '{print $3}')
  if [ "$?" != "0" ] ; then echo " Problem getting isMaster status on $secondary" ; exit 1 ; fi
done
echo " done"

# Demote primary so secondary is master
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Demoting $primary..."
echo 'rs.stepDown()' | mongo --quiet > /dev/null
if [ "$?" != "0" ] ; then echo " Problem demoting $primary" ; exit 1 ; fi
echo " done"

# Wait for secondary to take over as master
issecondary=$(echo 'db.isMaster()' | mongo | grep secondary | awk -F '[ ,]' '{print $3}')
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Waiting for $secondary to become master..."
until [ "$issecondary" == "true" ] ; do
  sleep 5
  echo -n "."
  issecondary=$(echo 'db.isMaster()' | mongo | grep secondary | awk -F '[ ,]' '{print $3}')
done
echo " done"

# Clear data on primary
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Clearing data on $primary..."
sudo -n /sbin/service mongod stop > /dev/null
if [ "$?" != "0" ] ; then echo " Problem stopping mongod on $primary" ; exit 1 ; fi
rm -rf /var/lib/mongo/*
sudo -n /sbin/service mongod start > /dev/null
if [ "$?" != "0" ] ; then echo " Problem starting mongod on $primary" ; exit 1 ; fi
echo " done"

# Wait for primary to come up
issecondary=$(echo 'db.isMaster()' | mongo | grep secondary | awk -F '[ ,]' '{print $3}')
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Waiting for $primary to come up..."
until [ "$issecondary" == "true" ] ; do
  sleep 5
  echo -n "."
  issecondary=$(echo 'db.isMaster()' | mongo | grep secondary | awk -F '[ ,]' '{print $3}')
done
echo " done"

# Demote secondary so primary is master
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Demoting $secondary..."
timeout 300 ssh -q $secondary "echo 'rs.stepDown()' | mongo --quiet > /dev/null"
if [ "$?" != "0" ] ; then echo " Problem demoting $secondary" ; exit 1 ; fi
echo " done"

# Wait for primary to take over as master
isprimary=$(echo 'db.isMaster()' | mongo | grep ismaster | awk -F '[ ,]' '{print $3}')
echo -n "$(date +'%Y-%m-%d %H-%M-%S') Waiting for $primary to become master..."
until [ "$isprimary" == "true" ] ; do
  sleep 5
  echo -n "."
  isprimary=$(echo 'db.isMaster()' | mongo | grep ismaster | awk -F '[ ,]' '{print $3}')
done
echo " done"