Thursday, April 12, 2012

Extract Data from Master oplog and Restore it in another MongoDB server

Extract data from oplog in MongoDB and restore in another MongoDB server.

                            Recently I came across a problem where we have to do a lot of modifications in the mongodb server which will be having issues with the production database. We removed the replication between the master and slave and then did the operation in slave and then updated the data using the tool wordnik-oss tools.

                             Unfortunately we did not have a replica set and we had normal Master-Slave setup. While all the updates are happening in slave I need to keep track of the master data so that I can add it to the slave. For this I used a tool named mongodb-admin-utils in wordnik-oss

Required software:

  1. Java and Git:
                  yum install java-1.6.0-sun-devel java-1.6.0-sun git

  2. Maven:
        recent version of wordnik-oss require maven 3

                  cd /usr/src
                    tar zxf apache-maven-3.0.4-bin.tar.gz

Building wordnik:

  1. Download and compile

                            git clone wordnik

  2. Compile and build
             In my case I only needed mongodb-admin-utils and hence I packaged only   that.

                         cd wordnik/modules/mongo-admin-utils
                        /usr/src/apache-maven-3.0.4/bin/mvn package

                       Once this is complete you can use mongo-admin-utils in the host.

Get Incremental oplog Backup from mongo master server

                cd wordnik/modules/mongo-admin-utils
               ./bin/ com.wordnik.system.mongodb.IncrementalBackupUtil  -o /root/mongo -h mastermongodb

                                   /root/mongo => output directory where the oplog is stored.
                                   mastermongodb => mongodb master host.

** We can't use this tool in slave as there is no oplog in slave.

Replay the Data from the oplog to the database

          I had some problems in restoring data from backup and I had to add the following settings for the restore to work without any issues.

    ulimit -n 20000

           Added the following Java options in so that it does not fail with Out Of Memory (OOM ) erros.

JAVA_CONFIG_OPTIONS="-Xms5g -Xmx10g -XX:NewSize=2g -XX:MaxNewSize=2g -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:PermSize=2g -XX:MaxPermSize=2g"

Replay Command:

            ./bin/ com.wordnik.system.mongodb.ReplayUtil -i /root/mongo -h localhost
                localhost => the mongodb server in which you want the data to be added.

         If you face any issues you can go ahead and file a issue in . The developer is a awesome person and will help you sort out the issue.


  1. Thanks for the write-up buddy. I helped me a lot. I encountered one nuisance though. I ran

    ./bin/ com.wordnik.system.mongodb.IncrementalBackupUtil -o /root/mongo -h mastermongodb

    with path to store incremental data and mastermongodb parameter replaced for my own. I have this command in nohup state but the command would not complete even after 2 days.Is this a bug or is it supposed to be like that ?

    1. It can run for days without any issues.

      1. Do you get the following sort of information in nohup.out
      oplog: 401,275 records, 40,123 req/sec, 4,953 skip

      2. Do you data getting populated in /root/backups.

      ./bin/ com.wordnik.system.mongodb.IncrementalBackupUtil -o /root/backups -h mongomaster

      oplog: 401,275 records, 40,123 req/sec, 4,953 skips

      [root@mongomaster mongo-admin-utils]$ ls -al /root/backups/
      total 128976
      drwxr-xr-x 2 root root 4096 Jul 17 08:24 .
      drwxr-x--- 9 root root 4096 Jul 17 08:24 ..
      -rw-r--r-- 1 root root 12 Jul 17 08:24 last_timestamp.txt
      -rw-r--r-- 1 root root 104857709 Jul 17 08:24 oplog.0001.bson
      -rw-r--r-- 1 root root 27050958 Jul 17 08:25 oplog.bson

  2. AWS Training in Bangalore - Live Online & Classroom
    myTectra Amazon Web Services (AWS) certification training helps you to gain real time hands on experience on AWS. myTectra offers AWS training in Bangalore using classroom and AWS Online Training globally. AWS Training at myTectra delivered by the experienced professional who has atleast 4 years of relavent AWS experince and overall 8-15 years of IT experience. myTectra Offers AWS Training since 2013 and retained the positions of Top AWS Training Company in Bangalore and India.

    IOT Training in Bangalore - Live Online & Classroom
    IOT Training course observes iot as the platform for networking of different devices on the internet and their inter related communication. Reading data through the sensors and processing it with applications sitting in the cloud and thereafter passing the processed data to generate different kind of output is the motive of the complete curricula. Students are made to understand the type of input devices and communications among the devices in a wireless media.


  3. Thanks for sharing,excellent information.It is very useful for me to learn and understand easily.Tableau is a powerful and fastest growing data visualization tool used in the Business Intelligence Industry. Business Intelligence Industry suggest to take tableau course to enhance their skills
    tableau training institute in bangalore