Blog
Technical deep dive

Dockerize Mongo to Get Consistent Data Across Your Development Environments

Author
Jonathan Gluck
Author
August 25, 2021
Dockerize Mongo to Get Consistent Data Across Your Development Environments

We have all lamented, with great chagrin, the proverbial "works on my machine" problem because data is different across different development environments.

Wouldn't it be ideal if we could deploy our MongoDB database state from a single repository on all development environments?

In this tutorial, we will be dockerizing a Mongo database with some seed data so we can do just that!

Prerequisites

  1. Install Docker Desktop
  2. Install MongoDB Compass (for testing)

Step-by-Step How-To

Step 1: Download and unzip the file TonicMongoDockerTutorial_start.zip

  • The structure should look as follows:
  • The purposes of each of these files are as follows:
    • docker-compose.yml will orchestrate our two containers; one holding the actual database, the other acting as a client 'seeder' to seed our data.
    • Dockerfile will be the Docker file for our custom Mongo client 'seeder'.
    • start.sh will contain the majority of the work we will have our 'seeder' execute as apart of its CMD instruction.
    • collections contains the JSON files representing the seed data we want in our Mongo container.

Step 2: We will start with docker-compose.yml. Edit it so that it has the following:

  • This creates our main container, my_mongo, where our Mongo database will live:
    • image: mongo pulls the Mongo image from Docker Hub
    • ports: maps local port 27019 to port 27017 (Mongo's default port) in the container. We pick 27019 to avoid conflicting with any Mongo already running on the host machine.
    • environment: set environment variables for Mongo to configure the root user credentials.
  • (optional) To test this:
    • Run docker-compose up -d in the TonicMongoDockerTutorial_start directory.
    • Connect to it using MongoDB Compass. Start a new connection. Click Fill in connection fields individually, select authentication: username/password, and fill in the following parameters:
      • hostname: localhost
      • port: 27019
      • username: root
      • password: rootpassword
    • You should see the default databases, admin, config, and local. Now let's set it up with some seed data!

Step 3: Let's create the Docker image for our seeder container. Edit Dockerfile so that it looks like:

  • This creates our seeder container:
    • FROM mongo pulls the Mongo image from Docker Hub; this gives the container the ability to use a Mongo client to seed the data.
    • COPY collections/Restaurants.json /collections/Restaurants.json copies the collection JSON into the seeder container.
    • ADD start.sh /start.sh adds the shell script (detailed in Step 4) into the container.
    • RUN chmod +x /start.sh makes the script executable in the container.
    • CMD ["/start.sh"] runs the script when the container starts.

Step 4: Let's put the logic to seed the database in start.sh:

  • This uses mongoimport to import the collection data into the my_mongo container. Most of this should be self-explanatory:
    • sleep 10 waits for 10 seconds to give my_mongo enough time to start up.
    • --host my_mongo specifies the host name of the my_mongo container. Our Docker compose will make sure this resolves!
    • --db tutorial_db names the database we want to put our collection in. It will auto create the DB if it does not exist.

Step 5: Now let's tie it all together and add our seeder image to docker-compose.yml.

  • We'll add the additional service entry for our seeder container:
    • mongo_seeder is our container name.
    • image: mongo_seeder will be the image name for when we build our Dockerfile.
    • depends_on makes it so we can access my_mongo as a host name in start.sh as mentioned in Step 4.

Step 6: Testing it all out from the TonicMongoDockerTutorial_start directory.

  • Build the Dockerfile docker build -f Dockerfile -t mongo_seeder .
  • Run docker-compose docker-compose up -d
  • Wait for the seeder to complete, and your restaurant collection should be accessible via MongoDB Compass! (see optional part of Step 2 for connection details)

Step 7 (Bonus): Configuring users and their permissions.

  • You may want to configure different users in your Mongo database with different permissions. We can add a javascript script to our Docker container to run Mongo commands to set that up!
  • Create a file in TonicMongoDockerTutorial_start directory with the following contents called setupUsers.js:
  • We can connect to our database and create a role and user with only listCollections, find, and listDatabases privilege actions to the Restaraunts collection, and no privilege to the Users collection. This should give us just enough to be able to view only Restaurants documents, but not edit them in MongoDB Compass.
  • Make sure setupUsers.js is copied into the container in Dockerfile:
  • Invoke setupUsers.js in start.sh:
  • mongosh invokes the Mongo shell with our setupUsers.js file.

  • To test this, rerun Step 6, and log in with our MongoDB Compass as username: lowAccessUser, password: lowaccessuserpassword and see we only have access to view documents on the Restaurants collection.

  • setupUsers.js does not have to be just for setting up users. You can do anything there that you can do in the mongo shell.

Conclusion

There we have it! Now all you have to do to reproduce your Mongo state anywhere is pull something like this from a repo onto a machine with Docker and run the commands from Step 6. You can download the complete code here: TonicMongoDockerTutorial_final.zip.

Credit to https://gist.github.com/yoobi55/5d36f13e902a75225a39a8caa5556551 for the Restaurant json data.

Jonathan Gluck
Engineering

Make your sensitive data usable for testing and development.

Unblock data access, turbocharge development, and respect data privacy as a human right.
Accelerate development with high-quality, privacy-respecting synthetic test data from Tonic.ai.Boost development speed and maintain data privacy with Tonic.ai's synthetic data solutions, ensuring secure and efficient test environments.