I’ve been meeting folks who are having a hard time understanding the practicality of a document database and dynamic schema. I often hear, “I’m just trying to wrap my head around the concept.”

Some of the main concerns are:

  • Shouldn’t we be normalizing?
  • What about joins?
  • What about ORM?

To help with the introduction and transition we’ll need to do a little unraveling than wrapping. Let’s kick off this short discussion with a literal example of why a document store (in this case MongoDB) works well.

Imagine for a moment that it’s time for your yearly visit to the doctor. You’ve been put on a few supplements to improve your health and the doctor is monitoring your progress. Time for a checkup.

You walk into the clinic and give your name. The secretary looks you up in their booking system and confirms your internal patient number. “There you are Mr. Solutions (Patient #000-00-0001). You’re right on time, please have a seat.” says the secretary. Since they have more than one Mr. Solutions its a good thing they have some sort of ID. The secretary goes to get your file as you wait patiently to be called in.

Let me clue you into something that happens behind the scenes. Having worked in a hospital I know the secretary goes into the back-office (database) and searches for the file #000-00-0001 (query). They only get one file. They don’t get a set of files that have a primary key and join them together. There are no joins. They don’t put your name/id number into a mapper and look you up.

If you took a peek into the back-office there are small files and big files. Some files have yellow sticky notes and others have blue. They are (indexed) by patient number. While a handful of files look very similar in structure (schema) others seem to be quite different. One way or another there is just one file per patient.

What’s the point? Welcome to a document database that works.

Many medical professionals made the choice  that a single file/folder/grouping of papers was sufficient to store your medical history. Some clinics are messy, others are neat, yet this simple system works for them. Back at the clinic it seems there is some information needed from a cardiologist that is in their files.

The secretary quickly calls and asks for a copy of the information. When he gets the copy he places it in your file.

You probably got the point a few paragraphs ago.

The same concept holds true with a document database. More often than not (yes there are exceptions) you’ll find that storing a majority of the pertinent data in a single document works well. What do I mean by a single document?

Taking our medical example, this is what it could look like if your medical history was digitized using the document database – MongoDB.

"_id" : ObjectId("4c537d4b82fd211170000000"),
	"name" : "Mr. LightCube Solutions"
	"date_created" : "Sat Jan 2 2010 21:32:58 GMT-0400 (EDT)",
        "DOB" : "Sat Jan 1 1970 00:01:01 GMT-0400 (EDT)",
	"billing" : {
		"description" : "Home",
		"telephone" : "",
		"address" : "43 Happy Lane",
		"address_2" : "",
		"city" : "Light Land",
		"state" : "NY",
		"zip" : "111222",
		"country" : "USA",
	"medications" : [
			"_id" : "4c537b3382fd21df6f040000",
			"name" : "Activase"
			"description" : " tissue plasminogen activator",
			"dose" : "100mg Vial",
			"date_started" : "Fri Jul 30 2010 21:32:58 GMT-0400 (EDT)"
	"appointments" : [
			"date" : "Fri Jul 30 2010 21:32:58 GMT-0400 (EDT)",
			"summary" : "Started patient on Activase due to heart disease."
			"scheduled_checkup" : "Mon Aug 30 2010 11:00:00 GMT-0400 (EDT)",

	"conditions" : {
		High Blood Pressure,
		Excessive Vitamin D

What would that document look like in a SQL database? I would guess 4 or 5 different tables requiring joins would be in place.

Why normalize all that valuable information?

If its not practical to do this in reality then why do it digitally?

Store your data in a way that is practical and dynamic. For instance, maybe Mr. Solutions will be put on some special program that will need to be monitored different than previous trials. Why build a new database just for that? Just adapt the schema and embed.

A document database offers the flexibility, speed and simplicity that we already live by in other systems.

Hopefully this brief trip to the doctor helped to clear things up. Now go take your MongoDB medicine and call me in the morning. :)

2 thoughts on “The Medicine of a Document Database

  1. The biggest counter to your argument is that join speed is slow when your storage is a filing cabinet, and your query engine is a medical secretary. So the counter argument is computer make it possible to use the relational model instead of the document database model. That being said, the analogy is solid.

    I know a programmer that wrote a medical database in Postgres, and discusses many interesting solutions to interesting problems he had. If I were to design one and had to chose between a RDBMS and Mongo I’d probably think long and hard about which to use.

  2. I’m been telling people for years that normalization is bull but nobody believed me. it really don’t matter how you store the information as long as you can get it back quickly and in order. Flexibility is key is some causes and not so key in others. I would store the schema in a RDBMS and the data in Mongo.


Leave a reply


<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>