The parallels between NoSQL and self-inflicted torture

It never ceases to amaze me how much pain and contortion people who use NoSQL databases need to go through.

Consider (for example) this blog post on the MongoHQ blog.

Here's the executive summary for you (in relational database speak).

Q: I have an entity that tracks documents and a document has an attribute called publication status and this includes such things as 'draft', 'waiting proof', 'approved', 'published' and so on. I would like to get a list of documents that are still unpublished and that includes documents that have statuses of 'draft', 'waiting proof' or 'approved' (but not the status 'published'). Write me a program that would get these.

A: Denormalize the schema and add a new column called "unpublished" and introduce logic into your application that will set the unpublished field (a boolean) to be TRUE when the document publication status is one of the various not published values and then write a query that would search the table with the restriction:


And dear reader, you do this because the "performance" is better.

Here's the SQL variant based on a simple one table structure.



Or maybe you need to enumerate the statuses for some reason such as that there are various 'published' statues.

SELECT ... FROM DOCUMENTS WHERE STATUS IN ( 'draft', 'waiting proof', 'approved' );

And you could create an index on STATUS and the select would be really quick. On a small 3 node ParElastic configuration running on Amazon AWS with 71 million documents, with each document record being approximately 850 bytes, I ran this query and sent the results off to a file in under 0.5s.

So here's the accounting for those of you who are keeping score at home.




1. Write the obvious NoSQL program to do this as described in the MongoDB blog post (the thing called "a slow query in Mon­goDB")

a while

0 min

2.  Find that the obvious NoSQL program is slow and attempt to debug it

a while

0 min

3. Find the MongoDB blog post on the subject

a while

0 min

4. Write the code to manage the new field

a while

0 min

5. Write the code to query the new field


0 min

6. Debug the program when someone else changes it and the magic field is not set right (database people call this anomalies when they discuss normalization)

a while

0 min

7. Write a simple query against a database


30 sec

8. Relaxing on the beach knowing that the program will perform properly …

0 min

a while

Tell me again, why is this NoSQL way of thinking a good idea?


You might have missed the point.

That hard part is the sort - as Chris' said, see my blog at for details.

Identifying when something is NoSQL

See for an explanation of how this problem and solution are exactly the same in an RDBMS and this isn't a NoSQL issue.

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Enter the characters shown in the image.