A few days ago, Microsoft Research (MSR) announced Trinity, a graph database that is powering some important initiatives within Microsoft such as Probase. This is the second serious initiative of MSR in the NOSQL space. Last year, MSR released a map reduce engine under the code name Dryad which is now available as part of MSDN Dev Labs.
Regardless of whether Trinity becomes an official Microsoft product or not, this announcement is a testimony to the emerging influence of NOSQL databases for both on-premise and cloud solutions. There are many explanations to the increasing popularity of the NOSQL approach but, essentially, they all boil down to the fact that the world needs new alternatives for storing and indexing the increasingly amounts of semi-structured and unstructured data.
We can’t keep thinking to model the entire world with rows and columns.
We can definitely find alternatives to store unstructured data with relational databases, but we, most of the time, we end up paying a price in terms of complexity and, to some extent, scalability.
As many other evolutionary approaches in the software development industry in recent few years, NOSQL databases have found a home within the open source community but (understandably so) have experience some resistance from the big relational DB vendors such as Microsoft, Oracle or IBM. Despite that, the adoptions of NOSQL databases have been growing exponentially even within big enterprise companies. To cite a recent example, last December the US Federal Government publicly announced its commitment to Apache Cassandra for real time analytics.
Types of NOSQL Databases
When thinking on adopting NOSQL databases, we should be aware that they vary depending on the aspects such as the storage model, APIs and consistency model. Here is a list of some of the most popular models in the current market:
- Wide Column Store (Map-Reduce): This type of databases specializes on large data computation and batch processing. Some of the most notable technology on this space are : Hadoop, Cassandra and Dryad
- Document Stores: This type of databases are optimized to store semi-structure data in the form of documents encoded in formats like JSON, BSON, YAM, etc. Some of the top technologies on this space include MongoDB, CouchDB, RavenDB
- Key Value Stores: Key-value stores are schema-less data stores that allow the application to store its data. The data is usually stored in a data type of a programming language or an object. Thus, there is no need for a fixed data mode. Some great data stores on this space are Windows Server AppFabric Caching, MemcacheDB, Tokyo Cabinet.
- Eventually Consistent Key Value Stores: These type of databases evolve from the eventually consistent transaction model. Some technologies on this space are Project Voldemort and Amazon Dynamo
- Graph databases: These type of databases uses graph structures with nodes, edges, and properties to represent and store information. Some great technologies on this space include Neo4J, Infinite Graph, Microsoft Trinity
A more exhaustive list could be found here.
Can I use NOSQL databases in a .NET solution?
Not only can you but I would encourage you to start thinking about. Currently, there are many NOSQL databases that enable native .NET interface or interoperable APIs such as HTTP. MongoDB, CouchDB, RavenDB, Riak are some of the NOSQL databases that are becoming increasingly popular within .NET enterprise customers.
What about Windows Azure?
Azure enables a NOSQL database as a first class citizen in the form of the Table Service which allows to store large volumes of semi structured data (still modeled as rows and columns though). Additionally, it is possible to get a lot of the existing NOSQL databases working as Azure worker/web roles leveraging the Blob storage as the fundamental persistence infrastructure. When implementing the correct mechanism for partitioning and distributing the data within the Blob storage, this approach can be as efficient (if not sometimes more efficient) than relying on the Table Service. Additionally, you can enable a nice consistency between on-premise and cloud solution in the Windows Azure platform.
Nice theory! Have you done this before?
Absolutely! At Tellago, we are really passionate about adopting new technology trends that, we believe, can make a significant difference in our solutions for our customers. In that sense, last year we started adopting and evangelizing NOSQL technologies with our .NET customers. As a result, we have currently implemented and deployed several solutions that rely on NOSQL infrastructures for various Fortune 500 companies. I will be sharing a lot of the lessons we have learned on those implementations in future posts but I can tell you that we got to experience a lot of the benefits of NOSQL databases in terms of simplicity, agility and scalability. We also use NOSQL databases heavily as part of our work at Tellago Studios.