What is NoSQL?
NoSQL encompasses a wide variety of different database technologies that were developed in response to the demands presented in building modern applications:
NoSQL can be referred to as non relational databases.
NoSQL provides the process of storage and retrieval of data which is different than the relational databases. For example: NoSQL can store the data in the form of document.
NoSQL consists of data like user information, social graphs, geographic location data and other user-generated content.
Examples of NoSQL databases: MongoDB, Cassandra.
Developers are working with applications that create massive volumes of new, rapidly changing data types — structured, semi-structured, unstructured and polymorphic data.
Long gone is the twelve-to-eighteen month waterfall development cycle. Now small teams work in agile sprints, iterating quickly and pushing code every week or two, some even multiple times every day.
Applications that once served a finite audience are now delivered as services that must be always-on, accessible from many different devices and scaled globally to millions of users.
Organizations are now turning to scale-out architectures using open source software, commodity servers and cloud computing instead of large monolithic servers and storage infrastructure.
Relational databases were not designed to cope with the scale and agility challenges that face modern applications, nor were they built to take advantage of the commodity storage and processing power available today.
Types NoSQL Database
The different types of NoSQL databases are as follows:
1. Key-value pair:
Every item in the database is stored as an attribute name (key) and linked with value. Key-value pairs are stored in the form of array.
2. Wide-column stores:
Instead of storing data in rows (relational tuples), the databases are designed for storing the data in the section of columns. Wide-column store provides high scalability and improves the performance of the system.
Examples: Hbase, BigTable, HyperTable.
3. Document Database:
Document databases work on key-value pair and contain more complex data. Each document is assigned a unique key which helps to retrieve the document.
Examples: MongoDB, CouchDB.
4. Graph Database:
Graph database is based on graph theory and especially designed for the data elements which are interconnected in the network.
Examples: Neo4j, Polyglot.
The Benefits of NoSQL
When compared to relational databases, NoSQL databases are more scalable and provide superior performance, and their data model addresses several issues that the relational model is not designed to address:
Large volumes of rapidly changing structured, semi-structured, and unstructured data
Agile sprints, quick schema iteration, and frequent code pushes
Object-oriented programming that is easy to use and flexible
Geographically distributed scale-out architecture instead of expensive, monolithic architecture
Advantages of NoSQL
>>NoSQL supports non relational databases and is also called as not only SQL.
>>NoSQL has a dynamic schema in which new data is added automatically.
>>It is highly scalable.
>>NoSQL provides the support for distributed computing.
Disadvantages of NosQL
>>NoSQL does not have the defined standard.
>>NoSQL has a limited capability of queries.
>>CAP Theorem for NoSQL
>>CAP Theorem is also known as ‘Brewers Theorem’, which states:
Database should remain consistent even after the execution and all nodes in the network should see the data at the same time.
It refers that the system is always available without any downtime.
3. Partition tolerance:
The system can continue to operate even if sudden partitioning occurs due to the network failure. For example: In case of network failure, if one server stops working, the other can serve the request.
>>Amazons DynamoDB is a NoSQL database service, which provides fast performance.
>>DynamoDB data model consists of table, items and attributes.
>>Each attribute of DynamoDB is represented as a name-value pair.
>>Amazon DynamoDB is a web service, which uses HTTP as a transport layer service.
1. Primary key:
To create a table, a user needs to define the primary key column name, which identifies the items in a table separately.
2. Partition key:
A primary key which has only one attribute is known as the partition key.
DynamoDB can use the partition key as input to the hash function for only internal use.
The two attributes of composite primary key are as follows:
1. First attribute is a sort key.
2. Second attribute is a sort key.
All items with same partition key are stored together and sorted out by using sort key value.
Data Indexing in Amazon DynamoDB
Amazon DynamoDB creates and maintains indexes for the primary key attributes, which helps in efficient access to data present in the table and also for the purpose of data retrieval.
DynamoDB creates a table with a composite primary key, which supports one or more secondary indexes in the table
The two types of secondary indexes are:
1. Global secondary index:
It is an index with a partition key and a sort key.
2. Local secondary index:
It is an index having the same partition with different sort key.
Wide-column stores organize data tables as columns instead of as rows. Wide-column stores can be found both in SQL and NoSQL databases. Wide-column stores can query large data volumes faster than conventional relational databases. A wide-column data store can be used for recommendation engines, catalogs, fraud detection and other types of data processing. Google BigTable, Cassandra and HBase are examples of wide-column stores.
Graph data stores organize data as nodes, which are like records in a relational database, and edges, which represent connections between nodes. Because the graph system stores the relationship between nodes, it can support richer representations of data relationships. Also, unlike relational models reliant on strict schemas, the graph data model can evolve over time and use. Graph databases are applied in systems that must map relationships, such as reservation systems or customer relationship management. Examples of graph databases include AllegroGraph, IBM Graph, Neo4j and Titan.
Evolution of NoSQL
Berkeley DB was an influential system in the early evolution of NoSQL database usage. Developed at the University of California, Berkeley, beginning in the 1990s, Berkeley DB was widely described as an embedded database that closely supported specific applications’ storage needs. This open source software provided a simple key-value store. Berkeley DB was commercially released by Sleepycat Software in 1999. The company was later acquired by Oracle in 2006. Oracle has continued to support open source Berkeley DB.
Other NoSQL databases that have gained prominence include cloud-hosted NoSQL databases such as Amazon DynamoDB, Google BigTable, as well as Apache Cassandra and MongoDB.
The basic NoSQL database classifications are only guides. Over time, vendors have mixed and matched elements from different NoSQL database family trees to achieve more generally useful systems. That evolution is seen, for example, in MarkLogic, which has added a graph store and other elements to its original document databases. Couchbase Server supports both key-value and document approaches. Cassandra has combined key-value elements with a wide-column store and a graph database. Sometimes NoSQL elements are mixed with SQL elements, creating a variety of databases that are referred to as multimodel databases.
Implementing a NoSQL Database
Often, organizations will begin with a small-scale trial of a NoSQL database in their organization, which makes it possible to develop an understanding of the technology in a low-stakes way. Most NoSQL databases are also open-source, meaning that they can be downloaded, implemented and scaled at little cost. Because development cycles are faster, organizations can also innovate more quickly and deliver superior customer experience at a lower cost.
As you consider alternatives to legacy infrastructures, you may have several motivations: to scale or perform beyond the capabilities of your existing system, identify viable alternatives to expensive proprietary software, or increase the speed and agility of development. When selecting the right database for your business and application, there are five important dimensions to consider.