What is a Graph Database?

A graph database is a database that uses a graphical model to represent and store the data.

The graph database model is an alternative to the relational model.

In a relational database, data is stored in tables using a rigid structure with a predefined schema.

In a graph database, there is no predefined schema as such. Rather, any schema is simply a reflection of the data that has been entered. As more varied data is entered, the schema grows accordingly.

What does a Graph Database Look Like?

Here’s an example of a simple graph database.

Screenshot of a simple graph database.
Example of a simple graph database.

The blue and green circles are nodes. The arrows represent relationships.

You can immediately see that the relationship that Tom Hanks has with all those movies (i.e. the green circles) is that he’s acted in them. But he also has directed a movie that he has acted in, so he has two relationships with that particular movie.

Actually, this is not the full database. It’s just the results of a query from the sample Movie database supplied with the Neo4j graph database management system.

Do Graph Databases use SQL?

Most graph database management systems are NoSQL systems, meaning, they either don’t support SQL, or they do, but also support other query languages.

NoSQL originally meant “Non SQL” but this is sometimes expanded to mean “Not only SQL”.

In most cases, SQL doesn’t make sense with the graph architecture. Graph databases are typically structured a lot differently to the relational model that SQL was designed for.

Many graph database management systems use their own proprietary query language.

The graph in the previous example was generated from the following query:

$MATCH (tom:Person {name: "Tom Hanks"})-[:ACTED_IN]->(tomHanksMovies) RETURN tom,tomHanksMovies

That code is written in Cypher – the proprietary query language for Neo4j.

Some graph database systems support other languages such as JavaScript, JSON, XQuery, SPARQL, etc

Graph Database Management Systems (GDBMS)

Graph databases are created and managed using a database management system (DBMS) specifically designed for graph databases. These can be referred to as a Graph Database Management System (GDBMS) or a Graph DBMS.

Some GDBMSs use a relational storage engine, while the NoSQL systems typically use a completely different architecture for their storage engine, for example a key-value store or document-oriented database.

Many NoSQL database systems use  tags or properties to define relationships between nodes. This can help return large amounts of related data without the need to use joins across many tables, as one would be required to do using the relational model using SQL.

Example of a Graph DBMS

Here’s another example of some query results using the Neo4j graph database management system. This time showing the user interface.

Screenshot of query results in the Neo4j graph database.
Example of query results in the Neo4j graph database. The results show the nodes (blue and green circles) and the relationships (arrows) between them.

The graphical representation of data in a graph database is in contrast to the tabular structure presented in the commonly used relational database model.

Example of a Relational DBMS

Just for a comparison, here’s an example of a query in MySQL (a relational DBMS). Note the results are displayed in a tabular format (as opposed to the nodes and arrows that the graph model uses).

Screenshot of running a query in MySQL Workbench.
A query in MySQL Workbench. The query results are provided in a tabular format.

Benefits of Graph Databases

Graph databases can have many benefits over other types of databases, and in particular, relational databases.

Performance

Graph databases can have major performance benefits over relational databases, particularly when it comes to large queries across related data.

Relational database applications commonly have queries that join many tables – perhaps 20, 30, or more. Queries like this can run extremely slowly, especially as more and more records are entered into the database. Recursive queries can be a particular issue, especially if they run many levels deep.

In many cases, if that same data was stored in a graph database, the queries would be much simpler and they would run much quicker. This is because, on a graph database, queries are localised to a portion of the graph. This means that the execution time for each query is proportional only to the size of the part of the graph traversed to satisfy that query, rather than the size of the overall graph.

Of course, this all depends on the data. The graph model is not necessarily the ideal model for all cases. Some data is more suited to the tabular structure of relational databases. However, the graph model is well suited to querying large associative data sets.

Flexibility

Because there’s no need to define a set schema, you have complete flexibility over how the database grows.

With the relational model, you would need to map out the requirements in full detail before creating the database. You need to try to foresee any potential change in business requirements and try to build a solution that caters for all possible future scenarios. This is not always possible. If the business requirements grow/change significantly, the structure of the database may also need to change significantly. It may even have to be completely redone.

With the graph model, as the business grows, you can always add new types of relationships, new nodes, and new subgraphs to any existing database without disturbing existing queries and application functionality.

Development and Maintenance

Software development has evolved to a point where it’s standard practice to release updates incrementally and iteratively. This is in contrast to the major release cycle that often took many months or even years.

Graph databases are well suited to this incremental/iterative release practice, because no underlying structural changes need to take place before any changes can occur on the application.

What can a Graph Database be used for?

Graph databases can be used in a wide variety of applications. Some popular uses for graph databases include:

  • Social networks
  • Realtime product recommendations
  • Network diagrams
  • Fraud detection
  • Access management
  • Graph based search of digital assets
  • Master data management

Examples of Graph Databases

Examples of graph databases include Neo4jBlazegraph, and OrientDB.

Check out this list of over 40 graph DBMSs.