graphd is the back-end database which powers Freebase.com. It is a in-house graph database or tuple store which is written in C and runs on Unix-like machines. It processes graph query language (GQL) queries (translated from the MQL queries submitted through the Freebase API). The best available overview is an April 2008 blog post by graphd lead, Scott Meyers. For more technical details, see the team's SIGMOD '10 paper. Graphd is different than relational databases. Relational databases store data in the form of tables, but the database stores data as a graph of nodes and relationships between those nodes. Relational databases use the SQL query language and accept queries and return results using a specialized network protocol. Metaweb uses the MQL query language and communicates via standard HTTP requests and responses. Once written, primitives are read only. Graphd is a log-structured or append-only store. To “modify” a primitive, for example by changing the value, you write a new primitive carrying the modification and use the prev to indicate that it replaces the “modified” primitive. To delete a primitive, you write a new primitive which marks the primitive you wish to delete as being deleted. Deleted or versioned primitives are weeded out during query execution. In addition to many implementation advantages, a log-structured database makes it easy to run queries “as of” a certain date. In April 2008, Metaweb reported a sustained performance of about 3300 "simple" queries/sec on a single AMD64 core for a database of 121 million primitives (aka "facts"). See the blog post for more details. Metaweb has no announced plans to open source graphd or even to sell it as a commercial product. Much of it is kept secret.



« Graphd - freebase graph database »


A quote saved on May 15, 2013.

#relational-database
#queries
#blog-posts


Top related keywords - double-click to view: