Polyglot persistence, InfiniteGraph and Cassandra, solve a problem

Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. InfiniteGraph is a powerful graph database used to quickly discover relationships and connections in large amounts of distributed data.

Blogger Todd Stavish recently posted a great example of polyglot persistence and use of multiple NoSQL technologies. He demonstrated an integration between a graph database (InfiniteGraph) and a structured key-value store (Cassandra), to perform social network analysis and identify friend of a friend (FOAF) relationships.

This is a nice, simplified example of how data can be extracted from Cassandra, modeled as a graph and processed in InfiniteGraph. InfiniteGraph is very efficient at path analysis and perfectly suited for this type of data and relationship analytics.

When analyzing social network data, identifying the “influencers” would be useful in a variety of situations.  The “hubs” could be used to reach a broader group of similar people for targeted marketing;  Identifying patterns within a group would enable predicting future behavior and targeting messages based on the predictions.  It can also be used for link hunting to identify criminal behavior, track fraudulent transactions that might be separated from each other through many degrees.

InfiniteGraph is a generic, multipurpose graph database suitable for solving a wide range of graph data requirements. It is highly suitable for examining the interactions not just between people, but any kind of entity.  InfiniteGraph can be used to run analytics on web data, utility power grid data, communications data or any data set that lends itself to a graph structure. For example, in pharmacology, the interactions between various drugs can be studied to identify previously unknown drug interactions, dependencies and side effects. In health sciences, it can be used to understand disease outbreaks and the pattern of spread for infectious diseases.  Graph analysis can also be used to understand cellular interactions of complex metabolic and genetic networks.

InfiniteGraph has the capability to handle complex graphs like the once mentioned above.  It can store and analyze not just the entities but also the relationships between entities, so that, link hunting, path analysis, become trivial when compared to other database mechanisms.  It provides an API specifically designed for the types of operations that would be performed on a graph such as node search, traversal and iterations.

Post a Comment

Required fields are marked *
*
*