Formal Language Theory Meets Graph Theory
A CFG For Graphs
Consider the following example where we take a graph or its hypergraph equivalent, our approach transforms the original graph into its clique tree. Using the properties of clique trees we expand and focus on sepsets and derive a set of production rules.
Using these rules we can grow exact and approximate graphs. Exact graph generation produces isomophic graphs. On the other hand, using stochastic graph generation we end up with approximate graphs that have many of the global properties observed in the original graph. Evaluating the performance of HRG against leading graph generators, like Kronerker Product, Exponential Random, and Chung-Lu graph models HRG outperforms these others. Especially when we compare the original graph to sets of generated graphs from the various generators using the connonical method of graph comparison: Graphlet Correlation Distance.
Here we grow graphs by applying the rules in the order that they were derived. If the rules are instead applied stochastically, we generate approximate graphs. Computing the network properties like node degree, hop-plot, eigenvector centrality, graphlet correlation distance (GCD), and others allows us to compare the performance of the generator.
Testing Robustness of Graph Generators
When we study complex systems such as communications networks, or networks of the brain we derive a graphical representation to work with. Often times during the validation of a new method—we are interested in similar networks (network that exhibit similar network properties). To address this, we can pick one of several well studied graph generators that learn a model given graph, for example: Kronecker Product or Chung-Lu. Using the learned model parameters we can generate synthetic graphs that on the surface have many of the properties observed in the original graph. In some cases we are interested in similar graphs that are an order or magnitude larger in size. But how well do these models capture the intricacies of the interactions to be able to scale. How do we shed light on the biases and assumptions that are at the core of these models? To do this, we use a simple method we call infinity mirror which takes the newly created synthetic graph and we feed it back to the graph generator to see how well it can relearn a set of model parameters. At each step, a new synthetic graph is generated, we compute a battery of network properties and compare these to the signature of the original graph. This way we can see what network properties are consistently preserved and which are not.
Human Navigation in Information Networks
Salvador Aguinaga and Tim Weninger. The Infinity Mirror Test for Analyzing the Robustness of Graph Generators. ACM International Conference on Knowledge Discovery in Data Mining (SIGKDD) Workshop on Mining and Learning with Graphs, San Francisco, CA, August 2016.
In order to efficiently reason about knowledge and information, humans have evolved efficient strategies for organizing complex concepts in order to form connections between and recall information. This behavior can be observed and codified when people search for objects within digital information networks. Current models of search behavior exhibit unnecessary or extraneous complexity. Minimal or simple modifications to well established algorithms yield valid models of human navigation by exploring hierarchical information inherent in networks. We explore and validate a new model of how humans navigate an information networks. To that end, we present a new path finding algorithm that approximates human navigation by leveraging the categorical classification of the nodes within the network. We compare our new model, CatPath, to existing graph distance measures when possible and show that the category paths are largely correlated with traces of human navigation.