Many of you use Star-P, Matlab, Python, R, among other languages for numerical
computing. Often, in many scientific applications, the behavior of individual
elements of a system is explained with simple rules. However, when assembled
into a system, they collectively exhibit complex behavior. For example, Newton
wrote down the equation for the gravitational attraction between two particles,
and it is deceptively simple. But, interesting things happen when
a
whole bunch of these interact with each other. Feel free to post your own
examples in the comments.
This blog entry is not about n-body simulations, but about combinatorial
computing augmenting numerical computing. At a basic level, graphs are a simple
way to capture relationships between elements in a system. In a dynamic system,
this graph may also evolve over time.
Most languages for numerical computing (including Star-P) support sparse
matrices as first class citizens. It turns out, that sparse matrices and graphs
are one and the same. So, its quite convenient to represent graphs as sparse
matrices. This approach has many benefits; you don't have to write new data
structures to store graphs, and algorithms to operate on graphs. Sparse matrix
operations can be used to implement graph algorithms !
Lets say, we have a collection of edges (u, v, w) - There is an edge of weight w
between nodes u and v. One way to store this graph is to just store the tuples
(u, v, w) in dense vectors. However, not much can be done with a graph in such a
data structure. To convert this into a sparse matrix, we use the sparse()
command:
>> G = sparse (u, v, w)
An interesting thing to do immediately at this point is to visualize the graph
with the spy command. Lets create a random graph for the purpose of
illustration.
>> n =
1000;
% Use n = 1000*p in Star-P
>> U = ceil(n*rand(10*n,1)); % from nodes
>> V = ceil(n*rand(10*n,1)); % to nodes
>> G = sparse(U, V, 1, n, n); % n-node graph with ~10n edges
So far, we have created a graph with 1,000 nodes and 10,000 random edges between
them. We use spy and spyy for visualization. spy is a Matlab built-in which
displays every nonzero in a sparse matrix (every edge in a graph) as a dot.
Clearly, when the graphs are very large, with tens of millions of nodes and
edges, such visualization becomes meaningless.
spyy
is a 2D histogram of nonzero/edge densities.
>> spy
(G)
% in Matlab

>>
spyy
(G)
% in Matlab or Star-P
Now, here's why all this is so cool. You can create distributed graphs in Star-P
from edge tuples stored in distributed dense vectors. In other words, u, v, and
w can be distributed dense vectors, and G will be a distributed sparse matrix,
or a distributed graph. This idea can be developed further to do lots of
interesting things. If this is something interesting, email me, or leave
comments, and I will post future blog entries on the subject.