Home » cncp » Erdős-Rényi networks

Erdős-Rényi networks

Let’s leave epidemic spreading for a little while to return to the static structure of networks, and look at perhaps the most common class of network, the Erdős-Rényi or ER network. These networks are incredibly simple to describe, incredibly simple to build and simulate – and incredibly useful as at least first-attempt network models for a range of processes. We’ll also run up against the limitations of animation for presenting networks.

(This is a chapter from Complex networks, complex processes.)

ER networks

Let's now look at the best-understood complex network. If there's a poster child for network science, it's the "random graph", or more properly, the Erdős-Rényi or ER network. We mentioned Erdős and Rényi in the introduction as the mathematicians who first gave shape to the idea that large networks with essentially random structure might still show some useful statistical properties that made them more comprehensible. In this chapter we'll see what these regularities are. The ER networks are complex enough to allow us to demonstrate techniques that will apply in other circumstances, but are simple and well-behaved enough to make this analysis fairly straightforward.

We'll explore the ER network in some detail both through simulation and through mathematical analysis. We'll do it this way for a good reason: in the real world, networks often cannot be guaranteed to have exactly the properties that the mathematical techniques require, but computer simulation really needs to be driven by an understanding of what's going on in network at a fundamental level and how the mathematical features contribute to this behaviour. For these reasons, it's not safe to only understand how to simulate networks: you need to be able at least to follow the mathematical analysis as well. Conversely, understanding real networks and applications requires the techniques of simulation as well as analysis.

We'll start by building ER networks using networkx and explore some of the properties that we developed earlier. We'll then look at the same properties (and more) from a more mathematical perspective, and relate the code to the maths to show how the two views interrelate.

Building an ER network

To build an Erdős-Rényi (or ER) network with $N$ vertices, we proceed as follows:

  1. Build a graph $G = (V, E)$ with $N$ vertices and no edges, so $|V| = N$ and $E = \emptyset$
  2. For each pair of vertices $v_1, v_2 \in V$ with $v_1 \neq v_2$, add an edge $(v_1, v_2)$ to $E$ with probability $\phi$

That's it! – a very simple process for constructing what turns out ot be a very interesting class of networks. There are a four things to notice here, all of whch turn out to be very important for what follows.

Firstly, the ER model has two parameters: the number of nodes in the network $N$, and the probability $\phi$ of an edge occurring between any given pair of nodes. The combination of these two parameters defines a class of networks, depending on exactly which pairs of nodes are connected at random at the connection stage.

Secondly, the probability of an edge appearing between any pair of nodes is an independent event: it doesn't matter whether a node is already heavily connected or not, the chances of its being linked to any other node is just $\phi$ – and this probability doesn't change over time.

Thirdly, we disallow both self-loops and parallel edges, thereby creating a simple network.

Fourthly, we build the network "all at once", with all its nodes and all its edges in place before we do any further analysis.

To build such a network, we need to turn the description into code. We can do this in two ways using networkx:

  1. by implementing the construction process ourselves; or
  2. by using the built-in generator function

The latter is clearly entirely adequate in practice, but for demonstration purposes, we'll do both.

In [1]:
import networkx
import math
import numpy

import matplotlib as mpl
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import matplotlib.cm as cm
import seaborn

from JSAnimation import IPython_display
from matplotlib import animation
/Users/sd/programming/cncp/cncp/lib/python2.7/site-packages/matplotlib/__init__.py:878: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.
  warnings.warn(self.msg_depr % (key, alt_key))

To define the construction process we take the two parameters that define an ER network – the number of nodes and the probaility that there will be an edge between any given pair – and add an edge between pairs with the given probability:

In [2]:
def erdos_renyi_graph_from_scratch( n, pEdge ):
    """Build the graph with n nodes and a probability pEdge of there
    being an edge between any pair of nodes.
    
    :param n: number of nodes in the network
    :param pEdge: probability that there is an edge between any pair of nodes
    :returns: a network"""
    g = networkx.empty_graph(n)
    
    # run through all the possible edges
    ne = 0
    for i in xrange(n):
        for j in xrange(i + 1, n):
            if numpy.random.random() <= pEdge:
                ne = ne + 1
                g.add_edge(i, j, { 'added': ne })
    
    return g

(We use n for $N$ and pEdge for $\phi$.) Notice the way we run through the pairs of nodes so that we only try to generate an edge once between each pair. This works because the graph we're building is undirected and we want at most one edge between each pair of nodes in either order. (There are also directed ER networks: to build on of those we'd want to try each pair in each order to allow for directionality.)

The key networkx method here is add_edge, which adds an edge between a pair of nodes. It's optional third parameter is a dictionary of attribute/value pairs that are associated with the edge, and we use this to record the order in which the edge was added so we can visualise the growth of the network below.

We can then use this function to build an ER network, for example with 5000 nodes and a 5% probability of there being an edge between any pair of nodes:

In [3]:
g_from_scratch = erdos_renyi_graph_from_scratch(5000, 0.05)

Alternatively, we could use networkn's "generator" function for ER networks built-in that we can use to build a graph with the same properties as above:

In [4]:
g_from_generator = networkx.erdos_renyi_graph(5000, 0.05)

The claim is that both g_from_scratch and g_from_generator are instances of the class of ER networks. They aren't the same network, though, even though they have the same parameters, because they've been created by stochastic processes and so will have different connections between their nodes. However, they will both share certain statistical characteristics that we'll come back to after we look at the growth processes ion more detail.

How the graph grows

It can sometimes be useful to see how these graphs grow, by means of animation. We can use matplotlib to draw a graph progressively, one node at a time, and show how the edge set grows too. We can then use the JSAnimation plug-in to generate an in-line animation, or save the animation to a file and link to it.

matplotlib's animation functions are quite involved. The core is a function that creates a figure for each frame of the animation, which matplotlib then links together like the pages of a flick-book. There's quite a lot of set-up involved too, though: the following code is heavily commented to (hopefully) show what's going on.

In [26]:
def animate_growing_graph( g, edges, fig, ax = None, pos = None, cmap = None, **kwords ):
    """Animate the growth of a network, showing how edges are added and
    how node degrees evolve. Slow if done for a large graph. Returns a
    matplotlib animation object that can be saved to a file for later
    or shown in-line in a notebook.
    
    :param g: the network
    :param edges: the edges, in the order they were added
    :param fig: the figure to draw into
    :param ax: (optional) the axes to draw into (defaults to main figure axes)
    :param pos: (optional) layout for the network (default is to use the spring layout)
    :returns: an animation object"""

    # fill in the defaults
    if ax is None:
        # figure main axes
        ax = fig.gca()
    if pos is None:
        # layout the network using the spring layout
        pos = networkx.spring_layout(g, iterations = 100, k = 2/math.sqrt(g.order()))
    if cmap is None:
        cmap = cm.hot
    if ('frames' not in kwords.keys()) or (kwords['frames'] is None):
        # animate at one second per edge
        kwords['frames'] = int(len(edges) * (1.0 / kwords['interval']))
        
    # manipulate the axes, since this isn't a data plot
    ax.set_xlim([-0.2, 1.2])      # axes bounded around 1
    ax.set_ylim([-0.2, 1.2])
    ax.grid(False)                # no grid
    ax.get_xaxis().set_ticks([])  # no ticks on the axes
    ax.get_yaxis().set_ticks([])

    # work out the colour map for the degrees of the network, picking
    # colours linearly from the length of the colour map 
    ds = g.degree().values()
    max_degree = max(ds)
    min_degree = min(ds)
    norm = colors.Normalize(vmin = min_degree, vmax = max_degree)
    mappable = cm.ScalarMappable(norm, cmap)
    
    # We now create all the graphical elements we need for the animation as matplotlib
    # lines and patches. Essentially this defines what's in the final frame of the animation.
    # We'll then make everything invisible and, as the animation progresses, make the elements
    # appear in the right order. It's a lot faster to do it this way rather than re-building
    # each frame from nothing as we go -- although that works too.
    
    # generate node markers based on positions
    nodeMarkers = dict()
    nodeDegrees = dict()
    for v in g.nodes_iter():
        circ = plt.Circle(pos[v], radius = 0.02, zorder = 2)   # place node markers at the top of the z-order
        ax.add_patch(circ)
        nodeMarkers[v] = circ
        nodeDegrees[v] = 0
        
    # build the list of edges as they were added
    edgeMarkers = []
    edgeEndpoints = []
    for (i, j) in edges:
        xs = [ pos[i][0], pos[j][0] ]
        ys = [ pos[i][1], pos[j][1] ]
        line = plt.Line2D(xs, ys, zorder = 1)   # place edge markers down the z-order
        ax.add_line(line)
        edgeMarkers.append(line)
        edgeEndpoints.append((i, j))
        
    # work out the "time shape" of the animation
    nFrames = kwords['frames']                          # frames in the animation
    framesPerEdge = max(int(nFrames / len(edges)), 1)   # frames per edge
    
    # add colourbar for node degree
    kmax = max(g.degree().values())
    cax = fig.add_axes([ 0.9, 0.125, 0.05, 0.775 ])
    norm = mpl.colors.Normalize(0, kmax)
    cb = mpl.colorbar.ColorbarBase(cax, cmap = cmap,
                                   norm = norm,
                                   orientation = 'vertical',
                                   ticks = range(kmax + 1))
    
    # initialisation function hides all the edges, colours all nodes
    # as having degree zero
    def init():
        x = 1
        for em in edgeMarkers:
            em.set(alpha = 0)
        for vm in nodeMarkers.values():
            vm.set(color = mappable.to_rgba(0))

    # per-frame drawing for animation
    def frame( f ):
        # frame number boundaries for various transitions in the animation "shape"
        atEdge = int((f + 0.0) / framesPerEdge)   # the edge we've reached with this frame
        if framesPerEdge == 1:
            a = 1
        else:
            a = ((f + 0.0) % framesPerEdge) / framesPerEdge

        if atEdge < len(edgeMarkers):
            edgeMarkers[atEdge].set(alpha = a)
            if(a == 1):
                (i, j) = edgeEndpoints[atEdge]
                nodeDegrees[i] = nodeDegrees[i] + 1
                nodeMarkers[i].set(color = mappable.to_rgba(nodeDegrees[i]))
                nodeDegrees[j] = nodeDegrees[j] + 1
                nodeMarkers[j].set(color = mappable.to_rgba(nodeDegrees[j]))

    # return the animation with the functions etc set up
    return animation.FuncAnimation(fig, frame, init_func = init, **kwords)

This function generates an animation that we can play to see how things evolve. This is quite a slow process as we have to render each frame of the animation and, if we want to see anything, need to run fairly slowly. We'll use a small ER network to show this process:

In [6]:
# build the network, which annotates the edges with their order of addition
er = erdos_renyi_graph_from_scratch(100, 0.03)

# pull the edges as a dict from edge to order
er_edges_dict = networkx.get_edge_attributes(er, 'added')

# return a list of edges in order of addition
er_edges = sorted(er_edges_dict.keys(),
                  key = (lambda e: er_edges_dict[e]))

We can then generate and show the animation:

In [29]:
fig = plt.figure(figsize = (8, 6))
anim = animate_growing_graph(er, er_edges, fig, frames = 100)
IPython_display.display_animation(anim, default_mode = 'once')
Out[29]:


Once Loop Reflect