In 2013 I did a summer project on using the Arduino as a platform for “citizen sensing”. This rapidly became an exploration of how to create hardware and software that can do sensing while operating in a very low-power regime, such as one would need for an environmental sensor.

There were several results from this project — one of which *wasn’t* an actual solution to the motivating problem I’d come up with. However, it did generate a lot of notes about low-power Arduino programming, both for hardware and software, and a software library that embodies some of them

I recently decommissioned the web site I was using to host this content, so I’ve ported it onto my main blog as a collection of research and development notes in case it’s still of interest to anyone.

]]>

**CALL FOR PAPERS**

3rd Workshop on Engineering Collective Adaptive Systems – eCAS 2018

In conjunction with FAS* 2018

September 7^{th}, 2018

Fondazione Bruno Kessler (FBK) & University of Trento – Trento, Italy

Web site: ecas2018.apice.unibo.it

**AIMS AND MOTIVATION**

Modern software systems are becoming more and more collective, composed of many distributed and heterogeneous entities. These systems operate under continuous perturbations making manual adjustments infeasible. For a collective system to be resilient, its adaptation must also be collective, in the sense that multiple entities must adapt in a way that addresses critical runtime conditions while preserving the benefits of the collaborative interdependencies. Decision-making in such systems is distributed and possibly highly dispersed, and interaction between the entities may lead to the emergence of unexpected phenomena.

In such systems, a new approach for adaptation is needed to allow (i) multiple entities to collectively adapt with (ii) negotiations to decide which collective changes are best. Collective adaptation also raises a second important challenge: which parts of the system (things, services, people) should be engaged in an adaptation? This is not trivial, since multiple solutions to the same problem may be generated at different levels. The challenge here is to understand these levels and create mechanisms to decide the right scope for an adaptation for a given problem.

This workshop solicits papers that address new methodologies, theories and principles that can be used in order to develop a better understanding of the fundamental factors underpinning the operation of such systems, so that we can better design, build, and analyze them, as well as case studies and applications showing such approaches in action. Interdisciplinary work is particularly welcomed.

**Suggested Topics (but not limited to):**

- Novel theories relating to operating principles of CAS
- Novel design principles for building CAS systems
- Insights into the short and long-term adaptation of CAS systems
- Insights into emergent properties of CAS
- Insights into general properties of large scale, distributed CAS
- Decision-making approaches in CAS
- Methodologies for studying, analyzing, and building CAS
- Frameworks for analyzing or developing CAS case studies
- Languages, platforms, APIs and other tools for CAS
- Scenarios, case studies, and experience reports of CAS in different contexts (e.g., Smart Mobility, Smart Energy/Smart Grid, Smart Buildings, traffic management, emergency response, etc.)

**SCOPE**

The workshop is expected to attract participants from many disciplines, including Autonomic Computing, Biology, Game Theory, Evolutionary Computing, Network Science, Self-Organizing Systems, Pervasive Computing, and to be of interest to anyone working with the domain of large-scale self-adaptive systems. In addition, the European Commission has funded seven scientific projects and a Coordination Action in this area, with projects starting at the beginning of 2013. Thus, the workshop provides a natural base for the projects to meet and share ideas, even if it is in no way limited to this audience, and is likely to have broad appeal to a wide range of researchers. Potential audience members might work in application areas relating to large-scale distributed systems, or may come from any of the many disciplines that can provide insights into the operation and design of such systems.

**IMPORTANT DATES**

- Abstract submission: June 4, 2018
- Workshop paper submission: June 11, 2018
- Workshop paper notification: July 9, 2018
- Camera-Ready Version: July 15, 2018
- Workshop: September 3, 2018

**SUBMISSION INSTRUCTIONS and REVIEW CRITERIA**

The length of a workshop paper may not exceed 6 pages including references and follow the IEEE Computer Society Press proceedings style guide.

All papers should be submitted in PDF format. You can submit the paper through EasyChair using this link:

https://easychair.org/conferences/?conf=ecas2018

By submitting a paper, the authors confirm that in case of acceptance, at least one author will attend the workshop to present the work.

Papers will be peer reviewed on the basis of originality, readability, relevance to themes, soundness, and overall quality. Workshop proceedings will be published on IEEE Xplore in parallel with the main conference proceedings.

**WORKSHOP ORGANISERS**

- Simon Dobson, University of St Andrews, UK, simon.dobson@st-andrews.ac.uk
- Martina De Sanctis, Fondazione Bruno Kessler, Trento, Italy, msanctis@fbk.eu
- Giacomo Cabri, Università di Modena e Reggio Emilia, Italy, giacomo.cabri@unimore.it

**STEERING COMMITTEE**

- Jacob Beal, Raytheon BBN Technologies, USA
- Giacomo Cabri, University of Modena And Reggio Emilia, Italy
- Nicola Capodieci, University of Modena and Reggio Emilia, Italy
- Emma Hart, Edinburgh Napier University, U.K
- Jane Hillston, University of Edinburgh, U.K
- Mirko Viroli, University of Bologna, Italy

**WORKSHOP PROGRAM COMMITTEE (PROVISIONAL)**

- Gerrit Anders, Augsburg University
- Franco Bagnoli, Università di Firenze
- Ezio Bartocci, TU Wien
- Luca Bortolussi, University of Trieste
- Johann Bourcier, IRISA/INRIA-Universite de Rennes 1
- Javier Camara, Carnegie Mellon University
- Siobhan Clarke, Trinity College Dublin
- Daniel Coore, University of the West Indies
- Ferruccio Damiani, Dipartimento di Informatica, Università di Torino
- Rocco De Nicola, IMT – School for Advanced Studies Lucca
- Giovanna Di Marzo Serugendo, University of Geneva
- Schahram Dustdar, TU Wien
- Jane Hillston, University of Edinburgh
- Paola Inverardi, University of L’Aquila
- Eva Kühn, TU Wien
- Peter Lewis, Aston University
- Nicolas Markey, LSV, CNRS & ENS Cachan
- Annapaola Marconi, Fondazione Bruno Kessler, Trento
- Hernan Melgratti — University of Buenos Aires, Argentina
- Monjur Mourshed, Cardiff University, UK
- Mirco Musolesi, University College London
- Carlo Pinciroli, École Polytechnique de Montreal
- Alexander Schiendorfer, University Augsburg
- Bradley Schmerl, Carnegie Mellon University
- Antoine Spicher, LACL University Paris Est Creteil
- Katia Sycara, Carnegie Mellon University
- Christof Teuscher, Portland State University
- Mirko Viroli, Università di Bologna
- Martin Wirsing, Ludwig-Maximilians-Universitaet Muenchen

**REGISTRATION**

All attendees at the workshop must register for SASO through the conference website: https://saso2018.fbk.eu/index.php/registration/

]]>

Salary: Grade 6/7/8; £28,098 – £31,604 / £34,520 – £38,833 / £42,418 – £49,149 per annum.

The School of Computing Science at the University of Glasgow invites applications for a Post Doctoral Research Associate or Fellow position in the leading-edge research project Science of Sensor System Software (S4)

Research is focused on delivering new principles and techniques for the development and deployment of verifiable, reliable, autonomous sensor-based systems that operate in uncertain, multiple and multi-scale environments. The S4 programme grant is a collaboration between four universities and you will be expected to work closely with researchers across the four universities.

This position offers an exciting opportunity to gain first-hand insights into the development of sensor-based systems and to develop and apply novel modelling and reasoning techniques that contribute to the goals of verifying reliability, robustness, security, etc. Depending on your experience, the role offers considerable intellectual freedom and opportunities for you to take significant initiative, leadership, and responsibility.

The job requires expert knowledge in one or more of: formal modelling and specification, stochastic and temporal logics, automated reasoning, sensor networks, run-time verification, real-world applications. Experience of bigraphs and model checking would be an advantage. You must have started to build up a strong publication record, have excellent programming and modelling skills, and be able to quickly integrate software, e.g. for model-checking, simulation, and verification. You should be competent to undertake hands-on work related to modelling and verification of chosen, real-life case-studies.

You hold, or expect to hold, a PhD in Computer Science or in a closely related field; alternatively, you have a first degree in one of the above-mentioned subjects and substantial experience in a research role in industry. Fresh PhD graduates are also encouraged to apply.

For appointment at Grade 8 you will need to meet the additional criteria as per the Job Description.

This post is offered as open-ended contract with funding available for up to 30 months.

It is anticipated that interviews will be held in March or April 2018.

For further information and to discuss details please contact the Principal Investigator, Professor Muffy Calder (email: muffy.calder@glasgow.ac.uk).

]]>

3-7 September 2018, Trento, Italy

The aim of the Self-Adaptive and Self-Organizing Systems conference series (SASO) is to provide a forum for the presentation and discussion of research on the foundations of engineered systems that self-adapt and self-organize. The complexity of current and emerging networks, software, and services can be characterized by issues such as scale, heterogeneity, openness, and dynamics in the environment. This has led the software engineering, distributed systems, and management communities to look for inspiration in diverse fields (e.g., complex systems, control theory, artificial intelligence, chemistry, psychology, sociology, and biology) to find new ways of designing and managing such computing systems in a principled way. In this endeavor, self-organization and self-adaptation have emerged as two promising interrelated approaches. They form the basis of many other so-called self-* properties, such as self-configuration, self-healing, or self-optimization. SASO aims to be an interdisciplinary meeting, where contributions from participants with different backgrounds leads to the fostering of a cross-pollination of ideas, and where innovative theories, frameworks, methodologies, tools, and applications can emerge.

The twelfth edition of the SASO conference embraces this interdisciplinary nature, and welcomes novel contributions to both the foundational and application-focused dimensions of self-adaptive and self-organizing systems research. We are looking for contributions that present new fundamental understanding of self-adaptive and self-organizing systems and how they can be engineered and used. The topics of interest include, but are not limited to:

**Self-* Systems theory**: nature-inspired and socially-inspired paradigms and heuristics; inter-operation of self-* mechanisms; theoretical frameworks and models; control theory;**Self-* System properties**: robustness; resilience; stability; anti-fragility; diversity; self-reference and reflection; emergent behavior; computational awareness and self-awareness;**Self-* Systems engineering**: reusable mechanisms and algorithms; design patterns; architectures; methodologies; software and middleware development frameworks and methods; platforms and toolkits; multi-agent systems;**Theory and practice of self-organization**: self-governance, change management, electronic institutions, distributed consensus, commons, knowledge management, and the general use of rules, policies, etc. in self-* systems;**Theory and practice of self-adaptation**: mechanisms for adaptation, including evolution, logic, learning; adaptability, plasticity, flexibility;**Socio-technical self-* systems**: human and social factors; visualization; crowdsourcing and collective awareness; humans-in-the-loop; ethics and humanities in self-* systems;**Data-driven approaches to self-* systems**: data mining; machine learning; data science and other statistical techniques to analyze, understand, and manage the behavior of complex systems;**Self-adaptive and self-organizing hardware**: self-* materials; self-construction; reconfigurable hardware;**Self-* Systems Education**: experience reports; curricula; innovative course concepts; methodological aspects of self-* systems education;**Applications and experiences with self-* systems**: smart grid, smart cities, smart homes, adaptive industrial plants, cyber-physical systems; autonomous vehicles and robotics; traffic management; self-adaptive cyber-security; Internet of Things; fog/edge computing; etc.

Abstract submission | April 16, 2018 |

Paper submission | April 23, 2018 |

Notification | June 4, 2018 |

Camera ready copy due | July 2, 2018 |

Conference | September 3-7, 2018 |

Submissions can have up to 10 pages formatted according to the standard IEEE Computer Society Press proceedings style guide (see templates here). Please submit your papers electronically in PDF format using the SASO 2018 conference management system:

https://easychair.org/conferences/?conf=saso2018.

The proceedings will be published by IEEE Computer Society Press, and made available as a part of the IEEE Digital Library. Note that a separate Call for Poster and Demo Submissions will also be issued. As per the standard IEEE policies, all submissions should be original, i.e., they should not have been previously published in any conference proceedings, book, or journal and should not currently be under review for another archival conference. We would like to also highlight IEEE’s policies regarding plagiarism and self-plagiarism (http://www.ieee.org/publications_standards/publications/rights/ID_Plagiarism.html).

Where relevant and appropriate, accepted papers will also be encouraged to participate in the Demo or Poster Sessions.

Papers should present novel ideas in the cross-disciplinary research context described in this call, motivated by problems from current practice or applied research. Both theoretical and empirical contributions should be highlighted, substantiated by formal analysis, simulation, experimental evaluations, or comparative studies, etc. Appropriate references must be made to related work. Due to the cross-disciplinary nature of the SASO conference, we encourage papers to be intelligible and relevant to researchers who are not members of the same specialized sub-field.

Authors are also encouraged to submit papers describing applications. Application papers should provide an indication of the real-world relevance of the problem that is solved, including a description of the domain, and an evaluation of performance, usability, or comparison to alternative approaches. Experience papers are also welcome, especially if they highlight insights into any aspect of design, implementation or management of self-* systems that would be of benefit to practitioners and the SASO community. All submissions will be rigorously peer reviewed and evaluated based on the quality of their technical contribution, originality, soundness, significance, presentation, understanding of the state of the art, and overall quality.

Antonio Bucchiarone,

Fondazione Bruno Kessler, IT

Alberto Montresor,

University of Trento, IT

Jake Beal,

Raytheon BBN Technologies, USA

Nelly Bencomo,

Aston University, UK

Jean Botev,

University of Luxembourg, LU

(This is a chapter from Complex networks, complex processes.)

We can improve both the performance and statistical properties of simulations by changing the simulation approach we use. We *won't* try to optimise or improve the performance of synchronous dynamics, although there's certainly scope to do so: instead, we'll *replace* the synchronous approach with another technique that (it turns out) is better-suited to the accurate simulating large systems.

The technique we use is sometimes called *Gillespie's stochastic siumulation algorithm* or simply *Gillespie simulation*, It was developed initially [Gil76][Gil77] to perform *ab initio* chemical simulations, where a lot of molecules react according to a set of simple chemical rules – a situation that's very similar to a process over a network. Cao *et alia* [CGP06, section II] provides a very accessible description to the basic mathematics of the technique, which we'll develop in a network context below.

The essence of Gillespie simulation is the observation that we can manipulate the probabilities governing events. Instead of testing in every discrete timestep which of the available events can occur (for example from susceptible to infected in SIR), we predict the instant of time at which the next event will occur – skipping the intermediate time when nothing happens. To put this another way, we convert the probabilities of individual events in *space* into aggregate probability distributions of events over *time*. If the simulation is such that a lot of "empty" timesteps occur, then this approach will avoid the costs of simulating them. It has the additional advantage of operating in continuous time with only a single event happening at each instant, which solves the problem of events affecting each other within a timestep.

Unfortunately these benefits come at the cost of some fairly subtle mathematics needed to manipulate the probability distributions into the required form. We'll deal with this first, and then encode the result as a new simulation dynamics that we can use to simulate epidemics using the *same* compartmented process models as we used for the synchronous case.

In the synchronous simulation in the previous chapter we took all the places at which an event could occur and probabilistically chose some of them for firing. Infection happens along SI edges. (We can also identify SS, SR, II, and RR edges, and these play important rôles in some epidemic models, although not in SIR.) SIR assumes that the dynamics occurs at these loci independently. If we denote the probability of an SI edge transmitting an infection as $\beta$ as usual, then the rate at which edges in the network transmit infection is given by $\beta [SI]$ where $[SI]$ denotes the number of SI edges in the network (the size of the locus, in other words). $[SI]$ is of course a function of time, since the population of SI edges is changed by the infection event. Similarly if infected nodes are removed with probability $\alpha$ the rate of recovery is given by $\alpha [I]$. In a sense the values of $[SI]$ and $[I]$ constitute the "state" of the dynamical system. Each infection event will decrease $[SI]$ by one and increase $[I]$ by a value that depends on the degree of the newly-infected node and how many of those adjacent nodes are susceptible. This indicates that the dynamics entwines three distinct features:

- the probabilities of different events;
- the number of places at which these events can occur; and
- the topology of the network that controls how the populations of different loci evolve.

It is this third feature that distinguishes the network formulation from the differential equation formulation, since it allows heterogeneity of evolution in both space and time.

Let us re-formulate the above in a way that's more explcitly continuous in nature. The probability that some SI edge will transmit infection in a small time $dt$ is given by $a_I \, dt = \alpha [SI] \, dt$, and recovery similarly by $a_R \, dt = \beta [I] \, dt$. We can now ask two questions: given the state of the network,

- when will the next event occur?, and
- what event will it be?

Clearly these are probabilistic questions, so the answers will be formulated as probability distributions. Let's define a probability distribution $P(\tau, e) \, d\tau$ as the probability that an event will happen in the interval $(t + \tau, t + \tau + d\tau)$ *and* that that event will be of type $e$, which for SIR will be either an infection ($I$) or a recovery ($R$) event. So at time $t$ we're looking at the distribution of the times $\tau$ between $t$ and the next event, and the identity of that event. This is a joint probability density function on the space of $\tau$ and $e$, where $\tau$ is a continuous random variable and $e$ is a discrete random variable. We an then draw values a pair of values $(\tau, e)$ from this distribution to give us the time to the next event and its identity.

Note also that the value of $\tau$ answers the first question above, while the value of $e$ answers the second.

What do we expect from this distribution? Intuitively, a system where there are lots of places where events can occur should give rise to a high likelihood of drawing a small value of $\tau$ from the distribution: the events happen close together in time. Conversely, as the number of places available decreases, it becomes more likely that we'll draw a larger value of $\tau$.

We now need a way to specify $P(\tau, e)$ and to draw values from it.

Let's think about $P(\tau, e) \, d\tau$ a little more. We're looking for a value of $\tau$ at which the next event happens, and the identity of that event. Equivalently, we could say that we want the probability that *no* event happens in the interval $[t, t + \tau]$, *and* that an $e$ event happens in the interval $[t + \tau, t + \tau + d\tau]$. The use of the word "and" here suggests that we'll be multiplying together the probabilities of the two components. We defined to probability of a particular event happening above, so we can then re-phrase $P(\tau, e) \, d\tau$ a little differently:

where $P_0(\tau)$ is the probability of no event happening in $(t + \tau)$ and $a_e$ is the probability of an event $e$ happening in an interval $d\tau$. Since we already know the values of $a_e$ from the model parameters $\alpha$ and $\beta$ and the size of the appropriate loci $[SI]$ and $[I]$, we just need an expression for $P_0(\tau)$. Let $a \, d\tau' = \sum_e a_e \, d\tau'$ be the probability that *some* event happens in an interval $d\tau'$, simply by summing-up the component probabilities of the different events. We then have:

which is the probability that no event occurs in in the interval $(t, t + \tau)$ *and then* that one occured in the following interval $d\tau'$. This is a differential equation, the solution of which is:

Substituting back into the above we therefore have:

\begin{align*} P(\tau, e) &= P_0(\tau) \, a_e \\ &= a_e \, e^{-a \tau} \end{align*}This is our joint probability distribution for the events defined by the various values of $a_e$. These values are *rates*, not probabilities: they are defined in terms of the number of places at which each event $e$ can occur.

To conduct simulation, we need to be able to draw a pair $(\tau, e)$ from our distribution. However, we can't simply choose $\tau$ and $e$ independently of each other, because the value of $P(\tau, e)$ depends on *all* the possible events $e$ because of the presence of $a$, the sum of all event rates, in its definition. That means that the time to the next event depends on the number of events that could occur.

In other words, $P(\tau, e)$ is a **joint probability distribution** from which we need to draw a pair. Any joint probability distribution $P(a, b)$ can be re-written as $P(a, b) = P(a) \, P(b | a)$: the prior (independent) probability of $a$ occuring multiplied by the probability of $b$ occurring *given that* $a$ has occurred. In our case,

where $P(\tau)$ is the probability that *some* event will occur on the interval $(t, t + \tau)$ and $P(e | \tau)$ is the probability that this event will be of type $e$ *given that* it occurs on this interval. Clearly $P(\tau)$ is simply the sum of the probabilities for all the events that may occur,

and therefore:

$$ P(e | \tau) = \frac{P(\tau, e)}{\sum_{e'} P(\tau, e')} $$These two equations are both single-variable probability distributions (over $\tau$ and $e$ respectively) expressed in terms of the joint probability distribution $P(\tau, e)$, and if we substitute for $P(\tau, e)$ from above we get:

\begin{align*} P(\tau) &= \sum_e a_e e^{-a \tau} \\ &= a \, e^{-a \tau} \\ \\ P(e | \tau) &= \frac{P(\tau, e)}{\sum_{e'} P(\tau, e')} \\ &= \frac{a_e e^{-a \tau}}{a \, e^{-a \tau} } \\ &= \frac{a_e}{a} \end{align*}Note that $P(e | \tau)$ is in this case independent of $\tau$, since the event probabilities are constants.

Let's briefly return to the network scenario we're interested in. The value $\tau$ is the interval of time until the next event occurs in the network, whether that is the infection of the S node attached to an SI edge of the recovery of an I node. Which of these events happens is determined by $e$. The pair $(\tau, e)$ therefore fully defines the time and identity of the next event in the simulation. It remains to see how we choose these two values, and how the network evolves in response to the selected event.

In order to make use of $P(\tau, e)$ we have to be able to draw $\tau$ and $e$ from the joint distribution. We saw above that we can dop this by drawing values from $P(\tau)$ and $P(e | \tau)$ individually, with the latter distribution actually being independent of time in our current case.

It may not be obvious how to draw from such distributions, but we can manipulate the probabilities to make it possible using only a source of uniformly-distributed random numbers on the range $(0, 1)$, which Python certainly has: `numpy.random.random()`

. The trick is to observe that, for any probability density function $P(a)$, the value $P(a) \, da$ represents the probability that a value drawn from the distribution will lie between $a$ and $(a + da)$. From this we can construct a cumulative distribution function,

where $F(x_0)$ represents the probability that a value drawn from $P(a)$ is less than or equal to $x_0$, also denoted $P(a \le x_0)$. If we now draw a value $r$ from a uniform distribution on $(0, 1)$ we can compute $x = F^{-1}(r)$ where $F^{-1}$ is the inverse of the cumulative distribution function and $x$ will be distributed according to $P(a)$. This means we can convert a uniformly-distributed value into a value drawn from any probability distribution for which we can construct (and invert) a cumulative distribution function.

In our case we have that $P(\tau) = a \, e^{-a \tau}$. Remember that $a$ is a constant, and that intervals can't be negative. This means that

\begin{align*} F(\tau) &= \int_{-\infty}^{\tau} a \, e^{-a \tau'} \, d\tau' \\ &= \int_0^{\tau} a \, e^{-a \tau'} \, d\tau' \\ &= -e^{-a \tau'} \, \bigg|_0^\tau \\ &= -e^{-a \tau} -(-e^0) \\ &= 1 - e^{-a \tau} \end{align*}This is an awkward expression to manipulate, but we can observe that, if a number $r_1$ is uniformly distributed, then so by definition is $1 - r_1$, so if we set $F(\tau) = 1 - r_1$ we can cancel-out the constant ones and get a simpler expression overall. We then have:

\begin{align*} 1 - r_1 &= F(\tau) \\ &= 1 - e^{-a \tau} \\ r_1 &= e^{-a \tau} \\ &= \frac{1}{e^{a \tau}} \\ e^{a \tau} &= \frac{1}{r_1} \\ a \tau &= \ln \frac{1}{r_1} \\ \tau &= \frac{1}{a} \, \ln \frac{1}{r_1} \end{align*}The discrete case works similarly. If we draw a value $r_2$ on $(0, 1)$, then the value of $e$ we require is given by $\sum_{e' = 0}^{e - 1} a_{e'} \leq r_2 a \leq \sum_{e' = 0}^{e} a_{e'}$: the largest $e$ such that the sum of $a_{e'}$ for $e' \le e$ is less than $r_2 a$.

The upshot of all this probability theory is that we can choose a time to the next event $\tau$ and the identity of the next event $e$ from the distribution induced by the individual event probabilities and the size of the loci for the various events in the network, by drawing two uniformly-distributed numbers and performing two simple calculations [Gil76].

`epydemic.Dynamics`

, exactly as we previously did for discrete-time synchronous dynamics.

In [2]:

```
import cncp
import networkx
import math
import numpy
import pickle
import epyc
import epydemic
import pandas as pd
import matplotlib
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
import seaborn
```

`epydemic`

's simulation class hierarchy, as outlined in red below:

We'll start with the stochastic dynamics class itself:

In [3]:

```
class StochasticDynamics(epydemic.Dynamics):
'''A dynamics that runs stochastically in :term:`continuous time`. This is a
very efficient and statistically exact approach, but requires that the
statistical properties of the events making up the process are known.'''
def __init__( self, g = None ):
'''Create a dynamics, optionally initialised to run on the given network.
:param g: prototype network to run the dynamics over (optional)'''
super(StochasticDynamics, self).__init__(g)
def eventRateDistribution( self, t ):
'''Return the event distribution, a sequence of (l, r, f) triples
where l is the locus where the event occurs, r is the rate at
which an event occurs, and f is the event function called to
make it happen.
Note that it's a rate we want, not a probability:
the former can be obtained from the latter simply by
multiplying the event probability by the number of times it's
possible in the current network, which is the population
of nodes or edges in a given state.
It is perfectly fine for an event to have a zero rate. The process
is assumed to have reached equilibrium if all events have zero rates.
:param t: current time
:returns: the event rate distribution'''
dist = self.eventDistribution(t)
return map((lambda (l, p, f): (l, p * len(l), f)), dist)
def do( self, params ):
'''Run the simulation using Gillespie dynamics. The process terminates
when either there are no events with zero rates or when :meth:`at_equilibrium`
return True.
:param params: the experimental parameters
:returns: the experimental results dict'''
# run the dynamics
g = self.network()
t = 0
events = 0
while not self.at_equilibrium(t):
# pull the transition dynamics at this timestep
transitions = self.eventRateDistribution(t)
# compute the total rate of transitions for the entire network
a = 0.0
for (_, r, _) in transitions:
a = a + r
if a == 0:
break # no events with non-zero rates
# calculate the timestep delta
r1 = numpy.random.random()
dt = (1.0 / a) * math.log(1.0 / r1)
# calculate which event happens
if len(transitions) == 1:
# if there's only one, that's the one that happens
(l, _, ef) = transitions[0]
else:
# otherwise, choose one at random based on the rates
r2 = numpy.random.random()
xc = r2 * a
k = 0
(l, xs, ef) = transitions[k]
while xs < xc:
k = k + 1
(l, xsp, ef) = transitions[k]
xs = xs + xsp
# increment the time
t = t + dt
# draw a random element from the chosen locus
e = l.draw()
# perform the event by calling the event function,
# passing the dynamics, event time, network, and element
ef(self, t, g, e)
# increment the event counter
events = events + 1
# run any events posted for before the maximum simulation time
self.runPendingEvents(self._maxTime)
# add some more metadata
(self.metadata())[self.TIME] = t
(self.metadata())[self.EVENTS] = events
# report results
rc = self.experimentalResults()
return rc
```

We need an event rate distribution rather than an event probability distribution so we provide that as a method `eventRateDistribution()`

that takes the probability distribution returned by `eventDistribution()`

and, for each event, multiplies the probability of that event happening by the number of places the event can happen.

The important part of the class is the `do()`

method, which implements the mechanism for drawing the $(\tau, e)$ pair as described above. In the code, `dt`

is the interval to the next event ($\tau$), while `xc`

is used to choose the event that occurs.

Then we need to bridge between this general framework and compartmented models, just as before:

In [4]:

```
class CompartmentedStochasticDynamics(StochasticDynamics):
'''A :term:`stochastic dynamics` running a compartmented model. The
behaviour of the simulation is completely described within the model
rather than here.'''
def __init__( self, m, g = None ):
'''Create a dynamics over the given process model, optionally
initialised to run on the given network.
:param m: the compartmented model for the disease process
:param g: prototype network to run the dynamics over (optional)'''
super(CompartmentedStochasticDynamics, self).__init__(g)
self._model = m
def setUp( self, params ):
'''Set up the experiment for a run. This performs the default action
of copying the prototype network and then builds the model and
uses it to initialise the nodes into the various compartments
according to the parameters.
:params params: the experimental parameters'''
# perform the default setup
super(CompartmentedStochasticDynamics, self).setUp(params)
# build the model
self._model.reset()
self._model.build(params)
# initialise the network from the model
g = self.network()
self._model.setUp(self, g, params)
def eventDistribution( self, t ):
'''Return the model's event distribution.
:param t: current time
:returns: the event distribution'''
return self._model.eventDistribution(t)
def experimentalResults( self ):
'''Report the model's experimental results.
:returns: the results as seen by the model'''
return self._model.results(self.network())
```

We can now take the same parameters as we used in the synchronous case:

In [5]:

```
# ER network parameters
N = 5000
kmean = 5
pEdge = (kmean + 0.0) / N
# SIR parameters
pInfected = 0.01
pInfect = 0.2
pRemove = 0.1
# create a parameters dict containing the disease parameters we want
params = dict()
params[epydemic.SIR.P_INFECTED] = pInfected
params[epydemic.SIR.P_INFECT] = pInfect
params[epydemic.SIR.P_REMOVE] = pRemove
```

Plugging these parameters into our new simulation class, we get:

In [6]:

```
g = networkx.erdos_renyi_graph(N, pEdge)
m = epydemic.SIR()
sim = CompartmentedStochasticDynamics(m, g)
sto = sim.set(params).run()
with open('sto.pickle', 'wb') as handle:
pickle.dump(sto, handle)
```

In [7]:

```
print "Epidemic covered {percent:.2f}% of the network".format(percent = ((sto['results']['compartments']['R'] + 0.0)/ N) * 100)
```

*supposed* to behave the same, for a suitably stochastic definition of "the same".

We can of course dig-into the results in more detail. There are a lot of potentially interesting things to explore, and we'll just pick two of the most important: is one method faster than the other?, and, do they look like they generate a similar train of events?

First we load both datasets:

In [8]:

```
with open('sync.pickle', 'rb') as handle:
syn = pickle.load(handle)
with open('sto.pickle', 'rb') as handle:
sto = pickle.load(handle)
```

In [9]:

```
print "Elapsed simulation times:"
print "Synchronous {elapsed:.2f}s".format(elapsed = syn[epyc.Experiment.METADATA]['elapsed_time'])
print "Stochastic {elapsed:.2f}s".format(elapsed = sto[epyc.Experiment.METADATA]['elapsed_time'])
```

But a performance benefit is only useful if the results are correct: there's no point in doing the wrong things faster, after all. So we need to convince ourselves that, at the very least, the two simulations conducted for the same parameters produce plausibly comparable results – even while we accept that statistical variations might occur.

We can start by looking at the populations of the different compartments at equilibrium:

In [10]:

```
print "Node type sub-populations:"
print "Synchronous:", syn[epyc.Experiment.RESULTS]['compartments']
print "Stochastic:", sto[epyc.Experiment.RESULTS]['compartments']
```

Although we are using two different simulation techniques, we claim that they are "the same" in the sense of simulating the same process dynamics. One way to test this is to look at the distance between successive events. If the events are happening with similar distributions, we would expect the inter-event time distributins to be similar too.

To do this we need to capture when (in simulation time) each event occurs. We can do this quite simply, either by extending the simulation dynamics classes, or – more straightforwardly – by defining a new compartment5ed model whose results include the simulation times for events:

In [11]:

```
class SIR_EventDistribution(epydemic.SIR):
'''An SIR model that also captures the times of all events.'''
def __init__( self ):
super(SIR_EventDistribution, self).__init__()
# create a place to store the sequence of event times
self._eventDistribution = []
def reset( self ):
super(SIR_EventDistribution, self).reset()
self._eventDistribution = []
def results( self, g ):
rc = super(SIR_EventDistribution, self).results(g)
# add the event times to the results
rc['event_times'] = self._eventDistribution
return rc
def infect( self, dyn, t, g, (n, m) ):
# perform the base event
super(SIR_EventDistribution, self).infect(dyn, t, g, (n, m))
# record the event time
self._eventDistribution.append(t)
def remove( self, dyn, t, g, n ):
# perform the base event
super(SIR_EventDistribution, self).remove(dyn, t, g, n)
# record the event time
self._eventDistribution.append(t)
```

In [31]:

```
# epidemic parameters
params = dict()
params[epydemic.SIR.P_INFECTED] = pInfected
params[epydemic.SIR.P_INFECT] = 0.05
params[epydemic.SIR.P_REMOVE] = 0.01
m = SIR_EventDistribution()
# run process over a larger ER network
g = networkx.erdos_renyi_graph(30000, 5.0 / 30000)
# synchronous dynamics
sim = epydemic.CompartmentedSynchronousDynamics(m, g)
syn_res = sim.set(params).run()
syn_events = syn_res[epyc.Experiment.RESULTS]['event_times']
# stochastic dynamics
sim = CompartmentedStochasticDynamics(m, g)
sto_res = sim.set(params).run()
sto_events = sto_res[epyc.Experiment.RESULTS]['event_times']
```

In [40]:

```
fig = plt.figure(figsize = (8, 5))
plt.title('Distribution of inter-event times')
plt.xlabel('Inter-event time')
plt.ylabel('$log(\mathrm{events})$')
# work out inter-event times
l = 0
syn_inter = []
for i in xrange(1, len(syn_events) - 1):
syn_inter.append(syn_events[i] - l)
l = syn_events[i]
sto_inter = []
l = 0
for i in xrange(1, len(sto_events) - 1):
sto_inter.append(sto_events[i] - l)
l = sto_events[i]
# plot the histogram of the distribution
plt.hist([sto_inter, syn_inter],
bins = range(10),
log = True,
label = ['stochastic', 'synchronous'])
plt.legend()
_ = plt.show()
```

*similar*, both dropping off exponentially as we'd expect. They don't follow exactly the same distribution, but that could just be the result of the stochastic nature of the process: we ran the two dynamics over the same network, but from different initial (random) seedings of nodes. Or it could be because the synchronous approach is less exact because of interactions between events. If we wanted a closer look, we'd have to perform some repetitions to see whether we got different results repeatedly or whether things evened out – but that's something for another time.

[CGP06] Ying Cao, Daniel Gillespie and Linda Petzold. Efficient step size selection for the tau-leaping simulation method. Journal of Chemical Physics **124**. 2006.

[Gil76] Daniel Gillespie. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of Computational Physics **22**, pages 403-–434. 1976.

[Gil77] Daniel Gillespie. Exact stochastic simulation of coupled chemical reactions. Journal of Physical Chemistry **81**(25), pages 2340-–2361. 1977.

(This is a chapter from Complex networks, complex processes.)

`epydemic`

library, making use of the compartmented model we coded earlier. And we'll discuss some of the advantages of this approach – but also its limitations, whcih lead us into continuous-time suiimulation of the same model

Recall from our earlier discussion that discrete-event simulators have to make three key decisions:

*when*(in simulation time) does the next event occur?,*where*in the network does it occur?, and*which*event is it that occurs?

A discrete-time simulation performs these decisions in a simulation loop that looks roughly as follows. At each timestep, the simulation collects all the places in which an event *might* occur (the "where" question). It then, for each of these places, decides *whether* the event occurs or not ("when") and, if it decides that it does, executes the event ("which"). It then moves to the next moment and repeats. Executing an event will typically change the places where future events can occur.

A discrete-time simulation is sometimes referred to as a **synchronous** simulation, because all the events in a given moment are performed in a batch.

Let's now build the code we need to create a synchronous simulation of an epidemic. We'll be making use of the `epydemic`

library, and specifically its descriptions of compartmented disease models. Before we do that, however, we need to construct a general simulation framework that we can then specialise to perform the functions we need.

`epydemic`

represents synchronous simulation using a small class hierarchy, and in this chapter we'll fill-out the part outlined in red in the following UML diagram:

(Actually what we'll describe is a slightly simpler version of `epydemic`

for ease of explanation. But it captures all the main points, and we'll come back to the code when we need the more advanced features.)

The decomposition of the three classes is as follows. `epydemic.Dynamics`

defines the basic functionality of a discrete-event simulation, mainly concerning the way we get events to execute. `epydemic.SynchronousDynamics`

specialises this framework to run in synchronous time, collecting together all the events for a given timestep, but without specifying exactly where the events come from. `epydemic.CompartmentedSynchronousDynamics`

then binds the source of events to a compartmented model. (We describe *why* we do it this way below.)

In [1]:

```
import networkx
import epydemic
import epyc
import math
import numpy
import pickle
from copy import copy
import pandas as pd
import matplotlib
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
import seaborn
```

Let's begin with the basic discrete-event dynamics:

In [2]:

```
class Dynamics(epyc.Experiment, object):
'''A dynamical process over a network. This is the abstract base class
for implementing different kinds of dynamics as computational experiments
suitable for running under. Sub-classes provide synchronous and stochastic
(Gillespie) simulation dynamics.'''
# Additional metadata elements
TIME = 'simulation_time' #: Metadata element holding the logical simulation end-time.
EVENTS = 'simulation_events' #: Metadata element holding the number of events that happened.
# the default maximum simulation time
DEFAULT_MAX_TIME = 20000 #: Default maximum simulation time.
def __init__( self, g = None ):
'''Create a dynamics, optionally initialised to run on the given network.
The network (if provided) is treated as a prototype that is copied before
each individual simulation experiment.
:param g: prototype network (optional)'''
super(Dynamics, self).__init__()
self._graphPrototype = g # prototype copied for each run
self._graph = None # working copy of prototype
self._maxTime = self.DEFAULT_MAX_TIME # time allowed until equilibrium
def network( self ):
'''Return the network this dynamics is running over.
:returns: the network'''
return self._graph
def setNetworkPrototype( self, g ):
'''Set the network the dynamics will run over. This will be
copied for each run of an individual experiment.
:param g: the network'''
self._graphPrototype = g
def setMaximumTime( self, t ):
'''Set the maximum default simulation time. The default is given
by :attr:`DEFAULT_MAX_TIME`.
param: t: the maximum time'''
self._maxTime = t
def at_equilibrium( self, t ):
'''Test whether the model is an equilibrium. Override this method to provide
alternative and/or faster simulations.
:param t: the current simulation timestep
:returns: True if we're done'''
return (t >= self._maxTime)
def setUp( self, params ):
'''Before each experiment, create a working copy of the prototype network.
:param params: parameters of the experiment'''
# perform the default setup
super(Dynamics, self).setUp(params)
# make a copy of the network prototype
self._graph = self._graphPrototype.copy()
def tearDown( self ):
'''At the end of each experiment, throw away the copy.'''
# perform the default tear-down
super(Dynamics, self).tearDown()
# throw away the worked-on model
self._graph = None
def eventDistribution( self, t ):
'''Return the event distribution, a sequence of (l, p, f) triples
where l is the :term:`locus` of the event, p is the probability of an
event occurring, and f is the :term:`event function` called to make it
happen. This method must be overridden in sub-classes.
It is perfectly fine for an event to have a zero probability.
:param t: current time
:returns: the event distribution'''
raise NotYetImplemented('eventDistribution()')
```

We make the dynamics class a sub-class of `epyc.Experiment`

. We haven't discussed `epyc`

yet – and there's no need to right now – but it provides functions for running lots of repetitions of simulations with a single command. We'll make extensive use of this later when we scale-up simulations.

An epidemic simulation takes place over a network. We can provide a network either to the constructor or by calling `setNetworkPrototpye()`

. This network is referred to as the *prototype* network. Every time we run the simulation, the prototype is copied into a *working* netyork that we then run the epidemic process over. This means we can repeatedly use the *same* network for *different* instances of the *same* process. The `setUp()`

and `tearDown()`

methods create and destroy the working copy.

We need to know when we should stop the simulation, and the most general answer to this is to have a maximum simulation time: that way we know we'll stop at some point. `setMaximumTime()`

can be used to change this from the default value of 20000 timesteps; `at_equilibrium()`

returns true if we have exceeded that time. Clearly we will often be able to do a better job of decoding whether a simulation has ended, in which case we should override this method.

Finally, we need a source of events. We get these in terms of a probability distribution that consists of a list of triples consisting of a list of places for an event to occur in the network, the probability of that event happening at any given element place, and the event function that we call when the event occurs. The `eventDistribution()`

method returns the distribution for the given time, and for the moment is left undefined.

We should note what else this class *doesn't* provide: any way of actually selecting and executing events drawn from the distribution. For that we need to define a specific dynamics.

Any simulation dynamics has to answer the three questions we posed earlier: *when* does an event happen?, *where* in the network?, and *which* action is taken? Synchronous dynamics has simple answers to these questions. At each discrete timestep (*when*) it looks for all the places in the network where an event *could* occur (*where*), and choose whether or not an event occurs at each place according to the probabilities given to the events by the probability distribution (*which*).

Providing this dynamics is simply a matter of turning this into code:

In [3]:

```
class SynchronousDynamics(Dynamics):
'''A dynamics that runs synchronously in discrete time, applying local
rules to each node in the network. These are simple to understand and
simple to code for many cases, but can be statistically inexact and slow
for large systems.'''
# additional metadata
TIMESTEPS_WITH_EVENTS = 'timesteps_with_events' #: Metadata element holding the number timesteps that actually had events occur within them
def __init__( self, g = None ):
'''Create a dynamics, optionally initialised to run on the given prototype
network.
:param g: prototype network to run over (optional)'''
super(SynchronousDynamics, self).__init__(g)
def do( self, params ):
'''Synchronous dynamics.
:param params: the parameters of the simulation
:returns: a dict of experimental results'''
# run the dynamics
g = self.network()
t = 0
events = 0
timestepEvents = 0
while not self.at_equilibrium(t):
# retrieve all the events, their loci, probabilities, and event functions
dist = self.eventDistribution(t)
# run through all the events in the distribution
nev = 0
for (l, p, ef) in dist:
if p > 0.0:
# run through every possible element on which this event may occur
for e in copy(l.elements()):
# test for occurrance of the event on this element
if numpy.random.random() <= p:
# yes, perform the event
ef(self, t, g, e)
# update the event count
nev = nev + 1
# add the events to the count
events = events + nev
if nev > 0:
# we had events happen in this timestep
timestepEvents = timestepEvents + 1
# advance to the next timestep
t = t + 1
# add some more metadata
(self.metadata())[self.TIME] = t
(self.metadata())[self.EVENTS] = events
(self.metadata())[self.TIMESTEPS_WITH_EVENTS] = timestepEvents
# report results
rc = self.experimentalResults()
return rc
```

That's it! – one method called `do()`

that codes-up the simulation loop. While the simulation is not at equilibrium (as defined by the `at_equilibrium()`

method inherited from `Dynamics`

) we retrieve the event distribution. For each entry we run-through all the possible places for an event and select randomly whether the event actually happens. We do this by using the `numpy.random.random()`

function, which returns a random number uniformyl distributed over the range $[0, 1]$. If this random number is less than the probability associated with the event, then we "fire" the event by calling the associated event function, passing it the dynamics, the current simulation time, the network over which the process is running, and the place where the event occurs (a node or an edge in the network). We keep track of the number of events we fire, and also keep track of the number of timesteps in which events are fired, which we'll use later when we think about the efficiency of this kind of simulation.

At the end of `do()`

we package-up a short summary of the experiment as **metadata**: data about the way the simulation occured. We store this in a dict that we inherit from `epyc.Experiment`

, accessed by the `metadata()`

method. Finally we return our `experimentalResults()`

, which is another method inherited from `epyc.Experiment`

that we'll come back to in a moment.

We're still missing some details, though: `SynchronousDynamics`

doesn't give us an event distribution, and doesn't give us any events.

In [4]:

```
class CompartmentedSynchronousDynamics(SynchronousDynamics):
'''A :term:`synchronous dynamics` running a compartmented model. The
behaviour of the simulation is completely described within the model
rather than here.'''
def __init__( self, m, g = None ):
'''Create a dynamics over the given disease model, optionally
initialised to run on the given prototype network.
:param m: the model
:param g: prototype network to run over (optional)'''
super(CompartmentedSynchronousDynamics, self).__init__(g)
self._model = m
def setUp( self, params ):
'''Set up the experiment for a run. This performs the default action
of copying the prototype network and then builds the model and
uses it to initialise the nodes into the various compartments
according to the parameters.
:params params: the experimental parameters'''
# perform the default setup
super(CompartmentedSynchronousDynamics, self).setUp(params)
# build the model
self._model.reset()
self._model.build(params)
# initialise the network from the model
g = self.network()
self._model.setUp(self, g, params)
def eventDistribution( self, t ):
'''Return the model's event distribution.
:param t: current time
:returns: the event distribution'''
return self._model.eventDistribution(t)
def experimentalResults( self ):
'''Report the model's experimental results.
:returns: the results as seen by the model'''
return self._model.results(self.network())
```

`setUp()`

method does the standard behaviour of building a copy of the network prototype, and then resets and builds the model and passes the working network to the model's `setUp()`

method. `eventDistribution()`

returns what the model says is the event distribution, which will also include implementations of the events. Finally, `experimentalResults()`

returns a dict of the model's definition of what constitutes the important features of running that particular model.

That's quite a lot of code, so let's pause and assess what we've built.

First of all we defined the basic structures of an epidemic process on a network: basically the ability to generate a working copy of a network several times, some definition of termination, and an abstract method for getting the event distribution. We then specialised this to provide continuous-time simulation dynamics which takk the distribution and applied it to all possible places where events could occur according totheir probabilities. Rather than then specifying the event distributions and events by sub-classing, we instead bound the missing elements to an object defining a compartmented model of disease, allowing that to provide the details.

Why this way? – why not just sub-class `SynchronousDynamics`

to provide, for example, the events of SIR and their distribution? The answer is that SIR is a process that can run on several *different* simulation regimes as well as this one, notably the stochastic dynamics we'll look at later. If we defined SIR by sub-classing `SynchronousDynamics`

, we'd then need to re-define it if we introduced another simulation dynamics: two definitions of the same process, which is an invitation to mistakes.

It's far better to define a single process in a single class and then re-use it, wnd this is what we've done in defining the `CompartmentedModel`

class and sub-classing it to define SIR. This makes the simulation framework easier to use, but trickier to implement: the astute reader will have noticed that we didn't explain how `CompartmentedModel`

works inside, and that's because it's a bit complicated. But it's also largely irrelevant in practice: you don't need to know how this particular piece of code works in order to use it for network science experiments. (If you're interested, you can look at the code in `epydemic`

's github repo. But don't say you weren't warned.)

The message here is not that some simulation code is complicated, but rather that it's possible to *localise* that complexity where it can't do any harm. This keeps the user interface simpler and also means that we can now concentrate on the epidemics, not the code we use to simulate them.

Finally, at long last, let's run some code.

We have a compartmented model of SIR, and a synchronous discrete-time simulation framework, so let's run the former in the latter. We first need to define the parameters of our simulation, and for this experiment we'll use a small-ish ER network and some fairly nondescript SIR parameters:

In [5]:

```
# ER network parameters
N = 5000
kmean = 5
pEdge = (kmean + 0.0) / N
# SIR parameters
pInfected = 0.01
pInfect = 0.2
pRemove = 0.1
```

We can then create the network and the model, and bind them together with the simulation dynamics:

In [6]:

```
g = networkx.erdos_renyi_graph(N, pEdge)
m = epydemic.SIR()
sim = CompartmentedSynchronousDynamics(m, g)
```

In [7]:

```
# create a parameters dict containing the disease parameters we want
params = dict()
params[epydemic.SIR.P_INFECTED] = pInfected
params[epydemic.SIR.P_INFECT] = pInfect
params[epydemic.SIR.P_REMOVE] = pRemove
# run the simulation
sync = sim.set(params).run()
# save the results for later
with open('sync.pickle', 'wb') as handle:
pickle.dump(sync, handle)
```

**results dict**. It's structured in a very particular way, with three top-level keys:

In [8]:

```
sync.keys()
```

Out[8]:

`results`

key contains a dict of the experimental results that the simulation returned: it's "real" results, if you like:

In [9]:

```
sync['results']
```

Out[9]:

In this case the results are a dict of compartments and their sizes, and a dict of loci and their sizes. We can see that in this case there are no infected nodes left, and therefore no SI edges – and therefore no way the simulation can infect any more nodes.

The `parameters`

key contains a dict of the parameters we passed to the simulation:

In [10]:

```
sync['parameters']
```

Out[10]:

So we have the experimental results and the simulation parameters that gave rise to the immediately to hand. Note that this isn't *quite* all the information we might need, as it doesn't include the size or link probability of the underlying network prototype we passed to the simulation.

Finally, the `metadata`

key contains a dist of useful information about how the simulation progressed:

In [11]:

```
sync['metadata']
```

Out[11]:

These values might be important in assessing how the simulation worked. For the time being, let's just draw attention to the difference between two values: the overall simulation time (20000 timesteps, the default), and the number of timesteps iun whichevents actually occurred. The former is *way* larger than the latter, suggesting that the simulation did an awful lot of ... well, nothing.

We can easily check whether we had an epidemic by checking the size of the largest outbreak, which in the case of an epidemic should scale linearly with `N`

, the size of the network:

In [12]:

```
print "Epidemic covered {percent:.2f}% of the network".format(percent = ((sync['results']['compartments']['R'] + 0.0)/ N) * 100)
```

The synchronous synamics we encoded above works by evaluating the process dynamics at each discrete timestep. This is an obvious approach, but one that begs two questions: how expensive is it to evaluate the dynamics at each step?; and, what proportion of timesteps do we evaluate the dynamics with no effect, because nothing changes?

To answer the first question we can look at the `do()`

method on `SynchronousDynamics`

. At each timestep it retrieves all the places where an event might occur, which we know from our definition of SIR is any SI edge (for infection events) and any infected node (for removal events). For each place, it draws a random number and then possibly calls an event function. The amount of work therefore depends on the sizes of the two loci for events, which will presumably swell as the epidemic progresses: we might assume that in an average timestep about half the nodes are infected, and some smaller proportion of the edges are SI: we can't say much more without a lot more information about the structure of the network. The loci change as events occur, which means that `CompartmentedModel`

will have to ensure that it can efficiently track these changes (and indeed a lot of the code complexity addresses exactly this).

We alluded earlier to the answer to the second question. The result dict includes metadata that defines the number of timesteps and the number in which at least one event actually occurred. We can use these to determine the percentage of timesteps in which anything actually happened – and therefore calculate the "wasted" timesteps:

In [13]:

```
print "Of {n} cycles simulated, {no} ({percent:.2f}%) had no events ".format(n = sync['metadata']['simulation_time'],
no = sync['metadata']['simulation_time'] - sync['metadata']['timesteps_with_events'],
percent = (sync['metadata']['simulation_time'] - sync['metadata']['timesteps_with_events']) / (0.0 + sync['metadata']['simulation_time']) * 100)
```

A slightly more significant problem is one of statistical exactness: the extent to which the simulation actually performs according to the probabilities. We won;t dig into this in too much detail, but the basic problem is simple to explain. In the `do()`

method, for each possible event, we collect the possible places the event can happen and then decide whether the event actually happens there. There's a hidden assumption here that all these choices are independent of one another, but that's not quite the case. For example, if two infected nodes are connected to the same susceptible node – so there are two SI edges in the locus for infection events – then we have two chances to infect the susceptible node in the same timestep. If the first time happens to result in infection then the second one can't (by definition), making the actual rate of infections in a timestep varies just slightly from the expected value. Similarly, we may happen to run the removal events before the infection events, and so nodes infected in the timestep don'ty have any possibility of recovering in that same timestep – even if the probability of recovery were set very high.

If these sound like trivial issues, well they may well be. But they may *not* be, depending on the exact combinations of parameters and network structures we encounter. That's a risk it'd be better not to take, as it introduces patterns into the simulation results that aren't there the model descriptions, or indeed in any mathematical analysis we might make of them.

It might be that we're willing to accept these issues in the interests of simplicity: synchronous simulation is very easy to program and understand. But both the performance and the statistical exactness issues are caused by the same basic decision to use discrete time, and it turns out that we can address both by using a different simulation dynamics, one that works directly from the probability distributions in continuous time.

(This is a chapter from Complex networks, complex processes.)

Simulation is an enormous topic in computer science, with a long and distiinguished history. It's easy to see why it's so important: whenever we use computers to study natural processes (or indeed man-made or engineering processes) we're taking a physical system, abstracting it into a computer model, and then building software that runs the model *as if* it were the real system running in the real world.

The process of model abstraction – of which compartmented models of disease are a prime example – is a process of simplification, of leaving out details in order to get to the essentials of the process we're interested in. This reduction of detail is sometimes criticised by those outside the scientific community: if you leave out the details, how do you know that your model is really saying anything about the real-world phenomenon? And that's a fair point. But simplification is essential if we're to understand the core behaviour of processes and not be distracted by all the details.

How do we know is a model says anything meaningful? We need to **verify** and **validate** it – two software engineering concepts that are sometimes summarised as "did we build it right?" and, "did we build the right thing?".

By **verification** we mean examining the model to ensure that, to the best of our ability, the mathematics and code are faithful to the way we think the process operates. This examination might take any number of forms, from inspection of the code and maths by others, through the development of test suites to exercise the code and check it against known situations, to the use of the mathematically-based techniques of computer science formal methods. It's easy to get code wrong, and incorrect code tells us less than nothing about the phenomena we're interested in: never skimp on debugging, and never assume things are finally working completely correctly.

By **validation** we mean deciding whether the model does actually reflect the real world. This might take the form of creating a simple real-world experiment and performing it physically as well as in simulation, to see if the results match. Of course they never will match exactly, because in the process of simplification we'll have removed some of the details that affect the physical process. In simulation, a pendulum on a friction-free mount will swing forever; in reality, it never will, because the mount will never actually *be* friction-free.

Since simulation has so much history, it's unsurprising that there are a myriad of approaches to conducting them. Each choice has subtly different implications on the experiments and results obtained – often only really understood by those who've spent a lifetime with the given techniques.

In network science the simulations we typically use fall under the broad rubric of **discrete-event simulation**. What this means is that, to a simulator, the world is treated as a sequence of individually-identifiable events that happen in a sequence through time. In the case of disease models, the events are individual nodes being infected or recovering: individual, discrete "happenings" described individually and executed independently. Of course one event affects subsequent ones – you can't recover if you've never been infected in the first place – but that's about the *possible sequences* of events that can occur, not a relationship between one event and enother at the coding level.

You can see the sequencing of events at work in the compartmented model. An infection event happens at SI edges and has a local effect: change the susceptible node's compartment, which in turn might generate more SI edges at which further infection events can occur. A recovery event happens at infected nodes, meaning it can *only* happen *if* an infection event previously happened there (or of the node was initially infected). The sequencing is implicit in the definition of the event loci, and of the events' effects – even though there's no explicit encoding within the events themselves of how they'll be sequenced.

A simulation occurs in **simulated time**, which is to say the time in the simulated world. This is typically different to real-world or **wallclock time**, which is how long the simulation takes to run on a computer. These two notions of time differ substantially. It's easy to see why: most biological and physical processes take an eternity from the perspective of a modern computer. The progression of a disease in an individual might take days, and we seldom want to wait that long for results. Simulation time often therefore passes more quickly than wallclock time.

We might need some way to relate simulation time to "real world" time, for example to see how many days an epidemic will last. In that case we'll need to develop ways to translate between simulated time and the "real" time of the phenomenon being studied. But often we don't care about this level of realism, and are happy to work in a more abstract world.

There's still another thing to consider, which is the issue of temporal resolution. Time, at least at the macro scale, is a continuous quantity, represented as a real number. A **continuous time** simulation represents time in this way, and also typically assuems that only one event happens at each (simulated) moment. This may sound restrictive, but the idea is that the events that happen happen instantaneously, so two events never need happen at *exactly* the same time: we can always put some infinmitesimal gap between them. In SIR, this means people go from being susceptible to being infected instantaneously; if two people are infected, one of them is always infected before the other.

Another way to view time is to think of it as divided up into discrete chunks: seconds, for example. Instead of modelling a continuous stream of events, each occurring at a different instant, we think about blocks of time in which a set of events occur. This is a **discrete time** simulation.

Which of these approaches is "right"? Neither – and that's anyway the wrong question. They are both approximations of reality that we use to perform computational experiments. There are sometimes reasons to prefer one over the other, but often the choice is a matter of intellectual preference or coding convenience. At the risk of massive stereotyping, people with computer science backgrounds are often (at least initially) more comfortable with a discrete-time view, while people with a classical science background often find it easier to think about continuous time. (One reason for this may be that the mathematics taught in computer science programmes is typically overwhelmingly discrete and tends not to emphasise modelling with differential equations, which is where the continuous ideas come from.) There are good mathematical reasons to prefer continuous-time over discrete-time simulation, but both are available to you.

When working with random networks and stochastic processes there are additional complications due to the use of randomness. It is entirely possible that, just by a chance interaction, a disease on a network will die out. Run the *same* experiment on the *same* network with the *same* parameters – and you might get a disease that *doesn't* die out, because the chance interaction didn't happen this time.

Does this mean that such experiments aren't repeatable? No! – but it *does* mean that we need to be careful, perform repetitions, be sure that we understand the implications of the various random factors that affect the outcome of each experiment. We'll have a lot more to say on this topic later.

What this discussion is getting at is that we need to be careful in going from the models we develop, their realisation in code, and their execution in simulation, to conclusions about the real world. We need to be sure that the conclusions we draw are supported by the simulations we've done, and that they match, to an appropriate degree, observations we can make about the real-world process we're simulating.

Let's look in overview at the process of discrete-event simulation, before we get into the coding details.

The basic process of simulation involves repeatedly deciding three things:

*when*(in simulation time) does the next event occur?,*where*in the network does it occur?, and*which*event is it that occurs?

The event is then executed, and the process repeats – forever in principle, and in practice until some **termination condition** occurs. In network science we often use a termination condition of **equilibrium**, where the network has in some sense "stabilised" so we can look at its overall state. In SIR this might be when there are no infected nodes left in the network, since no further events are then possible.

How are these three decisions made? The details are what differentiates between the different methods of simulation. For our purposes, `epydemic`

provides a small framework for simulating epidemics on networks, with the decision-making either being coded directly or – more conveniently – being offloaded to a software encoding of a compartmented model. It's this framework we'll turn to next.

(This is a chapter from Complex networks, complex processes.)

Having developed a discrete compartmented model of disease, we now have to turn it into code. Most epidemic processes share a common form and can be simulated using a small set of common techniques. It therefore makes sense to capture the form of an epidemic process in code, and then use that code to drive a simulator. In this way we can focus on the epidemic process rather than on the process of simulation.

We make use of a Python library, `epydemic`

, written to provide a framework within which to conduct simulations of epidemic processes. `epydemic`

provides three main elements:

- A base class for describing epidemic processes quickly and cleanly;
- A small library of common epidemic processes that can be used as a starting point for defining additional processes; and
- Implementations of the two most common simulation regimes.

As well as providing the small-scale features we introduce in this chapter, `epydemic`

has features for performing large-scale simulations on paralle compute clusters, integrating cleanly with the `epyc`

simulation library. We'll discuss this intregration in more detail later. You can also read the API documentation for a full description of `epydemic`

and its capabilities.

As we saw earlier, an epidemic simulation consists of two main components:

- A
**model**of the disease process that describes how nodes in the network are infected, recover, and so forth, typically using either probabilities or fixed elapsed times; and - A
**dynamics**that applies the model to a network over the timespan of the simulation.

The former describes the way nodes evolve as the disease progresses; the latter describes how this evolution occurs in time. For the moment we'll focus on the model, which `epydemic`

represents by the class `epydemic.CompartmentedModel`

. We sub-class this class to create different compartmented disease models.

An instance of a sub-class of `epydemic.CompartmentedModel`

basically encodes exactly the kind of discrete model we developed earlier. Each node in the network resides in a **compartment**, a box representing the disease state of the node. We are typically interested in how the sizes of the compartments change over time. A **locus** is a place in the network where an **event** can occur, where an event typically changes the compartment of one or more nodes around the locus. An example event in SIR would be an infection event, whose locus is the set of SI edges and which causes the S end to become I and any edges to adjacent S nodes to be classified as SI (i.e., be added to the locus for possible future infection).

The significance of loci is that `epydemic`

keeps track of the nodes and edges in each locus at each stage of the simulation. In our SIR example, after every simulation event `epydemic`

checks whether any nodes should be removed from the infected locus and whether any edges should be added to the SI locus – and does so automatically in a way that is optimised to only check as little of the network as necessary. This both makes simulation more efficient and simplifies the epidemic process description.

An `epydemic`

event is simply a Python function. As such it can do anything Python can do – but typically will perform only some simple transitions of the compartments of nodes. `epydemic.CompartmentedModel`

provides two methods that perform these operations. `changeCompartment()`

changes the compartment of a node, making sure that this change is reflected in the process' loci. `markOccupied()`

marks an edge as having been used to spread the disease, whcih can be useful when exploring how the epidemic spread.

Events might want to do other things, for example keeping track of the simulation time at which the epidemic crossed a particular edge, which might be useful for doing animations. About the only restriction on event code is that it should use `changeCompartment()`

to change nodes' compartments, as this ensures that the loci are updated.

`epydemic`

. This isn't actually necessary, as `epydemic`

already *has* an implementation of SIR (and indeed other compartmented models). But SIR is conceptually the simplest compartmented model, and demonstrates the approaches we'll use later.

In [1]:

```
import epydemic
import networkx
```

Let's first define a model for our disease. We know that SIR consists of three compartments: Susceptible, Infected, and Removed. There are two loci for disease and two corresponding events: infected nodes (which can be subject to recovery events), and SI edges (which can undergo infection events). We also know that it requires two dynamical parameters: the probability of infection along an edge, and the probability of recovery. We also require an initial seeding of the network in which nodes become infected with a given probability.

Let's see how this is coded in `epydemic`

:

In [2]:

```
class SIR(epydemic.CompartmentedModel):
'''The Susceptible-Infected-Removed compartmented model of disease.
Susceptible nodes are infected by infected neighbours, and are removed
when they are no longer infectious.'''
# the model parameters
P_INFECTED = 'pInfected' #: Parameter for probability of initially being infected.
P_INFECT = 'pInfect' #: Parameter for probability of infection on contact.
P_REMOVE = 'pRemove' #: Parameter for probability of removal.
# the possible dynamics states of a node for SIR dynamics
SUSCEPTIBLE = 'S' #: Compartment for nodes susceptible to infection.
INFECTED = 'I' #: Compartment for nodes infected.
REMOVED = 'R' #: Compartment for nodes recovered/removed.
# the locus for infection events
SI = 'SI' #: Edge able to transmit infection.
def __init__( self ):
super(SIR, self).__init__()
def build( self, params ):
'''Build the SIR model.
:param params: the model parameters'''
pInfected = params[self.P_INFECTED] # probability of a node bveing initially infected
pInfect = params[self.P_INFECT] # probability of infection
pRemove = params[self.P_REMOVE] # probability of recovery
self.addCompartment(self.SUSCEPTIBLE, 1 - pInfected)
self.addCompartment(self.INFECTED, pInfected)
self.addCompartment(self.REMOVED, 0.0)
self.addLocus(self.INFECTED)
self.addLocus(self.SUSCEPTIBLE, self.INFECTED, name = self.SI)
self.addEvent(self.INFECTED, pRemove, lambda d, t, g, e: self.remove(d, t, g, e))
self.addEvent(self.SI, pInfect, lambda d, t, g, e: self.infect(d, t, g, e))
def remove( self, dyn, t, g, n ):
'''Perform a removal event. This changes the compartment of
the node to :attr:`REMOVED`.
:param dyn: the dynamics
:param t: the simulation time (unused)
:param g: the network
:param n: the node'''
self.changeCompartment(g, n, self.REMOVED)
def infect( self, dyn, t, g, (n, m) ):
'''Perform an infection event. This changes the compartment of
the susceptible-end node to :attr:`INFECTED`. It also marks the edge
traversed as occupied.
:param dyn: the dynamics
:param t: the simulation time (unused)
:param g: the network
:param e: the edge transmitting the infection, susceptible-infected'''
self.changeCompartment(g, n, self.INFECTED)
self.markOccupied(g, (n, m))
```

Let's look at the `build()`

method first. This is called to construct the epidemic model. It first extracts the three parameters for the simulation from the hash of parameters. It then declares the three compartments of SIR using the `addCompartment()`

method. The second parameter is the probability of a ndoe being initially assigned to this compartment. (There are no initially-removed nodes.)

We then add the two loci using `addLocus()`

. Loci come in two flavours in `epydemic`

. **Node loci** capture nodes in a given compartment, while **edge loci** are edges linking nodes in two particular compartments. In this case, we have a node locus for infected nodes and an edge locus for SI edges (which we name for later).

Finally we bind events to each locus using `addEvent()`

. Events happen at a given locus with a given probability. An event is a function that takes four parameter: the simulation dynamics, the current simulation time, the `networkx`

network, and an element from the locus to which the event is bound (either a node or an edge). Since we represent events by methods on the model object, we need to wrap them in lambda expressions (Python closures) so that, when the event is triggered, it calls the correct method on the right model. We then bind these events to the correct loci. A locus may have several events associatd with it if desired, and conversely the same event might occur at several loci.

The above code completely specifies the structure of the epidemic. We now need to specify what happens at each event. For a `remove()`

event, we are passed a node and change its compartment using `changeCompartment()`

. For an `infect()`

event we are passed an SI edge, with the edge being aligned so that the compartments of its endpoints match the way we specified in defining the corresponding locus. We change the susceptible end's compartment to be infected, and mark the edge itself as "occupied", since the infection spread along it.

So far so good, but we still don't have anything to actually *run*. What we *do* have is the static description of a disease model thaty describes the probabilities of a node moving between different disease stages – together with code for the events that will occur as we progress through each stage.

What we stil need is a way of deciding when the different progressions happen for the different nodes. This is the issue of simulation dynamics. There are many ways in which we can perform simulations, but the important point is that the model we described can be applied under *any* of these different models – and that's generally true for most models developed using `epydemic`

. We next need to explore the simulation under different dynamics to see how they differ.

(This is a chapter from Complex networks, complex processes.)

**continuous** model where the population sizes are assumed to be real numbers. This makes a certain amount of sense if we think of compartments as fractions of an overall population. However, from another perspective, it's clear that there's another perspective in which only whole numbers of people become sick, leading to a **discrete** model that places an integer number of individuals into each compartment. How do we reconcile these two views?

The continuous model is best thought of as modelling the large-scale, **macroscopic** behaviour of the epidemic in which the don't really care about the exact numbers of individuals concerned. Also, for a large population, considering the relative sizes of compartments to a few decimal palces of accuracy will still yield something close to a whole number of individuals per compartment when the compartment fractions are scaled-up to the size of the overall population.

But we can also ask what happens at the **macroscopic** scale, for individuals. In that case we want to know how the disease might evolve in a *single person*. Another way to think of this is that a comparttmented model allows each individual person to traverse the compartments according to the probabilities associated with each transition.

Clearly the macroscopic and microscopic descriptions are related: we assume that, if we let a disease run through a population, then the ways in which individuals' infections evolve will integrate to reflect the macroscopic description in terms of fractions of the entire population.

As well as being continuous, however, there's another assumption implicit in the contionuous description. Let's re-visit the equations describing SIR:

$$ \frac{ds}{dt} = -\beta s(t) i(t) \hspace{1in} \frac{di}{dt} = \beta s(t) i(t) - \alpha i(t) \hspace{1in} \frac{dr}{dt} = \alpha i(t) $$Here $i(t)$ denotes the fraction of the population who are infected as time $t$. The rate of change in this population, $\frac{di}{dt}$, has two terms: a growth term $\beta s(t) i(t)$, and a reducing term $\alpha i(t)$. The growth term says that the infected population grows at a rate that is proportional to the total number of (susceptible, infected) pairs in the population, which is simply the product of the two population sizes: in each unit of time, all these people meet each other and a fraction $\beta$ of the susceptibles become infected.

The assumption, clearly, is that all these pairs of people *do actually meet*, and this is a strong assumption. It's called the assumption of **well-mixing**, or alternatively of a **homogeneous** population. We discussed this earlier when we talked about attack rates and reproduction numbers. In "small" populations, well-mixing isn't a totally unreasonable assumption – although it *is* still an approximation of reality (even the people in my small village don't all meet each other every day). If we were to consider a population the size of Scotland, it's clearly implausible.

That doesn't mean we should throw the model away. The statistician George Box is quoted as saying, "*There is no need to ask the question 'Is the model true?'. If 'truth' is to be the 'whole truth' the answer must be 'No'. The only question of interest is 'Is the model illuminating and useful?'*" But the simplification of SIR to three differential equations does smear-out some structure that might be important – and, it turns out, *is* important in the sense that there are disease phenomena that occur in nature that don't occur in this system. Putting SIR onto a network is one way of addressing this.

So in moving to diseases on networks we're trying to address two issues:

- that populations exhibit structure and so are not well-mixed; and
- that diseases occur in individuals, not simply in populations.

To address the first issue, we use a network to represent individuals and their interactions, with the connection structure of the network providing the opportunity for different kinds of inhomogeneity.For the second issue, we develop a discrete description of SIR, consistent with the continuous version, that we can apply to the individual noides of the network. We can then study how different network structures affect the properties of an epidemic.

The first step is conceptually the easier, but has some subtleties. The natural way to treat a population as a network is to have one node per individual in the population. Edges between nodes represent social interactions that are opportunities for infection. If a susceptible person is connected by an edge to an infected person, then there is an opportunity for the latter person to infect the former. Conversely, if there is no such edge, then the susceptible person cannot be infected that the infected individual, since there exists no social contact between them.

How might we construct this network? The simplest approach is undoubtedly to create a random network of some kind: perhaps an ER network, in which case we will obtain a "social network" for $N$ individuals who interact in a random way with a well-defined mean number of others. Simulating an epidemic will then involve running our toi-be-designed discrete disease process over this network, and examining the results.

A moment's thought will show several problems with this approach. Firstly, not all contacts are created equal, as we saw when we discussed secondary attack rates: people in close contact (such as children in a nursery, or people in a care home) are more likely to infect one another than people in weaker contact (such as workers in a factory). We could address this issue, perhaps, by **weighting** the edges between people to capture that fact that "some edges are more infectious than others". Alternatively, we might argue that these factors will even-out over a suiitably heterogeneous population, and so if we focus on the probability of infection for an "average social contact" we can still extract meaningfuil information from any simulation.

Secondly, how are individuals to be connected in the infection network? Are their connections random? Do they exhibit a more clustered structure? Are there dense packets of highly connected individuals, separated by sparse connections? These are questions of network degree, connectivity, and so forth – of network topology in general – and intuitively it seems clear that the choice may make a difference. We might, for example, expect a disease in a well-connected, high-mean-degree network to spread differently to the same disease on a network with lower connectivity.

Thirdly, we have described a **static** network whose connections don't change over time. Relating this back to the context we're considering, that doesn't seem appropriate. People might be expected to avoid individuals who are sick, or the sick individuals might be quarantined to preclude social contact. Either of these behaviours would be expected to remove social contacts – edges – from the part of the network around an infected individual.

(When I was growing up in England in the 1970s, parents actually demonstrated exactly the opposite behaviour. If a child got measles, for example, mothers all brought their children round for a play date with the explicit intention of getting them infected too – the logic being that exposing a child to the disease early was good for their immune systems, got the one-off infection "out of the way", and generally improved herd immunity. None of those arguments are at all wrong, but this approach to parenting seems to have gone out of fashion.)

In either case, we might think that it is more appropriate to adopt an approach that changes the structure of the network in response to infection, perhaps reducing the number of edges when a node is infected. In this case we have a **dynamic** or **adaptive** network structure, where the network responds to the progress of the process running over it. Again, we might decide that these effects will even-out and can be ignored to give an "average" result.

The upshot of this discussion is that we can take a simple representation – a static, random network with unweighted links – and then add more features if we think they might be relevant. As we do so we make the model more realistic – but also more complicated, and and we add to the number of possible degrees of freedom.

Adding more factors in pursuit of realism may sound attractive, but we have to bear in mind that it also gives us a freedom we may not be able to use effectively. Consider the case where we reduce the number of edges to an infected individual. How many edges do we remove?, and how do we select them? – and will these choices make a critical difference?, and how do they interact with the existying parameters of the model? In adding a new freedom we also add a considerable burden of analysis and simulation to check what effects our new freedom has. Might it be better to stick with the simplest case?

This argument might sound bogus to you: a cop-out just to reduce the amount of work we have to do. And if your primary interest is in the dynamics of a *particular* disease, about which you want to make accurate predictions – as would be the case for planning a clinical response to an outbreak – then of course you may strive to build *the most realistic model possible* and accept the associated extra work. On the other hand, if your primary interest is epidemic processes in general, you might be happy to stick to simpler models to see whether they *always* exhibit certain features which can then be generalised (with care) to *all* diseases. We'll see an example of this later in the case of epiudemic thresholds, where certain combinations of infectiousness and recovery *necessarily* lead to epidemics pretty much regardless of everything else.

Now let's return to the second issue we identified above: moving from a continuous to a discrete description of the disease process.

Compartmented models of disease represent diseases as a collection of compartments. We notionally consider each individual in the population to be "in" a particular compartment at a given time. As their disease progresses, they move "from" one compartment "to" another, typically according to some stochastic process where their re-location happens with some probability. In addition, this probability may be affected by other factors, for example the presence of individuals in other compartments as neighbours. When looking at the overall disease behavuour (the macroscopic view) we are typically interested in how the relative sizes of the compartments changes. When looking at the disease's progress (the microscopic view) we additionally need to know about the compartments of neighbouring individuals. It is precisely this microscopic behaviour that is missing from the continuous-process description of compartmented models.

How then do we describe interactions at the scale of individual nodes?

Let's look again (not for the last time) at the differential equations for SIR:

$$ \frac{ds}{dt} = -\beta s(t) i(t) \hspace{1in} \frac{di}{dt} = \beta s(t) i(t) - \alpha i(t) \hspace{1in} \frac{dr}{dt} = \alpha i(t) $$There are three compartments, and the three equations (one per compartment) tell us how their population changes. Looking at the last equation, we see that $r(t)$ increases at a rate proportional to the $i(t)$, the size of the infected compartment. Similarly, looking at the first equation, $s(t)$ decreases at a rate proportional to the number of susceptibl;e-infected pairs. In the second equation, these two effects both appear inverted – understandably, since individuals pass through infection to recovery, and rates have to balance out if we are to keep the population constant.

So much for the compartments: what does this mean for an individual?

We know that we are representing the interactions between individuals as network edges. Suppose that at some time we have a given susceptible individual. That individual cannot become infected spontaneously, but only through interaction with an individual who is infected at the same time and with whom she has some social contact, represented by an edge. So to determine whether the susceptible individual is infected, we need to know whether she has any edges that lead to individuals who are infected. We refer to such edges as **SI edges**: they connect a susceptible node to an infected node.

Suppose we have found an SI edge linking our susceptible node to an infected neighbour. The infection "passes along" this edge with a probability $\beta$, turning our susceptible node into an infected node, decreasing the population of the susceptible compartment by one and increasing the population in the infected compartment by one.

But there is also another effect. The edge down which the infection travelled is no longer an SI edge, since it now connects an infected node to *another* infected node. Furthermore any other SI edges that connected our formerly-susceptible node to infected nodes are also no longer SI edges. And finally, the fact that our formerly-susceptible node is now an infected node means that there may be new SI edges created, where there are edges between our node and a neighbnouring susceptible node.

This is quite a bit more complicated than the equations suggest at first glance. It is perhaps simpler to think of it slightly differently. It is the population of SI edges, *not* the population of susceptible or infected individuals alone, which determines the rate of infection: that much is clear from the infection term. The infection dynamics happens, not at individual nodes, but at SI edges. If can think of the SI edges as a **locus** for the infection dynamics: a place at which infection possibly occurs. The edges in that locus are potentially changed by every infection **event**: every time an SI edge actually results in an infection.

*once it has happened* it has an impact on the SI edges – and therefore, indirectly, in future infection events. The locus for removal events is therefore the population of nodes in the infected compartment, any of which may spontaneously be removed.

Summing-up the above, we can now formulate a discrete description of SIR.

The model consists of three compartments: susceptible (S), infected (I), and removed (R). Each node resides in exactly one compartment at any time. There are two loci for the dynamics: SI edges, and infected nodes. There are two events: infection happens at the SI locus with probability $\beta$, while removal happens at the I locus with probability $\alpha$. The infection event moves the S node into the I compartment; the removal event moves the I node into the R compartment. Removal therefore affects the contents of the I locus, and both events may affect the contents of the SI locus. If we compare this description to the three equations above it is hopefully easy to see the derivation.

What we've done is quite significant, though. We've moved from a description consisting of three continuous rates of change (the three differential equations) to a description consisting of two discrete events, each happening at a different locus. The events can be applied to individual nodes or edges in our network model, in which we would need to track exactly which nodes are in which compartments, and which edges are in the SI locus we're interested in. It's worth noting that we really don't care about removed nodes: they don't appear in either locus, and therefore can't affect the dynamics, other than by the fact that nodes that are removed are by definition *not* susceptible or infected.

The process description is an essential step along the way to simulation, but we're not quite there yet. We need to be able to express the above model in a computational form suitable to be executed. We need to be able to keep track of the populations in the different loci of the dynamics. And we need to choose where, and at what times, the different events occur.

(This is a chapter from Complex networks, complex processes.)

When we created ER networks earlier, we started with an empty network of $N$ and then added edges between pairs of nodes with a given probability $\phi$. We know that this will eventually lead to a network with mean degree $N\phi$. But let's look at the process from a slightly different perspective: what happens *as we add the nodes*? Specifically, how do the nodes become connected as we add edges?

Intuitively we can argue as follows. We start with an empty network. Adding an edge necessarily build a 2-node component. Adding another edge is (for a larrge network, anyway) overwhelmingly like to pick two other nodes not in the first component, forming a second. We can continue like this for some time, but gradually it will become more likely that one of the nodes we choose to connect is not isolated by rather part of a larger cluster: indeed, *both* nodes may be part *different* clusters, which thereby become joined into a a single one. As we continue to add edges, it starts to become increasingly likely that the edges will placed be between increasingly large components, thereby connecting them. And as a component becomes larger, there are more ways to connect to it (since there are more nodes to choose as endpoints), so we might expect that large components grow at the expense of small components. Eventually the network may become one large component, but even before this we might expect that there will be one or more components that are large relative to the others and to the size of the network as a whole.

This is indeed what happens. As we add edges to the initially-empty network according to the ER process, we create a large number of small components that over time connect to each other. Because large components are easier to connect to they grow faster, which leads to the formation of a component that contains a large fraction of the nodes: the **giant component**.

Does the giant component necessarily form? A moment's thought will suggest not: if we only add a small number of edges, then clearly there won't be enough for a giant component to form.

Let's denote the size of the largest component in a network by $N_G$. How does $N_G$ vary as we add edges?

Starting from an empty network, we have $N_G = 1$ since every node is its own small cluster. The ratio of the size of the "giant" component to the size of the network, $\frac{N_G}{N} \rightarrow 0$ as $N \rightarrow \infty$: the giant component is an insignificant fraction of the nodes. As we add nodes, we expect $N_G$ to increase. If we were to set $\phi = 1$ and add *all* possible edges, then at the end of the process we would have $\frac{N_G}{N} = 1$, the giant component containing all the nodes. We can think of $\frac{N_G}{N}$ as the probability that a node chosen at random will be in the giant component. Let's refer to this probability as $S$.

How does a node $i$ end up outside the giant component? It means that, for every other node $j$ in the network,

- either $i$ is not connected to $j$; or
- $i$ i is connected to $j$ but $j$ is itself not in the giant component.

For a particular node $j$, the probability of the first case is $(1 - \phi)$ (since the probability of their being an edge added is $\phi$); the probability of the second case is $\phi (1 - S)$, there being an edge between $i$ and $j$ (which is $\phi$) *and* $j$ not being in the giant component (which is $(1 - S)$). If we sum-up this probability for every $j$, then the probability we are looking for is given by the recurrence equation $1 - S = ((1 - \phi) + \phi (1 - S))^{N - 1}$. If we re-arrange this slightly,

where we used $\phi = \frac{\langle k \rangle}{N}$. Taking logs on both sides,

\begin{align*} \ln (1 - S) &= N \, \ln (1 - \frac{\langle k \rangle}{N} S) \\ &= -N \frac{\langle k \rangle}{N} S) \\ &= - \langle k \rangle S \\ \end{align*}Then we can take exponentials on each side, leading to:

\begin{align*} 1 - S &= e^{- \langle k \rangle S} \\ S &= 1 - e^{- \langle k \rangle S} \end{align*}This is still an awkward recurrence equation: $S$ appears on both sides. Situations like this often have no closed-form solution, but there's a trick to make progress, which is to make use of a graphical method.

In [1]:

```
import networkx
import math
import numpy
import matplotlib
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
import seaborn
```

In [2]:

```
fig = plt.figure(figsize = (5, 5))
# create a set of points for S, evenly spaced over the interval [0.0, 1.0]
ss = numpy.linspace(0.0, 1.0)
# different kmeans and their associated line types
kmeans = [ 0.5, 1, 1.5, 2 ]
lines = [ 'r-', 'g-', 'b-', 'y-' ]
# Build a function parameterised by kmean ro tun over S
def make_S( kmean ):
return (lambda S: 1.0 - math.exp(-kmean * S))
# plot S against S
plt.plot(ss, ss, 'k--')
# plot the exponential curves for the different selected kmeans
for i in range(len(kmeans)):
kmean = kmeans[i]
line = lines[i]
# map the appropriate function across S
ys = map(make_S(kmean), ss)
# plot the curve
plt.plot(ss, ys, line, label = '$\\langle k \\rangle = {k}$'.format(k = kmean))
plt.xlabel('$S$')
plt.title('Solutions for $S = 1 - e^{-\\langle k \\rangle S}$ for different values of $\\langle k \\rangle$')
plt.legend(loc = 'upper left')
_ = plt.show()
```

So by inspection for $\langle k \rangle = 1.5$ there is a solution at approximately $S = 0.58$, while for $\langle k \rangle = 2$ there is a solution at approximately $S = 0.8$ – 80% of the nodes in the network are in the giant component.

Looking at the lines for the different values of $\langle k \rangle$, notice that as $\langle k \rangle$ increases the corresponding curve starts out steeper. Shallow curver never intersect $y = S$, meaning no giant component emerges; as the curves get steeper, a solution emerges starting at low values of $S$ and gradually moving towards $S = 1$. The separator between these two regimes occurs when the initial gradient of the curve matches that of $y = S$, when the curve and the line are tangent to each other at $S = 0$. This separator is referred to as a **critical transition** or a **critical threshold**, because it's the critical value at which behaviour abruptly changes. It happens when:

and so:

$$ \langle k \rangle e^{-\langle k \rangle S} = 1 $$At $S = 0$ we discover that the critical threshold $\langle k \rangle_c = 1$.

We can of course also relate $\langle k \rangle_c$ back to $\phi$, the probability of adding an edge, and discover that the critical threshold probability $\phi_c$ below which the giant component doesn't form, but above which it does (a point we explore a little more below). For $\langle k \rangle_c = 1$ we have that $\phi_c = \frac{1}{N}$.

Let these two results sink in for a minute. Firstly, a mean degree of 1 – every node attached to on average one neighbnour – is enough to start forming a giant component and therefore, by implication, to take the network towards being connected. Secondly, for a large ER network even a vanishingly small number of edges will result in the formation of a giant component – and that number gets smaller as the network gets bigger! This all suggests that giant components will be common, so a lot of the networks we encounter in applications will have one.

Alternatively we can observe that, while it's hard to find $S$ in terms of $\langle k \rangle$, it is easy to find $\langle k \rangle$ in terms of $S$:

\begin{align*} S &= 1 - e^{-\langle k \rangle S} \\ 1 - S &= e^{-\langle k \rangle S} \\ \ln (1 - S) &= -\langle k \rangle S \\ \langle k \rangle &= - \frac{\ln (1 - S)}{S} \end{align*}Since we're actually interested in $S$ we can plot the curve rotated by ninety degrees for clarity, which yields:

In [3]:

```
fig = plt.figure(figsize = (5, 5))
ss = numpy.linspace(0.0, 1.0, endpoint = False) # omit 0.0 to avoid a divide-by-zero error later
plt.xlim([0, 4])
plt.xlabel("$\\langle k \\rangle$")
plt.ylabel("$S$")
plt.plot(map((lambda S: - math.log(1.0 - S) / S), ss), ss, 'r-')
plt.title('Expected size of giant component')
_ = plt.show()
```

This makes the critical nature of $\langle k \rangle_c = 1$ even more clear. As $\langle k \rangle$ grows beyond $\langle k \rangle_c$, the expected size of the giant component rapidly approaches the size of the network itself.

The existence and value of the critical threshold was first proven by Erdős and Rényi [ER59] in a paper that really marks the very start of network science. It shows that, even for small mean degrees, an ER network will have a giant component, and as the mean degree gets larger, that component will span the entire network. Looking at the graph above, you can see that the curve asymptotically approaches $S = 1$ as $\langle k \rangle \rightarrow \infty$. It is never *certain* that the process will connect the network – it's stochastic, after all – but it rapidly becomes overwhelmingly likely.

So much for the mathematics: let's look at the emergence of the giant component computationally.

The `networkx`

function `number_connected_components()`

computes the number of components in a network. To look at the giant component forming, we therefore need to count the number of components over the region around the critical threshold. We expect to see the number of components rapidly drop towards 1, and the fraction of nodes in the largest component rapidly increase towards 1.

We could therefore create an empty network and progressively add edges to it, counting the number of components as we go. We already have the code for this in our earlier from-scratch ER network generator: however, looking at the code, while the *result* is a random network, the *process* by which edges are added is actually very regular, and we should probably avoid such unnecessary regularity in case it makes a difference. One could easily imagine that adding nodes in a regular fashion might generate components faster (or slower?) than truly random addition.

What we could do instead is to build a random network and then re-construct it by emptying it and then adding the same edges edges in a random order. This destroys any artefacts coming from the way in which we added the edges in the first place.

We first define an iterator that will randomise a list:

In [4]:

```
from copy import copy
class permuted:
"""An iterator for the elements of an array in a random order."""
def __init__( self, es ):
"""Return an iterator for the elements of an array in a random order.
:param es: the original elements"""
self.elements = copy(es) # copy the data to be permuted
def __iter__( self ):
"""Return the iterator.
:returns: a random iterator over the elements"""
return self
def next( self ):
"""Return a random element
:returns: a random elkement of the original collection"""
n = len(self.elements)
if n == 0:
raise StopIteration
else:
i = int(numpy.random.random() * n)
v = self.elements[i]
del self.elements[i]
return v
```

In [9]:

```
def growing_component_numbers( n, es ):
"""Build the graph with n nodes and add edges randomly from es, returning
a list of the number of components in the graph as we add edges in a
random order taken from a list of possible edges.
:param n: the number of nodes
:param es: the edges
:returns: the number of components as each edge is added"""
# create an empty graph
g = networkx.empty_graph(n)
# add edges to g taken at random from the edge set,
# and compute components after each edge
cs = []
for e in permuted(es):
g.add_edge(*e)
nc = networkx.number_connected_components(g)
cs.append(nc)
return cs
```

In [10]:

```
# create an ER networks and grab its edges
er = networkx.erdos_renyi_graph(2000, 0.01)
es = er.edges()
# replay these edges
component_number = growing_component_numbers(2000, es)
# plot components against edges
fig = plt.figure(figsize = (5, 5))
plt.title("Consolidation of components as edges are added")
plt.xlabel("$|E|$")
plt.ylabel("Components")
plt.plot(range(len(component_number)), component_number, 'b-')
# edge at which the giant component forms
i = component_number.index(1)
# highlight the formation of the giant component
ax = fig.gca()
ax.annotate("$|E| = {e} ({p}\\%)$".format(e = i, p = int(((i + 0.0) / len(es)) * 100)),
xy = (i, 1),
xytext = (len(component_number) / 2, component_number[0] / 2),
arrowprops = dict(facecolor = 'black', width = 1, shrink = 0.05))
_ = plt.show()
```

The giant component forms well before we've added all the edges.

(Remember that thisd is a stochastic process. It's *possible* that a giant component would *never* form for a network, just by chance. However, for an ER network with 2000 nodes $\phi_c = \frac{1}{N} = 0.0005$, so $\phi = 0.01$ is well above the critical threshold.)

But *how* does the giant component form? Does it steadily accrete, or does it form suddenly as previously disconnected components connect? We can explore this by plotting the size of the largest component as we add edges, using the function `connected_components()`

that returns a list of components, largest first:

In [11]:

```
def growing_component_sizes( n, es ):
"""Build the graph with n nodes and edges taken from es, returning
a list of the size of the largest component as we add edges in a
random order taken from a list of possible edges.
:param n: number of edges
:param es: the edges
:returns: liost of largest component as each edge is added"""
g = networkx.empty_graph(n)
cs = []
for e in permuted(es):
g.add_edge(*e)
# pick the largest component (the one with the longest list of node members)
gc = len(max(networkx.connected_components(g), key = len))
cs.append(gc)
return cs
```

*number* of components on the same axes:

In [12]:

```
# compute list of component sizes as we add edges, re-using the
# ER edges we computed earlier
component_size = growing_component_sizes(2000, es)
fig = plt.figure(figsize = (5, 5))
plt.title("Emergence of the giant component as edges are added")
# plot the number of components
ax1 = fig.gca()
ax1.set_xlabel("Edges")
ax1.set_ylabel("Components", color = 'b')
ax1.plot(range(i), component_number[:i], 'b-', label = 'Components')
for t in ax1.get_yticklabels():
t.set_color('b')
# plot component sizes against edges
ax2 = ax1.twinx()
ax2.set_ylabel("Component size", color = 'r')
ax2.plot(range(i), component_size[:i], 'r-', label = "Component size")
for t in ax2.get_yticklabels():
t.set_color('r')
_ = plt.show()
```

Now isn't *that* interesting... Let's try to interpret what's happening. Quite early-on in the process of adding edges, there's a sudden jump in the size of the largest component in the network. Well before we get to the giant component, we start getting a component of hundreds, and then thousands, of nodes. The process by which we're adding edges is random and smooth, but nonetheless results in a sudden change in the connectivity of the network. The network consists of lots of small components that suddenly – over the course of adding a relatively small number of edges – join up and create an enormously larger component consisting of most of the nodes, which then itself gradually grows until it contains *all* the nodes. Below this threshold the network is composed of small, isolated collections of nodes; above it, it rapidly becomes one big component.

This is the first example we've seen of a critical transition, also known as a **phase change**: during a steady, incremental, process, the network changes from one state into another, very different state – and does so almost instantaneously.

We should examine the area around the critical point in more detail. First we need to locate it. Since the characteristic of the critical point is that the slope of the graph suddenly increases, we can look for it by looking at the slope of the data series:

In [13]:

```
def critical_point( cs, slope = 1 ):
"""Find the critical point in a sequence. We define the critical point
as the index where the derivative of the sequence becomes greater than
the desired slope. We ignore the direction of the slope.
:param cs: the sequence of component sizes
:param slope: the desired slope of the graph (defaults to 1)
:returns: the point at which the slope of the time series exceeds the desired slope"""
for i in xrange(1, len(cs)):
if abs(cs[i] - cs[i - 1]) > slope:
return i
return None
```

In [14]:

```
# find the critical point
cp = critical_point(component_size, slope = 50)
# some space either side of the critical point, with the
# right-hand side being more interesting and so getting more
bcp = int(cp * 0.8)
ucp = int(cp * 3)
fig = plt.figure(figsize = (5, 5))
plt.title("Details of the phase transition")
# plot the number of components
ax1 = fig.gca()
ax1.set_xlabel("Edges")
ax1.set_ylabel("Components", color = 'b')
ax1.plot(range(bcp, ucp), component_number[bcp:ucp], 'b-', label = 'Components')
for t in ax1.get_yticklabels():
t.set_color('b')
# plot component sizes against edges
ax2 = ax1.twinx()
ax2.set_ylabel("Component size", color = 'r')
ax2.plot(range(bcp, ucp), component_size[bcp:ucp], 'r-', label = "Component size")
for t in ax2.get_yticklabels():
t.set_color('r')
# add a line to show where we decided the critical point was
ax1.plot([cp, cp], # x's: vertical line at the critical point
ax1.get_ylim(), # y's: the y axis' extent
'k:')
_ = plt.show()
```

*number of components* comes down fairly smoothly, the *size of the largest component* jumps quickly as smaller components amalgamate.

In [16]:

```
def make_er_giant_component_size_by_kmean( n ):
"""Return a model function for a network with the given number
of nodes, computing the fractional size of the giant component
for different mean degrees.
:param n: the number of nodes"""
def model( kmean ):
phi = kmean / n
er = networkx.erdos_renyi_graph(n, phi)
gc = len(max(networkx.connected_components(er), key = len))
S = (gc + 0.0) / n
return S
return model
fig = plt.figure(figsize = (5, 5))
# plot the observed behaviour
kmeans = numpy.linspace(0.0, 5.0, num = 20)
sz = map(make_er_giant_component_size_by_kmean(2000), kmeans)
plt.scatter(kmeans, sz, color = 'r', marker = 'D', label = 'experimental')
# plt the theoretical behaviour
ss = numpy.linspace(0.0, 1.0, endpoint = False)
plt.plot(map((lambda S: - math.log(1.0 - S) / S), ss), ss, 'k,', label = 'predicted')
plt.xlim([0, 5])
plt.ylim([0.0, 1.0])
plt.title('Expected vs observed sizes of giant component')
plt.xlabel('$\\langle k \\rangle$')
plt.ylabel('$S$')
plt.legend(loc = 'lower right')
_ = plt.show()
```

*one specific* ER network that *might* happen to have properties that cause a giant component to form, or not form, or form with a slightly different size than predicted, just because of some fluke of way the edges are added. The mathematical expression gives us the expected behaviour that's overwhemingly probable in the case of large ($N \rightarrow \infty$) networks – but it can be misleading in any single clase, and in smaller networks.

There are many more properties of components we could explore, but we'll stop here: Newman [New10] presents many more calculations, for example about how the distribution of component sizes changes as edges are added.

There's an important point to make about all we've said above. You'll have noticed that a lot of the arguments relied on averaging, for example in identifying the *average* (mean) degree as greater than 1, or finding the *expected* size of the giant component. You might have wondered whether these sorts of calculations would be possible if for whatever reason we weren't able to do averaging.

Averaging works well for large networks: indeed, for really large networks we *have* to rely on statistical techniques, as all the details will generally be unavailable. And it's certainly the case that lot of phenomena of interest for complex networks (and complex processes) depend strongly on these statistical properties, with only very weak dependence on the details. This means we can often ignore the fine structure, the **micro-scale structure** of a network, and treat them as instances of classes defined by their **macro-scale structure**, the high-level summary statistics. Indeed, this is the basis for the techniques for managing variance by repetition that we'll see later when we scale-out our simulations.

*But*. (There was obviously a *but* coming.) There are also examples in which fine structure *does* matter – and even more cases where variations or irregularities in the structure make a huge difference. We'll see examples of these later, but an easily-understood example is the way an epidemic spreads on a network with communities of more-than-averagely-connected nodes: easily within communities, but with more difficulty between them because of the lesser connectivity. This is true even for networks with the same mean degree: the modular structure changes the process' behaviour.

The ER networks are special not because they're random – lots of networks have randomness – but because they're *so perfectly* random. They have, on average (that word again...), no fine structure to worry about, and so arguments based on averaging work, both for properties like the degrees of nodes and also for repeating experiments over different networks with the same parameters.

What about for more complex situations? It turns out that the other main class of networks, the powerlaw networks, have similar (but different) regularities that can similarly be exploited. There are other cases that don't have such nice features, and – while we can sometimes fall back on more powerful mathematical techniques, such as those associated with generating functions – we'll often be placed in situations where only extensive and careful simulation will get us anywhere. And simulation often requires an understanding of how the network is put together at a macro level as well as some understanding at least of the micro level, so the mathematical and computational views remain entwined.