Graphs and Social Networks

Paths and Biological Networks

The fundamental notion of connectivity in a network is closely related to the notion of paths in graphs.


A path in a graph is a sequence of nodes, where any pair of consecutive nodes in the sequence is (linked by) an edge in the graph.


Example: $A$-$B$-$C$-$D$ is a path in the graph above.

Such a path can have repeated nodes. If it doesn’t, the path is called a simple path. The length of a path is the number of edges it involves (that is the number of nodes, minus 1). At each vertex $x$, there is a unique path of length zero, consisting only of that vertex $x$.

A cycle is a path of length at least 3 that is simple, except for the first and last nodes being the same.

Example: $B$-$C$-$D$-$B$ is a cycle in the graph above.

A cycle in a simple graph provides, for any two nodes on that cycle, (at least) two different paths from one to the other.

Note that each edge (and node) of the 1970 Internet graph belongs to a cycle. This makes the other way around the cycle an alternative route in case one of the edges should fail.

In a directed network, paths are directed, too. A path from a vertex $x$ to a vertex $y$ is a sequence of vertices $x = x_0, x_1, \dots, x_k = y$ such tha, for any $i = 1, \dots, k$, there is and edge from $x_{i-1}$ to $x_i$ in the graph.

Connected Components

A simple graph is connected if, for every pair of nodes, there is a path between them.

Communication and transportation networks tend to be connected, as this is their main purpose: to connect.

If a graph is not connected, it naturally breaks into pieces, its connected components.

The relation ‘there is a path from $x$ to $y$ on the node set $X$ of a graph is the transitive closure of the graph relation ‘there is an edge between $x$ and $y$’. It is reflexive (as each node $x$ is connected to itself by the zero length path starting and ending at $x$), symmetric (as a path from $x$ to $y$ can be used backwards as a path from $y$ to $x$), and transitive (as a path from $x$ to $y$ and a path from $y$ to $z$ together make up a path from $x$ to $z$), hence an equivalence relation. The connected components of the graph are the parts (equivalence classes) of the corresponding partition of $X$.

Example. The connected components of the graph below are the node sets $\{A, B\}$, $\{C, E\}$, $\{D\}$, and $\{F,G,H,I,J,K,L,M\}$. Note that a component can consist of a single node only.


Biological Networks

Networks are widely used in many branches of biology as a convenient representation of patterns of interaction between biological elements.

Biochemical networks

Biochemical networks represent molecular level patterns of interaction and control mechanisms in the biological cell. One distinguishes between

  • metabolic networks,
  • protein-protein interaction networks and
  • genetic regulatory networks.

Metabolism is the chemical process by which a cell breaks down food or other nutrients into smaller blocks and then reassembles those blocks to form the biological molecules needed to complete other tasks. This breakdown and reassembly process can involve chains or pathways of successive chemical reactions, involving different groups of blocks at each step. The complete set of all possible reactions and all the pathways forms a metabolic network.

The most precise representation of a metabolic network uses a bipartite graph, like an affiliation network. Here one type of vertices are metabolites, the chemicals produced and consumed by the reactions. The other type of vertices represents the reactions themselves. Metabolites then are linked to the reactions they participate in. Arrows on the edges can be used to distinguish between ingoing and outgoing metabolites.


A third type of vertices can be used to represent enzymes which catalyze certain reactions (but are not consumed by them). The resulting graph then is a partly directed and partly undirected tripartite network.


In practice, however, the most common representation of a metabolic network is a projection on the set of metabolites, with an edge between any two of them, if they participate in a common reaction. Metabolic pathways can be highlighted by mapping the network in a subway map style.

In a protein-protein interaction network the vertices are proteins and two vertices are connected by an undirected edge if the corresponding protein interact, physically rather than chemically.

Proteins are long-chain molecules formed by the concatenation of a series of basic units called amino acids. Once created, a protein folds on itself in a form whose shape depends on the amino acid sequence. The folded form dictates the physical interaction it can have with other molecules.


Again, a bipartite network could be used as a more faithful representation of protein-protein interaction, with proteins and interactions as different types of vertex, and undirected edges connection proteins to the interactions in which they participate.

A genetic regulatory network also has proteins as its vertices, or equivalently the genes that code for them. There is a directed edge from gene A to gene B indicating that A regulates the expression of B. One can distinguish between promoting and inhibiting transcription factors, by giving the network two distinct types of edges. A more detailed description of this relationship between proteins follows.

A protein’s amino acid sequence is determined by a corresponding sequence stored in the DNA of the cell in which the protein is synthesized. The DNA code for a single protein is called a gene. The process of creating a protein from a gene via a copy of the gene made of RNA is called expressing the gene.

It is important for the cell to respond to its environment by turning on or off the production of individual proteins. It does this by the use of transcription factors, which are themselves proteins. The presence in the cell of the transcription factor for the gene turns on or enhances the expression of that gene, or inhibits it, depending on the type of transcription factor.

Neural networks

The primary information processing element in the brain is the neuron, a specialized brain cell that combines several inputs to generate a single output. A typical neuron consists of a cell body, along with a number of protruding tentacles, called dendrites, which are input wires for carrying signals in the cell. Most neurons have only one output, called the axon, which is typically longer than the dendrites. It usually branches near its end into axon terminals to allow the output of the cell to feed the input of several others.


The signals that travel within neurons are electrochemical in nature. Transmission of a signal from one neuron to the next happens through chemical neuro-transmitters. The receiving neuron sums the inputs from its dendrites and as a result may or may not send an output signal down its own axon. Inputs can also be inhibiting; signals received at inhibiting inputs make the receiving neuron less likely to fire.

At the simplest level, a neural network can be represented as a set of vertices, the neurons, connected by two types of directed edges, one for excitatory inputs and one for inhibiting inputs. In practice, neurons are not all the same. This variation can be encoded in the network representation by different types of vertices.

human connectome project

Ecological networks

Ecological networks are networks of ecological interactions between species. Species in an ecosystem can interact in different ways: they can eat one another, they can parasitize one another, or they can have any of a variety of mutually advantageous interactions, such as pollination or seed dispersal.

A food web is a directed network that represents which species prey on which others in a given ecosystem. The vertices of the network correspond to species and the directed edges to predator-prey interactions. In fact, ecologists conventionally draw edges in the opposite direction, from prey to predator, that is, in the direction of energy flow. Food webs are typically directed acyclic graphs (DAGs). The acyclic nature of food webs indicate that there is an intrinsic hierarchy among the species in ecosystems.

food web

Those higher up the hierarchy prey on those lower down, but not vice versa, although there are some counterexamples. A species’s position in this hierarchy is called by ecologists its trophic level. This is the rank of the species’ vertex on the acyclic food graph, that is the length of the longest path leading to the vertex representing the species. Those species in lower trophic levels tend to be smaller and more abundant, while those in higher trophic positions are usually larger-bodied and less numerous predators. A food chain is one particular route, or path, through a food web.