Graphs and Social Networks
Random Graphs, Giant Components
Large, complex networks often have one giant component that contains a significant proportion of the nodes. Erdős and Rényi (1960) have found this to be almost certainly the case with a random graph on $n$ vertices with $\frac{n}{2}$ edges.
Example.
A random graph is a mathematical model of a family of networks, where certain parameters (like the number of nodes and edges) have fixed values, but other aspects (like the actual edges) are randomly assigned. The simplest example of a random graph is in fact the network $G(n,m)$ with fixed numbers $n$ of nodes and $m$ of edges, placed randomly between the vertices. Although a random graph is not a specific object, many of its properties can be described precisely, in the form of expected values, or probability distributions.
Here we will look at degree distributions and the emergence of a giant component in a random graph. Other properties of random graphs and their variants are discussed in [Newman, Chapter 12], in [Latora et el., Chapter 3] and in [Barabasi, Chapter 3].
Random Graphs.
Let us denote by $G(n, m)$ a network with $n$ nodes and $m$ chosen edges, chosen uniformly at random (out of the possible $\binom{n}{2}$). Equivalently, one can choose uniformly at random one network in the set $\mathfrak{G}(n, m)$ of all networks on a given set of $n$ nodes with exactly $m$ edges.
More precisely, one should think of $G(n, m)$ as a probability distribution $P \colon \mathfrak{G}(n, m) \to \mathbb{R}$, that assigns to each network $G \in \mathfrak{G}(n, m)$ the same probability $P(G) = \Omega^{1}$ for
where $N = \binom{n}{2}$.
Then, for example, the diameter $\langle l \rangle$ of $G(n, m)$ can be determined as the average of the diameters $l(G)$ of the networks $G \in \mathfrak{G}(n, m)$:
In the same way, the number $\langle m \rangle$ of edges of $G(n, m)$ is
as every network $G \in \mathfrak{G}(n, m)$ has exactly $m$ edges, and the average degree $\langle k \rangle$ of the nodes in $G(n, m)$ is
as every network $G$ with $n$ nodes and $m$ edges has average degree $\frac{2m}{n}$.
The ErdősRényi model $G(n, p)$
The calculation of other properties of $G(n, m)$ are less straightforward. It turns out that it is easier to work in the related, but different random graph $G(n, p)$. Here $p \in [0, 1]$ is a probability, and $G(n, p)$ is a network on a given set of $n$ nodes, where each of the $N = \binom{n}{2}$ possible edges is present with probability $p$ (and absent with probability $(1p)$). This model gives a probability distribution $P$ on the set of all networks on the given $n$ nodes, where
if $G$ has exactly $m$ nodes. The networks in this model may have different numbers of edges. In fact $G(n, p)$ could be the empty network with no edges between the $n$ nodes at all (with (small) probability $(1p)^N$), or the complete graph on $n$ nodes (with (small) probability $p^N$).
Number of Edges. The expected number of edges $\langle m \rangle$ of $G(n, p)$ is determined as follows. First note that the probability of $G$ having exactly $m$ edges is
as each of the $\binom{N}{m}$ graphs $G$ with $m$ edges occurs with equal probability $P(G) = p^m (1p)^{Nm}$. This is a binomial distribution. The expected number of edges then is
taking into account that the case $m = 0$ contributes $0$ to the sum. Using the absorption law, $\binom{n}{k} = \frac{n}{k} \binom{n1}{k1}$, one finally gets
as
by the Binomial Theorem.
So $\langle m \rangle = N p$, which is not really surprising since $p = \langle m \rangle / N$ is the expected proportion of edges present in $G(n, p)$.
Mean Degree. The expected mean degree $\langle k \rangle$ is derived from the expected number of edges as
This again is no surprise, as any node is connected with probability $p$ to any of its $(n1)$ potential neighbours in the network. We use the notation
from now on for the mean degree of $G(n, p)$.
Degree Distribution
The probability of a node to have exactly degree $k$ is
reflecting the fact that there are $\binom{n1}{k}$ ways to choose $k$ other nodes to connect to, and that the probability of being connected to exactly those (and not connected to the remaining $n  1  k$ nodes) is $p^k (1p)^{n1k}$. This is again a binomial distribution.
Poisson’s Theorem, also known as the Law of Rare Events, states that, for large values of $n$ (keeping the mean degree $c = (n1)p$ constant), the binomial distribution can be approximated by a Poisson distribution. This can be seen as follows.
Assuming $n$ large makes $p = c/(n1)$ small. Taking logarithms,
using the Taylor series for the logarithm,
and ignoring (small) terms of higher order. Thus
Moreover,
whence
This is the Poisson distribution. For large $n$, the degrees in the random graph $G(n, p)$ follow a Poisson distribution.
Clustering Coefficient.
Define the clustering coefficient $C$ of a network $G = (X, E)$ as the probability that two neighbours of a node in the network are connected to each other, that is the proportion of the paths of length $2$ whose vertices form a triangle in the network. The clustering coefficient thus measures the transitivity, or the degree of triadic closure, in a network. (This notion of a clustering coefficient in general is different from the average of the (local) clustering coefficients of the individual nodes.)
In the random graph $G(n, p)$ the probability of any edge is just $p$, hence
For fixed $c$, this number tends to $0$ with $n \to \infty$.
The Giant Component.
For $p = 0$, the largest component of the random graph $G(n, p)$ has size $1$ (as each node is a connected component of its own).
At the other extreme, for $p = 1$, the entire network forms a single component of size $n$.
Somewhere between these extreme ends, the network undergoes a phase transition, from a network with a largest component of constant size (independent of $n$), to a network with largest component of a size proportional to $n$ (growing in size with $n$). A giant component is a component whose size is extensive, i.e., proportional to $n$.
Interactive Example. In this example, a network of $n = 100$ nodes, starting off as an empty network, becomes more and more connected, through the addition of randomly chosen edges. Most of the time, one can observe the emergence of a dominant large component after about $\tfrac12 n = 50$ edges have been added. The growth stops when there is only a single connected component left, usually after adding about 250 edges. Nodes grow in size according to their degree. Press CtrlR to reload.
Using properties of the Poisson distribution, it can be shown that in a random graph $G(n, p)$, when the average degree $c = (n1)p$ increases from $0$ upwards,
 for $c < 1$, the graph has no giant component, and the individual (small) components are trees;
 at $c = 1$, the graph has large components and the individual components may contain cycles;
 for $c > 1$, there is a single giant component which has loops, whereas all other components are trees;
 for $c$ larger than $\ln n$ the network is a single giant component.
Real networks are not random
Most real networks have a giant component, and the random graph helps to explain this.
However, in many respects a real network differs from a random network:

Social networks are small worlds due to short paths and cycles; a random graph consists mostly of trees and cycles, if any, tend to be long.

Social networks contain highly connected communities, forming almost complete subgraphs; a random network only becomes highly connected when $p$ is very large and almost all edges are present.

The clustering coefficient in a social network is high (as high as $0.20$ in the actors costarring network), whereas the clustering coefficient $C = \tfrac{c}{n1}$ of the random $G(n, p)$ is small, tending to $0$ for growing $n$ and fixed $c$.

Most significantly, the degree distribution of a real network does not follow a Poisson distribution due to the presence of highly connected vertices. More on this later.