Giant Component in Random Multipartite Graphs with Given Degree Sequences

We study the problem of the existence of a giant component in a random multipartite graph. We consider a random multipartite graph with $p$ parts generated according to a given degree sequence $n_i^{\mathbf{d}}(n)$ which denotes the number of vertices in part $i$ of the multipartite graph with degree given by the vector $\mathbf{d}$. We assume that the empirical distribution of the degree sequence converges to a limiting probability distribution. Under certain mild regularity assumptions, we characterize the conditions under which, with high probability, there exists a component of linear size. The characterization involves checking whether the Perron-Frobenius norm of the matrix of means of a certain associated edge-biased distribution is greater than unity. We also specify the size of the giant component when it exists. We use the exploration process of Molloy and Reed combined with techniques from the theory of multidimensional Galton-Watson processes to establish this result.


Introduction
The problem of the existence of a giant component in random graphs was first studied by Erdös and Rényi.In their classical paper [ER60], they considered a random graph model on n and m edges where each such possible graph is equally likely.They showed that if m/n > 1 2 + ǫ, with high probability as n → ∞ there exists a component of size linear in n in the random graph and that the size of this component as a fraction of n converges to a given constant.
The degree distribution of the classical Erdös-Rényi random graph has Poisson tails.However in many applications the degree distribution associated with an underlying graph does not satisfy this.For example, many so-called "scale-free" networks exhibit power law distribution of degrees.This motivated the study of random graphs generated according to a given degree sequence.The giant component problem on a random graph generated according to a given degree sequence was considered by Molloy and Reed [MR95].They provided conditions on the degree distribution under which a giant component exists with high probability.Further in [MR98], they also showed that the size of the giant component as a fraction of the number of vertices converges in probability to a given positive constant.They used an exploration process to analyze the components of vertices of the random graph to prove their results.Similar results were established by Janson and Luczak in [JL08] using different techniques based on the convergence of empirical distributions of independent random variables.There have been several papers that have proved similar results with similar but different assumptions and tighter error bounds [HM12], [BR12], [Rio12].Results for the critical phase for random graphs with given degree sequences were derived by Kang and Seierstad in [KS08].All of these results consider a random graph on n vertices with a given degree sequence where the distribution is uniform among all feasible graphs with the given degree sequence.The degree sequence is then assumed to converge to a probability distribution and the results provide conditions on this probability distribution for which a giant component exists with high probability.
In this paper, we consider random multipartite graphs with p parts with given degree distributions.Here p is a fixed positive integer.Each vertex is associated with a degree vector d, where each of its component d i , i ∈ [p] dictates the number of neighbors of the vertex in the corresponding part i of the graph.As in previous papers, we assume that the empirical distribution associated with the number of vertices of degree d converges to a probability distribution.We then pose the problem of finding conditions under which there exists a giant component in the random graph with high probability.Our approach is based on the analysis of the Molloy and Reed exploration process.The major bottleneck is that the exploration process is a multidimensional process and the techniques of Molloy and Reed of directly underestimating the exploration process by a one dimensional random walk does not apply to our case.In order to overcome this difficultly, we construct a linear Lyapunov function based on the Perron-Frobenius theorem, a technique often used in the study of multidimensional branching processes.Then we carefully couple the exploration process with some underestimating process to prove our results The coupling construction is also more involved due to the multidimensionality of the process.This is because in contrast to the unipartite case, there are multiple types of clones (or half-edges) involved in the exploration process, corresponding to which pair of parts of the multipartite graph they belong to.At every step of the exploration process, revealing the neighbor of such a clone leads to the addition of clones of several types to the component being currently explored.The particular numbers and types of these newly added clones is also dependent on the kind of clone whose neighbor was revealed.So, the underestimating process needs to be constructed in a way such that it simultaneously underestimates the exploration process for each possible type of clone involved.We do this by choosing the parameters of the underestimating process such that for each type of clone, the vector of additional clones which are added by revealing its neighbor is always component wise smaller than the same vector for the exploration process.
All results regarding giant components typically use a configuration model corresponding to the given degree distribution by splitting vertices into clones and performing a uniform matching of the clones.In the standard unipartite case, at every step of the exploration process all available clones can be treated same.However in the multipartite case, this is not the case.For example, the neighbor of a vertex in part 1 of the graph with degree d can lie in part j only if d j > 0. Further, this neighbor must also have a degree d such that di > 0. This poses the issue of the graph breaking down into parts with some of the p parts of the graph getting disconnected from the others.To get past this we make a certain irreducibility assumption which we will carefully state later.This assumption not only addresses the above problem, but also enables us to construct linear Lyapunov functions by using the Perron-Frobenius theorem for irreducible non-negative matrices.We also prove that with the irreducibility assumption, the giant component when it exists is unique and has linearly many vertices in each of the p parts of the graph.In [BR12], Bollobas and Riordan show that the existence and the size of the giant component in the unipartite case is closely associated with an edge-biased branching process.In this paper, we also construct an analogous edge-biased branching process which is now a multi-type branching process, and prove similar results.
Our study of random multipartite graphs is motivated by the fact that several real world networks naturally demonstrate a multipartite nature.The author-paper network, actor-movie network, the network of company ownership, the financial contagion model, heterogenous social networks, etc. are all multipartite [New01], [BEST04], [Jac08].Examples of biological networks which exhibit multipartite structure include drug target networks, protein-protein interaction networks and human disease networks [GCV + 07], [YGC + 07], [MBHG06].In many cases evidence suggests that explicitly modeling the multipartite structure results in more accurate models and predictions.
Random bipartite graphs (p = 2) with given degree distributions were considered by Newmann et.al in [NSW01].They used generating function heuristics to identify the critical point in the bipartite case.However, they did not provide rigorous proofs of the result.Our result establishes a rigorous proof of this result and we show that in the special case p = 2, the conditions we derive is equivalent to theirs.
The rest of the paper is structured as follows.In Section 2, we start by introducing the basic definitions and the notion of a degree distribution for multipartite graphs.In Section 3, we formally state our main results.Section 4 is devoted to the description of the configuration model.In Section 5, we describe the exploration process of Molloy and Reed and the associated distributions that govern the evolution of this process.In Section 6 and Section 7, we prove our main results for the supercritical case, namely when a giant component exists with high probability.In Section 8 we prove a sublinear upper bound on the size of the largest component in the subcritical case.

Definitions and preliminary concepts
We consider a finite simple undirected graph G = (V, E) where V is the set of vertices and E is the set of edges.We use the words "vertices" and "nodes" interchangeably.A path between two vertices there is a path between any two vertices in C. A family of random graphs {G n } on n vertices is said to have a giant component if there exists a positive constant ǫ > 0 such that P(There exists a component C ⊆ G n for which |C| n ≥ ǫ) → 1. Subsequently, when a property holds with probability converging to one as n → ∞, we will say that the property hold with high probability or w.h.p. for short.
For any integer p, we use [p] to denote the set {1, 2, . . ., p}.For any matrix M ∈ Ê m×n , we denote by M max i,j |M ij |, the largest element of the matrix M in absolute value.
It is easy to check that • is a valid matrix norm.We use δ ij to denote the Kronecker delta function defined by We denote by 1 the all ones vector whose dimension will be clear from context.The notion of an asymptotic degree distribution was introduced by Molloy and Reed [MR95].In the standard unipartite case, a degree distribution dictates the fraction of vertices of a given degree.In this section we introduce an analogous notion of an asymptotic degree distribution for random multipartite graphs.We consider a random multipartite graph G on n vertices with p parts denoted by G 1 , . . ., G p .For any i ∈ [p] a vertex v ∈ G i is associated with a "type" d ∈ p + which we call the "type" of v.This means for each i = 1, 2, . . ., p, the node with type d has d(i) d i neighbors in G i .A degree distribution describes the fraction of vertices of type d in G i , i ∈ [p].We now define an asymptotic degree distribution as a sequence of degree distributions which prescribe the number of vertices of type d in a multipartite graph on n vertices.For a fixed n, let D(n) , 1, . . ., n} p which denotes the fraction of vertices of each type in each part.Accordingly, we write For any vector degree d the quantity 1 ′ d is simply the total degree of the vertex.We define the quantity which is the maximum degree associated with the degree distribution D(n).To prove our main results, we need additional assumptions on the degree sequence.
Assumption 1.The degree sequence {D(n)} n∈AE satisfies the following conditions: (a) For each n ∈ AE there exists a simple graph with the degree distribution prescribed by D(n), i.e., the degree sequence is a feasible degree sequence.
(b) There exists a probability distribution p = p d i , i ∈ [p], d ∈ p + such that the sequence of probability distributions p(n) associated with D(n) converges to the distribution p.
The second moment of the degree distribution given by d Note that the quantity . So this condition implies that the total number of edges is O(n) , i.e., the graph is sparse.In condition (e) the quantity . So this condition says that sum of the squares of the degrees is O(n).It follows from condition (c) that λ j i < ∞ and that λ j i (n) → λ j i .The quantity λ j i is asymptotically the fraction of outgoing edges from G i to G j .For p to be a valid degree distribution of a multipartite graph, we must have for each 1 ≤ i < j ≤ p, λ j i = λ i j and for every n, we must have λ j i (n) = λ i j (n).We have not included this in the above conditions because it follows from condition (a).Condition (d) excludes the case where there are sublinear number of edges between G i and G j .
There is an alternative way to represent some parts of Assumption 1.For any probability distribution p on p + , let D p denote the random variable distributed as p.Then (b), (c) and (e) are equivalent to the following.
The following preliminary lemmas follow immediately.
Lemma 1.The conditions (b'), (c') and (e') together imply that the random variables are uniformly integrable.
Then using Lemma 1, we prove the following statement.
Lemma 2. The maximum degree satisfies ω(n Proof.For any ǫ > 0, by Lemma 1, there exists q ∈ such that E[(1 Observe that for large enough n, we have max{ Note that by condition (a), the set of feasible graphs with the degree distribution is non-empty.The random multipartite graph G we consider in this paper is drawn uniformly at random among all simple graphs with degree distribution given by D(n).The asymptotic behavior of D(n) is captured by the quantities p d i .The existence of a giant component in G as n → ∞ is determined by the distribution p.

Statements of the main results
The neighborhood of a vertex in a random graph with given degree distribution resembles closely a special branching process associated with that degree distribution called the edgebiased branching process.A detailed discussion of this phenomenon and results with strong guarantees for the giant component problem in random unipartite graphs can be found in [BR12] and [Rio12].The edge biased branching process is defined via the edge biased degree distribution that is associated with the given degree distribution.Intuitively the edge-biased degree distribution can be thought of as the degree distribution of vertices reached at the end point of an edge.Its importance will become clear when we will describe the exploration process in the sections that follow.We say that an edge is of type (i, j) if it connects a vertex in G i with a vertex in G j .Then, as we will see, the type of the vertex in G j reached by following a random edge of type (i, j) is d with probability We now introduce the edge-biased branching process which we denote by T .Here T is a multidimensional branching process.The vertices of T except the root are associated with types (i, j) ∈ S.So other than the root, T has N ≤ p 2 types of vertices.The root is assumed to be of a special type which will become clear from the description below.The process starts off with a root vertex v.With probability p d i , the root v gives rise to d j children of type (i, j) for each j ∈ [p].To describe the subsequent levels of T let us consider any vertex with type (i, j).With probability The number of children generated by the vertices of T is independent for all vertices.For each n, we define an edge-biased branching process T n which we define in the same way as T by using the distribution D(n) instead of D. We will also use the notations T (v) and T n (v) whenever the type of the root node v is specified.We denote the expected number of children of type (j, m) generated by a vertex of type (i, j) by It is easy to see that µ ijjm ≥ 0. Assumption 1(e) guarantees that µ ijjm is finite.Note that a vertex of type (i, j) cannot have children of type (l, m) if j = l.But for convenience we also introduce µ ijlm = 0 when j = l.By means of a remark we should note that it is also possible to conduct the analysis when we allow the second moments to be infinite (see for example [MR95], [BR12]), but for simplicity, we do not pursue this route in this paper.
Introduce a matrix M ∈ Ê N defined as follows.Index the rows and columns of the matrix with double indices (i, j) ∈ S.There are N such pairs denoting the N rows and columns of M .The entry of M corresponding to row index (i, j) and column index (l, m) is set to be µ ijlm .
Definition 1.Let A ∈ Ê N ×N be a matrix.Define a graph H on N nodes where for each pair of nodes i and j, the directed edge (i, j) exists if and only if A ij > 0. Then the matrix A is said to be irreducible if the graph H is strongly connected, i.e., there exists a directed path in H between any two nodes in H.
We now state the well known Perron-Frobenius Theorem for non-negative irreducible matrices.This theorem has extensive applications in the study of multidimensional branching processes (see for example [KS66]).
Theorem 1 (Perron-Frobenius Theorem).Let A be a non-negative irreducible matrix.Then (a).A has a positive eigenvalue γ > 0 such that any other eigenvalue of A is strictly smaller than γ in absolute value.
(b).There exists a left eigenvector x of A that is unique up to scalar multiplication associated with the eigenvalue γ such that all entries of x are positive.
We introduce the following additional assumption before we state our main results.
Assumption 2. The degree sequence {D(n)} n∈AE satisfies the following conditions.
(a).The matrix M associated with the degree distribution p is irreducible.
Assumption 2 eliminates several degenerate cases.For example consider a degree distribution with p = 4, i.e., a 4-partite random graph.Suppose for i = 1, 2, we have p d i is non-zero only when d 3 = d 4 = 0, and for i = 3, 4, p d i is non-zero only when d 1 = d 2 = 0.In essence this distribution is associated with a random graph which is simply the union of two disjoint bipartite graphs.In particular such a graph may contain more than one giant component.However this is ruled out under our assumption.Further, our assumption allows us to show that the giant component has linearly many vertices in each of the p parts of the multipartite graph. Let Namely, η is the survival probability of the branching process T .We now state our main results.
Theorem 2. Suppose that the Perron Frobenius eigenvalue of M satisfies γ > 1.Then the following statements hold.
(a) The random graph G has a giant component C ⊆ G w.h.p.Further, the size of this component C satisfies for any ǫ > 0.
(b) All components of G other than C are of size O(log n) w.h.p.
Theorem 3. Suppose that the Perron Frobenius eigenvalue of M satisfies γ < 1.Then all components of the random graph The conditions of Theorem 2 where a giant component exists is generally referred to in the literature as the supercritical case and that of Theorem 3 marked by the absence of a giant component is referred to as the subcritical case.The conditions under which giant component exists in random bipartite graphs was derived in [NSW01] using generating function heuristics.We now consider the special case of a bipartite graph and show that the conditions implied by Theorem 2 and Theorem 3 reduce to that in [NSW01].In this case p = 2 and N = 2.The type of all vertices d in G 1 are of the form d = (0, j) and those in G 2 are of the form d = (k, 0).To match the notation in [NSW01], we let p d 1 = p j when d = (0, j) and Using the definition of µ 1221 from equation (2), we get Similarly we can compute µ 2112 = j j(j−1)p j The Perron-Frobenius norm of M is its spectral radius and is given by (µ 1221 )(µ 2112 ).So the condition for the existence of a giant component according to Theorem 2 is given by (µ 1221 )(µ 2112 ) − 1 > 0 which after some algebra reduces to This is identical to the condition mentioned in [NSW01].The rest of the paper is devoted to the proof of Theorem 2 and Theorem 3.

Configuration Model
The configuration model [Wor78], [Bol85], [BC78] is a convenient tool to study random graphs with given degree distributions.It provides a method to generate a multigraph from the given degree distribution.When conditioned on the event that the graph is simple, the resulting distribution is uniform among all simple graphs with the given degree distribution.We describe below the way to generate a configuration model from a given multipartite degree distribution.
1.For each of the n d i (n) vertices in G i of type d introduce d j clones of type (i, j).An ordered pair (i, j) associated with a clone designates that the clones belongs to G i and has a neighbor in G j .From the discussion following Assumption 1, the number of clones of type (i, j) is same as the number of clones of type (j, i).
2. For each pair (i, j), perform a uniform random matching of the clones of type (i, j) with the clones of type (j, i).
3. Collapse all the clones associated with a certain vertex back into a single vertex.This means all the edges attached with the clones of a vertex are now considered to be attached with the vertex itself.
The following useful lemma allows us to transfer results related to the configuration model to uniformly drawn simple random graphs.Lemma 3. If the degree sequence {D(n)} n∈AE satisfies Assumption 1, then the probability that the configuration model results in a simple graph is bounded away from zero as n → ∞.
As a consequence of the above lemma, any statement that holds with high probability for the random configuration model is also true with high probability for the simple random graph model.So we only need to prove Theorem 2 and Theorem 3 for the configuration model.
The proof of Lemma 3 can be obtained easily by using a similar result on directed random graphs proved in [COC13].The specifics of the proof follow.
Proof of Lemma 3. In the configuration model for multipartite graphs that we described, we can classify all clones into two categories.First, the clones of the kind, (i, i) ∈ S and the clones of the kind (i, j) ∈ S, i = j.Since the outcome of the matching associated with each of the cases is independent, we can treat them separately for this proof.For the first category, the problem is equivalent to the case of configuration model for standard unipartite graphs.More precisely, for a fixed i, we can construct a standard degree distribution D(n) from D(n) by taking the i th component of the corresponding vector degrees of the latter.By using Assumptions 1, our proof then follows from previous results for unipartite case.
For the second category, first let us fix (i, j) with i = j.Construct a degree distribution by interchanging i and j.We consider a bipartite graph where degree distribution of the vertices in part i is given by D i (n) for i = 1, 2. We form the corresponding configuration model and perform the usual uniform matching between the clones generated from D 1 (n) with the clones generated from D 2 (n).This exactly mimics the outcome of matching that occurs in our original multipartite configuration model between clones of type (i, j) and (j, i).With this formulation, the problem of controlling number of double edges is very closely related to a similar problem concerning the configuration model for directed random graphs which was studied in [COC13].To precisely match their setting, add "dummy" vertices with zero degree to both D 1 (n) and D 2 (n) so that they have exactly n vertices each and then arbitrarily enumerate the vertices in each with indices from [n].From Assumption 1 it can be easily verified that the degree distributions D 1 (n) and D 2 (n) satisfy Condition 4.2 in [COC13].To switch between our notation and theirs, use D 1 (n) → M [n] and D 2 (n) → D [n] .Then Theorem 4.3 in [COC13] says that the probability of having no self loops and double edges is bounded away from zero.In particular, observing that self loops are irrelevant in our case, we conclude that lim n→∞ P(No double edges) > 0. Since the number of pairs (i, j) is less than or equal to p(p − 1) which is a constant with respect to n, the proof is now complete.

Exploration Process
In this section we describe the exploration process which was introduced by Molloy and Reed in [MR95] to reveal the component associated with a given vertex in the random graph.We say a clone is of type (i, j) if it belongs to a vertex in G i and has its neighbor in G j .We say a vertex is of type (i, d) if it belongs to G i and has degree type d.We start at time k = 0.At any point in time k in the exploration process, there are three kinds of clones -'sleeping' clones , 'active' clones and 'dead' clones.For each (i, j) ∈ S, the number of active clones of type (i, j) at time k are denoted by A j i (k) and the total number of active clones at time k is given by A(k) = (i,j)∈S A j i (k).Two clones are said to be "siblings" if they belong to the same vertex.The set of sleeping and awake clones are collectively called 'living' clones.We denote by L i (k) the number of living clones in G i and L j i (k) to be the number of living clones of type (i, j) at time k.It follows that j∈ If all clones of a vertex are sleeping then the vertex is said to be a sleeping vertex, if all its clones are dead, then the vertex is considered dead, otherwise it is considered to be active.At the beginning of the exploration process all clones (vertices) are sleeping.We denote the number of sleeping vertices in G i of type d at time k by and N S (0) = n.We now describe the exploration process used to reveal the components of the configuration model.

Exploration Process.
1. Initialization: Pick a vertex uniformly at random from the set of all sleeping vertices and and set the status of all its clones to active.
2. Repeat the following two steps as long as there are active clones: (a).Pick a clone uniformly at random from the set of active clones and kill it.(b).Reveal the neighbor of the clone by picking uniformly at random one of its candidate neighbors.Kill the neighboring clone and make its siblings active.
3. If there are alive clones left, restart the process by picking an alive clone uniformly at random and setting all its siblings to active, and go back to step 2. If there are no alive clones, the exploration process is complete.
Note that in step 2(b), the candidate neighbors of a clones of type (i, j) are the set of alive clones of type (j, i).
The exploration process enables us to conveniently track the evolution in time of the number of active clones of various types.We denote the change in A j i (k) by writing Define Z(k) Z j i (k), (i, j) ∈ S to be the vector of changes in the number of active clones of all types.To describe the probability distribution of the changes Z j i (k + 1), we consider the following two cases.
Let E j i denote the event that in step 2-(a) of the exploration process, the active clone picked was of type (i, j).The probability of this event is In that case we kill the clone that we chose and the number of active clones of type (i, j) reduces by one.Then we proceed to reveal its neighbor which of type (j, i).One of the following events happen: (i).E a : the neighbor revealed is an active clone.The probability of the joint event is given by Such an edge is referred to as a back-edge in [MR95].The change in active clones of different types in this joint event is as follows.
-If i = j, -If i = j, The neighbor revealed is a sleeping clone of type d.The probability of this joint event is given by -If i = j, Note that the above events are exhaustive, i.e., i,j∈S d In this case, we choose a sleeping clone at random and make it and all its siblings active.Let E j i be the event that the sleeping clone chosen was of type (i, j).Further let E d be the event that this clone belongs to a vertex of type (i, d).Then we have .
In this case the change in the number of active clones of different types is given by We emphasize here that there are two ways in which the evolution of the exploration process deviates from that of the edge-biased branching process.First, a back-edge can occur in the exploration process when neighbor of an active clone is revealed to be another active clone.Second, the degree distribution of the exploration process is time dependent.However, close to the beginning of the process, these two events do not have a significant impact.We exploit this fact in the following sections to prove Theorem 2 and 3.

Supercritical Case
In this section we prove the first part of Theorem 2. To do this we show that the number of active clones in the exploration process grows to a linear size with high probability.Using this fact, we then prove the existence of a giant component.The idea behind the proof is as follows.We start the exploration process described in the previous section at an arbitrary vertex v ∈ G.At the beginning of the exploration process, i.e. at k = 0 , we have So, close to the beginning of the exploration, a clone of type (i, j) gives rise to d m − δ im clones of type (j, m) with probability close to which in turn is close to for large enough n.If we consider the exploration process in a very small linear time scale, i.e. for k < ǫn for small enough ǫ, then the quantities λ j and the quantities are negligible.We use this observation to construct a process which underestimates the exploration process in some appropriate sense but whose parameters are time invariant and "close" to the initial degree distribution.We then use this somewhat easier to analyze process to prove our result.
We now get into the specific details of the proof.We define a stochastic process B j i (k) which we will couple with A j i (k) such that B j i (k) underestimates A j i (k) with probability one.We denote the evolution in time of B j i (k) by To define Ẑj i (k + 1), we choose quantities for some 0 < γ < 1 to be chosen later.
We now show that in a small time frame, the parameters associated with the exploration process do not change significantly from their initial values.This is made precise in Lemma 4 and Lemma 5 below.Before that we first introduce some useful notation to describe these parameters for a given n and at a given step k in the exploration process.Let M (n) denote the matrix of means defined analogous to M by replacing . Also for a fixed n, define M k (n) similarly by replacing from Assumption 1 it follows that Lemma 4. Given δ > 0, there exists ǫ > 0 and some integer n such that for all n ≥ n and for all time steps k ≤ ǫn in the exploration process we have d Proof.Fix ǫ 1 > 0. From Lemma 1 we have that that random variables 1 ′ D p(n) are uniformly integrable.Then there exists q ∈ such that for all n we have For each time step k ≤ ǫn in the exploration process we have So for small enough ǫ, for every (i, j) ∈ S we have where the last inequality can be obtained by choosing small enough ǫ 1 .Since q is a constant, by choosing small enough ǫ we can ensure that d 1 {1 ′ d≤q} Additionally from Assumption 1, for large enough n we have d The lemma follows by combining the above inequalities.
Lemma 5. Given δ > 0, there exists ǫ > 0 and some integer n such that for all n ≥ n and for all time steps k ≤ ǫn in the exploration process we have Proof.The argument is very similar to the proof of Lemma 4. Fix ǫ 1 > 0. From Lemma 1 we know that the random variables (1 ′ D p(n) ) 2 are uniformly integrable.It follows that there exists q ∈ such that for all n, we have From this we can conclude that for all i, j, m we have Also L j i (k) can change by at most 2ǫn.So, for small enough ǫ, by an argument similar to the proof of Lemma 4, we can prove analogous to (7) that By choosing ǫ small enough, we can also ensure Since M (n) converges to M we can choose n such that ||M (n) − M || ≤ δ 2 .By combining the last two inequalities, the proof is complete.Lemma 6.Given any 0 < γ < 1, there exists ǫ > 0, an integer n ∈ and quantities π d ij satisfying (5) and (6) and the following conditions for all n ≥ n: for each (i, j) ∈ S.
(b) The matrix M defined analogous to M by replacing where err(γ) is a term that satisfies lim γ→0 err(γ) = 0.
Proof.Choose q = q(γ) ∈ such that d . Now choose π d ji satisfying (5) and ( 6) such that π d ji = 0 whenever 1 ′ d > q.Using Lemma 4, we can now choose n and ǫ such that for every (i, j) ∈ S and d such that 1 ′ d ≤ q, (11) is satisfied for all n ≥ n and all k ≤ ǫn.The condition in part (a) is thus satisfied by this choice of π d ji .For any γ, let us denote the choice of π d ji made above by π d ji (γ).By construction, whenever M ijlm = 0, we also have Mijlm = 0. Suppose Also, by construction we have 0 ≤ π d ji (γ) < X γ be the random variable that takes the value (d m − δ im ) with probability π d ji (γ) and 0 with probability γ.Similarly, let X be the random variable that takes the value (d m − δ im ) with probability . Then, from the above argument have X γ → X as γ → 0 and that the random variable X dominates the random variable X γ for all γ ≥ 0. Note that X is integrable.The proof of part (b) is now complete by using the Dominated Convergence Theorem.
Assume that the quantities ǫ and π d ij have been chosen to satisfy the inequalities (11) and ( 12).We now consider each of the events that can occur at each step of the exploration process until time ǫn and describe the coupling between Z j i (k + 1) and Ẑj i (k + 1) in each case.
Suppose the event E j i happens.We describe the coupling in case of each of the following two events.(i).E a : the neighbor revealed is an active clone.In this case we simply mimic the evolution of the number of active clones in the original exploration process.Namely, Ẑm l (k + 1) = Z m l (k + 1) for all l, m. (ii).E d s : The neighbor revealed is a sleeping clone of type d.In this case, we split the event further into two events E d s,0 and In particular, For the above to make sense we must have π ji ≤ which is guaranteed by our choice of π d ij .We describe the evolution of B j i (k) in each of the two cases.(a).E d s,0 : in this case set Ẑm l (k + 1) = Z m l (k + 1) for all l, m. (b).E d s,1 : In this case, we mimic the evolution of the active clones of event E a instead of E d s .More specifically, -If i = j, Case 2: A(k) = 0. Suppose that event E j i ∩ E d happens.In this case we split E d into two disjoint events E d 0 and E d 1 such that Again, the probabilities above are guaranteed to be less than one for time k ≤ ǫn because of the choice of π d ij .The change in B j i (k + 1) in case of each of the above events is defined as follows.
(a) E d 0 . - This completes the description of the probability distribution of the joint evolution of the processes A j i (k) and B j i (k).Intuitively, we are trying to decrease the probability of the cases that actually help in the growth of the component and compensate by increasing the probability of the event which hampers the growth of the component (back-edges).From the description of the the coupling between Z j i (k + 1) and Ẑj i (k + 1) it can be seen that for time k < ǫn, with probability one we have Our next goal is to show that for some (i, j) ∈ S the quantity B j i (k) grows to a linear size by time ǫn.Let H(k) = σ({A j i (r), B j i (r), (i, j) ∈ S, 1 ≤ r ≤ k}) denote the filtration of the joint exploration process till time k.Then the expected conditional change in B j i (k) can be computed by considering the two cases above.First suppose that at time step k we have A(k) > 0, i.e., we are in Case 1.We first assume that i = j.Note that the only events that affect Ẑj i (k + 1) are The event E i m ∩E a affects Ẑj i (k +1) only when m = j, and in this case, Ẑj i (k +1) = −1.The same is true for the event where the last equality follows from (6).Now suppose that at time k we have A(k) = 0, i.e., we are in Case 2. In this case, we can similarly compute Using the description of the coupling in Case 2, the above expression is For the case i = j, a similar computation will reveal that we obtain very similar expressions to the case i = j.We give the expressions below and omit the computation.For Case 1, and for Case 2, A(k) = 0, Define the vector of expected change Then we can write the expected change of B j i (k) compactly as Fix δ > 0. Let γ be small enough such that the function err(γ) in (12) satisfies err(γ) ≤ δ.
Using Lemma 6 we can choose ǫ and π d ij satisfying (11) and (12).In particular, we have || M − M || ≤ δ.For small enough δ, both M and M have strictly positive entries in the exact same locations.Since M is irreducible, it follows that M is irreducible.The Perron-Frobenius eigenvalue of a matrix which is the spectral norm of the matrix is a continuous function of its entries.For small enough δ, the Perron-Frobenius eigenvalue of M is bigger than 1, say 1 + 2ζ for some ζ > 0. Let z be the corresponding left eigenvector with all positive entries and let z m min (i,j)∈S z j i and z M max (i,j)∈S z j i .Define the random process W (k) The first term satisfies 2ζz m ≤ 2ζz ′ A(k) ≤ 2ζz M .This is because 1 ′ A(k) = 1 and hence z ′ A(k) is a convex combination of the entries of z.By choosing γ small enough, we can ensure γz ′ QA(k) ≤ ζz m .Let κ = ζz m > 0.Then, we have We now use a one-sided Hoeffding bound argument to show that with high probability the quantity W (k) grows to a linear size by time ǫn.Let X(k + 1) = κ − ∆W (k + 1).Then Also note that |X(k + 1)| ≤ cω(n) almost surely, for some constant c > 0.
For any B > 0 and for any −B ≤ x ≤ B, it can be verified that Using the above, we get for any t > 0, , where the last statement follows from (16).We can now compute Optimizing over t, we get which follows by using Lemma 2. Substituting the definition of X(k + 1), Recall that Then it follows from (17) that there exists a pair (i ′ , j ′ ) such that Using the fact that the number of active clones grows to a linear size we now show that the corresponding component is of linear size.To do this, we continue the exploration process in a modified fashion from time ǫn onwards.By this we mean, instead of choosing active clones uniformly at random in step 2(a) of the exploration process, we now follow a more specific order in which we choose the active clones and then reveal their neighbors.This is still a valid way of continuing the exploration process.The main technical result required for this purpose is Lemma 7 below.
Lemma 7. Suppose that after ǫn steps of the exploration process, we have A j ′ i ′ (ǫn) > µn for some pair (i ′ , j ′ ).Then, there exists ǫ 1 > ǫ and δ 1 > 0 for which we can continue the exploration process in a modified way by altering the order in which active clones are chosen in step 2(a) of the exploration proces such that at time ǫ 1 n, w.h.p. for all (i, j) ∈ S, we have The above lemma says that we can get to a point in the exploration process where there are linearly many active clones of every type.An immediate consequence of this is the Corollary 1 below.We remark here that Corollary 1 is merely one of the consequences of Lemma 7 an can be proved in a much simpler way.But as we will see later, we need the full power of Lemma 7 to prove Theorem 2-(b).
Corollary 1. Suppose that after ǫn steps of the exploration process, we have A j ′ i ′ (ǫn) > µn for some pair (i ′ , j ′ ).Then there exists δ 2 > 0 such that w.h.p., the neighbors of the Before proving Lemma 7, we state a well known result.The proof can be obtained by standard large deviation techniques.We omit the proof.Lemma 8. Fix m.Suppose there are there are n objects consisting of α i n objects of type i for 1 ≤ i ≤ m.Let β > 0 be a constant that satisfies β < max i α i .Suppose we pick βn objects at random from these n objects without replacement.Then for given ǫ ′ > 0 there exists z = z(ǫ ′ , m) such that, Proof of Lemma 7. The proof relies on the fact that the matrix M is irreducible.If we denote the underlying graph associated with M by H, then H is strongly connected.We consider the subgraph T j ′ i ′ of H which is the shortest path tree in H rooted at the node (i ′ , j ′ ).We traverse T j ′ i ′ breadth first.Let d be the depth of T j ′ i ′ .We continue the exploration process from this point in d stages 1, 2, . . ., d. Stage 1 begins right after time ǫn.Denote the time at which stage l ends by ǫ l n.For convenience, we will assume a base stage 0, which includes all events until time ǫn.For 1 ≤ l ≤ d, let I l be the set of nodes (i, j) at depth l in T j ′ i ′ .We let I 0 = {(i ′ , j ′ )}.We will prove by induction that for l = 0, 1, . . ., d, there exists δ (l) > 0 such that at the end of stage l, we have w.h.p., A j i > δ (l) n for each (i, j) ∈ l x=0 I x .Note that at the end of stage 0 we have w.h.p.A j ′ i ′ > µn.So we can choose δ (0) = µ to satisfy the base case of the induction.Suppose |I l | = r.Stage l + 1 consists of r substages, namely (l + 1, 1), (l + 1, 2), . . ., (l + 1, r) where each substage addresses exactly one (i, j) ∈ I l .We start stage (l + 1, 1) by considering any (i, j) ∈ I l .We reveal the neighbors of αδ (l) n clones among the A j i > δ (l) n clones one by one.Here 0 < α < 1 is a constant that will describe shortly.The evolution of active clones in each of these αδ (l) n steps is identical to that in the event E j i in Case 1 of the original exploration process.Fix any (j, m) ∈ I l+1 .Note that M ijjm > 0 by construction of T j ′ i ′ .So by making ǫ and ǫ 1 , . . ., ǫ l smaller if necessary and choosing α small enough, we can conclude using Lemma 5 that for all time steps k < ǫ l n + αδ (l) n we have ||M k (n) − M || < δ for any δ > 0. Similarly, by using Lemma 4, we get By referring to the description of the exploration process for the event E j i in Case 1, the expected change in Z m j (k + 1) during stage (l + 1, 1) can be computed similar to (13) as where (a) follows from ( 18) and (b) can be guaranteed by choosing small enough δ.The above argument can be repeated for each (j, m) ∈ I l+1 .We now have all the ingredients we need to repeat the one-sided Hoeffding inequality argument earlier in this section.We can then conclude that there exists δ m j > 0 such that w.h.p. we have at least δ m j n active clones of type (j, m) by the end of stage (l + 1, 1).By the same argument, this is also true for all children of (i, j) in T j ′ i ′ .Before starting stage S 2 l+1 , we set δ (l) = min{(1 − α)δ (l) , δ m j 1 }.This makes sure that at every substage of stage l we have at least δ (l) n clones of each kind that has been considered before.This enables us to use the same argument for all substages of stage l.By continuing in this fashion, we can conclude that at the end of stage l + 1 we have δ (l+1) n clones of each type (i, j) for each (i, j) ∈ l+1 x=1 I x for appropriately defined δ (l+1) .The proof is now complete by induction.
Proof of Corollary 1.Consider any j ∈ [p].We will prove that the giant component has linearly many vertices in G j with high probability.
Let d be such that p d j > 0 and let d i > 0 for some i ∈ [p].This means in the configuration model, each of these type d vertices have at least one clones of type (j, i).Continue the exploration process as in Lemma 7.For small enough ǫ 1 there are at least n(p d j − ǫ 1 ) of type (j, i) clones still unused at time ǫ 1 n.From Lemma 7, with high probability we have at least δ 1 n clones of type (i, j) at this point.Proceed by simply revealing the neighbors of each of these.Form Lemma 8, it follows that with high probability, we will cover at least a constant fraction of these clones which correspond to a linear number of vertices covered.Each of these vertices are in the giant component and the proof is now complete.
We now prove part(b) of Theorem 2. Part (a) will be proved in the next section.We use the argument by Molloy and Reed, except for the multipartite case, we will need the help of Lemma 7 to complete the argument.
Proof of Theorem 2 (b).Consider two vertices u, v ∈ G.We will upper bound the probability that u lies in the component C, which is the component being explored at time ǫn and v lies in a component of size bigger than β log n other than C. To do so start the exploration process at u and proceed till the time step ǫ 1 n in the statement of Lemma 7. At this time we are in the midst of revealing the component C.But this may not be the component of u because we may have restarted the exploration process using the "Initialization step" at some time between 0 and ǫ 1 n.If it is not the component of u, then u does not lie in C. So, let us assume that indeed we are exploring the component of u.At this point continue the exploration process in a different way by switching to revealing the component of v.For v to lie in a component of size greater than β log n, the number of active clones in the exploration process associated with the component of v must remain positive for each of the first β log n steps.At each step choices of neighbors are made uniformly at random.Also, from Lemma 7, C has at least δ 1 n active clones of each type.For the component of v to be distinct from the component of u this choice must be different from any of these active clones of the component of u.So it follows that the probability of this event is bounded above by (1 − δ 1 ) β log n .For large enough β, this gives Using a union bound over all pairs of vertices u and v completes the proof.

Size of the Giant Component
In this section we complete the proof of Theorem 2-(a) regarding the size of the giant component.For the unipartite case, the first result regarding the size of the giant component was obtained by Molloy and Reed [MR98] by using Wormald's results [Wor95] on using differential equations for random processes.As with previous results for the unipartite case, we show that the size of the giant component as a fraction of n is concentrated around the survival probability of the edge-biased branching process.We do this in two steps.First we show that the probability that a certain vertex v lies in the giant component is approximately equal to the probability that the edge-biased branching process with v as its root grows to infinity.Linearity of expectation then shows that the expected fraction of vertices in the giant component is equal to this probability.We then prove a concentration result around this expected value to complete the proof of Theorem 2. These statements are proved formally in Lemma 10.
Before we go into the details of the proof, we first prove a lemma which is a very widely used application of Azuma's inequality.
Lemma 9. Let X = (X 1 , X 2 , . . ., X t ) be a vector valued random variable and let f (X) be a function defined on X.Let F k σ(X 1 , . . ., X k ).Assume that Proof.The proof of this lemma is a standard martingale argument.We include it here for completeness.Define the random variables Y 0 , . . ., Y t as The lemma then follows by applying Azuma's inequality to the martingale sequence {Y k }.
Lemma 10.Let ǫ > 0 be given.Let v ∈ G be chosen uniformly at random.Then for large enough n, we have Proof.We use a coupling argument similar to that used by Bollobas and Riordan [BR12] where it was used to prove a similar result for "local" properties of random graphs.We couple the exploration process starting at v with the branching process T n (v) by trying to replicate the event in the branching process as closely as often as possible.We describe the details below.
The parameters of the distribution associated with T n is given by . In the exploration process, at time step k the corresponding parameters are given by (see Section 5).We first show that for each of the first β log n steps of the exploration process, these two quantities are close to each other.The quantity d i N d j (k) is the total number of sleeping clones at time k of type (j, i) in G j that belong to a vertex of type d.At each step of the exploration process the total number of sleeping clones can change by at most ω(n).Also L j i (k) is the total number of living clones of type (j, i) in G j and can change by at most two in each step.
Then initially for all (i, j) we have L j i (0) = Θ(n) and until time β log n it remains Θ(n).Therefore, i,j,d From the explanation above, the first term is O(ω(n)/n) and the second term is O(1/n).
Recall that . From this we can conclude by using a telescopic sum and triangle inequality that for time index k ≤ β log n, i,j,d So the total variational distance between the distribution of the exploration process and the branching process at each of the first β log n steps is O(ω(n) log n/n).We now describe the coupling between the branching process and the exploration process.For the first time step, note that the root of T n has type (i, d) with probability p d i .We can couple this with the exploration process by letting the vertex awakened in the "Initialization step" of the exploration process to be of type (i, d).Since the two probabilities are the same, this step of the coupling succeeds with probability one.Suppose that we have defined the coupling until time k < β log n.To describe the coupling at time step k + 1 we need to consider the case of two events.The first is the event when the coupling has succeeded until time k, i.e., the two processes are identical.In this case, since the total variational distance between the parameters of the two processes is O(ω(n) log n/n) we perform a maximal coupling, i.e., a coupling which fails with probability equal to the total variational distance.For our purposes, we do not need to describe the coupling at time k+1 in the event that the coupling has failed at some previous time step.The probability that the coupling succeeds at each of the first β log n steps is at least (1 − O(ω(n) log n/n)) β log n = 1 − O(ω(n)(log n) 2 /n) = 1 − o(1).We have shown that the coupling succeeds till time β log n with high probability.Assume that it indeed succeeds.In that case the component explored thus far is a tree.Therefore, at every step of the exploration process a sleeping vertex is awakened because otherwise landing on an active clone will result in a cycle.This means if the branching process has survived up until this point, the corresponding exploration process has also survived until this time and the component revealed has at least β log n vertices.Hence, But Theorem 2 (b) states that with high probability, there is only one component of size greater than β log n, which is the giant component, i.Now what is left is to show that the size of the giant component concentrates around its expected value.
Proof of Theorem 2 (a) -(size of the giant component).From the first two parts of Theorem 2, with high probability we can categorize all the vertices of G into two parts, those which lie in the giant component, and those which lie in a component of size smaller than β log n, i.e., in small components.The expected value of the fraction of vertices in small components is 1 − η + o(1).We will now show that the fraction of vertices in small components concentrates around this mean.
Recall that cn n i∈[p],d∈D 1 ′ d p d i is the number of edges in the configuration model.Let us consider the random process where the edges of the configuration model are revealed one by one.Each edge corresponds to a matching between clones.Let E i 1 ≤ i ≤ cn denote the (random) edges.Let N S denote the number of vertices in small components, i.e., in components of size smaller than β log n.We wish to apply Lemma 9 to obtain the desired concentration result for which we need to bound |E[N S |E 1 , . . ., E k ] − E[N S |E 1 , . . ., E k+1 ]|.

.
The sleeping vertex to which the neighbor clone belongs is now active.The change in the number of active clones of different types is governed by the type d of this new active vertex.The change in active clones of different types in this event are as follows.