at the University of Birmingham in the UK.
My research is a mix of probability, statistics, computer science, and combinatorics.
Information-theoretic thresholds from the cavity method,
preprint, 2016, (with Amin Coja-Oghlan, Florent Krzakala, and Lenka Zdeborová).
Vindicating a sophisticated but non-rigorous physics approach called the cavity method,
we establish a formula for the mutual information in statistical inference problems induced by random graphs.
This general result implies the conjecture on the information-theoretic threshold in the disassortative stochastic block model
[Decelle et al. (2011)] and allows us to pinpoint the exact condensation phase transition
in random constraint satisfaction problems such as random graph coloring, thereby proving a conjecture from
[Krzakala et al. (2007)]. As a further application we establish the formula for the mutual information in Low-Density Generator Matrix codes as
conjectured in [Montanari (2005)]. The proofs provide a conceptual underpinning of the replica symmetric variant of the cavity method.
Extremes of the internal energy of the Potts model on cubic graphs,
preprint, 2016, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).
We prove tight upper and lower bounds on the internal energy per particle (expected number of monochromatic edges per vertex) in the anti-ferromagnetic Potts model on cubic graphs at every temperature and for all q>=2.
This immediately implies corresponding tight bounds on the anti-ferromagnetic Potts partition function.
Taking the zero-temperature limit gives new results in extremal combinatorics: the number of q-colorings of any 3-regular graph, for any q>=2, is maximized by a union of K_{3,3}'s.
This proves the d=3 case of a conjecture of Galvin and Tetali. See this
blog post and
survey article by Yufei Zhao for more on this conjecture and other extremal problems for regular graphs.
Counting independent sets in cubic graphs of given girth,
preprint, 2016, (with Guillem Perarnau).
We prove a tight upper bound on the independence polynomial (and total number of independent sets) of cubic graphs of girth at least 5. The bound is achieved by unions of the Heawood graph, the point/line incidence graph of the Fano plane.
We also give a tight lower bound on the total number of independent sets of triangle-free cubic graphs. This bound is achieved by unions of the Petersen graph. Some calculations from the proof are available
here.
We conjecture that in fact all Moore graphs are extremal for the scaled number of independent sets in regular graphs of a given minimum girth, maximizing this quantity if their girth is even and minimizing if odd.
On the average size of independent sets in triangle-free graphs,
preprint, 2016, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).
We prove that the
average size of an independent set in any triangle-free graph of maximum degree d is at least (1+o_d(1)) log d/d·n, giving an alternative proof of
Shearer's upper bound on the Ramsey number R(3,k).
We then prove new lower bounds on the total number of independent sets in triangle-free graphs, including a result for triangle-free graphs of maximum degree d that is asymptotically tight in the exponent.
We conjecture that in any triangle-free graph there is at least a factor 4/3 gap between the maximum and average independent set size. Combined with the above theorem, a proof of this conjecture would lead to an improvement on the current upper bound on R(3,k).
The Widom-Rowlinson model, the hard-core model and the extremality of the complete graph,
to appear, European Journal of Combinatorics, (with Emma Cohen, Péter Csikvári, and Prasad Tetali).
We give a short alternative proof of the result that the Widom-Rowlinson free energy is maximized over d-regular graphs by the complete graph on d+1 vertices. We also resolve one of the conjectures from the
previous paper. The main tool is a bijection between the Widom-Rowlinson model and the hard-core model on a modified graph that appears in a paper of Brightwell, Haggstrom, and Winkler.
Belief propagation on replica symmetric random factor graph models, (extended abstract at
RANDOM 2016), (with Amin Coja-Oghlan).
Physicists predict that the free energy of random factor graph models satisfying a certain "static replica symmetry" condition
can be calculated via the Belief Propagation message passing scheme
Krzakala, Montanari, Ricci-Tersenghi, Semerjian, Zdeborova PNAS 2007. We prove this conjecture for two general classes of random factor graph models: Poisson and regular random factor graphs.
We show that the messages constructed as in the case of acyclic factor graphs asymptotically satisfy the Belief Propagation equations,
and that the free energy density is given by the Bethe free energy formula.
Limits of discrete distributions and Gibbs measures on random graphs,
preprint, 2015, (with Amin Coja-Oghlan and Kathrin Skubch).
Inspired by the theory of graph limits and the
work of Panchenko we develop a notion of convergence for sequences of discrete probability measures and prove a Regularity Lemma (in the sense of Szemeredi) in this setting.
We then apply this to Gibbs measures on sparse random graphs and verify the physicists' replica symmetric solution under the hypothesis of non-reconstruction and convergence in probability of the sequence of Gibbs measures on the random graphs, in the topology defined above.
On the Widom-Rowlinson Occupancy Fraction in Regular Graphs,
to appear, Combinatorics, Probability and Computing, (with Emma Cohen and Prasad Tetali).
The Widom-Rowlinson model on a graph is a random configuration of two types of particles with a hard-core interaction between particles of different types.
The
occupancy fraction of the model is the fraction of vertices occupied by a particle of either type. We show that over all d-regular graphs, the occupancy fraction
is maximized by K_(d+1), the complete graph on d+1 vertices. Via integration, this shows that the normalized partition function of the model is maximized by K_(d+1),
and this in turn shows that K_(d+1) maximizes the normalized number of homomorphisms from a d-regular graph to the path on 3 vertices with a self-loop on each vertex,
proving a
conjecture of David Galvin.
Independent Sets, Matchings, and Occupancy Fractions,
preprint, 2015, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).
We give tight upper bounds on the logarithmic derivative of the independence and matching polynomials of any d-regular graph.
For independent sets, this is a strengthening of a sequence of results of
Kahn,
Galvin and Tetali, and
Zhao that a disjoint union of K_dd's maximizes the independence polynomial and total number of independent sets among all d-regular graphs.
For matchings, the result implies that disjoint unions of K_dd's maximize the matching polynomial and total number of matchings, as well as proving the
asymptotic upper matching conjecture of Friedland, Krop, Lundow, and Markstrom.
The proof uses an
occupancy method: we show that the occupancy fraction in the hard-core model and the edge occupancy fraction in the monomer-dimer model are maximized over all d-regular graphs by K_dd.
Birthday Inequalities, Repulsion, and Hard Spheres,
Proceedings of the AMS, 2016.
I study a "Birthday Inequality" and a "Repulsion Inequality" for random geometric graphs. This gives a new method for bounding the partition function in two models from statistical physics:
the hard sphere model and the hard-core lattice gas model.
Two applications in combinatorics are new upper bounds on the number of independent sets and matchings of a given size in d-regular graphs. Another application is a surprising fact about spheres in dimension 24:
at certain densities, if we condition that a set of n randomly placed centers of spheres of radius r are at pairwise distance greater than r, then the expectation of the volume of their union decreases!
Spectral Thresholds in the Bipartite Stochastic Block Model, (extended abstract at
COLT 2016), (with Laura Florescu).
We study a stochastic block model on a bipartite graph with unequal sizes and a planted partition on each side. The model arose in previous work on
planted random k-SAT.
We show that the usual spectral algorithm based on the SVD has a edge-density threshold for recovery much higher than the algorithm based on subsampling (below) and a new algorithm we propose,
Diagonal Deletion SVD. We show the failure of the usual SVD is due to a localization phenomenon of the singular vectors.
We also show that the bipartite block model exhibits a sharp threshold for detection of the planted partition, as in the results for the SBM of
Mossel, Neeman,
Sly and
Massoulie.
Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's,
NIPS 2015,
(with Vitaly Feldman and Santosh Vempala)
We present a new algorithm for recovering planted solutions in two models, the stochastic block model and planted constraint satisfaction problems,
via a common generalization in terms of random bipartite graphs. The idea of the algorithm is to perform power iteration with a sequence of adjacency matrices
from disjoint graphs subsampled from the original random graph. The algorithm is simple, easy to implement, and its running time is linear in the number of edges/constraints used.
On the Complexity of Random Satisfiability Problems with Planted Solutions,
STOC 2015,
(with Vitaly Feldman and Santosh Vempala)
We consider the problem of finding a planted assignment to a random k-SAT formula
(or a planted partition in a random hypergraph). This problem exhibits an algorithmic gap: while the planted solution is essentially
unique with a large linear number of clauses, there are distributions for which efficient algorithms are only known for n^{k/2} clauses.
We give a rigorous explanation of this gap by showing that any efficient
statistical algorithm requires n^{r/2} clauses,
where
r is a parameter that depends on the distribution and can be as high as k. We also describe a statistical algorithm that
matches the sample complexity up to log factors. This gives concrete evidence for the security
of Goldreich's
proposed one-way function,
and proves Feige's
3-SAT hypothesis for
statistical algorithms.
Large Deviations for the Empirical Distribution in the Branching Random Walk,
Electronic Journal of Probability, 2015, (with Oren Louidor)
We consider large deviations in the empirical distribution of a super-critical branching random walk. In particular we answer questions such as: What is the asymptotic probability that at least 70% of particles are between -Sqrt(n) and Sqrt(n) after n steps, instead of the expected 68%? We find there are two distinct regimes if the set considered is a finite union of intervals, with rates interpolating between the two for sets that are countable unions of intervals.
On Sharp Thresholds in Random Geometric Graphs,
RANDOM 2014, (with Milan Bradonjic)
We give a characterization of vertex-monotone properties with sharp thresholds in a Poisson random geometric graph or hypergraph. As an application we show that a geometric model of random k-SAT exhibits a sharp threshold for satisfiability.
Computer-enabled Metrics of Statistical Significance for Discrete Data, (with Mark Tygert and Rachel Ward)
This monograph addresses basic problems in statistical significance from an algorithmic perspective. We show that a simple root-mean-square test and the discrete Kolmogorov-Smirnov test are often superior to classical chi-square statistics, and give the relevant theory along with numerous examples.
The Bohman-Frieze Process Near Criticality,
Random Structures and Algorithms, 2013 (with Mihyun Kang and Joel Spencer).
We study the fine behavior of the phase transition of the Bohman-Frieze random graph process, and show that it shares several critical exponents with the Erdos-Renyi random graph. Our technique involves deriving a PDE associated to the process and using singularity analysis to extra information about small components from a multivariate generating function.
Hardness of Finding Independent Sets In Almost 3-Colorable Graphs,
FOCS 2010 (with Irit Dinur, Subhash Khot, and Muli Safra)
With techniques adapted from (
Dinur/Safra) and Fourier analysis on the discrete q-ary hypercube, we prove hardness of approximation for graph coloring. In particular, we show that it is NP-hard to distinguish a graph that is almost k-colorable from one whose largest independent set is a 1/k^2 fraction of the graph.
The Forgetfulness of Balls and Bins,
Random Structures and Algorithms, 2012.
I answer the question 'How many additional random balls are necessary to disguise an arbitrary initial placement of m balls in n bins?' with a precise answer that depends on the empirical variance of the initial placement.