- email:
*math@willperkins.org* - CV

Upcoming visits, conferences, and talks:

- March 26-30: Workshop on Graph Limits, Bohemian Switzerland
- April 2-6: Limits to inference in networks and noisy data, Santa Fe Institute
- April 26-27: Scottish Combinatorics Meeting, Edinburgh
- May 14-17: Workshop on Algorithms and Randomness, Georgia Tech
- August 20-31: Statistical physics and machine learning back together, Corsica
- September 30-October 5: Spin glasses and related topics, Banff
- Spring 2019: Geometry of Polynomials, Simons Institute

Teaching:

- Fall '17 2S: Statistics
- Fall '16 2S: Statistics
- Spring '16 3COM: Combinatorics
- Fall '13 Math 4221: Stochastic Processes I
- Spring '13 Math 6221: Advanced Classical Probability
- Fall '12 Math 4221: Stochastic Processes I
- Spring '12 Math 3215: Probability & Statistics
- Spring '12 CS 8803: Discrete Fourier Analysis

Bethe states of random factor graphs, *preprint, 2017*, (with Amin Coja-Oghlan).

We verify a key component of the replica symmetry breaking hypothesis put forward in the physics literature on random factor graph models. For a broad class of these models we verify that the Gibbs measure can be decomposed into a moderate number of Bethe states, subsets of the state space in which both short and long range correlations of the measure take a simple form. Moreover, we show that the marginals of these Bethe states can be obtained from fixed points of the Belief Propagation operator. We derive these results from a new result on the approximation of general probability measures on discrete cubes by convex combinations of product measures.

We verify a key component of the replica symmetry breaking hypothesis put forward in the physics literature on random factor graph models. For a broad class of these models we verify that the Gibbs measure can be decomposed into a moderate number of Bethe states, subsets of the state space in which both short and long range correlations of the measure take a simple form. Moreover, we show that the marginals of these Bethe states can be obtained from fixed points of the Belief Propagation operator. We derive these results from a new result on the approximation of general probability measures on discrete cubes by convex combinations of product measures.

On the hard sphere model and sphere packings in high dimensions, *preprint, 2017*, (with Matthew Jenssen and Felix Joos).

We give an alternative, statistical physics based proof of the Ω(d 2^{-d}) lower bound for the maximum sphere packing density in dimension d by showing that a random configuration from the hard sphere model has this density in expectation. While the leading constant we achieve is not as large as that in some other proofs, we do obtain additional geometric information: we prove a lower bound on the entropy density of sphere packings at this density, a measure of how plentiful such packings are.

We give an alternative, statistical physics based proof of the Ω(d 2^{-d}) lower bound for the maximum sphere packing density in dimension d by showing that a random configuration from the hard sphere model has this density in expectation. While the leading constant we achieve is not as large as that in some other proofs, we do obtain additional geometric information: we prove a lower bound on the entropy density of sphere packings at this density, a measure of how plentiful such packings are.

Tight bounds on the coefficients of partition functions via stability, *preprint, 2017*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We prove stability versions of several extremal theorems for partition functions (or graph polynomials) over classes of bounded-degree graphs, and then show these stability results can be used to obtain tight extremal bounds on the individual coefficients of the corresponding partition functions. As applications, we prove new bounds on the number of independent sets and matchings of a given size in regular graphs. For large enough graphs and almost all sizes, the bounds are tight and confirm the Upper Matching Conjecture of Friedland, Krop, and Markstrom and a conjecture of Kahn on independent sets for a wide range of parameters.

We prove stability versions of several extremal theorems for partition functions (or graph polynomials) over classes of bounded-degree graphs, and then show these stability results can be used to obtain tight extremal bounds on the individual coefficients of the corresponding partition functions. As applications, we prove new bounds on the number of independent sets and matchings of a given size in regular graphs. For large enough graphs and almost all sizes, the bounds are tight and confirm the Upper Matching Conjecture of Friedland, Krop, and Markstrom and a conjecture of Kahn on independent sets for a wide range of parameters.

Information-theoretic thresholds from the cavity method, extended abstract at *STOC 2017*, (with Amin Coja-Oghlan, Florent Krzakala, and Lenka Zdeborová).

Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we establish a formula for the mutual information in statistical inference problems induced by random graphs. This general result implies the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al. (2011)] and allows us to pinpoint the exact condensation phase transition in random constraint satisfaction problems such as random graph coloring, thereby proving a conjecture from [Krzakala et al. (2007)]. As a further application we establish the formula for the mutual information in Low-Density Generator Matrix codes as conjectured in [Montanari (2005)]. The proofs provide a conceptual underpinning of the replica symmetric variant of the cavity method.

Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we establish a formula for the mutual information in statistical inference problems induced by random graphs. This general result implies the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al. (2011)] and allows us to pinpoint the exact condensation phase transition in random constraint satisfaction problems such as random graph coloring, thereby proving a conjecture from [Krzakala et al. (2007)]. As a further application we establish the formula for the mutual information in Low-Density Generator Matrix codes as conjectured in [Montanari (2005)]. The proofs provide a conceptual underpinning of the replica symmetric variant of the cavity method.

Extremes of the internal energy of the Potts model on cubic graphs, *to appear, Random Structures and Algorithms*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We prove tight upper and lower bounds on the internal energy per particle (expected number of monochromatic edges per vertex) in the anti-ferromagnetic Potts model on cubic graphs at every temperature and for all q>=2.

This immediately implies corresponding tight bounds on the anti-ferromagnetic Potts partition function.

Taking the zero-temperature limit gives new results in extremal combinatorics: the number of q-colorings of any 3-regular graph, for any q>=2, is maximized by a union of K_{3,3}'s. This proves the d=3 case of a conjecture of Galvin and Tetali. See this blog post and survey article by Yufei Zhao for more on this conjecture and other extremal problems for regular graphs.

We prove tight upper and lower bounds on the internal energy per particle (expected number of monochromatic edges per vertex) in the anti-ferromagnetic Potts model on cubic graphs at every temperature and for all q>=2.

This immediately implies corresponding tight bounds on the anti-ferromagnetic Potts partition function.

Taking the zero-temperature limit gives new results in extremal combinatorics: the number of q-colorings of any 3-regular graph, for any q>=2, is maximized by a union of K_{3,3}'s. This proves the d=3 case of a conjecture of Galvin and Tetali. See this blog post and survey article by Yufei Zhao for more on this conjecture and other extremal problems for regular graphs.

Counting independent sets in cubic graphs of given girth, *preprint, 2016*, (with Guillem Perarnau).

We prove a tight upper bound on the independence polynomial (and total number of independent sets) of cubic graphs of girth at least 5. The bound is achieved by unions of the Heawood graph, the point/line incidence graph of the Fano plane.

We also give a tight lower bound on the total number of independent sets of triangle-free cubic graphs. This bound is achieved by unions of the Petersen graph. Some calculations from the proof are available here.

We conjecture that in fact all Moore graphs are extremal for the scaled number of independent sets in regular graphs of a given minimum girth, maximizing this quantity if their girth is even and minimizing if odd.

We prove a tight upper bound on the independence polynomial (and total number of independent sets) of cubic graphs of girth at least 5. The bound is achieved by unions of the Heawood graph, the point/line incidence graph of the Fano plane.

We also give a tight lower bound on the total number of independent sets of triangle-free cubic graphs. This bound is achieved by unions of the Petersen graph. Some calculations from the proof are available here.

We conjecture that in fact all Moore graphs are extremal for the scaled number of independent sets in regular graphs of a given minimum girth, maximizing this quantity if their girth is even and minimizing if odd.

On the average size of independent sets in triangle-free graphs, *to appear, Proceedings of the AMS*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We prove that the*average* size of an independent set in any triangle-free graph of maximum degree d is at least (1+o_d(1)) log d/d·n, giving an alternative proof of Shearer's upper bound on the Ramsey number R(3,k).
We then prove new lower bounds on the total number of independent sets in triangle-free graphs, including a result for triangle-free graphs of maximum degree d that is asymptotically tight in the exponent.

We conjecture that in any triangle-free graph there is at least a factor 4/3 gap between the maximum and average independent set size. Combined with the above theorem, a proof of this conjecture would lead to an improvement on the current upper bound on R(3,k).

We prove that the

We conjecture that in any triangle-free graph there is at least a factor 4/3 gap between the maximum and average independent set size. Combined with the above theorem, a proof of this conjecture would lead to an improvement on the current upper bound on R(3,k).

The Widom-Rowlinson model, the hard-core model and the extremality of the complete graph, *European Journal of Combinatorics, 2017*, (with Emma Cohen, Péter Csikvári, and Prasad Tetali).

We give a short alternative proof of the result that the Widom-Rowlinson free energy is maximized over d-regular graphs by the complete graph on d+1 vertices. We also resolve one of the conjectures from the previous paper. The main tool is a bijection between the Widom-Rowlinson model and the hard-core model on a modified graph that appears in a paper of Brightwell, Haggstrom, and Winkler.

We give a short alternative proof of the result that the Widom-Rowlinson free energy is maximized over d-regular graphs by the complete graph on d+1 vertices. We also resolve one of the conjectures from the previous paper. The main tool is a bijection between the Widom-Rowlinson model and the hard-core model on a modified graph that appears in a paper of Brightwell, Haggstrom, and Winkler.

Belief propagation on replica symmetric random factor graph models, (to appear, Annales de L'Institut Henri Poincare D; extended abstract at *RANDOM 2016*), (with Amin Coja-Oghlan).

Physicists predict that the free energy of random factor graph models satisfying a certain "static replica symmetry" condition can be calculated via the Belief Propagation message passing scheme Krzakala, Montanari, Ricci-Tersenghi, Semerjian, Zdeborova PNAS 2007. We prove this conjecture for two general classes of random factor graph models: Poisson and regular random factor graphs. We show that the messages constructed as in the case of acyclic factor graphs asymptotically satisfy the Belief Propagation equations, and that the free energy density is given by the Bethe free energy formula.

Physicists predict that the free energy of random factor graph models satisfying a certain "static replica symmetry" condition can be calculated via the Belief Propagation message passing scheme Krzakala, Montanari, Ricci-Tersenghi, Semerjian, Zdeborova PNAS 2007. We prove this conjecture for two general classes of random factor graph models: Poisson and regular random factor graphs. We show that the messages constructed as in the case of acyclic factor graphs asymptotically satisfy the Belief Propagation equations, and that the free energy density is given by the Bethe free energy formula.

Limits of discrete distributions and Gibbs measures on random graphs, *to appear, European Journal of Combinatorics*, (with Amin Coja-Oghlan and Kathrin Skubch).

Inspired by the theory of graph limits and the work of Panchenko we develop a notion of convergence for sequences of discrete probability measures and prove a Regularity Lemma (in the sense of Szemeredi) in this setting. We then apply this to Gibbs measures on sparse random graphs and verify the physicists' replica symmetric solution under the hypothesis of non-reconstruction and convergence in probability of the sequence of Gibbs measures on the random graphs, in the topology defined above.

Inspired by the theory of graph limits and the work of Panchenko we develop a notion of convergence for sequences of discrete probability measures and prove a Regularity Lemma (in the sense of Szemeredi) in this setting. We then apply this to Gibbs measures on sparse random graphs and verify the physicists' replica symmetric solution under the hypothesis of non-reconstruction and convergence in probability of the sequence of Gibbs measures on the random graphs, in the topology defined above.

On the Widom-Rowlinson Occupancy Fraction in Regular Graphs, *Combinatorics, Probability and Computing, 2017*, (with Emma Cohen and Prasad Tetali).

The Widom-Rowlinson model on a graph is a random configuration of two types of particles with a hard-core interaction between particles of different types. The*occupancy fraction* of the model is the fraction of vertices occupied by a particle of either type. We show that over all d-regular graphs, the occupancy fraction
is maximized by K_(d+1), the complete graph on d+1 vertices. Via integration, this shows that the normalized partition function of the model is maximized by K_(d+1),
and this in turn shows that K_(d+1) maximizes the normalized number of homomorphisms from a d-regular graph to the path on 3 vertices with a self-loop on each vertex,
proving a conjecture of David Galvin.

The Widom-Rowlinson model on a graph is a random configuration of two types of particles with a hard-core interaction between particles of different types. The

Independent Sets, Matchings, and Occupancy Fractions, *to appear, Journal of the London Mathematical Society*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We give tight upper bounds on the logarithmic derivative of the independence and matching polynomials of any d-regular graph.

For independent sets, this is a strengthening of a sequence of results of Kahn, Galvin and Tetali, and Zhao that a disjoint union of K_dd's maximizes the independence polynomial and total number of independent sets among all d-regular graphs.

For matchings, the result implies that disjoint unions of K_dd's maximize the matching polynomial and total number of matchings, as well as proving the asymptotic upper matching conjecture of Friedland, Krop, Lundow, and Markstrom.

The proof uses an*occupancy method*: we show that the occupancy fraction in the hard-core model and the edge occupancy fraction in the monomer-dimer model are maximized over all d-regular graphs by K_dd.

We give tight upper bounds on the logarithmic derivative of the independence and matching polynomials of any d-regular graph.

For independent sets, this is a strengthening of a sequence of results of Kahn, Galvin and Tetali, and Zhao that a disjoint union of K_dd's maximizes the independence polynomial and total number of independent sets among all d-regular graphs.

For matchings, the result implies that disjoint unions of K_dd's maximize the matching polynomial and total number of matchings, as well as proving the asymptotic upper matching conjecture of Friedland, Krop, Lundow, and Markstrom.

The proof uses an

Birthday Inequalities, Repulsion, and Hard Spheres, *Proceedings of the AMS, 2016*.

I study a "Birthday Inequality" and a "Repulsion Inequality" for random geometric graphs. This gives a new method for bounding the partition function in two models from statistical physics: the hard sphere model and the hard-core lattice gas model. Two applications in combinatorics are new upper bounds on the number of independent sets and matchings of a given size in d-regular graphs. Another application is a surprising fact about spheres in dimension 24: at certain densities, if we condition that a set of n randomly placed centers of spheres of radius r are at pairwise distance greater than r, then the expectation of the volume of their union decreases!

I study a "Birthday Inequality" and a "Repulsion Inequality" for random geometric graphs. This gives a new method for bounding the partition function in two models from statistical physics: the hard sphere model and the hard-core lattice gas model. Two applications in combinatorics are new upper bounds on the number of independent sets and matchings of a given size in d-regular graphs. Another application is a surprising fact about spheres in dimension 24: at certain densities, if we condition that a set of n randomly placed centers of spheres of radius r are at pairwise distance greater than r, then the expectation of the volume of their union decreases!

Spectral Thresholds in the Bipartite Stochastic Block Model, (extended abstract at *COLT 2016*), (with Laura Florescu).

We study a stochastic block model on a bipartite graph with unequal sizes and a planted partition on each side. The model arose in previous work on planted random k-SAT. We show that the usual spectral algorithm based on the SVD has a edge-density threshold for recovery much higher than the algorithm based on subsampling (below) and a new algorithm we propose, Diagonal Deletion SVD. We show the failure of the usual SVD is due to a localization phenomenon of the singular vectors. We also show that the bipartite block model exhibits a sharp threshold for detection of the planted partition, as in the results for the SBM of Mossel, Neeman, Sly and Massoulie.

We study a stochastic block model on a bipartite graph with unequal sizes and a planted partition on each side. The model arose in previous work on planted random k-SAT. We show that the usual spectral algorithm based on the SVD has a edge-density threshold for recovery much higher than the algorithm based on subsampling (below) and a new algorithm we propose, Diagonal Deletion SVD. We show the failure of the usual SVD is due to a localization phenomenon of the singular vectors. We also show that the bipartite block model exhibits a sharp threshold for detection of the planted partition, as in the results for the SBM of Mossel, Neeman, Sly and Massoulie.

Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's, *NIPS 2015*,
(with Vitaly Feldman and Santosh Vempala)

We present a new algorithm for recovering planted solutions in two models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. The idea of the algorithm is to perform power iteration with a sequence of adjacency matrices from disjoint graphs subsampled from the original random graph. The algorithm is simple, easy to implement, and its running time is linear in the number of edges/constraints used.

We present a new algorithm for recovering planted solutions in two models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. The idea of the algorithm is to perform power iteration with a sequence of adjacency matrices from disjoint graphs subsampled from the original random graph. The algorithm is simple, easy to implement, and its running time is linear in the number of edges/constraints used.

On the Complexity of Random Satisfiability Problems with Planted Solutions, *STOC 2015*,
(with Vitaly Feldman and Santosh Vempala)

We consider the problem of finding a planted assignment to a random k-SAT formula (or a planted partition in a random hypergraph). This problem exhibits an algorithmic gap: while the planted solution is essentially unique with a large linear number of clauses, there are distributions for which efficient algorithms are only known for n^{k/2} clauses. We give a rigorous explanation of this gap by showing that any efficient*statistical* algorithm requires n^{r/2} clauses,
where *r* is a parameter that depends on the distribution and can be as high as k. We also describe a statistical algorithm that
matches the sample complexity up to log factors. This gives concrete evidence for the security
of Goldreich's proposed one-way function,
and proves Feige's 3-SAT hypothesis for
statistical algorithms.

We consider the problem of finding a planted assignment to a random k-SAT formula (or a planted partition in a random hypergraph). This problem exhibits an algorithmic gap: while the planted solution is essentially unique with a large linear number of clauses, there are distributions for which efficient algorithms are only known for n^{k/2} clauses. We give a rigorous explanation of this gap by showing that any efficient

Large Deviations for the Empirical Distribution in the Branching Random Walk, *Electronic Journal of Probability, 2015*, (with Oren Louidor)

We consider large deviations in the empirical distribution of a super-critical branching random walk. In particular we answer questions such as: What is the asymptotic probability that at least 70% of particles are between -Sqrt(n) and Sqrt(n) after n steps, instead of the expected 68%? We find there are two distinct regimes if the set considered is a finite union of intervals, with rates interpolating between the two for sets that are countable unions of intervals.

We consider large deviations in the empirical distribution of a super-critical branching random walk. In particular we answer questions such as: What is the asymptotic probability that at least 70% of particles are between -Sqrt(n) and Sqrt(n) after n steps, instead of the expected 68%? We find there are two distinct regimes if the set considered is a finite union of intervals, with rates interpolating between the two for sets that are countable unions of intervals.

On Sharp Thresholds in Random Geometric Graphs, *RANDOM 2014*, (with Milan Bradonjic)

We give a characterization of vertex-monotone properties with sharp thresholds in a Poisson random geometric graph or hypergraph. As an application we show that a geometric model of random k-SAT exhibits a sharp threshold for satisfiability.

We give a characterization of vertex-monotone properties with sharp thresholds in a Poisson random geometric graph or hypergraph. As an application we show that a geometric model of random k-SAT exhibits a sharp threshold for satisfiability.

Computer-enabled Metrics of Statistical Significance for Discrete Data, (with Mark Tygert and Rachel Ward)

This monograph addresses basic problems in statistical significance from an algorithmic perspective. We show that a simple root-mean-square test and the discrete Kolmogorov-Smirnov test are often superior to classical chi-square statistics, and give the relevant theory along with numerous examples.

This monograph addresses basic problems in statistical significance from an algorithmic perspective. We show that a simple root-mean-square test and the discrete Kolmogorov-Smirnov test are often superior to classical chi-square statistics, and give the relevant theory along with numerous examples.

Random k-SAT and the Power of Two Choices, *Random Structures and Algorithms, 2014*.

I give an Achlioptas rule to shift the satisfiability threshold for random k-SAT. Previously this was known for k=2 (Sinclair/Vilenchik), and the parameters have since been improved (Dani et al). I also propose a semi-random decision problem for random k-SAT.

I give an Achlioptas rule to shift the satisfiability threshold for random k-SAT. Previously this was known for k=2 (Sinclair/Vilenchik), and the parameters have since been improved (Dani et al). I also propose a semi-random decision problem for random k-SAT.

Some deficiencies of χ^{2} and classical exact tests of significance, *Applied and Computational Harmonic Analysis*, 2014. (with Mark Tygert and Rachel Ward)

We indicate the serious problems faced by the chi-square test in dealing with sparse data and propose a remedy. See Andrew Gellman's blog post for a discussion.

We indicate the serious problems faced by the chi-square test in dealing with sparse data and propose a remedy. See Andrew Gellman's blog post for a discussion.

The Bohman-Frieze Process Near Criticality, * Random Structures and Algorithms*, 2013 (with Mihyun Kang and Joel Spencer).

We study the fine behavior of the phase transition of the Bohman-Frieze random graph process, and show that it shares several critical exponents with the Erdos-Renyi random graph. Our technique involves deriving a PDE associated to the process and using singularity analysis to extra information about small components from a multivariate generating function.

We study the fine behavior of the phase transition of the Bohman-Frieze random graph process, and show that it shares several critical exponents with the Erdos-Renyi random graph. Our technique involves deriving a PDE associated to the process and using singularity analysis to extra information about small components from a multivariate generating function.

Hardness of Finding Independent Sets In Almost 3-Colorable Graphs, * FOCS * 2010 (with Irit Dinur, Subhash Khot, and Muli Safra)

With techniques adapted from (Dinur/Safra) and Fourier analysis on the discrete q-ary hypercube, we prove hardness of approximation for graph coloring. In particular, we show that it is NP-hard to distinguish a graph that is almost k-colorable from one whose largest independent set is a 1/k^2 fraction of the graph.

With techniques adapted from (Dinur/Safra) and Fourier analysis on the discrete q-ary hypercube, we prove hardness of approximation for graph coloring. In particular, we show that it is NP-hard to distinguish a graph that is almost k-colorable from one whose largest independent set is a 1/k^2 fraction of the graph.

The Forgetfulness of Balls and Bins, * Random Structures and Algorithms*, 2012.

I answer the question 'How many additional random balls are necessary to disguise an arbitrary initial placement of m balls in n bins?' with a precise answer that depends on the empirical variance of the initial placement.

I answer the question 'How many additional random balls are necessary to disguise an arbitrary initial placement of m balls in n bins?' with a precise answer that depends on the empirical variance of the initial placement.

Computing the confidence levels for a root-mean-square test of goodness-of-fit, * Applied Mathematics and Computation*, 2011 (with Mark Tygert and Rachel Ward)

We give an efficient numerical algorithm for computing the asymptotic distribution of a root-mean-square goodness of fit test. The algorithm is a numerical evaluation of a contour integral.

We give an efficient numerical algorithm for computing the asymptotic distribution of a root-mean-square goodness of fit test. The algorithm is a numerical evaluation of a contour integral.