- email:
*math@willperkins.org* - CV

Upcoming visits, conferences, and talks:

- Feb 13: Stony Brook analysis seminar
- Mar 27: Minnesota probability seminar
- Apr 30: Wisconsin probability seminar
- June 1-4: SIAM Discrete Math conference, Portland, OR
- June 15-18: Geometry of Polynomials, Simons Institute, Berkeley
- Sep 21-25: Computational Phase Transitions, Simons Institute
- Nov 15-20: BIRS workshop on Learning in Networks, Oaxaca, Mexico

Events and seminars:

Past events:

Teaching:

- Spring '20 MCS 521: Combinatorial Optimization
- Fall '19 MATH 215: Introduction to Advanced Mathematics
- Fall '18 MATH 215: Introduction to Advanced Mathematics
- Fall '18 MCS 425: Codes and Cryptography
- Fall '17 2S: Statistics
- Fall '16 2S: Statistics
- Spring '16 3COM: Combinatorics
- Fall '13 Math 4221: Stochastic Processes I
- Spring '13 Math 6221: Advanced Classical Probability
- Fall '12 Math 4221: Stochastic Processes I
- Spring '12 Math 3215: Probability & Statistics
- Spring '12 CS 8803: Discrete Fourier Analysis

Correlation decay for hard spheres via Markov chains, *preprint, 2020*, (with Tyler Helmuth and Samantha Petti).

We prove that a certain Markov chain for sampling from the hard sphere model in dimension d is rapidly mixing when the fugacity is less than 2^{1-d}. We then prove that this implies that the model has the strong spatial mixing property and thus has a unique infinite volume Gibbs measure. Our results improve on all known lower bounds on the critical fugacity and critical density of the hard sphere model in dimensions 2 and above.

We prove that a certain Markov chain for sampling from the hard sphere model in dimension d is rapidly mixing when the fugacity is less than 2^{1-d}. We then prove that this implies that the model has the strong spatial mixing property and thus has a unique infinite volume Gibbs measure. Our results improve on all known lower bounds on the critical fugacity and critical density of the hard sphere model in dimensions 2 and above.

On the Number of Independent Sets in Uniform, Regular, Linear Hypergraphs, *preprint, 2020*, (with Emma Cohen, Michail Sarantis, and Prasad Tetali).

We bound the number of independent sets (both strong and weak) in uniform, regular, linear hypergraphs, and present a number of conjectures of optimal bounds.

We bound the number of independent sets (both strong and weak) in uniform, regular, linear hypergraphs, and present a number of conjectures of optimal bounds.

Efficient sampling and counting algorithms for the Potts model on Zd at all temperatures, *preprint, 2019*, (with Christian Borgs, Jennifer Chayes, Tyler Helmuth, and Prasad Tetali).

For d>=2 and q large enough as a function of d, we provide efficient sampling and approximate counting algorithms for the ferromagnetic Potts model on the discrete torus at all temperatures.

For d>=2 and q large enough as a function of d, we provide efficient sampling and approximate counting algorithms for the ferromagnetic Potts model on the discrete torus at all temperatures.

Independent sets in the hypercube revisited, *preprint, 2019*, (with Matthew Jenssen).

We revisit Sapozhenko's classic proof on the asymptotics of the number of independent sets in the discrete hypercube and Galvin's follow-up work on weighted independent sets. We combine Sapozhenko's graph container methods with the cluster expansion and abstract polymer models, two tools from statistical physics, to obtain considerably sharper asymptotics and detailed probabilistic information about the typical structure of (weighted) independent sets in the hypercube. These results refine those of Korshunov and Sapozhenko and Galvin, and answer several questions of Galvin.

We revisit Sapozhenko's classic proof on the asymptotics of the number of independent sets in the discrete hypercube and Galvin's follow-up work on weighted independent sets. We combine Sapozhenko's graph container methods with the cluster expansion and abstract polymer models, two tools from statistical physics, to obtain considerably sharper asymptotics and detailed probabilistic information about the typical structure of (weighted) independent sets in the hypercube. These results refine those of Korshunov and Sapozhenko and Galvin, and answer several questions of Galvin.

Counting independent sets in unbalanced bipartite graphs, *SODA 2020, to appear*, (with Sarah Cannon).

We use polymer models and the cluster expansion to give an FPTAS for approximating the partition function of the hard-core model on bipartite graphs when there is sufficient imbalance in the degrees or fugacities between the sides of the bipartition. The method also allows us to deduce exponential decay of correlations for the hard-core model on such graphs.

We use polymer models and the cluster expansion to give an FPTAS for approximating the partition function of the hard-core model on bipartite graphs when there is sufficient imbalance in the degrees or fugacities between the sides of the bipartition. The method also allows us to deduce exponential decay of correlations for the hard-core model on such graphs.

Fast algorithms at low temperatures via Markov chains, *RANDOM 2019*, (with Zongchen Chen, Andreas Galanis, Leslie Goldberg, James Stewart, and Eric Vigoda).

We define a Markov chain for abstract polymer models and show that under sufficient decay of the polymer weights, this chain mixes rapidly. We apply this chain to obtain an O(n log n) time sampling algorithm for the ferromagnetic Potts model on expander graphs at sufficiently low temperatures and an O(n^2 log^3 n) time sampling algorithm for the hard-core model on bipartite expander graphs at sufficiently high fugacities.

We define a Markov chain for abstract polymer models and show that under sufficient decay of the polymer weights, this chain mixes rapidly. We apply this chain to obtain an O(n log n) time sampling algorithm for the ferromagnetic Potts model on expander graphs at sufficiently low temperatures and an O(n^2 log^3 n) time sampling algorithm for the hard-core model on bipartite expander graphs at sufficiently high fugacities.

Storage capacity in symmetric binary perceptrons, *Journal of Physics A, 2019*, (with Benjamin Aubin and Lenka Zdeborova).

We study two natural symmetric variants of the classical binary perceptron, and using the first and second moment methods prove (up to a numerical hypothesis) that the capacity is given by the annealed computation (the first moment upper bound). The fact that the second moment method gives a tight lower bound is due in part to the behavior of the second moment entropy as a function of overlap near 1. This behavior leads us to conjecture the presence of frozen 1RSB structure in the solution space.

We study two natural symmetric variants of the classical binary perceptron, and using the first and second moment methods prove (up to a numerical hypothesis) that the capacity is given by the annealed computation (the first moment upper bound). The fact that the second moment method gives a tight lower bound is due in part to the behavior of the second moment entropy as a function of overlap near 1. This behavior leads us to conjecture the presence of frozen 1RSB structure in the solution space.

Spin systems on Bethe lattices, *Communications in Mathematical Physics, 2019*, (with Amin Coja-Oghlan).

In an influential paper Mezard and Parisi put forward an analytic but non-rigorous approach called the cavity method for studying spin systems on the Bethe lattice, i.e., the random d-regular graph. Their technique was based on certain hypotheses; most importantly, that the phase space decomposes into a number of Bethe states that are free from long-range correlations and whose marginals are given by Belief Propagation fixed points. In this paper we establish this decomposition rigorously for a very general family of spin systems. In addition, we show that the free energy can be computed from this decomposition.

In an influential paper Mezard and Parisi put forward an analytic but non-rigorous approach called the cavity method for studying spin systems on the Bethe lattice, i.e., the random d-regular graph. Their technique was based on certain hypotheses; most importantly, that the phase space decomposes into a number of Bethe states that are free from long-range correlations and whose marginals are given by Belief Propagation fixed points. In this paper we establish this decomposition rigorously for a very general family of spin systems. In addition, we show that the free energy can be computed from this decomposition.

Algorithms for BIS-hard problems on expander graphs, *SODA 2019*, (with Matthew Jenssen and Peter Keevash).

We give an FPTAS and an efficient sampling algorithm for the high-fugacity hard-core model on bounded-degree bipartite expander graphs and the low-temperature ferromagnetic Potts model on bounded-degree expander graphs. The results apply, for example, to random (bipartite) d-regular graphs, for which no efficient algorithms were known for these problems (with the exception of the Ising model) in the non-uniqueness regime of the infinite d-regular tree.

We give an FPTAS and an efficient sampling algorithm for the high-fugacity hard-core model on bounded-degree bipartite expander graphs and the low-temperature ferromagnetic Potts model on bounded-degree expander graphs. The results apply, for example, to random (bipartite) d-regular graphs, for which no efficient algorithms were known for these problems (with the exception of the Ising model) in the non-uniqueness regime of the infinite d-regular tree.

Algorithmic Pirogov-Sinai theory, *Probability Theory and Related Field, to appear*, extended abstract at *STOC 2019*, (with Tyler Helmuth and Guus Regts).

We develop an efficient algorithmic approach for approximate counting and sampling in the low-temperature regime of a broad class of statistical physics models. Our approach is based on combining contour representations from Pirogov-Sinai theory with Barvinok's approach to approximate counting using truncated Taylor series. Some consequences of our main results include an FPTAS for approximating the partition function of the hard-core model at sufficiently high fugacity on subsets of the lattice Z^d with appropriate boundary conditions and an efficient sampling algorithm for the ferromagnetic Potts model on the discrete torus at sufficiently low temperature.

We develop an efficient algorithmic approach for approximate counting and sampling in the low-temperature regime of a broad class of statistical physics models. Our approach is based on combining contour representations from Pirogov-Sinai theory with Barvinok's approach to approximate counting using truncated Taylor series. Some consequences of our main results include an FPTAS for approximating the partition function of the hard-core model at sufficiently high fugacity on subsets of the lattice Z^d with appropriate boundary conditions and an efficient sampling algorithm for the ferromagnetic Potts model on the discrete torus at sufficiently low temperature.

On kissing numbers and spherical codes in high dimensions, *Advances in Mathematics, 2018*, (with Matthew Jenssen and Felix Joos).

The kissing number in dimension d is the maximum number of non-overlapping d-dimensional unit spheres that can touch a single unit sphere. We improve the simple covering lower bound of Chabauty, Shannon, and Wyner by a linear factor in the dimension. We obtain a similar linear factor improvement to the best known lower bound on the maximal size of a spherical code of acute angle in high dimensions.

The kissing number in dimension d is the maximum number of non-overlapping d-dimensional unit spheres that can touch a single unit sphere. We improve the simple covering lower bound of Chabauty, Shannon, and Wyner by a linear factor in the dimension. We obtain a similar linear factor improvement to the best known lower bound on the maximal size of a spherical code of acute angle in high dimensions.

Bethe states of random factor graphs, *Communications in Mathematical Physics, 2019*, (with Amin Coja-Oghlan).

We verify a key component of the replica symmetry breaking hypothesis put forward in the physics literature on random factor graph models. For a broad class of these models we verify that the Gibbs measure can be decomposed into a moderate number of Bethe states, subsets of the state space in which both short and long range correlations of the measure take a simple form. Moreover, we show that the marginals of these Bethe states can be obtained from fixed points of the Belief Propagation operator. We derive these results from a new result on the approximation of general probability measures on discrete cubes by convex combinations of product measures.

We verify a key component of the replica symmetry breaking hypothesis put forward in the physics literature on random factor graph models. For a broad class of these models we verify that the Gibbs measure can be decomposed into a moderate number of Bethe states, subsets of the state space in which both short and long range correlations of the measure take a simple form. Moreover, we show that the marginals of these Bethe states can be obtained from fixed points of the Belief Propagation operator. We derive these results from a new result on the approximation of general probability measures on discrete cubes by convex combinations of product measures.

On the hard sphere model and sphere packings in high dimensions, *Forum of Mathematics, Sigma, 2019*, (with Matthew Jenssen and Felix Joos).

We give an alternative, statistical physics based proof of the Ω(d 2^{-d}) lower bound for the maximum sphere packing density in dimension d by showing that a random configuration from the hard sphere model has this density in expectation. While the leading constant we achieve is not as large as that in some other proofs, we do obtain additional geometric information: we prove a lower bound on the entropy density of sphere packings at this density, a measure of how plentiful such packings are.

We give an alternative, statistical physics based proof of the Ω(d 2^{-d}) lower bound for the maximum sphere packing density in dimension d by showing that a random configuration from the hard sphere model has this density in expectation. While the leading constant we achieve is not as large as that in some other proofs, we do obtain additional geometric information: we prove a lower bound on the entropy density of sphere packings at this density, a measure of how plentiful such packings are.

Information-theoretic thresholds from the cavity method, *Advances in Mathematics, 2018*; extended abstract at *STOC 2017*, (with Amin Coja-Oghlan, Florent Krzakala, and Lenka Zdeborová).

Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we establish a formula for the mutual information in statistical inference problems induced by random graphs. This general result implies the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al. (2011)] and allows us to pinpoint the exact condensation phase transition in random constraint satisfaction problems such as random graph coloring, thereby proving a conjecture from [Krzakala et al. (2007)]. As a further application we establish the formula for the mutual information in Low-Density Generator Matrix codes as conjectured in [Montanari (2005)]. The proofs provide a conceptual underpinning of the replica symmetric variant of the cavity method.

Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we establish a formula for the mutual information in statistical inference problems induced by random graphs. This general result implies the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al. (2011)] and allows us to pinpoint the exact condensation phase transition in random constraint satisfaction problems such as random graph coloring, thereby proving a conjecture from [Krzakala et al. (2007)]. As a further application we establish the formula for the mutual information in Low-Density Generator Matrix codes as conjectured in [Montanari (2005)]. The proofs provide a conceptual underpinning of the replica symmetric variant of the cavity method.

Tight bounds on the coefficients of partition functions via stability, *Journal of Combinatorial Theory, Series A, 2018*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We prove stability versions of several extremal theorems for partition functions (or graph polynomials) over classes of bounded-degree graphs, and then show these stability results can be used to obtain tight extremal bounds on the individual coefficients of the corresponding partition functions. As applications, we prove new bounds on the number of independent sets and matchings of a given size in regular graphs. For large enough graphs and almost all sizes, the bounds are tight and confirm the Upper Matching Conjecture of Friedland, Krop, and Markstrom and a conjecture of Kahn on independent sets for a wide range of parameters.

We prove stability versions of several extremal theorems for partition functions (or graph polynomials) over classes of bounded-degree graphs, and then show these stability results can be used to obtain tight extremal bounds on the individual coefficients of the corresponding partition functions. As applications, we prove new bounds on the number of independent sets and matchings of a given size in regular graphs. For large enough graphs and almost all sizes, the bounds are tight and confirm the Upper Matching Conjecture of Friedland, Krop, and Markstrom and a conjecture of Kahn on independent sets for a wide range of parameters.

Extremes of the internal energy of the Potts model on cubic graphs, *Random Structures and Algorithms, 2018*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We prove tight upper and lower bounds on the internal energy per particle (expected number of monochromatic edges per vertex) in the anti-ferromagnetic Potts model on cubic graphs at every temperature and for all q>=2.

This immediately implies corresponding tight bounds on the anti-ferromagnetic Potts partition function.

Taking the zero-temperature limit gives new results in extremal combinatorics: the number of q-colorings of any 3-regular graph, for any q>=2, is maximized by a union of K_{3,3}'s. This proves the d=3 case of a conjecture of Galvin and Tetali. See this blog post and survey article by Yufei Zhao for more on this conjecture and other extremal problems for regular graphs.

We prove tight upper and lower bounds on the internal energy per particle (expected number of monochromatic edges per vertex) in the anti-ferromagnetic Potts model on cubic graphs at every temperature and for all q>=2.

This immediately implies corresponding tight bounds on the anti-ferromagnetic Potts partition function.

Taking the zero-temperature limit gives new results in extremal combinatorics: the number of q-colorings of any 3-regular graph, for any q>=2, is maximized by a union of K_{3,3}'s. This proves the d=3 case of a conjecture of Galvin and Tetali. See this blog post and survey article by Yufei Zhao for more on this conjecture and other extremal problems for regular graphs.

Counting independent sets in cubic graphs of given girth, *Journal of Combinatorial Theory, Series B, 2018*, (with Guillem Perarnau).

We prove a tight upper bound on the independence polynomial (and total number of independent sets) of cubic graphs of girth at least 5. The bound is achieved by unions of the Heawood graph, the point/line incidence graph of the Fano plane.

We also give a tight lower bound on the total number of independent sets of triangle-free cubic graphs. This bound is achieved by unions of the Petersen graph. Some calculations from the proof are available here.

We conjecture that in fact all Moore graphs are extremal for the scaled number of independent sets in regular graphs of a given minimum girth, maximizing this quantity if their girth is even and minimizing if odd.

We prove a tight upper bound on the independence polynomial (and total number of independent sets) of cubic graphs of girth at least 5. The bound is achieved by unions of the Heawood graph, the point/line incidence graph of the Fano plane.

We also give a tight lower bound on the total number of independent sets of triangle-free cubic graphs. This bound is achieved by unions of the Petersen graph. Some calculations from the proof are available here.

We conjecture that in fact all Moore graphs are extremal for the scaled number of independent sets in regular graphs of a given minimum girth, maximizing this quantity if their girth is even and minimizing if odd.

On the average size of independent sets in triangle-free graphs, *Proceedings of the AMS, 2017*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We prove that the*average* size of an independent set in any triangle-free graph of maximum degree d is at least (1+o_d(1)) log d/d·n, giving an alternative proof of Shearer's upper bound on the Ramsey number R(3,k).
We then prove new lower bounds on the total number of independent sets in triangle-free graphs, including a result for triangle-free graphs of maximum degree d that is asymptotically tight in the exponent.

We conjecture that in any triangle-free graph there is at least a factor 4/3 gap between the maximum and average independent set size. Combined with the above theorem, a proof of this conjecture would lead to an improvement on the current upper bound on R(3,k).

We prove that the

We conjecture that in any triangle-free graph there is at least a factor 4/3 gap between the maximum and average independent set size. Combined with the above theorem, a proof of this conjecture would lead to an improvement on the current upper bound on R(3,k).

The Widom-Rowlinson model, the hard-core model and the extremality of the complete graph, *European Journal of Combinatorics, 2017*, (with Emma Cohen, Péter Csikvári, and Prasad Tetali).

We give a short alternative proof of the result that the Widom-Rowlinson free energy is maximized over d-regular graphs by the complete graph on d+1 vertices. We also resolve one of the conjectures from the previous paper. The main tool is a bijection between the Widom-Rowlinson model and the hard-core model on a modified graph that appears in a paper of Brightwell, Haggstrom, and Winkler.

We give a short alternative proof of the result that the Widom-Rowlinson free energy is maximized over d-regular graphs by the complete graph on d+1 vertices. We also resolve one of the conjectures from the previous paper. The main tool is a bijection between the Widom-Rowlinson model and the hard-core model on a modified graph that appears in a paper of Brightwell, Haggstrom, and Winkler.

Belief propagation on replica symmetric random factor graph models, *Annales de L'Institut Henri Poincare D, 2018*; extended abstract at *RANDOM 2016*, (with Amin Coja-Oghlan).

Physicists predict that the free energy of random factor graph models satisfying a certain "static replica symmetry" condition can be calculated via the Belief Propagation message passing scheme Krzakala, Montanari, Ricci-Tersenghi, Semerjian, Zdeborova PNAS 2007. We prove this conjecture for two general classes of random factor graph models: Poisson and regular random factor graphs. We show that the messages constructed as in the case of acyclic factor graphs asymptotically satisfy the Belief Propagation equations, and that the free energy density is given by the Bethe free energy formula.

Physicists predict that the free energy of random factor graph models satisfying a certain "static replica symmetry" condition can be calculated via the Belief Propagation message passing scheme Krzakala, Montanari, Ricci-Tersenghi, Semerjian, Zdeborova PNAS 2007. We prove this conjecture for two general classes of random factor graph models: Poisson and regular random factor graphs. We show that the messages constructed as in the case of acyclic factor graphs asymptotically satisfy the Belief Propagation equations, and that the free energy density is given by the Bethe free energy formula.

Limits of discrete distributions and Gibbs measures on random graphs, *European Journal of Combinatorics, 2017*, (with Amin Coja-Oghlan and Kathrin Skubch).

Inspired by the theory of graph limits and the work of Panchenko we develop a notion of convergence for sequences of discrete probability measures and prove a Regularity Lemma (in the sense of Szemeredi) in this setting. We then apply this to Gibbs measures on sparse random graphs and verify the physicists' replica symmetric solution under the hypothesis of non-reconstruction and convergence in probability of the sequence of Gibbs measures on the random graphs, in the topology defined above.

Inspired by the theory of graph limits and the work of Panchenko we develop a notion of convergence for sequences of discrete probability measures and prove a Regularity Lemma (in the sense of Szemeredi) in this setting. We then apply this to Gibbs measures on sparse random graphs and verify the physicists' replica symmetric solution under the hypothesis of non-reconstruction and convergence in probability of the sequence of Gibbs measures on the random graphs, in the topology defined above.

On the Widom-Rowlinson Occupancy Fraction in Regular Graphs, *Combinatorics, Probability and Computing, 2017*, (with Emma Cohen and Prasad Tetali).

The Widom-Rowlinson model on a graph is a random configuration of two types of particles with a hard-core interaction between particles of different types. The*occupancy fraction* of the model is the fraction of vertices occupied by a particle of either type. We show that over all d-regular graphs, the occupancy fraction
is maximized by K_(d+1), the complete graph on d+1 vertices. Via integration, this shows that the normalized partition function of the model is maximized by K_(d+1),
and this in turn shows that K_(d+1) maximizes the normalized number of homomorphisms from a d-regular graph to the path on 3 vertices with a self-loop on each vertex,
proving a conjecture of David Galvin.

The Widom-Rowlinson model on a graph is a random configuration of two types of particles with a hard-core interaction between particles of different types. The

Independent Sets, Matchings, and Occupancy Fractions, *Journal of the London Mathematical Society, 2017*, (with Ewan Davies, Matthew Jenssen, and Barnaby Roberts).

We give tight upper bounds on the logarithmic derivative of the independence and matching polynomials of any d-regular graph.

For independent sets, this is a strengthening of a sequence of results of Kahn, Galvin and Tetali, and Zhao that a disjoint union of K_dd's maximizes the independence polynomial and total number of independent sets among all d-regular graphs.

For matchings, the result implies that disjoint unions of K_dd's maximize the matching polynomial and total number of matchings, as well as proving the asymptotic upper matching conjecture of Friedland, Krop, Lundow, and Markstrom.

The proof uses an*occupancy method*: we show that the occupancy fraction in the hard-core model and the edge occupancy fraction in the monomer-dimer model are maximized over all d-regular graphs by K_dd.

We give tight upper bounds on the logarithmic derivative of the independence and matching polynomials of any d-regular graph.

For independent sets, this is a strengthening of a sequence of results of Kahn, Galvin and Tetali, and Zhao that a disjoint union of K_dd's maximizes the independence polynomial and total number of independent sets among all d-regular graphs.

For matchings, the result implies that disjoint unions of K_dd's maximize the matching polynomial and total number of matchings, as well as proving the asymptotic upper matching conjecture of Friedland, Krop, Lundow, and Markstrom.

The proof uses an

Birthday Inequalities, Repulsion, and Hard Spheres, *Proceedings of the AMS, 2016*.

I study a "Birthday Inequality" and a "Repulsion Inequality" for random geometric graphs. This gives a new method for bounding the partition function in two models from statistical physics: the hard sphere model and the hard-core lattice gas model. Two applications in combinatorics are new upper bounds on the number of independent sets and matchings of a given size in d-regular graphs. Another application is a surprising fact about spheres in dimension 24: at certain densities, if we condition that a set of n randomly placed centers of spheres of radius r are at pairwise distance greater than r, then the expectation of the volume of their union decreases!

I study a "Birthday Inequality" and a "Repulsion Inequality" for random geometric graphs. This gives a new method for bounding the partition function in two models from statistical physics: the hard sphere model and the hard-core lattice gas model. Two applications in combinatorics are new upper bounds on the number of independent sets and matchings of a given size in d-regular graphs. Another application is a surprising fact about spheres in dimension 24: at certain densities, if we condition that a set of n randomly placed centers of spheres of radius r are at pairwise distance greater than r, then the expectation of the volume of their union decreases!

Spectral Thresholds in the Bipartite Stochastic Block Model, *COLT 2016*, (with Laura Florescu).

We study a stochastic block model on a bipartite graph with unequal sizes and a planted partition on each side. The model arose in previous work on planted random k-SAT. We show that the usual spectral algorithm based on the SVD has a edge-density threshold for recovery much higher than the algorithm based on subsampling (below) and a new algorithm we propose, Diagonal Deletion SVD. We show the failure of the usual SVD is due to a localization phenomenon of the singular vectors. We also show that the bipartite block model exhibits a sharp threshold for detection of the planted partition, as in the results for the SBM of Mossel, Neeman, Sly and Massoulie.

We study a stochastic block model on a bipartite graph with unequal sizes and a planted partition on each side. The model arose in previous work on planted random k-SAT. We show that the usual spectral algorithm based on the SVD has a edge-density threshold for recovery much higher than the algorithm based on subsampling (below) and a new algorithm we propose, Diagonal Deletion SVD. We show the failure of the usual SVD is due to a localization phenomenon of the singular vectors. We also show that the bipartite block model exhibits a sharp threshold for detection of the planted partition, as in the results for the SBM of Mossel, Neeman, Sly and Massoulie.

Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's, *NIPS 2015*,
(with Vitaly Feldman and Santosh Vempala)

We present a new algorithm for recovering planted solutions in two models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. The idea of the algorithm is to perform power iteration with a sequence of adjacency matrices from disjoint graphs subsampled from the original random graph. The algorithm is simple, easy to implement, and its running time is linear in the number of edges/constraints used.

We present a new algorithm for recovering planted solutions in two models, the stochastic block model and planted constraint satisfaction problems, via a common generalization in terms of random bipartite graphs. The idea of the algorithm is to perform power iteration with a sequence of adjacency matrices from disjoint graphs subsampled from the original random graph. The algorithm is simple, easy to implement, and its running time is linear in the number of edges/constraints used.

On the Complexity of Random Satisfiability Problems with Planted Solutions, *SICOMP, 2018*, extended abstract at *STOC 2015*,
(with Vitaly Feldman and Santosh Vempala)

We consider the problem of finding a planted assignment to a random k-SAT formula (or a planted partition in a random hypergraph). This problem exhibits an algorithmic gap: while the planted solution is essentially unique with a large linear number of clauses, there are distributions for which efficient algorithms are only known for n^{k/2} clauses. We give a rigorous explanation of this gap by showing that any efficient*statistical* algorithm requires n^{r/2} clauses,
where *r* is a parameter that depends on the distribution and can be as high as k. We also describe a statistical algorithm that
matches the sample complexity up to log factors. This gives concrete evidence for the security
of Goldreich's proposed one-way function,
and proves Feige's 3-SAT hypothesis for
statistical algorithms.

We consider the problem of finding a planted assignment to a random k-SAT formula (or a planted partition in a random hypergraph). This problem exhibits an algorithmic gap: while the planted solution is essentially unique with a large linear number of clauses, there are distributions for which efficient algorithms are only known for n^{k/2} clauses. We give a rigorous explanation of this gap by showing that any efficient

Large Deviations for the Empirical Distribution in the Branching Random Walk, *Electronic Journal of Probability, 2015*, (with Oren Louidor)

We consider large deviations in the empirical distribution of a super-critical branching random walk. In particular we answer questions such as: What is the asymptotic probability that at least 70% of particles are between -Sqrt(n) and Sqrt(n) after n steps, instead of the expected 68%? We find there are two distinct regimes if the set considered is a finite union of intervals, with rates interpolating between the two for sets that are countable unions of intervals.

We consider large deviations in the empirical distribution of a super-critical branching random walk. In particular we answer questions such as: What is the asymptotic probability that at least 70% of particles are between -Sqrt(n) and Sqrt(n) after n steps, instead of the expected 68%? We find there are two distinct regimes if the set considered is a finite union of intervals, with rates interpolating between the two for sets that are countable unions of intervals.

On Sharp Thresholds in Random Geometric Graphs, *RANDOM 2014*, (with Milan Bradonjic)

We give a characterization of vertex-monotone properties with sharp thresholds in a Poisson random geometric graph or hypergraph. As an application we show that a geometric model of random k-SAT exhibits a sharp threshold for satisfiability.

We give a characterization of vertex-monotone properties with sharp thresholds in a Poisson random geometric graph or hypergraph. As an application we show that a geometric model of random k-SAT exhibits a sharp threshold for satisfiability.

Computer-enabled Metrics of Statistical Significance for Discrete Data, (with Mark Tygert and Rachel Ward)

This monograph addresses basic problems in statistical significance from an algorithmic perspective. We show that a simple root-mean-square test and the discrete Kolmogorov-Smirnov test are often superior to classical chi-square statistics, and give the relevant theory along with numerous examples.

This monograph addresses basic problems in statistical significance from an algorithmic perspective. We show that a simple root-mean-square test and the discrete Kolmogorov-Smirnov test are often superior to classical chi-square statistics, and give the relevant theory along with numerous examples.

Random k-SAT and the Power of Two Choices, *Random Structures and Algorithms, 2014*.

I give an Achlioptas rule to shift the satisfiability threshold for random k-SAT. Previously this was known for k=2 (Sinclair/Vilenchik), and the parameters have since been improved (Dani et al). I also propose a semi-random decision problem for random k-SAT.

I give an Achlioptas rule to shift the satisfiability threshold for random k-SAT. Previously this was known for k=2 (Sinclair/Vilenchik), and the parameters have since been improved (Dani et al). I also propose a semi-random decision problem for random k-SAT.

Some deficiencies of χ^{2} and classical exact tests of significance, *Applied and Computational Harmonic Analysis*, 2014. (with Mark Tygert and Rachel Ward)

We indicate the serious problems faced by the chi-square test in dealing with sparse data and propose a remedy. See Andrew Gellman's blog post for a discussion.

We indicate the serious problems faced by the chi-square test in dealing with sparse data and propose a remedy. See Andrew Gellman's blog post for a discussion.

The Bohman-Frieze Process Near Criticality, * Random Structures and Algorithms*, 2013 (with Mihyun Kang and Joel Spencer).

We study the fine behavior of the phase transition of the Bohman-Frieze random graph process, and show that it shares several critical exponents with the Erdos-Renyi random graph. Our technique involves deriving a PDE associated to the process and using singularity analysis to extra information about small components from a multivariate generating function.

We study the fine behavior of the phase transition of the Bohman-Frieze random graph process, and show that it shares several critical exponents with the Erdos-Renyi random graph. Our technique involves deriving a PDE associated to the process and using singularity analysis to extra information about small components from a multivariate generating function.

Hardness of Finding Independent Sets In Almost 3-Colorable Graphs, * FOCS * 2010 (with Irit Dinur, Subhash Khot, and Muli Safra)

With techniques adapted from (Dinur/Safra) and Fourier analysis on the discrete q-ary hypercube, we prove hardness of approximation for graph coloring. In particular, we show that it is NP-hard to distinguish a graph that is almost k-colorable from one whose largest independent set is a 1/k^2 fraction of the graph.

With techniques adapted from (Dinur/Safra) and Fourier analysis on the discrete q-ary hypercube, we prove hardness of approximation for graph coloring. In particular, we show that it is NP-hard to distinguish a graph that is almost k-colorable from one whose largest independent set is a 1/k^2 fraction of the graph.

The Forgetfulness of Balls and Bins, * Random Structures and Algorithms*, 2012.

I answer the question 'How many additional random balls are necessary to disguise an arbitrary initial placement of m balls in n bins?' with a precise answer that depends on the empirical variance of the initial placement.

I answer the question 'How many additional random balls are necessary to disguise an arbitrary initial placement of m balls in n bins?' with a precise answer that depends on the empirical variance of the initial placement.

Computing the confidence levels for a root-mean-square test of goodness-of-fit, * Applied Mathematics and Computation*, 2011 (with Mark Tygert and Rachel Ward)

We give an efficient numerical algorithm for computing the asymptotic distribution of a root-mean-square goodness of fit test. The algorithm is a numerical evaluation of a contour integral.

We give an efficient numerical algorithm for computing the asymptotic distribution of a root-mean-square goodness of fit test. The algorithm is a numerical evaluation of a contour integral.