# A General Framework for Symmetric Property Estimation

@inproceedings{Charikar2019AGF, title={A General Framework for Symmetric Property Estimation}, author={Moses Charikar and Kirankumar Shiragur and Aaron Sidford}, booktitle={NeurIPS}, year={2019} }

In this paper we provide a general framework for estimating symmetric properties of distributions from i.i.d. samples. For a broad class of symmetric properties we identify the {\em easy} region where empirical estimation works and the {\em difficult} region where more complex estimators are required. We show that by approximately computing the profile maximum likelihood (PML) distribution \cite{ADOS16} in this difficult region we obtain a symmetric property estimation framework that is sample… Expand

#### Topics from this paper

#### 6 Citations

Instance Based Approximations to Profile Maximum Likelihood

- Computer Science, Mathematics
- NeurIPS
- 2020

A new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution, a prominent quantity in symmetric property estimation, and the first provable computationally efficient implementation of PseudoPML, a general framework for estimating a broad class of symmetric properties. Expand

The Optimality of Profile Maximum Likelihood in Estimating Sorted Discrete Distributions

- Computer Science, Mathematics
- ArXiv
- 2020

This paper strengthens the above result and shows that using a careful chaining argument, the error probability can be reduced to $\delta^{1-c}\exp(c'n^{1/3+c})$ for arbitrarily small constants $c>0$ and some constant $c'>0$. Expand

Compressed Maximum Likelihood

- Computer Science
- ICML
- 2021

This work shows that CML is sample-efficient for several fundamental learning tasks over both discrete and continuous domains, including learning structural densities, estimating probability multisets, and inferring symmetric distribution functionals. Expand

On the Competitive Analysis and High Accuracy Optimality of Profile Maximum Likelihood

- Computer Science, Mathematics
- SODA
- 2021

This paper strengthens the above result and shows that using a careful chaining argument, the error probability can be reduced to $\delta^{1-c}\cdot \exp(c'n^{1/3+c})$ for arbitrarily small constants $c>0$ and some constant $c'>0$. Expand

On the High Accuracy Limitation of Adaptive Property Estimation

- Mathematics, Computer Science
- AISTATS
- 2021

It is shown that under a mild assumption that the distribution estimator is close to the true sorted distribution in expectation, any adaptive approach cannot achieve the optimal sample complexity for every $1$-Lipschitz property within accuracy $\varepsilon \ll n^{-1/3}$. Expand

Minimax Estimation of Divergences Between Discrete Distributions

- Computer Science, Mathematics
- IEEE Journal on Selected Areas in Information Theory
- 2020

The first minimax rate-optimal estimator which does not require any Poissonization, sample splitting, or explicit construction of approximating polynomials is constructed. Expand

#### References

SHOWING 1-10 OF 29 REFERENCES

Efficient profile maximum likelihood for universal symmetric property estimation

- Mathematics, Computer Science
- STOC
- 2019

An algorithm is provided that, given n samples from a distribution, computes an approximate PML distribution up to a multiplicative error of exp(n2/3 poly log(n)) in nearly linear time and yields a universal plug-in estimator that is competitive with a broad range of estimators up to accuracy. Expand

A Unified Maximum Likelihood Approach for Optimal Distribution Property Estimation

- Computer Science, Mathematics
- Electron. Colloquium Comput. Complex.
- 2016

It is proved that for all these properties, a single, simple, plug-in estimator---profile maximum likelihood (PML)---performs as well as the best specialized techniques, raising the possibility that PML may optimally estimate many other symmetric properties. Expand

Data Amplification: A Unified and Competitive Approach to Property Estimation

- Mathematics, Computer Science
- NeurIPS
- 2018

This work designs the first unified, linear-time, competitive, property estimator that for a wide class of properties and for all underlying distributions uses just 2n samples to achieve the performance attained by the empirical estimator with n\sqrt{\log n} samples. Expand

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

- Mathematics, Computer Science
- COLT
- 2018

An efficiently computable estimator is constructed that achieves the minimax rates in estimating the distribution up to permutation, and it is shown that the plug-in approach of the authors' unlabeled distribution estimators is "universal" in estimating symmetric functionals of discrete distributions. Expand

Minimax Estimation of Functionals of Discrete Distributions

- Mathematics, Computer Science
- IEEE Transactions on Information Theory
- 2015

The minimax rate-optimal mutual information estimator yielded by the framework leads to significant performance boosts over the Chow-Liu algorithm in learning graphical models and the practical advantages of the schemes for the estimation of entropy and mutual information. Expand

Approximate Profile Maximum Likelihood

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2019

An efficient algorithm for approximate computation of the profile maximum likelihood (PML), a variant of maximum likelihood maximizing the probability of observing a sufficient statistic rather than the empirical sample, is proposed and the empirical performance is competitive and sometimes significantly better than state-of-the-art performance for various estimation problems. Expand

The Power of Linear Estimators

- Mathematics, Computer Science
- 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science
- 2011

The main result is that for any property in this broad class of practically relevant distribution properties, there exists a near-optimal linear estimator, and a practical and polynomial-time algorithm for constructing such estimators for any given parameters. Expand

The Broad Optimality of Profile Maximum Likelihood

- Computer Science, Mathematics
- NeurIPS
- 2019

The profile maximum likelihood estimator is established as the first unified sample-optimal approach to a wide range of learning tasks and achieves the optimal sample complexity up to logarithmic factors of k. Expand

Data Amplification: Instance-Optimal Property Estimation

- Mathematics, Computer Science
- ICML
- 2020

Novel linear-time-computable estimators are presented that significantly "amplify" the effective amount of data available and outperform the previous state-of-the-art estimators designed for each specific property. Expand

Algorithms for modeling distributions over large alphabets

- Mathematics, Computer Science
- International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings.
- 2004

Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks. Expand