Local Density Estimation in High Dimensions

Xian Wu
Xian Wu
[email protected]
https://orcid.org/0000-0003-4650-5066
Simons Institute and University of California Berkeley, Berkeley, California 94705;
Search for more papers by this author
,
Moses Charikar
Moses Charikar
[email protected]
Department of Computer Science, Stanford University, Stanford, California 94305;
Search for more papers by this author
,
Vishnu Natchu
Vishnu Natchu
[email protected]
Laserlike, Inc., Mountain View, California 94035
Search for more papers by this author

Simons Institute and University of California Berkeley, Berkeley, California 94705;

Search for more papers by this author

Moses Charikar

[email protected]

Department of Computer Science, Stanford University, Stanford, California 94305;

Search for more papers by this author

Vishnu Natchu

[email protected]

Laserlike, Inc., Mountain View, California 94035

Search for more papers by this author

Published Online:19 Jan 2022https://doi.org/10.1287/moor.2021.1221

References

[1] Ahle TD, Aumüller M, Pagh R (2017) Parameter-free locality sensitive hashing for spherical range reporting. Proc. 28th Annual ACM-SIAM Sympos. Discrete Algorithms (Society for Industrial and Applied Mathematics), 239–256.Google Scholar
[2] Andoni A, Razenshteyn I (2015) Optimal data-dependent hashing for approximate near neighbors. Proc. 47th Annual ACM Sympos. Theory Comput. (ACM, New York), 793–801.Google Scholar
[3] Andoni A, Razenshteyn I (2015) Tight lower bounds for data-dependent locality-sensitive hashing. Preprint, submitted July 15, https://arxiv.org/abs/1507.04299.Google Scholar
[4] Andoni A, Indyk P, Laarhoven T, Razenshteyn I, Schmidt L (2015) Practical and optimal LSH for angular distance. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 28 (Neurips 2015, Montreal, Canada), 1225–1233.Google Scholar
[5] Arya S, da Fonseca GD, Mount DM (2010) A unified approach to approximate proximity searching. Algorithms – ESA 2010, 18th Annual European Symposium, Liverpool, UK, September 6–8, 2010. Proceedings, Part I.Google Scholar
[6] Aumüller M, Christiani T, Pagh R, Silvestri F (2017) Distance-sensitive hashing Preprint, submitted March 22, https://arxiv.org/abs/1703.07867.Google Scholar
[7] Cao S, Lu W, Xu Q (2015) GraRep: Learning graph representations with global structural information. Proc. 24th ACM Internat. Conf. Inform. Knowledge Management (ACM), 891–900.Google Scholar
[8] Charikar M (2002) Similarity estimation techniques from rounding algorithms. Proc. 34th Annual ACM Sympos. Theory Comput. (ACM, New York), 380–388.Google Scholar
[9] Charikar M, Siminelakis P (2017) Hashing-based-estimators for kernel density in high dimensions. IEEE Sympos. Foundations Comput. Sci.Google Scholar
[10] Coleman B, Shrivastava A, Baraniuk RG (2019) Race: Sub-linear memory sketches for approximate near-neighbor search on streaming data. Preprint, submitted February 18, https://arxiv.org/abs/1902.06687.Google Scholar
[11] Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. Proc. 20th Annual Sympos. Comput. Geometry (ACM), 253–262.Google Scholar
[12] Goemans MX, Williamson DP (1995) Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42(6):1115–1145.Crossref, Google Scholar
[13] Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM), 855–864.Google Scholar
[14] Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Proc. 31st Internat. Conf. Neural Inform. Processing Systems, 1025–1035.Google Scholar
[15] Huang EH, Socher R, Manning CD, Ng AY (2012) Improving word representations via global context and multiple word prototypes. Proc. 50th Annual Meeting Assoc. Comput. Linguistics: Long Papers, vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 873–882.Google Scholar
[16] Indyk P, Motwani R (1998) Approximate nearest neighbors: Toward removing the curse of dimensionality. Proc. 30th Annual ACM Sympos. Theory Comput. (ACM, New York), 604–613.Google Scholar
[17] Jegou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Machine Intelligence 33(1):117–128.Crossref, Google Scholar
[18] Kennedy C, Ward R (2016) Fast cross-polytope locality-sensitive hashing. Preprint, submitted February 22, https://arxiv.org/abs/1602.06922.Google Scholar
[19] Lai S, Liu K, He S, Zhao J (2016) How to generate a good word embedding. IEEE Intelligent Systems 31(6):5–14.Crossref, Google Scholar
[20] Livni R, Shalev-Shwartz S, Shamir O (2014) On the computational efficiency of training neural networks. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Montreal, Quebec, Canada), 855–863.Google Scholar
[21] Luo C, Shrivastava A (2018) Arrays of (locality-sensitive) count estimators (ACE): Anomaly detection on the edge. Proc. 2018 World Wide Web Conf. (International World Wide Web Conferences Steering Committee), 1439–1448.Google Scholar
[22] Lv Q, Josephson W, Wang Z, Charikar M, Li K (2007) Multi-probe LSH: Efficient indexing for high-dimensional similarity search. Proc. 33rd Internat. Conf. Very Large Data Bases (VLDB Endowment), 950–961.Google Scholar
[23] Melamud O, McClosky D, Patwardhan S, Bansal M (2016) The role of context types and dimensionality in learning word embeddings Preprint, submitted January 5, https://arxiv.org/abs/1601.00893.Google Scholar
[24] Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space Preprint, submitted January 16, https://arxiv.org/abs/1301.3781.Google Scholar
[25] Motwani R, Naor A, Panigrahi R (2006) Lower bounds on locality sensitive hashing. Proc. 22nd Annual Sympos. Comput. Geometry (ACM), 154–157.Google Scholar
[26] O’Donnell R, Wu Y, Zhou Y (2014) Optimal lower bounds for locality-sensitive hashing (except when q is tiny). ACM Trans. Comput. Theory 6(1):1–13.Crossref, Google Scholar
[27] Panigrahy R (2006) Entropy based nearest neighbor search in high dimensions. Proc. 17th Annual ACM-SIAM Sympos. Discrete Algorithm (Society for Industrial and Applied Mathematics), 1186–1195.Google Scholar
[28] Pennington J, Socher R, Manning CD (2014) GLOVE: Global vectors for word representation. Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Doha, Qatar), 1532–1543.Google Scholar
[29] Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. Proc. 20th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM), 701–710.Google Scholar
[30] Spring R, Shrivastava A (2017) A new unbiased and efficient class of LSH-based samplers and estimators for partition function computation in log-linear models Preprint, submitted March 15, https://arxiv.org/abs/1703.05160.Google Scholar
[31] Sugawara K, Kobayashi H, Iwasaki M (2016) On approximately searching for similar word embeddings. Proc. 54th Annual Meeting Assoc. Comput. Linguistics.Google Scholar
[32] Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. Proc. 24th Internat. Conf. World Wide Web (International World Wide Web Conferences Steering Committee), 1067–1077.Google Scholar
[33] Terasawa K, Tanaka Y (2007) Spherical LSH for approximate nearest neighbor search on unit hypersphere. Workshop Algorithms Data Structures (Springer), 27–38.Google Scholar
[34] Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S (2017) Community preserving network embedding. Proc. 31st AAAI Conf. Artificial Intelligence, 203–209.Google Scholar
[35] Yang Z, Cohen W, Salakhudinov R (2016) Revisiting semi-supervised learning with graph embeddings. Internat. Conf. Machine Learn., 40–48.Google Scholar
[36] Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2016) Understanding deep learning requires rethinking generalization Preprint, submitted November 10, https://arxiv.org/abs/1611.03530.Google Scholar

cover image Mathematics of Operations Research

Volume 47, Issue 4

November 2022

Pages 2547-3399, C2

Article Information

Metrics

Information

Received:October 15, 2018
Accepted:September 12, 2021
Published Online:January 19, 2022

Cite as

Xian Wu, Moses Charikar, Vishnu Natchu (2022) Local Density Estimation in High Dimensions. Mathematics of Operations Research 47(4):2614-2640.

https://doi.org/10.1287/moor.2021.1221

Keywords

Acknowledgments

The authors thank Tatsunori Hashimoto and Virag Shah for helpful comments and feedback. This work was initiated when the authors were visiting Laserlike, Inc.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Local Density Estimation in High Dimensions

References

Volume 47, Issue 4

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News