# Statistical analysis of Mapper for stochastic and multivariate filters

Statistical analysis of Mapper for stochastic and multivariate filters Reeb spaces, as well as their discretized versions called Mappers, are common descriptors used in topological data analysis, with plenty of applications in various fields of science, such as computational biology and data visualization, among others. The stability and quantification of the rate of convergence of the Mapper to the Reeb space has been studied a lot in recent works (Brown et al. in CoRR. arXiv:1909.03488, 2019; Carrière and Oudot in Found Comput Math 18(6):1333–1396, 2017; Carrière et al. in J Mach Learn Res 19(12):1–39, 2018; Munch and Wang in: 32nd international symposium on computational geometry (SoCG 2016), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 51: 53:1–53:16, 2016), focusing on the case where a scalar-valued filter is used for the computation of Mapper. On the other hand, much less is known in the multivariate case, when the codomain of the filter is Rp\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\mathbb {R}}^p$$\end{document}, and in the general case, when it is a general metric space (Z,dZ)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$(\mathcal {Z},d_\mathcal {Z})$$\end{document}, instead of R\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\mathbb {R}}$$\end{document}. The few results that are available in this setting (Dey et al. in: 33rd international symposium on computational geometry (SoCG 2017), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 77, 36:1–36:16, 2017; Munch and Wang, 2016) can only handle continuous topological spaces and cannot be used as is for finite metric spaces representing data, such as point clouds and distance matrices. In this article, we introduce a slight modification of the usual Mapper construction and we give risk bounds for estimating the Reeb space using this estimator. Our approach applies in particular to the setting where the filter function used to compute Mapper is also estimated from data, such as the eigenfunctions of PCA. Our results are given with respect to the Gromov-Hausdorff distance, computed with specific filter-based pseudometrics for Mappers and Reeb spaces defined in Dey et al. (2017). We finally provide examples of this setting in statistics and machine learning for different kinds of target filters, as well as numerical experiments that demonstrate the relevance of our approach. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Applied and Computational Topology Springer Journals

# Statistical analysis of Mapper for stochastic and multivariate filters

, Volume 6 (3) – Sep 1, 2022
39 pages

/lp/springer-journals/statistical-analysis-of-mapper-for-stochastic-and-multivariate-filters-VizlKi0RPV
Publisher
Springer Journals
Copyright © The Author(s), under exclusive licence to Springer Nature Switzerland AG 2022
ISSN
2367-1726
eISSN
2367-1734
DOI
10.1007/s41468-022-00090-w
Publisher site
See Article on Publisher Site

### Abstract

Reeb spaces, as well as their discretized versions called Mappers, are common descriptors used in topological data analysis, with plenty of applications in various fields of science, such as computational biology and data visualization, among others. The stability and quantification of the rate of convergence of the Mapper to the Reeb space has been studied a lot in recent works (Brown et al. in CoRR. arXiv:1909.03488, 2019; Carrière and Oudot in Found Comput Math 18(6):1333–1396, 2017; Carrière et al. in J Mach Learn Res 19(12):1–39, 2018; Munch and Wang in: 32nd international symposium on computational geometry (SoCG 2016), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 51: 53:1–53:16, 2016), focusing on the case where a scalar-valued filter is used for the computation of Mapper. On the other hand, much less is known in the multivariate case, when the codomain of the filter is Rp\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\mathbb {R}}^p$$\end{document}, and in the general case, when it is a general metric space (Z,dZ)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$(\mathcal {Z},d_\mathcal {Z})$$\end{document}, instead of R\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\mathbb {R}}$$\end{document}. The few results that are available in this setting (Dey et al. in: 33rd international symposium on computational geometry (SoCG 2017), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 77, 36:1–36:16, 2017; Munch and Wang, 2016) can only handle continuous topological spaces and cannot be used as is for finite metric spaces representing data, such as point clouds and distance matrices. In this article, we introduce a slight modification of the usual Mapper construction and we give risk bounds for estimating the Reeb space using this estimator. Our approach applies in particular to the setting where the filter function used to compute Mapper is also estimated from data, such as the eigenfunctions of PCA. Our results are given with respect to the Gromov-Hausdorff distance, computed with specific filter-based pseudometrics for Mappers and Reeb spaces defined in Dey et al. (2017). We finally provide examples of this setting in statistics and machine learning for different kinds of target filters, as well as numerical experiments that demonstrate the relevance of our approach.

### Journal

Journal of Applied and Computational TopologySpringer Journals

Published: Sep 1, 2022

Keywords: Topological data analysis; Mapper; Confidence regions; 55N31; 62R40

### References

Access the full text.