Access the full text.
Sign up today, get DeepDyve free for 14 days.
Jeff Erickson, Kim Whittlesey (2005)
Greedy optimal homotopy and homology generators
Peter Bubenik (2012)
Statistical topological data analysis using persistence landscapesJ. Mach. Learn. Res., 16
M. Saadatfar, Hiroshi Takeuchi, V. Robins, N. Francois, Y. Hiraoka (2017)
Pore configuration landscape of granular crystallizationNature Communications, 8
Philip Smith, V. Kurlin (2019)
Skeletonisation algorithms with theoretical guarantees for unorganised point clouds with high levels of noisePattern Recognit., 115
A. Zomorodian (2010)
Computational topology
A. Suzuki, M. Miyazawa, J. Minto, T. Tsuji, I. Obayashi, Y. Hiraoka, Takatoshi Ito (2021)
Flow estimation solely from image data through persistent homology analysisScientific Reports, 11
T. Dey, T. Hou, S. Mandal (2018)
Persistent 1-Cycles: Definition, Computation, and Its ApplicationArXiv, abs/1810.04807
Y Onodera, S Kohara, S Tahara, A Masuno, H Inoue, M Shiga, A Hirata, K Tsuchiya, Y Hiraoka, I Obayashi, K Ohara, A Mizuno, O Sakata (2019)
Understanding diffraction patterns of glassy, liquid and amorphous materials via persistent homology analysesJ. Ceram. Soc. Jpn., 127
Matija Čufar (2020)
Ripserer.jl: flexible and efficient persistent homology computation in JuliaJ. Open Source Softw., 5
Y. Onodera, Y. Onodera, S. Kohara, S. Tahara, S. Tahara, Atsunobu Masuno, Atsunobu Masuno, H. Inoue, M. Shiga, Motoki Shiga, A. Hirata, K. Tsuchiya, Y. Hiraoka, I. Obayashi, K. Ohara, A. Mizuno, O. Sakata (2019)
Understanding diffraction patterns of glassy, liquid and amorphous materials via persistent homology analysesJournal of the Ceramic Society of Japan
G. Carlsson, T. Ishkhanov, V. Silva, A. Zomorodian (2007)
On the Local Behavior of Spaces of Natural ImagesInternational Journal of Computer Vision, 76
Joseph Chan, G. Carlsson, R. Rabadán (2013)
Topology of viral evolutionProceedings of the National Academy of Sciences, 110
M. Lesnick (2011)
The Theory of the Interleaving Distance on Multidimensional Persistence ModulesFoundations of Computational Mathematics, 15
Y Hiraoka, T Nakamura, A Hirata, EG Escolar, K Matsue, Y Nishiura (2016)
Hierarchical structures of amorphous solids characterized by persistent homologyProc. Natl. Acad. Sci., 113
G. Kusano, K. Fukumizu, Y. Hiraoka (2017)
Kernel Method for Persistence Diagrams via Kernel Embedding and Weight FactorJ. Mach. Learn. Res., 18
F Iuricich (2022)
Persistence cycles for visual exploration of persistent homologyIEEE Trans. Vis. Comput. Graph., 28
Henry Adams, T. Emerson, M. Kirby, R. Neville, C. Peterson, P. Shipman, Sofya Chepushtanova, Eric Hanson, Francis Motta, Lori Ziegelmeier (2015)
Persistence Images: A Stable Vector Representation of Persistent HomologyJ. Mach. Learn. Res., 18
H Adams (2017)
1J. Mach. Learn. Res., 18
D. Cohen-Steiner, H. Edelsbrunner, J. Harer (2005)
Stability of Persistence DiagramsDiscrete & Computational Geometry, 37
Paul Bendich, Peter Bubenik, Alexander Wagner (2015)
Stabilizing the unstable output of persistent homology computationsJournal of Applied and Computational Topology, 4
Günter Rote, Gert Vegter (2007)
Effective Computational Geometry for Curves and Surfaces Chapter 7 Computational Topology : An Introduction
F. Chazal, D. Cohen-Steiner, M. Glisse, L. Guibas, S. Oudot (2009)
Proximity of persistence modules and their diagrams
Chao Chen, D. Freedman (2010)
Hardness Results for Homology LocalizationDiscrete & Computational Geometry, 45
H. Edelsbrunner, D. Letscher, A. Zomorodian (2000)
Topological Persistence and SimplificationDiscrete & Computational Geometry, 28
T. Dey, A. Hirani, B. Krishnamoorthy (2010)
Optimal homologous cycles, total unimodularity, and linear programming
Sébastien Roux, V. Petkov
Electronic Reprint Applied Crystallography Isaacs – Interactive Structure Analysis of Amorphous and Crystalline Systems Applied Crystallography Isaacs – Interactive Structure Analysis of Amorphous and Crystalline Systems
(2016)
Escolar and Yasuaki Hiraoka
Ippei Obayashi and HomCloud developer team
I. Obayashi (2017)
Volume Optimal Cycle: Tightest representative cycle of a generator on persistent homologySIAM J. Appl. Algebra Geom., 2
G. Carlsson (2009)
Topology and dataBulletin of the American Mathematical Society, 46
H. Edelsbrunner, Ernst Mücke (1992)
Three-dimensional alpha shapes
A. Hirata, Tomohide Wada, I. Obayashi, Y. Hiraoka (2020)
Structural changes during glass formation extracted by computational homology with machine learningCommunications Materials, 1
If there is an edge ω (cid:48) τ −→ ω is in ¯ G ∗ , we have ω (cid:48) ≺ r ω
The optimal volume for ( τ, ω ) ∈ D n − 1 ( X r ) is dec( ω, ¯ G ∗ ) , where dec( ω, ¯ G ∗ ) the set of all descendant nodes of ω in ¯ G ∗ including ω itself
SL Rouxa, V Petkova (2010)
Isaacs - interactive structure analysis of amorphous and crystalline systemsJ. Appl. Crystallogr., 43
CJA Delfinado, H Edelsbrunner (1995)
An incremental algorithm for betti numbers of simplicial complexes on the 3-sphereComput. Aided Geom. Des., 12
JM Chan, G Carlsson, R Rabadan (2013)
Topology of viral evolutionProc. Natl. Acad. Sci., 110
A. Zomorodian, G. Carlsson (2004)
Computing Persistent HomologyDiscrete & Computational Geometry, 33
Benjamin Schweinhart (2015)
Statistical Topology of Embedded Graphs
A. Tahbaz-Salehi, A. Jadbabaie (2008)
Distributed coverage verification in sensor networks without location information2008 47th IEEE Conference on Decision and Control
Emerson Escolar, Y. Hiraoka (2016)
Optimal Cycles for Persistent Homology Via Linear Programming
Ulrich Bauer, M. Lesnick (2013)
Induced Matchings of Barcodes and the Algebraic Stability of PersistenceProceedings of the thirtieth annual symposium on Computational geometry
H. Hughes, Sarah Bridges-Rhoads (2013)
Beyond the Scope of this PaperInternational Review of Qualitative Research, 6
V. Kurlin (2015)
A one‐dimensional homologically persistent skeleton of an unstructured point cloud in any metric spaceComputer Graphics Forum, 34
Y. Hiraoka, Takenobu Nakamura, A. Hirata, Emerson Escolar, K. Matsue, Y. Nishiura (2015)
Hierarchical structures of amorphous solids characterized by persistent homologyProceedings of the National Academy of Sciences, 113
This paper proposes a stable volume and a stable volume variant, referred to as a stable sub-volume, for more reliable data analysis using persistent homology. In prior research, an optimal cycle and similar ideas have been proposed to identify the homo- logical structure corresponding to each birth-death pair in a persistence diagram. While this is helpful for data analysis using persistent homology, the results are sensitive to noise. The sensitivity affects the reliability and interpretability of the analysis. In this paper, stable volumes and stable sub-volumes are proposed to solve this problem. For a special case, we prove that a stable volume is the robust part of an optimal volume against noise. We implemented stable volumes and sub-volumes on HomCloud, a data analysis software package based on persistent homology, and show examples of stable volumes and sub-volumes. Keywords Persistent homology · Topological data analysis · Computational algebraic topology · Optimization on homology algebra Mathematics Subject Classification 55N31 · 62R40 · 55-04 · 55-08 1 Introduction Topological data analysis (TDA) (Edelsbrunner and Harer 2010; Carlsson 2009)isa data analysis method utilizing the mathematical concept of topology. In recent years, persistent homology (PH) (Edelsbrunner et al. 2002; Zomorodian and Carlsson 2005) has become one of the most important tools of TDA. PH is mathematically formalized by using homology on a filtration. We can characterize the geometric information by encoding the information regarding the scale of the data on the filtration. PH has developed rapidly over the most recent decade and has been applied in a variety of areas, B Ippei Obayashi i.obayashi@okayama-u.ac.jp Center for Artificial Intelligence and Mathematical Data Science, Okayama University, 3-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Okayama, Japan 123 I.Obayashi Fig. 1 How to compute a PD for 5 points a Input pointcloud b Pointcloud and circles with radii r < a/2. √ √ c r = a/2, d r = a/ 3, e r = a/ 2, f PD including natural image analysis (Carlsson et al. 2008), biology (Chan et al. 2013), geology (Suzuki et al. 2021), and materials science (Hiraoka et al. 2016; Saadatfar et al. 2017; Onodera et al. 2019; Hirata et al. 2020). PH information can be described by a persistence diagram (PD) or a persistence barcode. A PD is a scatter plot on the XY plane, where each point (called a birth-death pair) on the plot corresponds to a homological structure in the input data. 1.1 Persistent homology and volume-optimal cycles Figure 1 can be used to intuitively explain how a PD is determined from data. The input data in the example constitute a pointcloud with five points, as shown in Fig. 1a. Let a be the length of the edges of the regular triangle and the square. Since the five points themselves do not carry any interesting topological information, we construct a topological structure by putting circles with values of radii r as indicated in Fig. 1b–e. We now face the problem of how to determine the proper size of the circles. If the circles are too small, as in Fig. 1b, the topology is simply equivalent to the five points. On the other hand, if the circles are too large, as in Fig. 1e, the shape becomes acyclic, which makes it impossible to uncover any interesting topological information. PH solves this problem by considering the appearance and disappearance of homology generators associated with radii changes; that is, PH considers the increasing sequence from Fig. 1b–e. In the Fig. 1 example, two holes (homology generators in the 1st homology) appear at Fig. 1c, one of which disappears at d. The other hole disappears at Fig. 1e. The pair of radii of the appearance and disappearance of each homology generator is called a birth- death pair, and the multiset of all birth-death pairs is called a persistence diagram.In √ √ this example, the PD for the 1st homology is {(a/2, a/ 3), (a/2, a/ 2)}. The PD is visualized by a scatter plot or a 2D histogram (Fig. 1f). The two pairs correspond to a regular triangle and a square. We can extract the geometric information of the pointcloud in Fig. 1ausing thePD. It is beneficial in data analysis using PH to detect the original homological structures corresponding to each birth-death pair. This is sometimes called “inverse analysis on PH”. Some applications of PH (Hiraoka et al. 2016;Hirataetal. 2020) already use inverse analysis on PH. However, detection is not an easy task, as the representative cycle corresponding to a homology generator is not unique. A persistence diagram and a persistence barcode contain the same information. The difference is how the information is visualized. 123 Stable volumes for persistent homology Fig. 2 Two types of optimization results To solve this problem, various methods involving the solution of an optimiza- tion problem in homology algebra have been proposed (Chen and Freedman 2011; Escolar and Hiraoka 2016; Dey et al. 2019; Erickson and Whittlesey 2005;Dey et al. 2011; Obayashi 2018; Tahbaz-Salehi and Jadbabaie 2008; Schweinhart 2015) in various settings. For PH, optimal cycles (Escolar and Hiraoka 2016), persistence trees (Schweinhart 2015), volume-optimal cycles (Obayashi 2018), and persistent 1- cycles (Dey et al. 2019), have been proposed. The “tightest” or “minimal” cycle is considered the best to interpret, and solving the optimization problem in homology algebra produces the tightest cycle. Persistence cycles (Iuricich 2022) is also proposed for a similar purpose using discrete Morse theory. 1.2 Problems of homology optimization The existing methods have problems with noise. The example in Fig. 1 can be used to demonstrate these problems. The pointcloud in Fig. 1 has two minimal building blocks, a triangle and a square, and two birth-death pairs correspond to these shapes. However, the optimization technique sometimes fails to find these minimal building blocks when a small noise is added. Figure 2a shows the pointcloud in Fig. 1a when a small noise is added. This figure also shows circles, and, as can be seen, a pentagon appears before either a triangle or a square appears. By applying the homology optimization technique to the data in Fig. 2a, we produce a triangle and a pentagon, as shown in Fig. 2b. On the other hand, consider the pointcloud in Fig. 2c. The pointcloud here is also very close to that in Fig. 1a; however, while a square appears first, the homology optimization technique gives a square and a triangle, as shown in Fig. 2d. The example demonstrates the following problems: (1) A small noise can change the result, and (2) The optimization technique sometimes fails to give minimal building blocks of the data. The first problem is related to the reliability of the analysis; the second problem is related to the interpretability of the result. This means that the stability theorem does not hold for the solutions of the homology optimization problems. Previous studies (Cohen-Steiner et al. 2007; Chazal et al. 2009; Bauer and Lesnick 2014; Lesnick 2015) have proved the stability theorem for PDs. The literature indicates that a PD is continuously changed by a small noise in the input data if we consider reasonable metrics. The stability of other PH outputs has also been studied in the context of machine learning and PH (Bubenik 2015; Kusano et al. 2018; Adams et al. 2017) or pointcloud summary method (Kurlin 2015; Smith and Kurlin 2021). Although these types of stability play an important role in the study of PH, such a stability theorem does not hold for optimized results. 123 I.Obayashi Fig. 3 Optimal volumes for 3 × 3 × 3 cubical lattice. a 1st PD, b Optimal volumes of two pairs, c Optimal volumes of other two pairs Fig. 4 Schematic result of applying the statistical approach to the birth-death pair (1/2, 1/ 2) These problems are practically significant, especially when we analyze crystalline structures using PH. We can demonstrate the problems using the synthetic crystalline datainFig. 3. Here, the pointcloud consists of 27 points, arranged in a 3 × 3 × 3 cubical lattice. The distance between two vertices on the cube is 1. A small noise is added to the pointcloud. Figure 3a shows the 1st PD of the pointcloud. As can be seen here, some birth-death pairs are concentrated around (1/4, 1/ 2) ≈ (0.5, 0.7). These pairs correspond to 1x1 squares in the lattice. By applying volume-optimal cycles (Obayashi 2018), we found some loops corresponding to the pairs in Fig. 3b and c. Some of the cycles shown in Fig. 3b are squares, as expected; however, others in Fig. 3c are larger structures resembling chairs. These larger structures were detected with the same mechanism in Fig. 2. The purpose of this paper is to present a new method to solve these problems. That is, we propose a method to find minimal building blocks that is robust to noise. 1.3 Statistical approach in previous research Bendich et al. (2020) proposed a statistical approach for the problem, which can be outlined as follows: 1. Computing a PD from the input data and choosing a birth-death pair to analyze 2. Adding a small noise to the input data, computing a PD, and applying inverse analysis 3. Repeating 2. multiple times 4. Computing an average of the results of the inverse analysis By applying this method to the birth-death pair (1/2, 1/ 2) in Fig. 1, we produce the result shown in Fig. 4. The result indicates that the four points on the square are robust to noise and that the leftmost point is less robust. This result is consistent with the fact that the pair (1/2, 1/ 2) corresponds to a square. 123 Stable volumes for persistent homology The stability of the method in a probabilistic sense was also proved in the paper. Notably, the method provides a more reliable inverse analysis. The idea is quite clever, simple, and easy to implement. However, the method has a high computation cost since it requires the user to compute PDs and optimal cycles a large number of times. In the paper, the authors repeat the computation of the generators 1,000 or 10,000 times. We suspect that fewer repetitions may be sufficient, but ten or a hundred trials will be necessary. Multiple trials and errors are typically needed to tune the noise bandwidth in order to apply the method, and thus the cost is not ignorable. 1.4 Reconstructed shortest cycles in Ripserer.jl The “reconstructed shortest cycles” functionality of Ripserer.jl by Cufar (2020)gives another solution of the problem. The functionality reconstructs the tightest 1-cycle using the shortest path algorithm and the representative of persistent cohomology.The functionality accepts a noise bandwidth parameter and computes a tighter loop. The advantage of reconstructed shortest cycles is their efficiency. Since it uses the shortest path algorithm, the computational complexity is small. However, recon- structed shortest cycles can be applied only to 1st PH since it uses the shortest path algorithm. Another disadvantage is the lack of mathematical justification for 4 5 the functionality. Now the functionality is declared as experimental and gives no mathematical documentation. We explain the algorithm in Appendix A. 1.5 Results In this paper, we propose stable volumes and a variant of stable volumes, called stable sub-volumes. The proposed method is based on the volume-optimal cycles and optimal volumes (Obayashi 2018). Stable volumes produce minimal building blocks with a lower computation cost than the statistical approach. The following list shows the outline of the results: • A stable volume for the (n − 1)th PH embedded in R is defined in a similar way to an optimal volume using persistence trees (Definition 3) • We prove that the stable volume is considered to be the “robust part” against noise (Theorem 3.1) • The stable volume is reformalized using an optimization problem on homology algebra (Definition 6, Theorem 4.1) • A stable volume without dimension conditions is also defined using the optimiza- tion problem (Sect. 4.2) • A stable sub-volume is also defined in a similar way (Definition 7) https://github.com/mtsch/Ripserer.jl. The demonstration of the reconstructed shortest cycles is available at https://mtsch.github.io/Ripserer.jl/ dev/generated/cocycles/#Reconstructed-Shortest-Cycles. Jan. 28, 2022. https://mtsch.github.io/Ripserer.jl/dev/api/#Ripserer.reconstruct_cycle. 123 I.Obayashi Fig. 5 Optimal and stable volumes computed for 3 × 3 × 3 cubical lattice. a Optimal volumes of two pairs (the same as Fig. 3c), b Stable volumes of the same two pairs As the stable volume method has already been implemented in HomCloud (Obayashi et al. 2021), stable volumes can now be used to analyze the data. We can demonstrate stable volumes by using the same input data as in Fig. 3. Figure 5 shows the optimal volumes (Fig. 5a) and stable volumes (Fig. 5b) of the same two birth-death pairs. The birth-death pairs correspond to 1x1 squares in the lattice points, but Fig. 5a shows larger loops than expected. In contrast, Fig. 5bshows the expected 1x1 squares. The author confirmed that all stable volumes are 1x1 squares. This example suggests that stable volumes give better results and help promote a better intuitive understanding of PDs. 1.6 Organization of the paper The remainder of the paper is organized as follows: Sect. 2 introduces the concept of PH, optimal volumes, and volume-optimal cycles. Section 3 defines stable volumes for the (n − 1)th PH embedded in R . In this section, we also discuss the limitation of stable volumes. Section 4 shows an alternative formalization of stable volumes using mathematical optimization. Section 4.1 provides a proof that the two formalizations are consistent for the (n − 1)th PH when the simplicial complex is embedded in R . Section 4.2 extends the formalization to other cases, and Sect. 4.3 defines a stable sub- volume, which is another variant of stable volume. Section 5 discusses the software implementation of stable volumes. Section 6 gives examples using synthetic and real data. This section also provides a comparison with previous methods, the statistical approach, and the reconstructed shortest cycles. Section 7 discusses the way in which the noise bandwidth parameter can be tuned. Section 8 compares stable volumes and sub-volumes. Section 9 summarizes the paper and offers concluding remarks. In this section, we also discuss the differences between the proposed method and the methods of previous works. https://homcloud.dev/index.en.html. 123 Stable volumes for persistent homology 2 Persistent homology, optimal volumes, and volume-optimal cycles In this section, we introduce the mathematical concepts of PH, PDs, optimal volumes, and volume-optimal cycles, which provide the foundation for our discussion. 2.1 Persistent homology PH is defined on a filtration of topological spaces. Let X be a finite simplicial complex and X :∅= X X ··· X = X (1) 0 1 N be a filtration of subcomplexes of X.The kth persistent homology H (X; k) with a coefficient field k is defined as H (X; k) : H (X ; k) → H (X ; k) → ··· → H (X ; k), (2) k k 0 k 1 k N where H (X ; k) is the kth homology k-vector space of X and → is the linear map k i i induced by the inclusion maps. We define the direct sum of the sequences of linear maps to describe the fundamental theorem of PH. Definition 1 For sequences of linear maps with the same length N + 1, {V } = m=1 f f m,1 m,N {V −−→ ··· − −− → V } , the direct sum of these sequences V = m,0 m,N i m=1 m=1 f f V − → ··· −→ V is defined as follows: 0 N L L V = V (i = 0,..., N ), f = f (i = 1,..., N ). i m,i i m,i m=1 m=1 The fundamental theorem of PH is as follows. Theorem A There exists a unique decomposition of H (X; k), H (X; k) I (b , d ), (3) k m m m=1 where 0 < b < d ≤∞, k → k is an identity map, 0 → k, k → 0 are zero maps, m m and I (b, d) are as shown below: bth (d−1)th dth I (b, d) =0 → ··· → 0 → k → ··· → k → 0 → ··· → 0, if d =∞, (4) bth I (b, d) =0 → ··· → 0 → k → ··· → k, if d =∞. (5) 123 I.Obayashi On an intuitive level, the map 0 → k indicates the appearance of a homology generator, k → k indicates the persistence of the generator, and k → 0 indicates the disappearance of the generator. We note that we should use a field as a coefficient ring of the homology module since the theorem holds only when k is a field. 2.2 Order with level We earlier introduced the idea of a level function. We now introduce the concept of total order consistent with the level function. The total order is used to rigorously describe the algorithms; the level function is needed to rigorously describe the noise. Simplices with the same level often appear in an alpha filtration or a Vietoris-Rips filtration; thus, the concept of the total order is required to deal with such filtrations. The pair consisting of the total order and the level function is called an order with level. Let ≺ be a total order on the set of all simplices of X. We assume that σ σ implies σ ≺ σ . (6) We can number all simplices in X by this ordering as follows: σ ≺ σ ≺ ··· ≺ σ . (7) 1 2 N From the condition (6), X ={σ ,...,σ } is always a subcomplex of X and i 1 i ∅= X X ··· X = X (8) 0 1 N is a filtration of simplicial complexes. In this filtration, the number of simplices is increased one by one, which simplifies the mathematical description. To include additional information in the filtration, we consider an order with level. Definition 2 A pair r = (rˆ, ) of a real-valued map on X and a total order on X satisfying the following conditions is called an order with level. 1. σ σ implies rˆ(σ ) ≤ˆr (σ ) 2. The order satisfies (6) From the total order , a filtration X is defined by (7) and (8). Theorem A gives r r the unique decomposition H (X ; k) I (b , d ). For each (b , d ),the k r m m m m m=1 simplex σ is called a birth simplex, σ is called a death simplex, and the pair b d m m (σ ,σ ) is called a birth-death simplices pair. When d =∞, we write the pair as b d m m m (σ ,) using the special symbol . D (X ) denotes the set of all birth-death simplices b k r pairs. Moreover, rˆ(σ ) is called a birth time, rˆ(σ ) is called a death time, and the b d m m pair of the birth time and death time, (rˆ(σ ), rˆ(σ )), is called a birth-death pair. b d m m When σ = , rˆ() is defined as +∞. The multiset of birth-death pairs is called a persistence diagram: {(rˆ(σ ), rˆ(σ )) | m = 1,..., L;ˆr (σ ) =ˆr (σ )}. (9) b d b d m m m m 123 Stable volumes for persistent homology We write the persistence diagram as PD (X , r ). 2.3 Optimal volumes Optimal volume is proposed by Obayashi (2018) as a way to extract homological structures corresponding to a birth-death pair. For an order with levels r = (rˆ, ) and a birth-death simplices pair (τ ,ω ) ∈ D (X ) with ω = ,the optimal volume 0 0 k r 0 for the pair is formalized as the solution to the following minimization problem: minimize z , subject to z = ω + α ω ∈ C (X ; k), 0 ω k+1 ω∈F (10) k+1 τ (∂z) = 0 for every τ ∈ F , τ (∂z) = 0, (k) where C (X ; k) is a chain complex whose coefficient field is k, F ={σ ∈ X | k+1 k ∗ k τ ≺ σ ≺ ω }, τ is an element of the dual basis of cochain complex C (X ; k), and 0 r r 0 z ={ω | α = 0} is the norm of z. For solution z, ∂z is called a volume-optimal 0 ω cycle. Obayashi (2018) shows the following theorem, which indicates that the volume- optimal cycle is suitable for representing a birth-death pair. Theorem B Let (τ ,ω ) ∈ D (X ) be a birth-death simplices pair and z an optimal 0 0 k r volume of that pair. Then the following relations hold: ∂z ∈ / Z ({σ ∈ X | σ ≺ τ }), (11) k r 0 ∂z ∈ Z ({σ ∈ X | σ τ }), (12) k r 0 ∂z ∈ / B ({σ ∈ X | σ ≺ σ }), (13) k r 0 ∂z ∈ B ({σ ∈ X | σ σ }), (14) k r 0 where Z (·) represents the cycles and B (·) represents the boundaries. k k 2.4 Optimal volumes for the (n − 1)th PH and persistence trees In this section, we consider a triangulation of a convex set in R and its (n − 1)th PH with n ≥ 2. More precisely, we assume the following: Condition 1 A simplicial complex X in R satisfies two conditions. • Any k-simplex (k < n) is a face of an n-simplex •|X|:= σ is contractible σ ∈X Any alpha filtration for a pointcloud with more than (n + 1) points in general posi- tion (Edelsbrunner and Mücke 1994) satisfies the above conditions. 123 I.Obayashi Delfinado and Edelsbrunner (1995) presented an efficient algorithm to compute the second Betti number of a simplicial complex on the 3-sphere or R using a dual graph and union-find algorithm (Section 19, Cormen et al (2022)). Schweinhart (2015) pointed out that the (n − 1)th PH was isomorphic to the 0th persistent cohomology of the dual filtration by Alexander duality. Zeroth cohomology is deeply related to the connected components in the dual filtration. This gives rise to another formalization of (n − 1)th PH. The algorithm is the PH version of Delfinado and Edelsbrunner (1995)’s algorithm. Obayashi (2018) generalizes the idea to optimal volumes. We now examine an order with levels r on X. Consider the filtration X in (7) and (8) given by . For simplicity, we will use Z as the coefficient field. Then the r 2 following theorems hold (Schweinhart 2015; Obayashi 2018). Theorem C For any birth-death pair in PD (X , r ), the death time is not infinity. n−1 Therefore we can always define the optimal volume of the pair. Moreover, the optimal volume is uniquely determined. Theorem D If z and z are the optimal volumes for two different birth-death pairs, one of the following holds: • z ∩ z =∅ • z z • z z Note that we can naturally regard any z = (n) k σ ∈ C (X ) as a subset of ω n ω∈X (n) n-simplices of X, {ω ∈ X | k = 0}, since we use Z as the homology coefficient ω 2 field. From Theorem D, we know that D (X ) can be regarded as a forest (i.e., the n−1 r union of distinct trees) by the inclusion relation. Schweinhart (2015) calls the forest persistence trees. Moreover, we can compute the persistence trees by the merge-tree algorithm (Algorithm 1). To describe the algorithm, consider the one-point compactification space R ∪ {∞} S . The following facts from Condition 1 are well known. n n • X ∪{ω } is a cell decomposition of R ∪{∞}, where ω = (R ∪{∞})\|X | ∞ ∞ (n−1) 7 • For any τ ∈ X , the proper cofaces of τ are just two n-cells in X ∪{ω } We can extend the order with levels r onto X ∪{ω } by regarding ω as the maximum ∞ ∞ element and rˆ(ω ) =+∞. To describe the algorithm, we consider a directed graph (n−1) G whose nodes are n-cells in X ∪{ω }. An edge has extra data in X , and we can write the edge from ω to ω with extra data τ as (ω − → ω ). The directed graph G is increasingly updated in Algorithm 1. Since the graph is always a forest throughout the algorithm (Obayashi 2018), we can find a root of a tree that contains an n-cell ω in the graph G by recursively following the edges from ω. We call this procedure Root(ω, G). The following theorem gives the interpretation of the result of the algorithm as persistence information. If τ is a face of ω, ω is called a coface of τ . If the difference of the dimensions is one, ω is a proper coface of τ . 123 Stable volumes for persistent homology Algorithm 1 Computing persistence trees by the merge-tree algorithm. procedure Compute- Tree(X ) initialize G ={ω } for σ ∈ X in ( )-descending order do (LOOP) if σ is an n-simplex then add σ to G as a vertex else if σ is an (n − 1)-simplex then let ω and ω be two cofaces of σ 1 2 ω ← Root(ω , G) ω ← Root(ω , G) if ω = ω then 1 2 continue else if ω ω then 1 2 Add (ω − → ω ) to G as an edge 2 1 else Add (ω − → ω ) to G as an edge 1 2 end if end if end for return G end procedure procedure Root(ω, G) if ω has no outgoing edge then return ω else return Root(ω , G), where ω − → ω parent parent end if end procedure ¯ ¯ ¯ Theorem E Let G be the result of Algorithm 1. That is, G is the final graph of G. ∗ ∗ Then the following holds: 1. G is a tree whose root is ω ∗ ∞ 2. D (X ) ={(τ, ω) | (ω − →∗) is an edge of G }.Here ∗ means another vertex n−1 r ∗ 3. If there is an edge ω − → ω is in G , we have ω ≺ ω ∗ r ¯ ¯ 4. The optimal volume for (τ, ω) ∈ D (X ) is dec(ω, G ), where dec(ω, G ) the n−1 r ∗ ∗ set of all descendant nodes of ω in G including ω itself 5. G gives the persistence trees. That is, (τ, ω) is the parent of (τ ,ω ) in the persis- tence trees if and only if there are edges ω − → ω We can prove the above theorems using Alexander duality. Appendix A in (Obayashi 2017) provides a detailed discussion. We also remark that the merge-tree algorithm is remarkably accelerated by adding a shortcut path to the root in a similar way as in the union-find algorithm. This paper is the preprint version of (Obayashi 2018). The contents of Appendix A in (Obayashi 2017) are omitted from (Obayashi 2018); thus, we sometimes refer to the preprint version. 123 I.Obayashi 3 Stable volumes for the (n-1)th PH We can now describe stable volume for (n − 1)th PH using persistence trees under Condition 1. The parameter in the following definition is called a bandwidth param- eter. Definition 3 Let X be a simplicial complex and r an order with levels on X, and X be the filtration of X given by (7) and (8). Let G be the persistence trees of the filtration X .For (τ ,ω ) ∈ D (X ) and a positive number , the stable volume of r 0 0 n−1 r the birth-death simplices pair with a noise bandwidth parameter ,SV (r,τ ,ω ),is 0 0 defined as follows: ⎛ ⎞ ⎝ ¯ ⎠ SV (r,τ ,ω ) ={ω }∪ dec(ω, G ) , (15) 0 0 0 ∗ ω∈C (τ ,ω ) 0 0 where C (τ ,ω ) is given as 0 0 (n) C (τ ,ω ) ={ω ∈ X | ω − → ω , and rˆ(τ ) ≥ˆr (τ ) + }. (16) 0 0 0 0 We can compute SV (r,τ ,ω ) by traversing persistence trees as shown in Algo- 0 0 rithm 2. Algorithm 2 Computing stable volumes for (n − 1)th PH procedure Stable- Volume-for-n-1(τ ,ω , , G ) 0 0 ∗ S ←{ω } for ω satisfying ω − → ω do if rˆ(τ ) ≥ˆr (τ ) + in G then Append dec(ω, G ) to S end if end for return S end procedure We specify the following (r,ω )-order condition in order to describe the main theorem. Definition 4 Let r and q be two orders with levels on a simplicial complex X and ω be an n-simplex. Then q satisfies an (r,ω )-order condition if σ ≺ ω implies 0 0 r 0 σ ≺ ω for all σ ∈ X. q 0 We also define the symbol τ . For an order with levels q and an n-simplex q,ω ω , τ is the birth simplex paired with ω in D (X ). That is, (τ ,ω ) ∈ 0 q,ω 0 n−1 q q,ω 0 0 0 D (X ). n−1 q The following theorem is the first main theorem of the paper. It states that the stable volume is an invariant part of optimal volumes in the presence of small noise. 123 Stable volumes for persistent homology Theorem 3.1 SV (r,τ ,ω ) = OV(q,ω ), (17) 0 0 0 q∈R where R ={q = (q ˆ , ) : an order with levels | (18) ˆ q −ˆr < /2 and q satisfies (r,ω )-order condition}, ∞ 0 and OV(q,ω ) = the optimal volume of (τ ,ω ) ∈ D (X ). (19) 0 q,ω 0 n−1 q The theorem treats two different filtrations, X and X , on the same simplicial complex r q X. It should be noted that the reader needs to treat these filtrations carefully. The following two claims immediately imply Theorem 3.1 and are discussed in the next three subsections. Claim 1 For any q ∈ R , the following relation holds: SV (r,τ ,ω ) ⊆ OV(q,ω ). (20) 0 0 0 Claim 2 For any ω/ ˜ ∈ SV (r,τ ,ω ), there is an order with levels q ∈ R satisfying 0 0 ω/ ˜ ∈ OV(q,ω ). 3.1 Dual graph and its subgraphs To show Theorem 3.1, we introduce a dual graph of X and its subgraphs. Definition 5 V and E are defined as follows: X X (n) V = X ∪{ω }, X ∞ (21) (n−1) E = X . G = V ∪ E is an undirected graph when the two endpoints of τ ∈ E are defined X X X X as the two cofaces of τ . We call the graph G the dual graph of X. We define the subgraphs of G , G ( X X X r σ) and G (rˆ ≥ s),as G ( σ) = V ( σ) ∪ E ( σ), X r X r X r V ( σ) ={ω ∈ V | ω σ }, X r X r E ( σ) ={τ ∈ E | τ σ }, X r X r (22) G (rˆ ≥ s) = V (rˆ ≥ s) ∪ E (rˆ ≥ s), X X X V (rˆ ≥ s) ={ω ∈ V |ˆr (ω) ≥ s}, X X E (rˆ ≥ s) ={τ ∈ E |ˆr (τ ) ≥ s}, X X 123 I.Obayashi where σ is a simplex in X and s is a real number. The condition (6) ensures that G ( σ) and G (rˆ ≥ s) are subgraphs of G . We also define G ( σ) by X r X X X r replacing with . r r ¯ ¯ We introduce the notation G as G in Algorithm 1 when the inside of the (LOOP) is finished at σ . The following facts are essential for the proof of Theorem 3.1.The facts are shown as Fact 17 and Fact 18 in (Obayashi 2017). Fact 1 The topological connectivity of vertices in G is the same as G ( σ). That σ X r is, ω ,...,ω ∈ V ( σ) are all vertices of a connected component in G ( σ) 1 k X r X r if and only if there is a tree in G whose vertices are ω ,...,ω . σ 1 k Fact 2 For each tree in G , the root of the tree is the ( )-maximum vertex in the tree. σ r The next lemma describes the role of the birth and death simplices. K (G,ω) denotes the connected component of a graph G which contains a vertex ω. K (G,ω) and K (G,ω) denote all the vertices and all the edges of K (G,ω). Lemma 3.2 Let (τ ,ω ) ∈ D (X ). Then ω is the ( )-maximum n-simplex in 0 0 n−1 r 0 r K (G ( τ ), ω ) and the optimal volume of (τ ,ω ) is K (G ( τ ), ω ).Fur- X r 0 0 0 0 X r 0 0 thermore, there exists ω ∈ V satisfying the following conditions: 1 X • ω − → ω in G 0 1 ∗ • ω ω 1 r 0 • K (G ( τ ), ω ) = K (G ( τ ), ω ) ∪ K (G ( τ ), ω ) ∪{τ }. That is, X r 0 0 X r 0 0 X r 0 1 0 τ is the edge between K (G ( τ ), ω ) and K (G ( τ ), ω ) 0 X r 0 0 X r 0 1 We can easily prove the lemma by considering Algorithm 1 and Facts 1 and 2. 3.2 Proof of Claim 1 The following equality holds from the definition of stable volume (15), Theorem E, Facts 1 and 2, and Lemma 3.2. SV (r,τ ,ω ) = K (G (rˆ ≥ˆr (τ ) + ), ω ). (23) 0 0 V X 0 0 The following equality also holds from Lemma 3.2. OV(q,ω ) = K (G ( τ ), ω ). (24) 0 V X q q,ω 0 Therefore, we can show the following relationship to prove Claim 1. K (G (rˆ ≥ˆr (τ ) + ), ω ) ⊆ K (G ( τ ), ω ). (25) V X 0 0 V X q q,ω 0 The following lemma is essential to prove Claim 1. Lemma 3.3 Let (τ ,ω ) ∈ D (X ) be a birth-death simplices pair and q ∈ R . 0 0 n−1 r Then the following inequality holds: q ˆ (τ ) − /2 < rˆ(τ ). (26) q,ω 0 123 Stable volumes for persistent homology Proof We assume that q ˆ (τ ) − /2 ≥ˆr (τ ) and will show a contradiction. Since q,ω 0 ˆ q −ˆr < /2, for any σ ∈ G with τ σ,wehave ∞ X q,ω q rˆ(τ ) ≤ˆ q(τ ) − /2 ≤ˆ q(σ ) − /2 <(rˆ(σ ) + /2) − /2 =ˆr (σ ). (27) 0 q,ω From the definition of order with levels, the inequality immediately implies τ ≺ σ 0 r and G ( τ ) is a subgraph of G ( τ ). X q q,ω X r 0 Since (τ ,ω ) ∈ D (X ), Lemma 3.2 gives ω ∈ V satisfying q,ω 0 n−1 q 1 X ω ω , (28) 1 q 0 K (G ( τ ), ω ) = K (G ( τ ), ω ) ∪ K (G ( τ ), ω ) ∪{τ }. X q q,ω 0 X q q,ω 0 X q q,ω 1 q,ω 0 0 0 0 (29) (28) leads to ω ω from the (r,ω )-order condition. At the same time, (29) leads to 1 r 0 0 ω ∈ K (G ( τ ), ω ) since G ( τ ) is a subgraph of G ( τ ). These facts 1 X r 0 0 X q q,ω X r 0 lead to a contradiction since ω is the ( )-maximum vertex in K (G ( τ ), ω ), 0 r X r 0 0 but K (G ( τ ), ω ) contains ω and ω ω . X r 0 0 1 1 r 0 Using Lemma 3.3, we prove that G (rˆ ≥ˆr (τ ) + ) is a subgraph of G ( τ ) X 0 X q q,ω to show (25). For any σ ∈ G (rˆ ≥ˆr (τ ) + ),wehave X 0 q ˆ (σ ) > rˆ(σ ) − /2 ≥ (rˆ(τ ) + ) − /2 =ˆr (τ ) + /2 > q ˆ (τ ), (30) 0 0 q,ω and hence σ τ . This means that G (rˆ ≥ˆr (τ ) + ) is a subgraph of G ( q q,ω X 0 X q τ ). q,ω 3.3 Proof of Claim 2 If ω/ ˜ ∈ OV (r,ω ), the conclusion of Claim 2 is trivial. Therefore, we assume ω ˜ ∈ OV (r,ω ). Since ω ˜ ∈ OV(r,ω ) = K (G ( τ ), ω ), 0 V X r 0 0 ω/ ˜ ∈ SV (r,τ ,ω ) = K (G (rˆ ≥ˆr (τ ) + ), ω ), 0 0 V X 0 0 and G (rˆ > rˆ(τ ) + ) is a subgraph of G ( τ), there is τ˜ ∈ E satisfying the X 0 X r X following conditions: τ ≺ τ, ˜ 0 r rˆ(τ ) ≤ˆr (τ) ˜ < rˆ(τ ) + , 0 0 (31) ω/ ˜ ∈ K (G ( τ) ˜ , ω ), V X r 0 ω ˜ ∈ K (G ( τ) ˜ , ω ). V X r 0 Let =ˆr (τ) ˜ −ˆr (τ ) and η = ( + )/4. From (31), we have ∗ 0 ∗ /2 >η > /2 ≥ 0. (32) 123 I.Obayashi Fig. 6 The relationship between connected components and path P in Case 1. Circles represent vertices, rectangles by dotted lines represent connected components, solid lines represent edges between connected components, and dashed curves represent paths Lemma 3.2 gives ω ∈ V satisfying 1 X ω ω , and (33) 1 r 0 K (G ( τ), ω ) = K (G ( τ ), ω ) ∪ K (G ( τ ), ω ) ∪{τ }. (34) X r 0 X r 0 0 X r 0 1 0 Let ω be the endpoint of τ contained in K (G ( τ ), ω ). A path from ω to ω e 0 X r 0 0 0 e exists in K (G ( τ ), ω ). We consider the following two cases regarding the path. X r 0 0 Case 1 There is a path P from ω to ω in G ( τ ) without passing through K (G ( 0 e X r 0 X r τ) ˜ , ω) ˜ . Case 2 Any path ω to ω in K (G ( τ ) passes through K (G ( τ) ˜ , ω) ˜ . 0 e X r 0 X r We will divide the proof into the above two cases. 3.3.1 Case 1 Fig. 6 shows the relationship between connected components and the path P in Case 1. We can construct an order with levels q satisfying ω/ ˜ ∈ K (G ( τ ), ω ) V X q q,ω 0 based on this relationship. Using η, we define a function q ˆ on X as follows: (n) rˆ(σ ) + η if σ ∈ X , rˆ(σ ) + η if σ ∈ Q , q ˆ (σ ) = (35) (n−1) rˆ(σ ) − η if σ ∈ X \Q , (0) (n−2) rˆ(σ ) − η if σ ∈ X ∪ ··· ∪ X , where Q = P ∪{τ }∪ K (G ( τ), ω ), and Q is the set of all edges in Q.We 0 X r 1 E can easily show that q ˆ is a level function from the fact that rˆ is a level function. We 123 Stable volumes for persistent homology define the order on X ∪{ω }, , as follows: ∞ q σ ≺ σ if and only if q ˆ (σ ) < q ˆ (σ ), or (36) q ˆ (σ ) =ˆ q(σ ) and σ σ , or q ˆ (σ ) =ˆ q(σ ) and σ ∩ σ =∅ and σ ≺ σ . This is a kind of lexicographic order by the total preorder (σ, σ ) →ˆ q(σ ) ≤ˆ q(σ ), the partial order ⊆, and the total order . Therefore is a total order. r q The following two facts are easily shown. Fact 3 q = ( , q ˆ ) is an order with levels. (n−1) Fact 4 The order coincides with on V . The same is true on Q or X \Q . q r X E E We also show the following fact. Fact 5 q ∈ R . Proof From the definition, we have ˆ q −ˆr≤ η< /2. To show the (r,ω )-order condition, we assume σ ≺ ω and show σ ≺ ω . From the assumption, we have r 0 r 0 rˆ(σ ) ≤ˆr (ω ). Since ω is n-simplex, q ˆ (ω ) =ˆr (ω ) + η, and so q ˆ (ω ) ≥ˆ q(σ ). 0 0 0 0 0 Therefore, we consider the following cases: 1. q ˆ (ω )> q ˆ (σ ). In this case, we can see ω σ from the definition of order with 0 0 q levels. 2. q ˆ (ω ) =ˆ q(σ ) and ω σ . In this case, we have ω σ from the definition of 0 0 0 q 3. q ˆ (ω ) =ˆ q(σ ) and ω σ . In this case, the condition ω σ leads to a contra- 0 0 0 diction since ≺ satisfies (6)but σ ≺ ω . r r 0 4. q ˆ (ω ) =ˆ q(σ ) and ω ∩ σ =∅. In this case, we have ω σ from the assumption 0 0 0 q of σ ≺ ω . r 0 In all cases except 3, we have σ ≺ ω . r 0 The following fact comes from Fact 4. Fact 6 τ is the ( )-minimum element in Q. 0 q The above fact leads to the following. Fact 7 Q is a subgraph of K (G ( τ ), ω ). X q 0 0 We also prove the following fact about K (G ( τ ), ω ) and K (G ( τ ), ω ). X q 0 0 X q 0 1 Fact 8 K (G ( τ ), ω ) and K (G ( τ ), ω ) are different connected compo- X q 0 0 X q 0 1 nents in G ( τ ). X q 0 123 I.Obayashi Proof We assume that K (G ( τ ), ω ) = K (G ( τ ), ω ) and this will lead X q 0 0 X q 0 1 to a contradiction. From the assumption, there is a path Q from ω to ω in G ( 0 1 X q τ ). Any edge τ on Q satisfies τ τ and so q ˆ (τ ) ≥ˆ q(τ ). Therefore, since 0 q 0 0 |ˆr (τ ) −ˆ q(τ )|≤ η and q ˆ (τ ) =ˆr (τ ) + η, the inequality rˆ(τ ) ≥ˆr (τ ) holds for any 0 0 0 τ ∈ Q , where Q is the set of all edges on Q . E E We conclude that τ τ for any τ ∈ Q from (36), τ τ , and rˆ(τ ) ≥ˆr (τ ). r 0 q 0 0 This means that Q is a subgraph of G ( τ ). However, this contradicts the fact X r 0 that ω and ω are contained in different connected components of G ( τ ). 0 1 X r 0 From Facts 4, 7, and 8, and (33), we have the following fact. Fact 9 τ = τ . q,ω 0 Next, we examine τ˜. Fact 10 τ τ˜. 0 q Proof Since τ ∈ Q and τ/ ˜ ∈ Q ,wehave 0 E E q ˆ (τ ) −ˆ q(τ) ˜ = (rˆ(τ ) + η) − (rˆ(τ) ˜ − η) 0 0 =ˆr (τ ) −ˆr (τ) ˜ − 2η = − 2η> 0. The inequality leads to τ τ˜ since q is an order with levels. 0 q We can prove the following fact in a similar way to Fact 8. Fact 11 ω/ ˜ ∈ K (G ( τ) ˜ , ω ). X q 0 From Facts 9, 10, and 11, the following fact is easily established. Fact 12 ω/ ˜ ∈ K (G ( τ ), ω ). X q q,ω 0 Finally, since OV(q,ω ) = K (G ( τ ), ω ), we conclude that ω/ ˜ ∈ 0 V X q q,ω 0 OV(q,ω ). 3.3.2 Case 2 In Case 2, there is a path T from K (G ( τ) ˜ , ω) ˜ to ω without passing through X r e K (G ( τ) ˜ , ω ). Figure 7 shows the relationship between connected components X r 0 and paths between the components in this case. We now define a function sˆ on X as follows: (n) rˆ(σ ) + η if σ ∈ X , rˆ(σ ) + η if σ ∈ S , sˆ(σ ) = (37) (n−1) rˆ(σ ) − η if σ ∈ X \S , (0) (n−2) rˆ(σ ) − η if σ ∈ X ∪ ··· ∪ X , where S = K (G ( τ) ˜ , ω) ˜ ∪ T ∪{τ }∪ K (G ( τ ), ω ) and S is the set of all X r 0 X r 0 1 E edges of S. 123 Stable volumes for persistent homology Fig. 7 The relationship between connected components and path T in Case 2. Circles represent vertices, rectangles by dotted lines represent connected components, solid lines represent edges between connected components, and dashed curves represent paths We also define the total order on X, , as follows: σ ≺ σ if and only if sˆ(σ ) < sˆ(σ ), or (38) sˆ(σ ) =ˆs(σ ) and σ σ , or sˆ(σ ) =ˆs(σ ) and σ ∩ σ =∅ and σ ≺ σ . We can prove the following fact in a similar way to Facts 3 and 5. Fact 13 s = ( , sˆ) is an order with levels on X, and s ∈ R . The following fact arises directly from the definition of sˆ and . (n−1) Fact 14 The order coincides with on V . The same is true on Q or X \Q . s r X E E The following fact can be shown in a similar way to Facts 6 and 7. Fact 15 S is a subgraph of K (G ( τ ), ω) ˜ . X s 0 We can show the following fact in the same way as Fact 10. From the following fact, we have K (G ( τ ), ω) ˜ ⊆ K (G ( τ) ˜ , ω) ˜ . X s 0 X s Fact 16 τ τ˜. 0 s We can prove the following in a similar way to Facts 8 and 9 using Facts 14, 15, and 16. Fact 17 K (G ( τ) ˜ , ω) ˜ and K (G ( τ) ˜ , ω ) are different connected components X s X s 0 in G ( τ) ˜ and τ˜ connects the two connected components. Therefore, τ =˜ τ . X s s,ω From the above facts, we have the following. Fact 18 ω/ ˜ ∈ K (G ( τ ), ω ). X s s,ω 0 Finally, since OV(q,ω ) = K (G ( τ ), ω ), we conclude that ω/ ˜ ∈ 0 V X q q,ω 0 OV(q,ω ). 123 I.Obayashi Fig. 8 Two filtrations on the same simplicial complex 3.4 Limitation of stable volumes In this subsection, we discuss some of the limitations of stable volumes. The first relates to the (r,ω )-order condition. Theorem 3.1 requires the (r,ω )- 0 0 order condition; however, this condition seems somewhat curious. To clarify, we offer a brief discussion of the role of the condition. First, we examine what happens when orders with levels q and r satisfy ˆ q −ˆr≤ /2 but not the (r,ω )-order condition. Figure 8 shows an example. In this example, we assume the following conditions: Y = Y ∪{τ ,τ ,ω ,ω } is a simplicial complex, (39) 0 1 2 1 2 rˆ(σ ) ≈ˆ q(σ ) for any σ ∈ Y , (40) rˆ(ω ) ≈ˆr (ω ), q ˆ (ω ) ≈ˆ q(ω ), (41) 1 2 1 2 rˆ(σ ) < rˆ(τ ) for any σ ∈ Y , (42) 1 0 q ˆ (σ ) < q ˆ (τ ) for any σ ∈ Y , (43) 1 0 rˆ(τ )< rˆ(τ )< rˆ(ω )< rˆ(ω ) for any σ ∈ Y , (44) 1 2 1 2 0 q ˆ (τ )< q ˆ (τ )< q ˆ (ω )< q ˆ (ω ) for any σ ∈ Y . (45) 1 2 2 1 0 Here, we have (rˆ(τ ), rˆ(ω )), (rˆ(τ ), rˆ(ω )) ∈ PD (X , r ), 1 2 2 1 1 (q ˆ (τ ), q ˆ (ω )), (q ˆ (τ ), q ˆ (ω )) ∈ PD (X , q), 1 1 2 2 1 (rˆ(τ ), rˆ(ω )) ≈ (q ˆ (τ ), q ˆ (ω )), 1 2 1 1 (rˆ(τ ), rˆ(ω )) ≈ (q ˆ (τ ), q ˆ (ω )). 2 1 2 2 The optimal volumes are {ω ,ω }= OV(rˆ, rˆ(τ ), rˆ(ω )) = OV(q ˆ , q ˆ (τ ), q ˆ (ω )), (46) 1 2 1 2 1 1 {ω }= OV(rˆ, rˆ(τ ), rˆ(ω )), (47) 1 2 1 {ω }= OV(q ˆ , q ˆ (τ ), q ˆ (ω )). (48) 2 2 2 Therefore, (rˆ(τ ), rˆ(ω )) ≈ (q ˆ (τ ), q ˆ (ω )); however, OV(rˆ, rˆ(τ ), rˆ(ω )) does not 2 1 2 2 2 1 intersect with OV(q ˆ , q ˆ (τ ), q ˆ (ω )). Thus, the example shows that OV(rˆ, rˆ(τ ), rˆ(ω )) 2 2 2 1 does not have a robust part to small noise if we do not assume the (r,ω )-order condition. 123 Stable volumes for persistent homology In contrast, (46) indicates that OV(rˆ, rˆ(τ ), rˆ(ω )) has a stable part even if we do 1 2 not assume the (r,ω )-order condition. We expect that if an optimal volume is large in some sense, then the optimal volume has a stable part even if the (r,ω )-order condition is not satisfied. Mathematically clarifying what is meant by “large in some sense” is a subject for future research. Next, we focus on the construction of simplicial filtrations. An alpha shape (Edels- brunner and Mücke 1994) is often used to construct a filtration from a pointcloud. Alpha complexes have some good properties. If a pointcloud is in R and satisfies the general position condition, then the alpha shape is embedded in R . An alpha shape has the same information as the union-balls model. More precisely, an alpha shape is homotopy equivalent to the union of balls. However, the alpha shape may vary dis- continuously with noise. Theorem 3.1 assumes that the simplicial complex X does not change with noise; however, the assumption does not hold if the noise is large. On the other hand, if the noise is small, the assumption holds and the theorem works perfectly. In Sect. 6, we show that a large noise bandwidth parameter gives unexpected results. Therefore, it is reasonable to assume that the noise is small; however, care should be taken. 4 Another formalization of a stable volume as the solution to an optimization problem The formalization of stable volumes is based on persistence trees. Importantly, how- ever, persistence trees are available only for the (n − 1)th persistence homology, and we cannot apply the concept directly to another degree. Another version of a stable volume can be defined as the solution to an optimization problem. Definition 6 Let r be an order with levels and (τ ,ω ) ∈ D (X ). SV (r,τ ,ω ) is 0 0 k r 0 0 defined by the solution to the following minimization problem. minimize z , subject to z = ω + α ω ∈ C (X ; k), (49) 0 ω k+1 ω∈F ,k+1 τ (∂z) = 0 for any τ ∈ F , (50) ,k (k) where k is the coefficient field and F ={σ ∈ X |ˆr (τ )+ ≤ˆr (σ ) and σ ≺ ω }. ,k 0 r 0 4.1 Another formalization of a stable volume: a special case For (n − 1)th PH, we will prove the following second main theorem, which guarantees the validity of the definition. Theorem 4.1 If X satisfies Condition 1, k = Z/2Z, and (τ ,ω ) ∈ D (X ), 0 0 n−1 r SV (r,τ ,ω ) coincides with SV (r,τ ,ω ) for k = n − 1. 0 0 0 0 123 I.Obayashi Recall the remark associated with Theorem D that SV (r,τ ,ω ) can be regarded 0 0 (n) as a subset of X . For z = ω + α ω ∈ C (X ; Z/2Z), we define V and E as follows. 0 ω n z z ω∈F ,n V ={ω ∈ F | α = 1}∪{ω }, z ,n ω 0 (51) E ={τ ∈ F |∃ω ∈ V τ (∂ω) = 1}. z ,n−1 z To prove the theorem, we show the following lemma. Lemma 4.2 If z satisfies (49) and (50), G := V ∪ E is a subgraph of G and G z z z X z is a disjoint union of some connected components of G (rˆ ≥ˆr (τ ) + ). X 0 We now assume that z satisfies (49) and (50). To show the above lemma, we show the following three facts, Fact 19, 20, and 21. Fact 19 For any τ ∈ E , τ is not a face of ω . z ∞ (n) Proof Assume τ is a face of ω . Then there is a unique n-simplex ω ∈ X which ∞ 1 is the coface of τ . From the definition of E and ω ∈ / V , V should contain ω . z ∞ z z 1 However, ∗ ∗ ∗ ∗ τ (∂z) = τ (∂( ω)) = τ (∂ω) = τ (∂ω ) = 1 = 0, (52) ω∈V ω∈V z z which contradicts (50). The following fact guarantees that G is a subgraph of G . z X Fact 20 For any τ ∈ E , both cofaces of τ , ω and ω , are contained in V . z 1 2 z Proof From the definition E , V contains either ω or ω and one of the following z z 1 2 holds: α = 1or α = 1. From the condition (50), ω ω 1 1 ∗ ∗ τ (∂z) = α τ (∂(ω)) = α + α = 0. (53) ω ω ω 1 2 ω∈F ,n Therefore, α = α = 1, which means that both ω and ω are contained in V . ω ω 1 2 z 1 2 Fact 21 For ω ∈ V , A denotes the set of all (n − 1)-dimensional faces of ω .If 1 X ω 1 ω ∈ V , then the following holds: 1 z A ∩ E (rˆ ≥ˆr (τ ) + ) = A ∩ E . (54) ω X 0 ω z 1 1 Proof Since E ⊆ F ⊆ E (rˆ ≥ˆr (τ ) + ), A ∩ E (rˆ ≥ˆr (τ ) + ) ⊇ z ,n−1 X 0 ω X 0 A ∩ E is trivial. Therefore, we assume τ ∈ A ∩ E (rˆ ≥ˆr (τ ) + ) and will ω z ω X 0 1 1 show τ ∈ A ∩ E . Since τ ≺ ω and ω ∈ V ⊂ F ∪{ω },wehave τ ≺ ω . ω z r 1 1 z ,n 0 r 0 Therefore, τ ∈ F since τ ∈ E (rˆ ≥ˆr (τ ) + ).Wealsohave τ (∂ω ) = 1 ,n−1 X 0 1 because τ ∈ A and we conclude τ ∈ E . ω z 123 Stable volumes for persistent homology By repeatedly using Fact 21, we can see that any path in G (rˆ ≥ˆr (τ ) + ) starting X 0 from ω ∈ V is also a path in G . This means that G is a disjoint union of some 1 z z z connected components of G (rˆ ≥ˆr (τ ) + ), and Lemma 4.2 is shown. X 0 Lemma 4.3 z = ω satisfies (49) and (50). ω∈K (G (rˆ≥ˆr (τ )+ ),ω ) V X 0 0 Proof First, we show (49). The condition (49) is equivalent to the following inclusion relationship: K (G (rˆ ≥ˆr (τ ) + ), ω ) ⊂ F ∪{ω }. (55) V X 0 0 ,n 0 Since G (rˆ ≥ˆr (τ ) + ) is a subgraph of G ( τ ) and ω is ( )-maximum X 0 X r 0 0 r element in K (G ( τ ), ω ), ω ω holds for any ω ∈ K (G (rˆ ≥ˆr (τ ) + X r 0 0 r 0 V X 0 ), ω ). This equates to (55). Next we show (50). We can show the following relationship in a similar way to (55). K (G (rˆ ≥ˆr (τ ) + ), ω ) ⊆ F . (56) E X 0 0 ,n−1 Let τ be an element of F . We consider the following two cases to show τ (∂z) = ,n−1 1. τ ∈ K (G (rˆ ≥ˆr (τ ) + ), ω ). In this case, K (G (rˆ ≥ˆr (τ ) + ), ω ) E X 0 0 E X 0 0 contains both cofaces ω ,ω of τ , and 1 2 ∗ ∗ ∗ τ (∂z ) = τ (∂ω ) + τ (∂ω ) = 1 + 1 = 0. (57) 1 1 2 2. τ ∈ F \K (G (rˆ ≥ˆr (τ )+ ), ω ). In this case, K (G (rˆ ≥ˆr (τ )+ ), ω ) ,n−1 E X 0 0 E X 0 0 does not contain both cofaces of τ , and τ (∂z ) = 0. In both cases, we have τ (∂z ) = 0. From Lemma 4.2 and Lemma 4.3, we can show that z in Lemma 4.3 is the solution to the optimization problem in Definition 6. This means that K (G (rˆ ≥ V X rˆ(τ ) + ), ω ) = SV (rˆ,τ ,ω ).From(23), this set is equal to SV(r,τ ,ω ), which 0 0 0 0 0 0 completes the proof of the theorem. 4.2 Another formalization of a stable volume: general case We can apply Definition 6 to cases other than k = n − 1. In such cases, we also call the solutions of the optimization problems the stable volumes. In such cases, Theorem 4.1 does not hold in general, and it is difficult to mathematically ensure the good property shown in Theorem 3.1. Empirically, however, the stable volumes often work well. (We examine the property using computer experiments in Sect. 6.) The reason for this is likely that an optimal volume is often included in a lower- dimensional structure (a submanifold or a lower-dimensional simplicial complex) and the solution of the stable volume is also included in the structure. 123 I.Obayashi 4.3 Stable sub-volume In the previous subsection, we expressed our belief that a lower-dimensional structure may make the stable volume work well. We can develop a variant of stable volume using this idea. If we find such a lower-dimensional structure, we can consider the stable volume constrained on the structure. An optimal volume is one possible lower- dimensional structure. We define a stable sub-volume as follows. Definition 7 Let r be an order with levels and (τ ,ω ) ∈ D (X ).The stable sub- 0 0 k r volume of (τ ,ω ) is defined by the solution to the following minimization problem: 0 0 minimize z , subject to z = ω + α ω ∈ C (X ; k), (58) 0 ω k+1 ω∈F ∩OV(r ,τ ,ω ) ,k+1 0 0 τ (∂z) = 0 for any τ ∈ F , (59) ,k (k) where k is the coefficient field and F ={σ ∈ X |ˆr (τ )+ ≤ˆr (σ ) and σ ≺ ω }. ,k 0 r 0 For stable sub-volumes, the parameter is also called a noise bandwidth parameter. The following proposition holds for the (n − 1)th PH in R since the stable volume is a subset of an optimal volume. Proposition 4.4 Let X be a simplicial complex embedded in R satisfying Condition 1. Let r be an order with levels. For (τ ,ω ) ∈ D (X ) and > 0, the stable volume of 0 0 n−1 r (τ ,ω ) with bandwidth parameter coincides with the stable sub-volume of (τ ,ω ) 0 0 0 0 with bandwidth parameter . The advantages and disadvantages of a stable sub-volume will be discussed in Sect. 8. 5 Implementation In this section, we discuss how to produce a stable volume using a computer. While we can directly implement stable volumes by persistence trees (Definition 3) using the persistence trees constructed by Algorithm 1, implementing stable volumes as an optimization problem is more difficult. As discussed in (Chen and Freedman 2011; Dey et al. 2019), these kinds of optimization problems are NP-hard in general, which makes them difficult to solve on a computer. To resolve this problem, we can apply the following approximation techniques (Tahbaz-Salehi and Jadbabaie 2008; Escolar and Hiraoka 2016; Obayashi 2018): • Use R as a coefficient field instead of Z/2Z 0 1 • Change norm to norm The second technique is often used in sparse modeling. Following the above approx- imations, the optimization problem can be formulated as a linear programming problem. 123 Stable volumes for persistent homology We need to translate the problem into an acceptable form for a linear programming solver. That is, we need to specify the variables, the objective function (the function to be minimized), and the constraints in the following form: • The objective function should be (a linear combination of variables) + (constant) • Each constraint should be (a linear combination of variables) + (constant) OP (another constant), where OP is any of the relational operators =, ≥, ≤ Algorithm 3 shows the algorithm of a stable volume. We can also compute a stable sub-volume using the same algorithm expect that F is replaced by F ∩ ,k+1 ,k+1 OV(r,τ ,ω ). It should be noted that R is used as the coefficient field and that we 0 0 need to consider the sign of c . ω,τ Algorithm 3 Computing stable volumes procedure Stable- Volume(ω ,τ , ) 0 0 Solve the following optimization problem by a linear programming solver and return {ω ∈ F | ,k+1 α = 0} variables: α , α ¯ for all ω ∈ F ω ω ,k+1 minimize α ¯ , subject to ω∈F ,k+1 α ¯ − α ≥ 0 for each ω ∈ F , ω ω ,k+1 α ¯ + α ≥ 0 for each ω ∈ F , ω ω ,k+1 c + c α = 0 for each τ ∈ F , ω ,τ ω,τ ω ,k ω∈F ,k+1 where c = τ (∂ω). ω,τ end procedure Section 4.2 in (Obayashi 2018) introduced the idea of accelerating the computation of optimal volume using locality. The same idea is applicable to stable volumes. HomCloud already implements Algorithm 2 and Algorithm 3, and was used for the examples in Sect. 6. 6 Examples In this section, we give examples of stable volumes and stable sub-volumes. Alpha filtration (Edelsbrunner and Mücke 1994) is used to compute the PDs. 123 I.Obayashi 6.1 2D lattice with random defects Figure 9 shows an example of stable volume. Here, we consider 2D lattice points with defects (Fig. 9a). We prepared 30x30 lattice points in a 2D space. The distance between the two closest vertices is 1. The configuration of the points is perturbed from the complete lattice using random noises sampled from a uniform distribution on (−0.1, 0.1) , and some points are removed randomly with a probability of 0.5 except for the outermost points. Figure 9b shows the 1st PD of the points, and Fig. 9c shows the optimal volume of (0.705, 2.091). Figure 9d shows the stable volume of the same birth-death pair with bandwidth parameter = 0.12. The optimal volume in Fig. 9d seems to surround the holes in the pointcloud more naturally than is the case in Fig. 9c. We also examined the effect of changing the bandwidth parameter, computing stable volumes for bandwidth parameter = 0.0, 0.01, 0.02,... , 0.40. Figure 9eshows the graph of the bandwidth parameter versus the size of the stable volumes. The size was measured as the number of simplices in the stable volume. From the optimal volume ( = 0), the size of the stable volume rapidly decreases. A wide plateau appears at = 0.04 and is completely flat from = 0.06 to = 0.27, meaning that the stable volume is stable to changes in bandwidth parameter over the range [0.06, 0.27]. This suggests that should be somewhere within this range. It should be noted that the scale of this range coincides with the scale of the noise. This is consistent with the fact that the bandwidth parameter indicates the strength of the virtual noise. 6.2 Comparison with previous studies We apply the statistical approach by Bendich et al. (2020) and reconstructed shortest cycles by Ripserer.jl using the same data as in Fig. 9 to compare with our method. The two methods are applied to the same birth-death pair as in the example of Sect. 6.1. Figure 10 shows the results of the statistical approach. The distribution of the noise 2 2 is the uniform distribution on [−0.03, 0.03] (left) and [−0.06, 0.06] (right). In the computation, the optimal volumes are computed 100 times. We put large black dots in the figure to indicate points whose frequencies are more than 70% or 90%. The result is consistent with Fig 9 and has richer information. However, the result looks more difficult to interpret than Fig 9. Figure 11 shows the reconstructed shortest cycles with two noise bandwidth param- eters, 0.1 (left) and 0.3 (right). As shown in the figure, the result for 0.3 is consistent with other results, but the result for 0.1 is not consistent. This means that the shortest loop criterion sometimes gives a result inconsistent with the structure of persistent homology. Figure 12 shows the reason for the phenomenon using schematic pictures. The 1st PD for the pointcloud Fig. 12a has two prominent birth-death pairs corre- sponding to and in Fig. 12b, but the shortest loop criterion finds the loop 1 2 3 instead of since and share the two cross-marked points and is shorter than 2 2 3 3 123 Stable volumes for persistent homology Fig. 9 An optimal volume and a stable volume for a 2D lattice with defects. a The input pointcloud, b 1st PD, c The optimal volumes of (0.705, 2.091), d The stable volume of the same birth-death pair, e Bandwidth parameter versus the size of stable volume I.Obayashi Fig. 10 Result of the statistical approach. Upper two figures: Noise source is the uniform distribution on 2 2 [−0.03, 0.03] . Lower two figures: Noise source is the uniform distribution on [−0.06, 0.06] . The left two figures: The points with frequencies > 70% are marked. The right two figures: The points with frequencies > 90% are marked Fig. 11 Result of reconstructed shortest cycles. Left: The noise bandwidth parameter is 0.1. Right: The noise bandwidth parameter is 0.3 Fig. 12 Schematic pictures to explain why the shortest loop criterion sometimes gives a result inconsistent with the structure of persistent homology 123 Stable volumes for persistent homology Fig. 13 1st PD for amorphous silica In this comparison, we also remark that the meaning of noise bandwidth parameters used in those three methods is different. Therefore a direct comparison of the results with the same parameter is meaningless. We should focus on the changes in the results to the change of noise bandwidth parameters. 6.3 Atomic configuration of amorphous silica The proposed approach was applied to more realistic data. In this case, we used the atomic configuration of amorphous silica. The data are from ISAACS (Rouxa and Petkova 2010), generated by Reverse Monte Carlo simulations guided by synchrotron X-ray radiation data. The data are available at http://isaacs.sourceforge.net/ex.html. The PDs were computed from the atomic configuration. Two types of atoms, silicons and oxygens, were mixed in a 1:2 ratio in the data. The atomic type was ignored in this example, and only the positions of the atoms were used. Figure 13 shows the 1st PD. The PD has birth-death pairs on the vertical line with (birth time) ≈ 0.7. The birth- death pairs on the vertical line correspond to rings formed by the chemical bonds between oxygen and silicon. Oxygen and silicon atoms appear alternatively on these rings. Previous studies (Hiraoka et al. 2016; Onodera et al. 2019) have found that the existence of the vertical line on a PD implies network structures. Figure 14a shows the boundary of the optimal volume (orange) and the boundary of the stable volume (green) with = 0.2 of (0.677, 5.007). The red and blue points are oxygen atoms and silicon atoms, respectively. In this case, the stable volume and stable sub-volume are identical. The optimal volume is a large twisted ring, while the stable volume is a simpler ring. Figure 14bshows the versus volume plot. The plot indicates that the size of the stable volume quickly decreases from = 0to = 0.1, 123 I.Obayashi Fig. 14 a Optimal volume and stable volume of (0.677, 5.007). The orange ring is the boundary of the optimal volume; the green ring is the boundary of the stable volume. The red and blue points are oxygen atoms and silicon atoms, respectively. b The plot of the bandwidth parameter versus the size of stable volumes and stable sub-volumes Fig. 15 a Optimal volume and stable volume of (0.669, 5.053). The orange ring is the boundary of the optimal volume; the green ring is the boundary of the stable volume. The red and blue points are oxygen atoms and silicon atoms, respectively. b Plot of the bandwidth parameter versus the size of the stable volumes and stable sub-volumes. c Schematic of the optimal volume and the stable volume. The orange area is the optimal volume, the dark orange ring is its boundary, and the green ring is the boundary of the stable volume meaning that the optimal volume is sensitive to noise. Therefore, we can consider that the stable volume is the essential part of the birth-death pair. It should be noted that the stable volume with ∈[1.1, 1.6] on the second plateau is not a reasonable representation of the birth-death pair since only oxygen atoms exist on the boundary of the volume. The large causes the removal of the information of the silicon atoms. Figure 15a shows the optimal volume (orange) of (0.669, 5.053) and the stable volume (green) of the same pair with = 0.2. In this case, the optimal volume and the stable sub-volume are identical. However, the stable volume and sub-volume are not identical since the stable volume is not included in the optimal volume. Figure 15b shows the plot of against the volume. In Fig. 15a, the stable volume is tighter than the stable sub-volume. Figure 15c shows the schematic of the optimal volume and the stable volume. The stable volume and sub-volume are different since path Z in the figure is not included in the optimal 123 Stable volumes for persistent homology volume. In our opinion, the stable volume appears superior since it surrounds the tunnel in the pointcloud more tightly. However, we do not have a theoretical guarantee. In general, the stable volume is tighter than the stable sub-volume since the optimization for a stable volume is more aggressive than that for a stable sub-volume. 7 Tuning of noise bandwidth parameter In applying stable volumes, properly tuning the noise bandwidth parameter is essen- tial. The examples shown in the previous section suggest two possible approaches to such tuning: • Thescaleof should be the same scale as the noise of the data • A value on the plateau shown in the plot of against volume size (Sect. 6.1) should be used In applying the first approach, we can use the domain knowledge about the data to tune the parameter. Here, the scale of the system noise gives an estimate of the parameter. For example, the scale of thermal fluctuation can be used as the parameter if the data consist of the atomic configuration, and thermal fluctuation is the dominant noise. To apply the second approach, we need to plot the figure. As noted in Sect. 6.3, when there are multiple plateaus, we need additional criteria to determine which is better. If we use stable volumes by persistence trees, such a plot is easy to produce. From Definition 3, the size of a stable volume is computable as follows: 1 + (size of dec(ω, V , E )). (60) ω∈C (τ ,ω ) 0 0 We can compute the size of dec(ω, V , E ) by counting all the descendant elements of ω. It is also easy to judge whether ω ∈ C (τ ,ω ) by comparing rˆ(τ ) and rˆ(τ ) in 0 0 0 (16). HomCloud already has this functionality. When we cannot use stable volumes by persistence trees and we want to plot against volume size, a stable sub-volume is more useful than a stable volume. 8 Comparison between stable volumes and stable sub-volumes The question of whether stable volumes or stable sub-volumes are better requires further discussion. The differences between stable volumes and stable sub-volumes can be described as follows: • If we want to compute stable volumes or stable sub-volumes using multiple band- width parameters, the computation cost of stable sub-volumes is smaller than that of stable volumes. Thus, stable sub-volumes are desirable for constructing versus volume plots. 123 I.Obayashi • If the size of the optimal volume is large, a stable volume often gives a more aggressive and likely better result than a stable sub-volume. In other words, a stable sub-volume gives a more conservative result than a stable volume. The first listed characteristic shows the advantage of using a stable sub-volume, while the second characteristic implies that the choice between stable volumes and stable sub-volumes depends on the problem. In the author’s opinion, stable sub-volumes are easy to handle in general. However, users should compare both options and make a decision as to which is better for their data. 9 Concluding remarks We have proposed the concept of stable volumes and stable sub-volumes as a means for identifying good geometric realization of homology generators that, unlike many of the methods proposed in prior research, is robust to noise. The idea of stable volumes and stable sub-volumes is based on optimal volumes by Obayashi (2018). The statistical approach taken by Bendich et al. (2020) offers another solution to our problem. However, one advantage of stable volumes and stable sub-volumes is that they do not require a large number of repeated computations. Moreover, stable volumes and sub-volumes are easier to visualize than the output of the statistical approach since they give a deterministic rather than a probabilistic description. On the other hand, the advantage of the statistical approach is its flexibility. For 1st PH, we can only apply the statistical approach when we want to minimize the length of a loop rather than minimize the volume. The statistical approach is also applicable to the spatial distribution of points. Another advantage of the statistical approach is that it gives richer probabilistic information than stable volumes and sub-volumes. Reconstruct shortest cycles in Ripserer.jl by Cufar (2020) offers the other solu- tion. The approach uses the representative of persistent cohomology and the shortest path algorithm. Our proposed method has two advantages over reconstructed short- est cycles. The first is its mathematical background. We prove Theorem 3.1 to show the good property of stable volumes and stable sub-volumes. Reconstruct shortest cycles do not have such mathematical justification. The second is the difference in the scope of the application. Stable volumes and sub-volumes can apply to the PH of any degree; however, reconstructed shortest cycles are only applicable to 1st PH. On the other hand, the advantage of reconstructed shortest cycles is computation efficiency. The algorithm of reconstructed shortest cycles uses the shortest path algorithm and is faster than stable volumes and sub-volumes in general. Although we used the number of simplices as the volume size in this paper, it is possible to use the real volume by changing the objective function from |¯ α | ω∈F ,k+1 to vol(ω)|¯ α |, where vol(ω) is the volume of the simplex ω. This idea may ω∈F ,k+1 improve the result. In this paper, we presented algorithms only for the filtration of a simplicial complex; however, the concept can be easily applied to other filtrations, such as cubical and cell 123 Stable volumes for persistent homology filtrations. The proposed algorithms should be helpful in the study of two-dimensional, three-dimensional, or higher-dimensional digital images. Some problems raised in Sect. 3.4 are still unresolved. A more reliable mathematical justification for stable volumes other than (n − 1)th PH is also an open problem. These problems show the limitation of our research and are worth addressing in future research for the further development of homology optimization. Acknowledgements This research is partially supported by JSPS KAKENHI JP19H00834, JP20H05884, 19KK0068, JST Presto JPMJPR1923, JST CREST JPMJCR15D3, and JST-Mirai Program JPMJMI18G3. I would like to thank M. Cufar for answering questions about Ripserer.jl. Funding Open access funding provided by Okayama University. Declarations Conflict of interest The author states that there is no conflict of interest. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Appendix A: Reconstructed shortest cycles in Ripserer.jl We illustrate the mechanism of reconstructed shortest cycles in Ripserer.jl using an example. Figure 16 shows the example filtration, and we consider Z/2Z-persistent homology. In this filtration, a homology generator appearing at X disappears at X . That is, (2, 7) 2 7 is a birth-death pair in the 1st PD. The representative cycle corresponding to the pair is (1) σ . This representative cycle is the loop appearing at X . σ ∈X Now we consider persistent cohomology. The diagram defined by persistent coho- mology is identical to the diagram by persistent homology since we use a field as a coefficient, but the representative cocycle is, of course, different. The representative ∗ ∗ ∗ cocycle corresponding to (2, 7) is σ + σ + σ . 1 2 3 From the observation of the example, we find that the representative cocycle is the “cut” of the loop. This means that any representative cycle corresponding to the birth-death pair disappears if we remove all 1-simplices in the representative cocycle from the 1-skeleton of X as shown in Fig. 17a. Using this fact, we can compute representative cycles using the shortest path algo- rithm. Let C be {σ ,σ ,σ }, the set of all 1-simplices in the representative cocycle. 1 2 3 (1) (1) (1) Now X is divided into two parts, one is X \C and another is X ∩ C ={σ }. 2 2 2 (1) Then we compute the shortest path in X \C whose two endpoints are the endpoints of σ . The result is shown in Fig. 17b. 123 I.Obayashi Fig. 16 Example filtration for reconstructed shortest cycles Fig. 17 a Representative (a) (b) (c) (d) cocycle, b Reconstructed shortest cycle for X c for X d 2 4 for X (1) We can extend the idea to compute tighter cycles. We can similarly divide X into (1) (1) (1) X \C and X ∩ C ={σ ,σ }. For each σ ∈ X ∩ C, we compute the shortest 1 2 4 4 4 (1) path in X \C whose two endpoints are the endpoints of σ . The results are (b) and (c) in Fig. 17, and we choose (c) as the shortest loop in these loops. Using X instead of X , we compute an even tighter loop (d). This is the mechanism of the reconstructed shortest cycles. The algorithm is summarized as follows. 1. The 1st PD, PD (X) for a filtration X : X ⊂ ··· ⊂ X , is computed using 1 1 N persistent cohomology. The representative cocycles are also computed 2. We choose a birth-death pair (b, d), and take the corresponding representative cocycle σ σ ∈C 3. We also choose k between b and d (1) (1) 4. For each σ ∈ X ∩ C, we compute the shortest path in X \C, P(σ ), whose two k k endpoints are the endpoints of σ 5. We search the shortest one from {P(σ )} σ ∈C The mathematical justification of the algorithm is not yet given, but empirically it works. A deeper analysis of the algorithm is beyond the scope of this paper. References Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Hanson, E., Motta, F., Ziegelmeier, L.: Persistence images: a stable vector representation of persistent homology. J. Mach. Learn. Res. 18(8), 1–35 (2017) Bauer, U., Lesnick, M.: Induced matchings of barcodes and the algebraic stability of persistence. In: Pro- ceedings of the Thirtieth Annual Symposium on Computational Geometry. Association for Computing Machinery, New York, NY, USA, SOCG’14, p 355-364 (2014). 10.1145/2582112.2582168 Bendich, P., Bubenik, P., Wagner, A.: Stabilizing the unstable output of persistent homology computations. J. Appl. Comput. Topol. 4, 309–338 (2020). https://doi.org/10.1007/s41468-019-00044-9 Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16(3), 77–102 (2015) Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46, 255–308 (2009). https://doi.org/10.1090/S0273- 0979-09-01249-X 123 Stable volumes for persistent homology Carlsson, G., Ishkhanov, T., de Silva, V., Zomorodian, A.: On the local behavior of spaces of natural images. Int. J. Comput. Vis. 76(1), 1–12 (2008). https://doi.org/10.1007/s11263-007-0056-x Chan, J.M., Carlsson, G., Rabadan, R.: Topology of viral evolution. Proc. Natl. Acad. Sci. 110(46), 18566– 18571 (2013). https://doi.org/10.1073/pnas.1313480110 Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L. J., Oudot, S. Y.: Proximity of persistence mod- ules and their diagrams. In: Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry. Association for Computing Machinery, New York, NY, USA, SCG ’09, p 237-246 (2009). 10.1145/1542362.1542407 Chen, C., Freedman, D.: Hardness results for homology localization. Discret. Comput. Geom. 45(3), 425– 448 (2011). https://doi.org/10.1007/s00454-010-9322-8 Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discret. Comput. Geom. 37(1), 103–120 (2007). https://doi.org/10.1007/s00454-006-1276-5 Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C.: Introduction to algorithms, Fourth Edition. MIT press (2022) Delfinado, C.J.A., Edelsbrunner, H.: An incremental algorithm for betti numbers of simplicial complexes on the 3-sphere. Comput. Aided Geom. Des. 12(7), 771–784 (1995). https://doi.org/10.1016/0167- 8396(95)00016-Y Dey, T.K., Hirani, A.N., Krishnamoorthy, B.: Optimal homologous cycles, total unimodularity, and linear programming. SIAM J. Comput. 40(4), 1026–1044 (2011). https://doi.org/10.1137/100800245 Dey T. K., Hou T., Mandal S.: Persistent 1-cycles: Definition, computation, and its application. In: Marfil R., Calderón M., Díaz del Río F., Real P., Bandera A.: (eds) Computational Topology in Image Context. Springer International Publishing, Cham, pp 123–136 (2019) Edelsbrunner, H., Harer, J.: Computational topology: an introduction. American Mathematical Soc. (2010) Edelsbrunner, H., Mücke, E.P.: Three-dimensional alpha shapes. ACM Trans. Graph 13(1), 43–72 (1994). https://doi.org/10.1145/174462.156635 Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. Discret. Com- put. Geom. 28(4), 511–533 (2002). https://doi.org/10.1007/s00454-002-2885-2 Erickson J., Whittlesey K.: Greedy optimal homotopy and homology generators. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, SODA ’05, pp 1038–1046 (2005) Escolar, E. G., Hiraoka ,Y.: Optimal Cycles for persistent homology via linear programming, Springer Japan, Tokyo, pp 79–96 (2016), 10.1007/978-4-431-55420-2_5 Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Natl. Acad. Sci. 113(26), 7035–7040 (2016). https://doi.org/10.1073/pnas.1520877113 Hirata, A., Wada, T., Obayashi, I., Hiraoka, Y.: Structural changes during glass formation extracted by computational homology with machine learning. Commun. Mater. 1(1), 98 (2020). https://doi.org/10. 1038/s43246-020-00100-3 Iuricich, F.: Persistence cycles for visual exploration of persistent homology. IEEE Trans. Vis. Comput. Graph. 28(12), 4966–4979 (2022). https://doi.org/10.1109/TVCG.2021.3110663 Kurlin, V.: A one-dimensional homologically persistent skeleton of an unstructured point cloud in any metric space. Comput. Graph. Forum 34(5), 253–262 (2015) Kusano, G., Fukumizu, K., Hiraoka, Y.: Kernel method for persistence diagrams via kernel embedding and weight factor. J. Mach. Learn. Res. 18(189), 1–41 (2018) Lesnick, M.: The theory of the interleaving distance on multidimensional persistence modules. Found. Comput. Math. 15(3), 613–650 (2015). https://doi.org/10.1007/s10208-015-9255-y Obayashi, I.: Volume optimal cycle: Tightest representative cycle of a generator on persistent homology (2017). Preprint version of Obayashi (2018), arXiv:1712.05103 Obayashi, I.: Volume-optimal cycle: Tightest representative cycle of a generator in persistent homology. SIAM J. Appl. Algebra Geom. 2(4), 508–534 (2018). https://doi.org/10.1137/17M1159439 Obayashi, I., Nakamura, T., Hiraoka, Y.: Persistent homology analysis for materials research and persistent homology software: Homcloud (2021). arXiv:2112.03610 Onodera, Y., Kohara, S., Tahara, S., Masuno, A., Inoue, H., Shiga, M., Hirata, A., Tsuchiya, K., Hiraoka, Y., Obayashi, I., Ohara, K., Mizuno, A., Sakata, O.: Understanding diffraction patterns of glassy, liquid and amorphous materials via persistent homology analyses. J. Ceram. Soc. Jpn. 127(12), 853–863 (2019). https://doi.org/10.2109/jcersj2.19143 123 I.Obayashi Rouxa, S.L., Petkova, V.: Isaacs - interactive structure analysis of amorphous and crystalline systems. J. Appl. Crystallogr. 43(1), 181–185 (2010). https://doi.org/10.1107/S0021889809051929 Saadatfar, M., Takeuchi, H., Robins, V., Francois, N., Hiraoka, Y.: Pore configuration landscape of granular crystallization. Nat. Commun. (2017). https://doi.org/10.1038/ncomms15082 Schweinhart, B.: Statistical topology of embedded graphs. PhD thesis, Princeton University (2015). https:// web.math.princeton.edu/~bschwein/ Smith, P., Kurlin, V.: Skeletonisation algorithms with theoretical guarantees for unorganised point clouds with high levels of noise. Pattern Recognit. 115, 107,902 (2021). https://doi.org/10.1016/j.patcog. 2021.107902 Suzuki, A., Miyazawa, M., Minto, J.M., Tsuji, T., Obayashi, I., Hiraoka, Y., Ito, T.: Flow estimation solely from image data through persistent homology analysis. Sci. Rep. 11(1), 17–948 (2021). https://doi. org/10.1038/s41598-021-97222-6 Tahbaz-Salehi, A., Jadbabaie, A.: Distributed coverage verification in sensor networks without location information. In: 2008 47th IEEE Conference on Decision and Control, pp 4170–4176 (2008). https:// doi.org/10.1109/CDC.2008.4738751 Zomorodian, A., Carlsson, G.: Computing persistent homology. Discret. Comput. Geom. 33(2), 249–274 (2005). https://doi.org/10.1007/s00454-004-1146-y Cufar, M.: Ripserer.jl: flexible and efficient persistent homology computation in julia. J. Open Sour. Softw. 5(54), 2614 (2020). https://doi.org/10.21105/joss.02614 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Journal of Applied and Computational Topology – Springer Journals
Published: Dec 1, 2023
Keywords: Persistent homology; Topological data analysis; Computational algebraic topology; Optimization on homology algebra; 55N31; 62R40; 55-04; 55-08
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.