Get 20M+ Full-Text Papers For Less Than $1.50/day. Subscribe now for You or Your Team.

Learn More →

Computing the Matching Distance of 2-Parameter Persistence Modules from Critical Values

Computing the Matching Distance of 2-Parameter Persistence Modules from Critical Values The exact computation of the matching distance for multi-parameter persistence modules is an active area of research in computational topology. Achieving an easily obtainable exact computation of this distance would allow multi-parameter persistent homology to be a viable option for data analysis. In this paper, we provide theoretical results for the computation of the matching distance in two dimensions along with a geometric interpretation of the lines through parameter space realizing this distance. The crucial point of the method we propose is that it can be easily implemented. 1 Introduction Persistent Homology is known as one of the most-often used tools in Topological Data Analysis, allowing one to use topological invariants to understand the underlying shape of data. In particular, single pa- rameter persistence yields a summary of data through a one-dimensional ltration, allowing an overview of the data at many di erent scales. Single parameter persistent homology has been the subject of much study and has proven to be useful in many applications [2, 3, 15, 21, 24]. However, some data requires ltration along multiple parameters to fully capture its information: this is the role of multi-parameter persistent homology [9, 8]. In some contexts, it can be helpful to use multiple parameters to capture the details of the data [10, 16, 22, 23, 25]. Additionally, single parameter persistent homology is not robust to outliers in a point cloud; these outliers can lead to a misinterpretation of the persistent homology, with the unnecessary creation or destruction of homology classes. Multi- parameter persistence can be a natural x to this problem by adding a dimension depending on the density of the samples (see, e.g., [14, 7]). Unfortunately, understanding, visualizing, and computing invariants in multi-parameter persistent homology remains a dicult task both mathematically and computationally. This diculty holds as well when it comes to computing distances between such invariants. In the single parameter case, there are several ways to compare persistent homology modules | such as the bottleneck distance and Wasserstein distances | which exhibit some stability properties with respect to variations in the input [13]. For more than one parameter, there are also several de nitions of distances between persistence modules. Amongst them, the matching distance [4] attracts the attention of multi-parameter persistence practitioners. Australian National University, Canberra, Australia Boston College, Massachusetts, United States EPFL, Lausanne, Switzerland Universit a di Modena e Reggio Emilia, DISMI, Italy KTH Royal Institute of Technology Stockholm, Sweden Institute of Science and Technology Austria, Klosterneuburg, Austria arXiv:2210.12868v1 [math.AT] 23 Oct 2022 To compute the matching distance between two n-parameter persistence modules, one uses the fact that restricting an n-parameter ltration to any ane line of positive slope through the parameter space yields a 1-parameter ltration. There is therefore a corresponding restriction of the n-parameter persistence module to a 1-parameter module along that line. This construction allows for the knowledge and computational methods from the 1-dimensional case to be applied to the n-dimensional case. Indeed, following this idea, the matching distance is de ned as a supremum of the one-dimensional bottleneck distance, over the collection of all lines of positive slope in the parameter space, i.e., L L L d (M; N ) := sup m ^  d (dgm M ; dgm N ); match B L : u=sm ~ +b where m ^ is a normalization term explained in Section 2. However, computing exactly this distance is not an easy task given the nature of its de nition. As a rst step towards an exact computation, several approximations of this distance have been provided for 2-dimensional persistence modules [5, 19]. The exact computation of the matching distance is currently only possible for 2-dimensional modules [17], with recent computational improvements in terms of time complexity [6]. These methods exploit the duality of points and lines in the plane. In this paper, we propose a method to compute the matching distance based on a re nement of the framework developed in [1], in which we compute the rank invariant of a multi-persistence module from a nite set of critical parameter values. These critical parameter values capture all the changes in homology occurring throughout the multi- ltration. They may also be used to partition the set of positive lines in the parameter space into equivalence classes, where each equivalence class maintains the same birth and death ordering within the restricted module for all possible homology classes. Based on these results, we formulate the idea that the critical values must be relevant to the choice of lines for the computation of the matching distance. Leveraging the framework of [1] to compute the matching distance allows us to reduce the number of lines necessary to compute the matching distance to a nite set, thus reducing the computation to a maximum rather than a supremum without exploiting the point-line duality used in [17]. Through some examples, we show that considering only lines passing through pairs of critical values is not sucient. This is because the de nition of matching distance depends on the bottleneck distance, and lines in the same equivalence class might not be such that the bottleneck distance is always achieved by the same pairing of births/deaths, as will be further explained in Section 3. To overcome this problem, we analyze where switches might happen in the matching which achieves the bottleneck distance, identifying a set of points in the parameter space which we later on refer to as the set of switch points. This set allows us to re ne our equivalence relation on the set of positive lines (see Section 4) by partitioning lines through the parameter space using the union of the critical parameter values with the switch points. Using this union, it is possible to identify all the lines at which the matching distance can be realised. We explain in detail how to compute all relevant parameter values and show that the matching distance is attained either at: a line through a pair of points in this union, or a line of diagonal slope through exactly one of the points in the union (see Theorem 3.1). The contribution of this paper is to advance the state of the art, although still restricted to two dimen- sions, in two ways: computing the matching distance in a way which is both geometrically interpretable and implementable. In contrast to other methods such as [17, 6], we provide a geometric understanding of di erent lines, including horizontal, vertical, and diagonal lines, as well as lines through critical values, and their contribution to the matching distance. In addition, the method we propose leads to explicit computations of important lines, which will lead to an algorithm that can be implemented. The paper is structured as follows. In Section 2, we give necessary background from multi-parameter persistent homology, as well as some of the concepts coming from the framework developed in [1]. We also provide some examples of two-parameter persistence modules for which we explicitly show which lines are necessary to compute the matching distance. In Section 3, we provide the statement and proof of our main theoretical result Theorem 3.1 and include a discussion on the role of vertical and horizontal lines. In Section 4, we provide guidelines for the computation of the switch points, which are necessary to the exact computation of the matching distance. Finally, in Section 5, we discuss future direction of study. 2 Background While we will subsequently specialize to the case of n = 2 parameters, we begin with some general de nitions to set the scene. 2 2.1 Notation and de nitions De nition 2.1 (Parameter Spaces). Let n 2 N. If n = 1, endow R with the usual order . If n > 1, n n take the following partial order on R : for u = (u ); v = (v ) 2 R , set u  v if and only if u  v for i i i i i = 1; : : : ; n. The poset (R ;) is called an nD parameter space. For n > 1, the nD parameter space can be sliced by lines with positive slope. A line L  R is a 1D parameter space when considered parameterized by s 2 R as L : u = ms ~ + b n n where m ~ 2 R and b 2 R . L has positive slope if each coordinate m of m ~ is positive: m > 0. i i Throughout the paper, we consider the following standard normalisation for parameterizations of lines: km ~ k := maxfm j i = 1; : : : ; ng = 1; b = 0 (1) 1 i i i=1 This normalization is the one used in papers such as [11, 20]. Other choices are possible. Changing the normalization would impact all the intermediate computations that follow but not the overall results. De nition 2.2 (Persistence Modules). Let F be a xed eld. An (n-parameter) persistence module M n n over the parameter space R is an assignment of an F-vector space M to each u 2 R , and linear maps, called transition or internal maps, i (u; v) : M ! M to each pair of points u  v 2 R , satisfying the M u v following properties: • i (u; u) is the identity map for all u 2 R . • i (v; w) i (u; v) = i (u; w) for all u  v  w 2 R . M M M In this paper, persistence modules will always assumed to be nitely presented. To x the ideas, we will assume that they are obtained applying homology in a xed degree, say q, over a xed coecient eld F, to a tame and one-critical (cf. [1]) n-parameter ltration K of a nite simplicial complex: M = H (K;F). In the particular case when n = 1, a nitely presented persistence module M can be uniquely decomposed as a nite sum of interval modules [26]. De nition 2.3 (Interval Module). An interval module is a 1-parameter persistence module of the form I [b; d) with b < d  1 2 R := R[f1g such that, for every s  t 2 R, F if b  s < d id if b  s  t < d I [b; d) = ; I [b; d)(s  t) = 0 otherwise 0 otherwise The interval [b; d) can be represented as a point (b; d) in R R, above the diagonal. This encoding permits de ning persistence diagrams. De nition 2.4 (Persistence Diagram). If M =  I [b ; d ) , then the persistence diagram of M , j2J j j denoted dgm M , is the nite multi-set of points (b ; d ) of R R with multiplicity m for j 2 J . j j j We now review the de nitions of bottleneck distance between persistence diagrams and matching distance between two n-parameter persistence modules M and N . When n = 1, the bottleneck distance d is an easily computable extended metric on persistence diagrams [18], de ned as follows. De nition 2.5 (Bottleneck Distance). Let M , N be two 1-parameter persistence modules, with persis- tence diagrams dgm M and dgm N . A matching between dgm M and dgm N is a multi-bijection  : P ! Q between the points of a multi-subset P in dgm M and a multi-subset Q in dgm N . The cost of a matching , denoted c(), is the greatest among the costs of each matched pair of points, taken to be equal to the ` distance of the corresponding pair of points in R  R, with the convention that 11 = 0 and 1 a = 1 for every a 2 R, and the costs of each unmatched point, 1 2 taken to be equal to the ` distance of that point to the diagonal of R : p p 2 1 c() := max maxkp (p)k ; max : p2P p2=P Q 2 3 The bottleneck distance d is de ned as the smallest possible cost of any matching  between dgm M and dgm N : d (dgm M; dgm N ) := min c(): :P!Q Pdgm M;Qdgm N 2 0 Observe that a matching  that pairs a point (b; d) 2 R with a point (b ;1) 2 R R has cost equal to in nity. Hence, the bottleneck distance between two persistence diagrams with a di erent number of points at in nity cannot be nite. On the contrary, matching points of the same type always gives a nite (more convenient) cost that can be expressed as follows. Remark 2.6. By de nition, the cost c() of  : P ! Q takes one of the two forms. • If c() is realized by matching p 2 P to q = (p), then c() = jstj, where s and t are either both abscissas or both ordinates of p and q, respectively; jstj • If c() is realized by some p not in P Q, then c() = , where s and t are the abscissa and the ordinate of p. jstj Brie y, we can say that c() is attained by some , with  2 f1; 2g and s; t coordinates of points in dgm M [ dgm N . When the number of parameters is n  2, we can use the bottleneck distance to de ne an extended pseudo-metric between persistence modules M and N via restrictions to lines with positive slope. n L The restriction of M to a line of positive slope L  R is the persistence module M that assigns L L M to each u 2 L, and whose transition maps i L (u; v) : (M ) ! (M ) for u  v 2 L are the same u u v as in M . Once a parameterization u = ms ~ + b of L is xed, the persistence module M is isomorphic to the 1-parameter persistence module, by abuse of notation still denoted by M , that • assigns to each s 2 R the vector space (M ) := M , and s u L L • whose transition map between (M ) and (M ) for s  t is equal to that of M between M and s t u M with u = ms ~ + b and v = mt ~ + b. De nition 2.7 (Matching Distance). The matching distance between M and N is de ned by L L L d (M; N ) := sup m ^  d (dgm M ; dgm N ) match B L : u=sm ~ +b where m ^ := minfm j i = 1; : : : ; ng > 0 is the minimal coordinate of the direction vector m ~ of L, and L varies among all the lines of R with positive slope, taken with standard normalization. The role of the weight m ^ in the de nition of the matching distance is that of ensuring stability of persistence modules against perturbations of the ltrations that originate them (cf. [11, 20]). Such weight would vanish for lines parallel to the coordinate axes. Therefore, such lines are not considered in the de nition of the matching distance. In the context of the matching distance between multi-parameter persistence modules, it is more L L convenient to include the factor m ^ when computing the cost of a matching  between the points of L L a multi-subset in dgm M and the points of a multi-subset in dgm N for a xed line L with positive slope. Therefore, we introduce the following notation: L L L cost( ) := m ^ c( ) : Examples of computation of the matching distance in some simple cases are given in Section 2.3. By de nition, the matching distance is computed by taking the supremum of the bottleneck distance over all (in nitely many) lines with positive slope through the parameter space. However, in [1], we have shown that this set of lines may be partitioned into a nite number of equivalence classes. For the two-parameter case (i.e., for n = 2), we will use these equivalence classes to generate a nite set of lines from which one may compute the matching distance. Therefore, we introduce notation and de nitions necessary to understand these equivalence classes below. Henceforth in this paper, we always assume n = 2, which is the case for which we have extremely explicit results. We introduce the following de nitions (with associated gures) following [1]. 4 2 De nition 2.8 (Positive Cone). For every point u in R , let S (u) be the positive cone with vertex u: S (u) := fv 2 R : u  vg : The boundary of the positive cone, @S (u), decomposes into open faces. In particular, @S (u) can + + be partitioned by non-empty subsets A of f1; 2g in the following way. For ; =6 A  f1; 2g, de ne fug if A = f1; 2g ; S (u) := f(x ; x ) 2 R j x = u ; x > u g if A = f1g ; 1 2 1 1 2 2 f(x ; x ) 2 R j x > u ; x = u g if A = f2g : 1 2 1 1 2 2 f1g S (u) u = S S f1;2g f2g Figure 1: The positive cone S (u) of u 2 R and the decomposition of its boundary into S , S and f1g f2g S , which correspond respectively to the vertical boundary, horizontal boundary, and u. f1;2g De nition 2.9. A line L with positive slope intersects @S (u) in exactly one point, which we call the push of u onto L, and denote by push (u): push (u) := L\ @S (u) : push (u) Figure 2: The push of u along the line L. In this case, A = f2g, and so u is to the left of L. Because of the partition of @S (u) by its open faces, there is a unique non-empty subset of f1; 2g, L L denoted A , such that push (u) = L \ S L (u). Concretely, in the plane, A = f1g means that u is u L A u L L strictly to the right of L, A = f2g means that u is strictly to the left of L, A = f1; 2g means that u is u u L L on L. More generally, when f1g  A , we say u is to the right of L, and when f2g  A , we say u is to u u the right of L, allowing for the fact that it may be true in either of these cases that u is on L. 0 2 We say that two lines L; L  R with positive slope have the same reciprocal position with respect L L 2 0 0 to u if and only if A = A . Given a non-empty subset U of R , we write L  L if L and L have u u the same reciprocal position with respect to u for all u 2 U . Note that  is an equivalence relation on lines with positive slope in R [1]. 5 In this paper, it is necessary for us to extend this equivalence relation to include points in the 2 2 2 projective completion of the real plane, P = R [ ` , with ` the line at in nity of P . Points on the 1 1 line at in nity are given by homogeneous coordinates [x : x : x ] with x = 0: Note that a line with 0 1 2 0 positive slope in R will intersect ` at exactly one point [0 : v : v ] with ~v = (v ; v )  0 giving the 1 1 2 1 2 direction of the line. Given a point v = [0 : v : v ] 2 ` with v ; v > 0, de ne 1 2 1 1 2 >fvg if A = f1; 2g ; S (v) := f[0 : x : x ] 2 ` j x = v ; x > v g if A = f1g ; A 1 2 1 1 1 2 2 f[0 : x : x ] 2 ` j x > v ; x = v g if A = f2g : 1 2 1 1 1 2 2 2 2 L For every such v 2 P and line L  R with positive slope, de ne A to be the largest subset of f1; 2g 0 2 such that L\ S L 6= ;. As above, we say that two lines L; L in R with positive slope are in the same L L reciprocal position with respect to v if and only if A = A . v v We then extend the equivalence relation on lines in R with positive slope de ned in [1] as follows: two lines L and L are equivalent if and only if they are in the same reciprocal position with respect to a nite set U of points in P : 0 L L L  L if and only if A = A 8u 2 U : u u We introduce the following notation for simplicity. For a line L  R parameterized as ms ~ + b, and for u 2 R , we set p (u) to be the unique real number such that push (u) = m ~  p (u) + b: In other words, p (u) is the parameter value of the push of u to L. We note that p (u) depends on L L the chosen parameterization of the line L. However, we implicitly assume that the parameterization we choose is the standardly normalized one. The formula for the quantity p (u) depends on whether u lies to the left (i.e. A = f2g) or to the right (i.e. A = f2g) of the line L. If u lies to the left of L, then u b 2 2 p (u) = : (2) m m 2 2 If u lies to the right of L, then u b 1 1 p (u) = : (3) m m 1 1 If u lies on L, then u b u b 1 1 2 2 p (u) = = : (4) m m m m 1 1 2 2 Note that, thanks to the standard parametrization of a line L, pushing any two points u; v 2 R onto L yields to parameter values p (u); p (v) 2 R such that L L push (u) push (v) if L has slope greater than 1, 2 2 L L p (u) p (v) = (5) L L push (u) push (v) otherwise. 1 1 L L 2.2 Working assumptions Our goal in this paper is to identify a nite set of lines with positive slope from which to compute the matching distance between two nitely presented persistence modules, M and N . Our strategy is that of partitioning the set of lines with positive slope into equivalence classes where it is easy to understand how the cost of a matching between persistence diagrams along lines changes when moving such lines around, for example by translations and rotations. To this end, we start with requesting that the following property holds for all the n-parameter persistence modules M considered hereafter. Property (P). There exists a nite set C of points in R , called critical values, for which the ranks of transition maps in the n-parameter persistence module M are completely determined the ranks of transition maps between values in C , the closure of C under least upper bound. More speci cally, M M for any u; v 2 R , rank i (u; v) = rank i (u;  v ) M M 6 with 0 0 u = maxfu 2 C ju  ug; 0 0 v = maxfv 2 C jv  vg 0 0 if fu 2 C ju  ug is non-empty, and rank i (u; v) = 0 otherwise. Note that C is empty if and only M M M if M is trivial. In the case of a nitely presented persistence module M , one such set C exists and is given by the set of grades of its generators and relators. More concretely, property (P) is satis ed by modules M obtained by taking the persistent homology over a led F of simplicial complexes K ltered by a tame and one-critical ltration K = fK g n : M = H (K;F). Indeed, xing a discrete gradient vector eld u2R q V consistent with the ltration K and de ning C to be the set of entrance values of simplices  that are critical in the gradient vector eld V , M satis es property (P) with set of critical values given by C as proved in [1][Thm. 1]. If both M and N have property (P ), then we may partition the set of lines with positive slope in n 0 R via the equivalence relation  . For any two lines L  L , the persistence diagrams C [C C [C M N M N 0 0 L L L L dgm M and dgm M (respectively dgm N and dgm N ) are in bijection [1][Thm. 2]. As we will see in Lemma 3.2 later, these bijections will extend to a bijection, denoted 0 , between L;L 0 0 L L L L the sets of matchings that are used to compute both d (M ; N ) and d (M ; N ) . Moreover, the B B bijection 0 can be used to compute the cost of a matching for line L from the cost of the corresponding L;L matching for line L, provided the lines are in the same equivalence class. L L Ideally, one would like to use 0 to show that a line L on which d (M ; N ) attains a maximum L;L B is a line which passes through two points in C [ C : However, this is not the case, as Examples 2.12 M N and 2.13 show. The reason is that, within an  -equivalence class, it is possible that there is no C [C M N \winning" matched pair of points or \winning" matching that works for all lines within that equivalence class, i.e., the winning matching may \switch" within the same  equivalence class. C [C M N Therefore, the equivalence relation  must be re ned, in order to generate equivalence classes C [C M N in which there exists at least one matching and matched pair which computes the bottleneck distance for all lines in a given equivalence class. This requires the addition of switch points: points ! in the projective completion for which the cost of matching some pair u; v equals the cost of matching some other pair w; x, with u; v; w; x 2 C [ C for any line through !. M N So, with the goal of computing the matching distance using the method described above in the case n = 2, we introduce another property, called (Q), that pairs M; N of 2-parameter persistence modules are required to satisfy, involving a set of switch points. Property (Q). There exists a nite set of points (M; N ) in P such that for all u; v; w; x 2 C , not necessarily distinct, and ;  2 f1; 2g, also not necessarily distinct, the sign (positive, negative or null) of jp (u) p (v)j jp (w) p (x)j L L L L (u; v; w; x; ; ) := ; remains the same for all lines L in the same equivalence class with respect to C [ C [ M N Once the equivalence relation  is re ned to  , a winning matching  and a winning C [C C [C [ M N M N matched pair u; v 2 C [C can be determined for each equivalence class. Indeed, Property (Q) implies M N the following property (Q'), as Proposition 2.10 will prove. Property (Q'). There exists a nite set of points (M; N ) in P with the following property. Let L be any line, and let  be an optimal matching with respect to computing the bottleneck distance L L between dgm M and dgm N . Suppose that the cost c() of the matching  is realised by the pushes of u; v 2 C [ C onto L. Then, if L is any other line in the same equivalence class as L with respect to M N C [ C [ , the matching 0 () is an optimal matching with respect to computing the bottleneck M N L;L 0 0 L L distance between dgm M and dgm N . Moreover, its cost c( ()) is realised by the pushes of u L;L and v onto L . In other words, 0 0 jp (u) p (v)j jp (u) p (v)j L L L L c() = () c( ()) = : L;L Proposition 2.10. Property (Q) implies Property (Q'). 7 L L Proof. Choose a line L and let  be a matching between dgm M and dgm N that has cost jp (u) p (v)j L L c() = for some u; v 2 C [ C and  2 f1; 2g. M N Let L be a line that is equivalent to L with respect to C [ C [ . Property (Q) implies that M N jp (u) p (v)j L L c( ()) = : L;L This tells us that within an equivalence class under  , if the pair u; v 2 C [ C obtains the M N C [C [ M N cost of , then that pair also obtains the cost of (), although it may not be the unique such pair L;L to do so. Now suppose that  is the optimal matching with respect to computing the bottleneck distance L L between dgm M and dgm N . Let u ; v 2 C [ C and  2 f1; 2g such that M N jp (u ) p (v )j L  L c( ) = : L L Since  is the optimal matching, for any other matching  between dgm M and dgm N , it is true that c( )  c(). Therefore, again by property (Q), we have c( 0 ( ))  c( 0 ()): L;L L;L Therefore ( ) is a (possibly not unique) matching that obtains the bottleneck distance between L;L 0 0 L L dgm M and dgm N , and u ; v 2 C [ C is a (possibly not unique) pair of critical values which M N obtains the cost of ( ). L;L Finally, because Example 2.12 shows that the diagonal direction is also needed, we set to be the union of with the singleton f[0 : 1 : 1]g. This will allow for the computation of the matching distance between M and N from lines through two points in C [ C [ via Theorem 3.1, assuming that M M N and N satisfy properties (P) and (Q). As mentioned above, all nitely presented persistence modules satisfy property (P). In Section 4, we provide the theoretical justi cation also for the existence of the set ensuring property (Q). 2.3 Examples We now give some simple examples of computation of the matching distance in the case of two parameters. The considered persistence modules will be direct sums of rectangle modules, i.e. persistence module that are trivial outside a supporting rectangle [u ; v )  [u ; v ), with u = (u ; u ), v = (v ; v ) 2 R , 1 1 2 2 1 2 1 2 and equal to F with all the transition maps equal the identity on the supporting rectangle with bottom left corner u and a top right corner v. The following two examples, illustrated in Figure 3 and in Figure 4, respectively, show that is does not suce to consider only lines with diagonal direction or only lines though critical values of the modules. Example 2.11. Let M be the rectangle decomposable module given as the direct sum of two rectangle modules, one with support [0; 7) [0; 7), the other one with support [0; 7) [4; 11). Let N be the rectangle decomposable module given as the direct sum of the rectangle modules with support [0; 7) [0; 11) and [0; 7) [4; 7). L L For diagonal lines L with equation y = x + q for q  1 or q  3, we have that M = N so that L L d (M ; N ) = 0. For 0 < q < 4, a straightforward computation shows that the maximum bottleneck distance achieved is for q = 2, where the cost is 2. However, for the line L through the points (0; 0) L L L 7 and (7; 11), the bottleneck distance is d (M ; N ) = 4 and m ^ = , so that the matching distance between M and N is at least . This example shows that diagonal lines do not suce. Example 2.12. Let M be the rectangle module with support [2;1) [2; 7), and let N be the rectangle module with support [2;1) [2; 10). For lines L with equation y = mx + q with m  1 and q  2, we get weighted bottleneck distance 5, thus achieving the matching distance, because the weighted distance is smaller than 5 for all other lines. In particular, the line through (2; 2), point in C \C , with diagonal direction achieves the matching M N distance between M and N . 8 Figure 3: The persistence modules considered in Example 2.11: The matching distance is not attained at any diagonal line. Figure 4: The persistence module considered in Example 2.12: The matching distance is not attained at any line through two critical points. Symmetrically, if we consider the rectangle persistence modules M = [2; 7) [2;1) and N = [2; 10) [2;1), the situation is reversed, that is, the matching distance is realised by lines of slope greater or equal than 1. Note that the only slope that achieves the matching distance in both cases is 1. Also note that in neither of these two examples is it possible to achieve the matching distance by a line through two points in C [ C or their closure with respect to the least upper bound. M N The next example, illustrated in Figures 5 and 6, shows that not even considering both lines through critical values or their least upper bounds and diagonal lines ensures we achieve the matching distance. This fact will motivate the introduction of a further set of points, called switch points, in Section 4. Example 2.13. Let M be the rectangle decomposable module which is the direct sum of two rectangle modules, one with underlying rectangle [0; 7)  [0; 8), the other one with underlying rectangle [0; 7) [4; 11). Let N be the rectangle decomposable module sum of two rectangle modules whose underlying rectangles are [0; 7) [4; 8) and [0; 7) [0; 11). L L For diagonal lines L with equation y = x + q for q  1 or q  4, we have that M = N so that L L d (M ; N ) = 0. For 1 < q < 4, straightforward computation shows that the maximum bottleneck 3 0 7 distance achieved is for q = 2:5, where the cost is . However, for the line L through the points ( ; 6) 2 2 L L L 7 and (7; 11), the bottleneck distance is d (M ; N ) = 3 and m ^ = , so that the matching distance between M and N is at least . Out of the lines through two points in the closure with respect to L L lowest upper bound of C [ C , the only line that does not give M = N is the line through (0; 0) M N and (7; 11), which has weighted bottleneck distance . This example shows that it does not suce to consider only diagonal lines and lines through two points in the closure with respect to lowest upper bound of C [ C . M N 9 Figure 5: The persistence modules considered in Example 2.13: Neither lines through critical values nor diagonal lines suce to achieve the matching distance. 3 Main result The goal of this section is to prove the following main result. Theorem 3.1. Let M; N be two 2-parameter persistence modules. If M; N are both trivial, then d (M; N ) = 0. Otherwise, assume that M and N satisfy properties (P) and (Q). Let C and match M C be nite sets of critical values in R as ensured by Property (P), and let (M; N ) be a nite set of points in P as ensured by property (Q). Then the matching distance between M and N is achieved by a line through two (not necessarily unique) points in C [ C [ , with (M; N )[f[0 : 1 : 1]g. M N In particular, since nitely presented persistence modules satisfy properties (P) and (Q), Theorem 3.1 holds for such persistence modules. We prepare the proof of this theorem, given in Section 3.3, showing some preliminary claims in which the set is written as the union of its proper part R and its part at in nity ` . P 1 1 3.1 Preliminary claims 0 2 0 0 0 0 0 Let L and L be lines in R with positive slope parameterized by L : u = ms ~ + b and L : u = m ~ s + b L L 0 respectively. Let S be the set of all matchings between M and N , and let S be the set of all matchings 0 0 L L between M and N . 0 0 Lemma 3.2. If L  L , then there exists a bijection between S and S . L;L C [C M N 0 0 0 Proof. Since L  L , we know that L  L and L  L . Therefore, by our previous C [C C C M N M N 0 0 L L L L paper [1], there exist multi-bijections : dgm M ! dgm M and : dgm N ! dgm N . For any M N L L 1 matching  : P ! Q in S such that P  dgm M and Q  N , set () =  ( ) . In other L;L N M words, the following diagram commutes: L L dgm M dgm N M N 0 () 0 0 L;L L L dgm M dgm N Since and are bijections, () is easily seen to be a matching of S . Similarly, we could M N L;L 0 1 have just as easily taken a matching  2 S , and created a matching ( )    in S. Therefore, N M establishes a bijection between the sets S and S . L;L More explicitly, the multi-bijection behaves as follows. Let u; v 2 C be such that the pair (p 0 (u); p 0 (v)) 2 dgm M . Then gives a bijection between M L L M the parametrized values of pairs u; v 2 C on L and L, i.e., 1 L (p 0 (u); p 0 (v)) = (p (u); p (v)) 2 dgm M : L L L L Under the assumptions above, since L  L , both lines intersect the same face of the positive cone 0 0 0 of u. This means that there exists some i 2 f1; 2g such that (m ~  p (u) + b) = (m ~  p (u) + b ) = u . L i L i i This enables us to solve for p (u) from p (u): L L 10 Figure 6: Weighted bottleneck distance across the line parameters. On the horizontal axis we have the values of the angle , giving the slope of the line. On the vertical axis we have the parameter b giving the origin of the line. The red cross corresponds to the maximum value over all lines, the black cross the maximum value for lines through pairs of critical points and the blue cross is the maximum value for diagonal lines. This particular example shows that it is not sucient to consider only lines through pairs of critical points or diagonal lines. Indeed the line achieving the matching distance (represented by the red cross), is not diagonal, nor passes through two critical values. 11 0 0 (m ~  p (u) + b) = (m ~  p 0 (u) + b ) , L i L i 0 0 m p (u) + b = m p 0 (u) + b , i L i L i i m b b i i p 0 (u) = p (u) + : (6) L L 0 0 m m i i Similarly, there is some j 2 f1; 2g such that b b j j p 0 (v) = p (v) + : L L 0 0 m m j j Analogously, if u 2 C is such that the pair (p 0 (u);1) 2 dgm M , we can solve for p 0 (u) from M L L p (u) as in (6). L L For the rest of the section, we x a line L with positive slope, we let P  dgm M and Q  dgm N , and we take  : P ! Q to be a matching such that the pair u; v 2 C [ C achieves the cost of : M N jp (u)p (v)j L L cost() = m ^ ; with 1; if cost() is achieved by a matched pair, 2; if cost() is achieved by an unmatched point . Also, for any other line L in the same equivalence class as L, we simply write for the bijection L;L between matchings along L and matchings along L from Lemma 3.2. As we are assuming that M; N satisfy Property (Q), for L equivalent to L via  , gives C [C [ M N a bijection between the matchings of L to those of L and, if u; v achieve the cost of , then u; v also achieve the cost of (). In this situation, the following lemmas tell us how to move a line around in its equivalence class with respect to C [ C [ in order to increase (or at least not decrease) the cost of a matching. M N We rst analyze what happens when we translate a line. 2 L L Lemma 3.3. Let L  R be a line with positive slope and let  : dgm M ! dgm N be a matching jp (u)p (v)j L L such that c() = , with u; v 2 C [ C [ and  2 f1; 2g. Then, for any other line M N L  L, obtained by translating L, the cost of () is given by the following formulas: C [C [ M N jp (u)p (v)j L L L L L L L 1. cost(()) = cost() = m ^ , if A  A or A  A , u v v u jp (u)p (v)+(b b )(1=m +1=m )j L L L 1 1 2 2. cost(()) = m ^ , otherwise. 0 0 0 0 ~0 Proof. Suppose that L is parameterized as ms ~ + b and L as m s + b . Let L be a line that is parallel to and in the same reciprocal position as L with respect to C [ C [ . Since L is parallel to L, we M N L L ~0 have m ~ = m , and in particular, m ^ = m ^ . As a consequence of Property (Q), we see that 0 0 jp (u) p (v)j L L c(()) = : L L 0 0 If either u or v lies on L, meaning A or A = f1; 2g, then L  L and L parallel to L u v C [C [ M N 0 L L implies that L = L , and therefore c(()) = c(). Otherwise, we have that both A and A are strictly u v contained in f1; 2g. We now have two cases. 1. First suppose that L (and hence L ) intersects the same face of the positive cone of u and of v, L L L L that is A  A or A  A . u v v u By Equation (6), this means that there is some i 2 f1; 2g such that: 0 0 m b b b b i i i i i p 0 (u) = p (u) + = p (u) + ; and L L L 0 0 m m m i i 0 0 m b b b b i i i i i p 0 (v) = p (v) + = p (v) + : L L L 0 0 m m m i i 12 We can then compute directly: jp 0 (u) p 0 (v)j L L c(()) = 0 0 jp (u) + (b b )=m p (v) (b b )=m j L i i L i i i i jp (u) p (v)j L L = c(): 2. Now suppose that L (and hence also L ) intersects di erent faces of the positive cones of the points u and v. Without loss of generality, assume that u lies strictly to the right of both L and L , and that v lies strictly to the left. This means that b b p (u) = p (u) + ; L L b b p 0 (v) = p (v) + : L L Recall that because of our standard normalization (see Equation (1)), we have b = b and 1 2 0 0 b = b . So we have 1 2 jp 0 (u) p 0 (v)j L L c(()) = 0 0 jp (u) + (b b )=m p (v) (b b )=m j L 1 1 L 2 2 1 2 jp (u) p (v) + (b b )(1=m + 1=m )j L L 1 1 2 = : Remark 3.4. Since translation preserves the slope of a line, Lemma 3.3 holds with respect to C [C [ M N as well. Lemma 3.5. Let L  R be a line with positive slope not intersecting C [ C [ . For each M N P L L 0 matching  : dgm M ! dgm N , there exists a direction of translation such that if L is a line obtained by translating L in that direction and L  L , then C [C [ M N cost(())  cost(): Proof. With reference to the notations and cases analyzed in Lemma 3.3, we now have two cases: 1. First suppose that L (and hence L ) intersects the same face of the positive cone of u and of v. We then know that cost(()) = cost(). This shows that in this case, translating L in either of the two directions keeps the cost unchanged. Since we are assuming that C [ C [ contains at M N least the proper points u and v, at least one of the two directions of translation guarantees hitting a proper point in C [ C [ M N P 2. Now suppose that L (and hence also L ) intersects di erent positive faces of the points u and v so jp (u)p (v)+(b b )(1=m +1=m )j L L L 1 1 2 that cost(()) = m ^ . Note that m ; m > 0. If p (u)  p (v), 1 2 L L 0 0 then having b > b would imply that cost(())  cost(). If p (v)  p (u), then having b > b 1 L L 1 1 1 would imply that cost(())  cost(). 0 0 0 Note that b > b means that L is a left translate of L, and b > b means that L is a right 1 1 1 1 translate of L. In either case, there is a direction of translation in which the cost does not decrease. We may therefore translate L, increasing c(), until L hits a point in C [ C [ . Since u and M N v lie on opposite sides of L, we are guaranteed to hit a proper point in C [ C [ M N P Lemma Lemma 3.3 is used in the following way. 13 Remark 3.6. For each line L with positive slope that does not intersect any point of C [ C [ M N P there is a direction of translation such that • translating L in that direction does not decrease the weighted cost cost(), • during the translation the line remains in the same equivalence class as L with respect to C [ C [ , until • eventually, the translated line hits a proper point in C [ C [ R . M N P Let us now study what happens when we rotate a line. Lemma 3.7. Let L be a line through a point r 2 R with direction m ~ = (m ; m ) (with m ; m > 0), 1 2 1 2 jp (u)p (v)j L L L L and let  : dgm M ! dgm N be a matching such that c() = , with u; v 2 C [ C and M N 2 f1; 2g. Then, for any other line L  L, obtained by rotating L about r, the cost of () is C [C [ M N given by the following formulas: m ^ L L 1. cost(()) =  cost(); if both A and A contain argmax fm g. L i u v i2f1;2g m ^ L L 2. cost(()) = cost() if both A and A contain argmin fm g. u v i2f1;2g L L L 3. cost(()) =  m ^ (v r ) + r u ; if A = argmin fm g, A = argmax fm g 2 2 1 1 i i u i2f1;2g v i2f1;2g and p 0 (u) p 0 (v)  0. L L 1 L L L 4. cost(()) =  m ^ (r v ) + u r ; if A = argmin fm g, A = argmax fm g 2 2 1 1 i i u i2f1;2g v i2f1;2g 0 0 and p (u) p (v)  0. L L Proof. Assume, without loss of generality, that L has slope greater than 1. That is, m ^ = m < m = 1, 1 2 so that argmax fm g = 2 and argmin fm g = 1. i i i2f1;2g i2f1;2g L L 1. Suppose that A and A both contain 2. Then u v jp (u) p (v)j = ju v j = jp 0 (u) p 0 (v)j: L L 2 2 L L Therefore, 0jp (u) p (v)j m ^ L L cost(()) = m ^ =  cost(): m ^ L L 2. Next suppose that both A and A contain 1. Then u v jv u j 1 1 jp 0 (v) p 0 (u)j = : L L 0 So we have that 0 jp 0 (v) p 0 (u)j 1 0 jv u j jv u j L L 1 1 1 1 L L cost(()) = m ^  =  m  = = cost(): L L 3. Assume that A = f1g and A = f2g. That is, assume that u lies strictly to the right of the line u v L and that v lies strictly to the left of the line L. Let L be another line obtained by rotating L 0 0 L L L L around r without hitting any point in C [C [ . We have A = A and A = A . Moreover, M N u u v v push 0 (u) r 1 1 2 2 = = : push (u) r m ^ 0 m 1 1 L 1 Thus, push 0 (u) r u r 1 1 1 1 push (u) = + r = + r : 2 0 2 0 2 L L m m 1 1 By Equation (5), we therefore see that p 0 (v) p 0 (u) = push (v) push (u) 0 0 L L 2 2 L L u r 1 1 = v + r 2 0 2 r u 1 1 = v r + : 2 2 0 14 0 0 0 Hence, if p (v) p (u)  0, the cost of () for L is L L 0 p 0 (v) p 0 (u) L L cost(()) = m ^ 1 0 r u 1 1 =  m  v r + 2 2 0 1 0 =  m (v r ) + r u : 2 2 1 1 0 0 4. Similarly, if p (u) p (v)  0, then the cost of ()) for L is L L 1 0 cost(()) =  m (r v ) + u r : 2 2 1 1 Lemma 3.8. Let L  R be a line with positive slope and let r be any proper point on L. If L does not pass through any point of C [ C [ , apart from possibly the point r itself, then, for each matching M N L L 0 : dgm M ! dgm N , there exists a direction of rotation such that if L is a line obtained by rotating L around r in this direction and L  L , then C [C [ M N cost(())  cost(): Proof. Assume that L does not pass through any point of C [ C [ other than r. Recall that the M N point [0 : 1 : 1] lies in , so by our assumptions, L is not a line with diagonal direction. Assume, without loss of generality, that the slope of L is larger than 1, that is m ^ = m < m = 1. Consider the matching 1 2 , and suppose that jp (u) p (v)j L L c() = : By assumption, one of the following mutually exclusive cases applies: • at least one of the points u and v is not on L, and r equals the other point, • neither of the points u and v lie on L. We therefore have the following cases. 1. The points u and v lie on the same side of L, including the possibility that one of them lies on L. L L In other words, either A and A are equal, or one is a proper subset of the other one. u v 2. The points u and v lie strictly on di erent sides of L, implying that neither of them lie on L. In L L other words, one of A and A equals f1g, and the other one equals f2g. u v We prove each case separately. L L 1. First suppose that u and v lie on the same side of L. This means that either both A and A u v contain 1 or both contain 2. Let L be a line in the same equivalence class as L with respect to C [ C [ M N We have two subcases. L L (a) Suppose that A and A both contain 2. Then, by Lemma 3.7(1), we have that u v m ^ cost(()) =  cost(): m ^ Therefore rotating L towards the diagonal will increase the cost, at least until we hit a point of C [ C [ , possibly the diagonal direction. M N L L (b) Next suppose that both A and A contain 1. Then, by Lemma 3.7(2), we have u v cost(()) = cost(): So the cost of the matching does not change as we rotate L about r. We can once again rotate towards the diagonal without decreasing the cost, at least until we hit a point in C [C [ M N including the diagonal direction. 15 2. Next suppose that u and v lie on di erent sides of L. Assume, without loss of generality, that L L A = f1g and A = f2g. That is, u lies strictly to the right of the line L and v lies strictly to the u v left of the line L. Let L be another line obtained by rotating L around r without crossing any point in C [C [ M N 0 0 The sign of p (u) p (v) is determined by A . L L l:u:b:(u;v) 0 0 • If f1g  A , then p (u) p (v)  0. By Lemma 3.7(4), we have L L l:u:b:(u;v) 1 0 cost(()) =  m (r v ) + u r : (7) 2 2 1 1 • If f2g  A , then p 0 (u) p 0 (v)  0. By Lemma 3.7(3), we have L L l:u:b:(u;v) 1 0 cost(()) =  m (v r ) + r u : (8) 2 2 1 1 We now have the following further subcases to consider. L L (a) Suppose that f1g  A and v  r . By Equation (7), taking m < m rotates L 2 2 1 l:u:b:(u;v) away from the diagonal direction and does not decrease the cost of the matching . In this case, we may hit no point of C [ C [ before we get to the vertical line through r, that M N is, we may eventually hit the point [0 : 0 : 1] 2 ` . L L (b) Now, suppose that f1g  A and v < r . This time, we take m > m , which rotates 2 2 1 l:u:b:(u;v) the line L towards the diagonal direction without decreasing its cost. In this case, we may hit no point of C [ C [ before we get to the diagonal line through r, that is, we may M N eventually hit the point [0 : 1 : 1] 2 (c) Next, suppose that f2g  A and v > r . So by Equation (8), we see that taking 2 2 l:u:b:(u;v) m > m , that is, rotating the line L towards the diagonal slope, does not decrease the cost. L L (d) Finally, suppose that f2g  A and v  r . This time, we can take m < m , which 2 2 1 l:u:b:(u;v) rotates the line away from the diagonal slope. This does not decrease the cost, and we will hit the point u before getting to a vertical line. When the slope of L is smaller than 1, that is m ^ = m < m = 1, the argument is very similar except 2 1 that for the case 2(c) where we may hit the point [0 : 1 : 0] 2 ` instead of the point [0 : 0 : 1]. The case when the slope of L is exactly 1 is excluded by the assumption that L intersects C [ C [ only at M N the proper point r. Lemma 3.8 is used in the following way. Remark 3.9. For each line L as in Lemma 3.8, there is a direction of rotation such that • rotating L in such direction around r does not decrease cost(), • during the rotation the line remains in the same equivalence class as L with respect to C [C [ M N until • eventually, the rotated line hits a point in C [ C [ [f[0 : 1 : 0]; [0 : 0 : 1]g. M N 3.2 Vertical and Horizontal Lines As can be seen in the proof of Lemma 3.8, it is not immediately apparent that the bottleneck distance for horizontal or vertical lines is irrelevant for the overall matching distance computation - in other words, this lemma does not discount the possibility that the matching distance is achieved on such a line. In order to understand better the role that horizontal and vertical lines play in the computation of the matching distance, we must more formally derive the cost of a vertical line (De nition 3.13); the correctness of this de nition relies on Theorem 3.12. After de ning the cost of a vertical line, we are able to show that these lines do not determine the matching distance, via Corollary 3.14. The case dealing with horizontal lines is analogous. In this section we maintain the notations and assumptions of the previous section. In particular, the persistence modules M and N are assumed to satisfy property (P) for some nite sets C and C , and M N property (Q) for some set . Moreover, we maintain the notation := [f[0 : 1 : 1]g. In order to prove Theorem 3.12, we rst prove the following two lemmas concerning the cost of lines within the same equivalence class as a vertical line. 16 2 0 Lemma 3.10. Let V  R be a vertical line and let c; c be distinct points on V such that the open line segment on V between them does not intersect C [ C [ . Then there exists a m ~ > 0 suciently M N 1 small such that, for any two parallel lines L and L with direction m ~ = (m ; 1) and 0 < m < m ~ that 1 1 1 0 0 intersect V on the open line segment c; c , it holds that L  L . C [C [ M N Proof. Let us set " := minfm 2 R j [0 : m : m ] 2 ; m = 1g ; 1 1 1 2 2 c :=g:l:b: C [ C [ min M N P (9) c :=l:u:b: C [ C [ max M N P " := minfjx y j j x 6= y ; x; y 2 C [ C [ g : 1 1 1 1 M N P Assume, without loss of generality, that c < c . Taking "=0 to be equal to +1, choose " " 0 < m ~ < min ; ; " : 1 1 0 min max jc c j jc c j 2 2 2 0 0 As there are no points in C [ C [ on V between c and c , we get that L and L have the same M N reciprocal position with respect to all points in V \ (C [ C [ ). Moreover, by the choice of their M N slope, L and L are in the same reciprocal position with respect to all points in (C [ C [ )n V as M N well. Thus, L  L . C [C [ M N 2 0 Lemma 3.11. Let V  R be a vertical line, and let c; c be distinct points on V such that the line segment between them does not contain any point of C [C [ . Let L be a line through a point r 2 V M N between c and c with direction m ~ = (m ; 1), 0 < m < 1. Then the limit of the weighted bottleneck 1 1 distance along L as it rotates around r toward the vertical direction, written as L L L lim m ^ d (dgm M ; dgm N ); m !0 is the same for all points r in the open segment of V between c and c . Moreover, this convergence is uniform on the open line segment between c and c . Proof. We consider the behavior of the weighted bottleneck distance along lines L(r; m ) with direction m ~ = (m ; 1), and 0 < m < 1, through a point r on the open segment of V between c and c , as m 1 1 1 tends to 0. First of all we observe that, for any two points r and r on such open segment, and for suciently 0 0 small m > 0, Lemma 3.10 ensures that L := L(r; m )  L(r ; m ) =: L . Hence, by Property 1 1 1 C [C [ M N (Q'), if the optimal matching achieving the bottleneck distance along L is , then the optimal matching achieving the bottleneck distance along L is 0 (), and that the matched pair giving the cost of L;L 0 () is the image of the matched pair giving the cost of  under 0 : If p (u); p (v) give the L;L L;L L L bottleneck distance along L, then p 0 (u); p 0 (v) give the bottleneck distance along L . In other words, L L for any line L(; m ) that is steep enough and intersects V between c and c , the bottleneck distance is determined by the same pair u; v of critical points. Now, considering the cases listed in Lemma 3.7 separately, we show that the weighted bottleneck distance along L(r; m ) converges uniformly, independent of the choice of r, as m tends to 0. That is, 1 1 for any  > 0, there exists a m ~ > 0 such that for any 0 < m < m ~ and any point r 2 V in between c 1 1 1 0 0 00 and c , the line L through r with direction (m ; 1) gives 0 0 0 L L L L L L jm ^ d (dgm M ; dgm N ) lim m ^ d (dgm M ; dgm N )j <  : B B m !0 Let  > 0. L(r;m ) L(r;m ) 1 1 1. If both A and A contain 2 = argmax fm g, then the weighted bottleneck dis- v u i i2f1;2g ju v j 2 2 tance is m , which is independent of r and tends to 0 when m tends to 0. In particular, 1 1 ju v j 0  2 2 for any r 2 V between c and c and any 0 < m < , we have that m < , proving 1 1 ju v j 2 2 L(r;m ) L(r;m ) L(r;m ) 1 1 1 the uniform convergence of m ^ d dgm M ; dgm N to 0 as m tends to 0 in this B 1 case. 17 L(r;m ) L(r;m ) 1 1 2. If A and A contain 1 = argmin fm g, then the weighted bottleneck distance is v u i2f1;2g i ju v j 1 1 constantly equal to , independently of both r and m . Hence, it converges uniformly (for ju v j 0 1 1 steep enough lines intersecting V between c and c ) to as m tends to 0 in this case. L(r;m ) L(r;m ) 1 1 3. If A = f1g and A = f2g, and p (u) p (v)  0, then the weighted u v L(r;m ) L(r;m ) 1 1 1 1 bottleneck distance is  (m (v r ) + r u ). As m tends to 0, this tends to  (r u ) 1 2 2 1 1 1 1 1 regardless of r (that is, regardless of r , as r is the same for any point r between c and c ), thanks 2 1 to the fact that r is bounded between c and c . 2 2 In particular, we have that L(r;m ) L(r;m ) 1 1 m d dgm M ; dgm N  (r u ) = m  (v r ) 1 B 1 1 1 2 2 < m  maxfjv c j;jv c jg : 1 2 2 2 Therefore, for any r 2 V between c and c and any 0 < m < , we have 1 0 maxfjv c j;jv c jg 2 2 2 ju v j 2 2 1 m   (r u ) < , proving uniform convergence of the weighted bottleneck distance 1 1 1 along L(r; m ) to  (r u ) as m tends to 0 in this case. 1 1 1 1 L(r;m ) L(r;m ) 1 1 4. If A = f1g and A = f2g, and p (u) p (v)  0, then, by an analogous u v L(r;m ) L(r;m ) 1 1 argument to the one made in the previous case, we can see that the weighted bottleneck distance along L(r; m ) converges uniformly to  (u r ) as m tends to 0 in this case. 1 1 1 1 Theorem 3.12. Given a vertical line V  R , the limit L L L lim m ^ d (dgm M ; dgm N ) m !0 of the weighted bottleneck distance of the line L through the point r 2 V with direction m ~ = (m ; 1), 0 < m < 1, as L rotates toward the vertical direction, does not depend on the point r. Proof. By Lemma 3.11, we already know that, for every two points r; r 2 V such that the closed segment between them does not contain any point of C [C [ , the limit of the weighted bottleneck distance M N P of lines through the point r 2 V as they rotate toward the vertical direction is equal to that of lines through r . Let us now take c to belong to V \ (C [ C [ ) and r 2 V to be arbitrarily close to c. Let L(r) M N and L(c) be parallel lines with positive slope through r and c, respectively. L(r) L(r) L(r) Let us now assume that m ^ d (dgm M ; dgm N ) tends to the value `, which has to be the L(c) L(c) L(c) same for all r suciently close to c by Lemma 3.11, and m ^ d (dgm M ; dgm N ) tends to the value `(c), as L(c) and L(r) tend to V by rotating counterclockwise about r and c, respectively. Let us also assume, by contradiction, that j`(c) `j > 0 and take 0 < " < j`(c) `j=3. By internal stability [20, Thm. 2], we know that the weighted bottleneck distance for L(r) tends to that of L(c) as r tends to c, so L(r) L(r) L(r) L(c) L(c) L(c) m ^ d dgm M ; dgm N m ^ d dgm M ; dgm N  " (10) B B for every r suciently close to c. By Lemma 3.11, we also know that the weighted bottleneck distance for L(r) tends to ` uniformly as we rotate L(r) towards the vertical line and move r on the line. Hence, for all r suciently close to c, there is some real number m ~ > 0 such that for all 0 < m  m ~ , the lines L(r) through r with direction 1 1 1 m ~ = (m ; 1) satisfy L(r) L(r) L(r) m ^ d dgm M ; dgm N `  " (11) We may now take m > 0 to be suciently small so that for the lines L(c) through c with direction m ~ = (m ; 1) we also have L(c) L(c) L(c) m ^ d dgm M ; dgm N `(c)  " (12) By applying the triangle inequality, and using Equations (10) to (12), we get j`(c)`j  3" < j`(c)`j, giving a contradiction. 18 The above preposition guarantees that the following notion for the cost of a vertical line is well de ned. The case of the analogous statement for horizontal lines can be handled in a similar way. De nition 3.13. The cost of a vertical line V is the limit of the weighted bottleneck distance of a line L through a xed point r 2 V as it rotates toward the vertical direction: L L L cost(V ) := lim m ^ d (dgm M ; dgm N ) L + m ^ !0 where L is any line through r 2 V with direction m ~ = (m ; m ) and m ^ = m . 1 2 1 We can now conclude that the matching distance can always be achieved by a non-vertical line. Corollary 3.14. For every vertical line V , there exists a line L of positive slope through a point of C [ C [ and a di erent point of C [ C [ such that M N P M N L L L cost(V )  m ^ d (dgm M ; dgm N ) : Proof. By Theorem 3.12 the cost of a vertical line V does not depend on the point r 2 V used to compute it. Thus, we are free to choose r 2 V such that it is higher than any point in C [C [ . We can now M N P rotate V about r making sure we stop at the rst point of C [C [ we hit. The proof of Lemma 3.8 M N ensures that, if we rotate towards the diagonal, the obtained line L has a weighted bottleneck distance not smaller than cost(V ) except possibly when we are in case 2.(a) or 2.(d). However, since we have chosen r higher than any point in C [ C [ , we cannot be in case 2.(a). M N P Finally, suppose that when we slightly rotate V around r toward the diagonal direction we get lines L that fall in case 2.(d). As usual, we call u; v the two points realizing the weighted bottleneck distance along such lines. Let us recall that, in case 2.(d), u lies strictly to the right of L , v lies strictly to the left 0 0 0 of L , and l:u:b:(u; v) lies to the left of L . However, if l:u:b:(u; v) lies to the left of L and r > v ; u , 2 2 2 it must be the case that r  u , since r is a point on the line L . This would imply that u does not lie 1 1 strictly to the right of V , the vertical line through r, which is a contradiction. So, we cannot be in case 2.(d). If the obtained line L intersects only at a point at in nity, then Lemma 3.5 guarantees that we can translate L until we hit a point of C [ C [ without decreasing the weighted bottleneck distance, M N P and we are done. If the obtained line L is through a point c in C [ C [ , then c must be strictly to the left M N P of V because r was chosen to be strictly higher than C [ C [ . Now we can use c as a new M N P center of rotation applying again Lemma 3.8. If we get again a vertical line, then we can iterate the argument. Eventually, there will be no points strictly to the left of the newly obtained vertical line, and thus the process will terminate with a line L through a point of C [ C [ and a di erent point of M N P C [ C [ M N 3.3 Proof of the main theorem We are now ready to prove Theorem 3.1, thanks to Proposition 2.10, Lemma 3.5, 3.8, and Corollary 3.14, which we restate below for convenience. Theorem 3.1. Let M; N be two 2-parameter persistence modules. If M; N are both trivial, then d (M; N ) = 0. Otherwise, assume that M and N satisfy properties (P) and (Q). Let C and match M C be nite sets of critical values in R as ensured by Property (P), and let (M; N ) be a nite set of points in P as ensured by property (Q). Then the matching distance between M and N is achieved by a line through two (not necessarily unique) points in C [ C [ , with (M; N )[f[0 : 1 : 1]g. M N L L Proof. If M and N are both trivial, then M and N are also trivial for all lines L, yielding that d (M; N ) = 0. Otherwise, if M and N are not both trivial, consider the critical sets C and C match M N L L guaranteed by property (P), and which cannot be both empty. Consider a line L on which M and N are also not both trivial. Clearly, the matching distance between M and N is achieved on one such line L. L L L L If dgm(M ); dgm(N ) have a di erent number of points at in nity, then d dgm(M ); dgm(N ) = 0 L L 1. For any line L  L, the points at in nity of dgm(M ) (resp. dgm(N )) are in bi- C [C [ M N 0 0 L L jection with those of dgm(M ) (resp. dgm(N )) via (resp. ) as de ned in Lemma 3.2. M N 19 0 0 L L So, dgm(M ) and dgm(N ) will also have a di erent number of points at in nity, implying that 0 0 L L d dgm(M ); dgm(N = 1 as well. By internal stability [20, Thm. 2], we have that the bottleneck distance is equal to in nity on any line on the boundary of the equivalence class of L, implying the claim. L L L L If dgm(M ); dgm(N ) have the same number of points at in nity, then d dgm(M ); dgm(N ) is realized by a matching . As in Lemmas 3.3 and 3.7 above, we denote the matched pair that gives the bottleneck distance L L between M and N by u; v, so the corresponding real parameters on the line L are p (u) and p (v), L L respectively. Thus, jp (u) p (v)j L L L L L m ^ d dgm(M ); dgm(N ) = with  2 f1; 2g. 0 L Properties (P ) and (Q ) guarantee that the optimal matching  and the pair of points u; v 2 C [ C whose push onto L gives maximal distance in that matching do not change if we vary the M N line L remaining inside its equivalence class with respect to C [ C [ , i.e. without hitting any new M N points in C [ C [ M N Depending on the following exhaustive list of cases, we see that it is always possible to get a line L that gives no smaller weighted bottleneck distance and that passes through two points of C [ C [ M N Writing as the union of its set of proper points with its set of points at in nity , the list of P 1 cases to analyze is: (i) L is already through at least two distinct points of C [ C [ M N (ii) L is only through one point of C [ (iii) L is only through one point of (iv) L is through no point of C [ C [ M N In case (i) there is nothing to prove. In case (ii), the claim is proved by applying Lemma 3.8, i.e. by rotating the line until we either hit another point in C [C [ [f[0 : 1 : 0]; [0 : 0 : 1]g. If the point we M N hit is in C [ C [ , then we are done, otherwise the line is vertical, and we can apply Corollary 3.14 M N and we are also done. In case (iii), the claim is proven by applying Lemma 3.5, i.e. by translating the line until we get a line hitting a point of C [ C [ and still through the same point of as L. M N P 1 In case (iv), by Lemma 3.5, there exists a direction of translation such that we can translate the line in that direction without decreasing the cost and eventually hit a point in C [ C [ . We are now in M N P case (iii) from which we can reach the conclusion as explained above. 4 Computation of the switch points In this section we explain how to compute the possible switch points that we need to add to generate the set of switch points for the pair M; N . The main result of this section is Theorem 4.3 stating that, if M and N satisfy property (P), then also Property (Q), and hence (Q'), is satis ed. It is important to note that this result is not only an existence result but leads to a constructive procedure to determine the set of switch points (M; N ). Switch points are needed when there exist two matched pairs u; v and w; x with u; v; w; x 2 C [C M N for which, • there is a line L in a particular equivalence class via  on which the cost of matching u C [C M N with v equals the cost of matching w with x, but • the cost of matching u with v does not equal the cost of matching w with x for all lines in that equivalence class. Therefore, we use Lemma 4.1 to determine when, given a quadruple u; v; w; x and equivalence class E via  , it is possible to have a line L 2 E with the aforementioned equality. As the following C [C M N proposition shows, all such lines with this equality in E pass through some point ! 2 P . Spanning over all quadruples and all equivalence classes via  ; we obtain the set of switch points (M; N ): C [C M N To avoid degenerate or trivial cases, we assume that u 6= v and w 6= x. This is because the matched pair u; v with u = v would indicate a simultaneous birth and death, i.e. a point on the diagonal within a 20 persistence diagram or yield null contribution to the matching distance (similarly for w = x). However, it can be the case that one of the 4 points is repeated, such as v = x: In this case, a switch point would indicate a line L for which the cost of matching u to v is the same as matching u to x . u x v = x Figure 8: An example of 4 distinct points Figure 7: An example of 3 points u; w; v = x u; w; v; x which generate an ! point. Here, m that generate an ! point. Here, m denotes the and m denote the midpoints of the line seg- midpoint of the line segment between v and ments between u and w and v and x, respec- x. The pushes of the values u and w to L are tively. The pushes of the values u; w; v; x to denoted as red points, and the push of v is ! L are denoted as red points. Since the two red itself. For any line which passes through !, the segments in the gure have equal length for any cost of matching u with v is the same as the line which passes through !, on those lines the cost of matching w with x. cost of matching u with v is the same as the cost of matching w with x. Recall the notation p (u) for the parameter value of the push of a point u onto the line L, and 2 L L recall that, for every u 2 R , u pushes rightwards if A = f2g, upwards if A = f1g, and to itself if u u A = f1; 2g. We will use the following notation in Lemma 4.1: Let U be a non-empty set of points in the plane and set A(U ) = fA j u 2 U;; =6 A  f1; 2gg: u u The set A(U ) speci es, for each u 2 U , one or more push directions, and there are 3 possible choices for A(U ) if jUj = m. Given such a set A(U ), de ne E(A(U )) to be the set of all lines L such that A = A for each u 2 U . In other words, E(A(U )) is the set of all lines such that all points of U push onto these lines in the directions speci ed by A(U ). Note that it is a union of all equivalence classes in which the points of U push in the speci ed directions. We introduce this notation because switch points are generated based on how the pairs of points u; v and w; x push onto a line. Taking U = fu; v; w; xg, this is only a subset of the set C [ C , M N implying that there may be many equivalence classes via  for which these four points push in C [C M N the same way for each equivalence class. So, we create the union of all such equivalence classes via the set A(u; v; w; x). We can prove the following proposition. Lemma 4.1. Let u; v; w; x 2 C [ C . Assume that u 6= v and w 6= x, and that there are at least M N three distinct points among u; v; w; x. Let ;  2 f1; 2g . Fix a choice for A(u; v; w; x), and set E := E(A(u; v; w; x)) For any line L 2 E, consider the real quantity jp (u) p (v)j jp (w) p (x)j L L L L (u; v; w; x; ; ) := : If  (u; v; w; x; ; ) = 0, then exactly one of the following two possibilities is true, depending on E: 1. We have  (u; v; w; x; ; ) = 0 for every L 2 E. 2 0 2. There exists a point ! 2 P such that, for every L 2 E, we have  (u; v; w; x; ; ) = 0 if and only if ! 2 L . In fact,  and  are determined by the pairs u; v and w; x respectively. For example, if both points u; v are in C , then this pair represents matching point in dgm M to the diagonal for some L 2 E, and  = 2. If u 2 C and v 2 C ; M N L L then this pair represents the cost of matching a point in dgm M to dgm N , and  = 1. The value of  may be computed analogously. However, we prove the proposition for the slightly more general case of arbitrary ;  2 f1; 2g. 21 In other words, Lemma 4.1 says that, xed A(u; v; w; x), either the sign of  is constant on the equivalence classes of lines L onto which the points u; v; w; x push as prescribed by A(u; v; w; x), or otherwise the lines where there is a switch of the sign of  belong to a pencil though a point ! 2 P . Proof. Consider some L 2 E, parameterized as ms ~ + b. The four points can be in various di erent positions with respect to L and we want to nd the lines for which the quantity jp (u) p (v)j jp (w) p (x)j L L L L (u; v; w; x; ; ) := jp (u)p (v)j jp (w)p (x)j L L L L vanishes, as ( , respectively) represents the cost of pairing u with v (w with x, respectively), so to determine when these costs are equal. Recall, saying that a point is on one side of a line includes the possibility that the point is on the line. Saying that a point is strictly on one side of the line excludes the possibility that the point is on the line. Without loss of generality, we have the following possibilities. 1. There exists i 2 f1; 2g such that i 2 A for all p 2 fu; v; w; xg. That is, all four points lie on one side of L or, 2. There exists i 2 f1; 2g such that i 2 A for exactly 3 of p 2 fu; v; w; xg. That is, three points lie on one side of L, and the fourth lies strictly on the other side. 3. Two paired points have A = f1g for both points in the pair, and the other two paired points have A = f2g for both points in the pair. This is the case when two paired points lie strictly on one side of L, and the other two paired points lie strictly on the other side of L. 4. Two unpaired points have A = f1g, and the other two unpaired points have A = f2g. This is p p the case when two paired points lie strictly on opposite sides of L, and the other two paired points also lie strictly on opposite sides of L. We now address each possibility recalling Equations (2) to (4): 1. Suppose, without loss of generality, that p (u)  p (v) and p (w)  p (x): L L L L Then the value of  is given by: 1 u v 1 w x i i i i (u; v; w; x; ; ) = m m  m m i i i i u v w x 1 i i i i = + + + ; (13) with i = 1 (resp. i = 2) when all the points are to the right (resp. left) of L. 2. Suppose without loss of generality that u, v, w lie to the left of L and x lies strictly to the right. Suppose also, without loss of generality, that p (u)  p (v). L L If p (w)  p (x), then L L u v (w b ) (x b ) 2 2 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 2 2 1 x 1 u v w 1 1 b 1 b 1 2 2 2 1 2 = + + + : (14) m    m  m  m 1 2 1 2 If p (x)  p (w), then L L u v (w b ) (x b ) 2 2 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 2 2 1 x 1 u v w 1 1 b 1 b 1 2 2 2 1 2 = + + + + (15) m    m  m  m 1 2 1 2 The case when u, v, w lie to the right of L and x lies strictly to the left can be handled similarly. 22 3. Suppose, without loss of generality, that u and v lie strictly to the left of L and that w and x lie strictly to the right of L. Suppose also, without loss of generality, that p (u)  p (v) and L L p (w)  p (x). Then L L u v w x 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 2 1 1 w x 1 u v 1 1 1 2 2 = + + : (16) m   m 1 2 4. Suppose, without loss of generality, that u and w lie strictly to the left of L and that v and x lie strictly to the right of L. Also suppose, without loss of generality, that p (u)  p (v). If L L p (w)  p (x), then L L u b v b w b x b 2 2 1 1 2 2 1 1 (u; v; w; x; ; ) = + : m m m m 2 1 2 1 v x 1 u w 1 1 1 b 1 1 b 1 1 2 2 1 2 = + + + + + + + : (17) m   m   m   m 1 2 1 2 If p (x)  p (w), then L L u b v b w b x b 2 2 1 1 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 1 2 1 v x 1 u w 1 1 1 b 1 1 b 1 1 2 2 1 2 = + + + + + + + : (18) m   m   m   m 1 2 1 2 Observe that in all Equations (13) to (18), the expression for  (u; v; w; x; ; ) has the form 1 1 b b 1 2 (u; v; w; x; ; ) =  +  +   ; L 1 2 m m m m 1 2 1 2 where , , and are constants that depend only on u; v; w; x and ;  and not on the line L. 1 2 We now consider the following exhaustive cases: • = 0 and at least one of ; equal 0, 1 2 • = 0 and ; 6= 0, 1 2 • 6= 0: Suppose rst that is zero and one of and is zero. Assume, without loss of generality, that 1 2 = 0. Since m > 0, then the sign of  (u; v; w; x; ; ) is the same as the sign of . In other words, 1 2 L 2 either  (u; v; w; x; ; ) = 0 for every line L 2 E, or it is positive for every line L 2 E, or negative for every line L 2 E. Thus, this case gives rise to no switch point !. This implies that Equation (13) generates no switch points, as in that equation = 0 and one of and is zero. We can conclude that when all points u; v; w; x lie on the same side of any line L in E, no switch point is created. In Equation (16), it is always the case that = 0: If it is also the case that x = w or u = w , then 1 1 2 2 either or is zero. If we assume that x = w , for  to be zero, it must also be the case that u = v : 1 2 1 1 L 2 2 This implies that c (u; v) = c (w; x) = 0 for all lines L 2 E; and therefore  (u; v; w; x; ; ) = 0 for L L L every line L 2 E. (An identical argument works for u = v as well.) Thus, no switch point is created. 2 2 It is also possible to have = 0 in Equation (17), when  = : If, additionally, x = v or u = w , 1 1 2 2 then either or is zero. However, if x = v , then, for  to be zero, it must also be the case that 1 2 1 1 L u = w : This implies that c (u; v) = c (w; x) for all lines L 2 E; and therefore  (u; v; w; x; ; ) = 0 2 2 L L L for every line L 2 E. (An identical argument works for u = w as well.) So no switch point is created. 2 2 To summarize, no switch points are created for: • Equation (13) • Equation (16), when x = w or u = w ; nor 1 1 2 2 • Equation (17), when  =  and x = v or u = w . 1 1 2 2 23 Next suppose that = 0 and 6= 0 and 6= 0. In this case, the quantity  (u; v; w; x; ; ) is zero 1 2 L if and only if m  = m  : 2 1 1 2 This is a condition on the slope of the line, which states that L must have a xed slope. When considering the line L in P , this means that L passes through the point at in nity ! = [0 : : ]. Conversely, 1 2 it is easy to check that any line L 2 E with this slope has  (u; v; w; x; ; ) = 0. It is only possible for = 0 in Equation (16) and in Equation (17), when  = . In both of these, to ensure that 6= 0 and 6= 0, it must be that x 6= w and u 6= v : 1 2 1 1 2 2 Using this formula, we obtain the following ! points on the line at in nity: • From Equation (16), when x 6= w and u 6= v : 1 1 2 2 ! = [0 : (w x ) : (u v )] : 1 1 2 2 • From Equation (17), when  =  and x 6= w and u 6= v : 1 1 2 2 ! = [0 : v x : u w ] : 1 1 2 2 Finally, suppose that 6= 0. In this case,  (u; v; w; x; ; ) = 0 if and only if 1 1 b b 1 2 1 2 +  + = 0: m m m m 1 2 1 2 Again, viewing the lines L in P , it is easy to check that this equation is satis ed if and only if the line L 2 E passes through the point ! = [ : : ] : 1 2 When we require that 6= 0, we only need to consider Equations (14), (15) and (17), and Equa- tion (18) when  6= . Using this formula, we obtain the following proper ! points: • From Equation (14): ! = [ : x : (v u ) + w ] : 1 2 2 2 • From Equation (14), altered to consider the case when u; v; and w lie to the right of lines L 2 E, and x to the left: ! = [ : (v u ) + w : x ] : 1 1 1 2 • From Equation (15): ! = [ : x : (u v ) + w ] : 1 2 2 2 • From Equation (15), altered to consider the case when u; v; and w lie to the right of lines L 2 E, and x to the left: ! = [ : (u v ) + w : x ] : 1 1 1 2 • From Equation (17) when  6= : ! = [  : v x : u w ] : 1 1 2 2 • From Equation (18): ! = [ +  : x + v : w + u ] : 1 1 2 2 Remark 4.2. Lemma 4.1 describes how to construct certain ! points based on a choice of a quadruple u; v; w; x 2 C [ C and sets A for p 2 fu; v; w; xg. In fact, the same ! points can be obtained if we M N p choose fu; v; w; xg from the more restricted set C [ C . M N For example, let u 2 (C [ C n (C [ C ). Then u is the least upper bound either of two M N M N 0 00 points in C , or two points in C . Suppose, without loss of generality, that u ; u 2 C such that M N M 0 00 0 u = ((u ) ; (u ) ). If L is a line such that the push direction of u onto L is upward, then p (u) = p (u ). 1 2 L L Similarly, if L is a line such that the push direction of u onto L is rightward, then p (u) = p (u ). Since L L the formula for ! only depends on the direction in which u pushes, the value of ! does not change. 24 2 The proof of next Theorem 4.3 uses Lemma 4.1 to generate a set (M; N )  P of switch for which property (Q') holds. Theorem 4.3. Given two 2-parameter persistence modules M; N satisfying Property (P), there exists a set of switch points (M; N ) ensuring that M and N also satisfy Property (Q), and hence (Q'). Proof. As usual, let C and C denote the set of critical values of M and N , respectively, as guaranteed M N by Property (P), and let C and C be their closure with respect to the least upper bound. M N For each quadruple u; v; w; x 2 C [C (or simply each quadruple in C [C , thanks to Remark 4.2) M N M N and each set A(u; v; w; x) satisfying the conditions of Lemma 4.1, we may obtain a switch point !, either proper or at in nity. Iterating over all such quadruples and possible push directions for each quadruple, we obtain a nite set of points in P , which we denote (M; N ), or brie y We now show that the set satis es property (Q); then, by Proposition 2.10, it satis es property (Q'). 0 0 Let L and L be two lines such that L  L , and let u; v; w; x be any quadruple, with at C [C [ M N least three distinct points, in C [ C . We want to prove that the signs of  (u; v; w; x; ; ) and M N L 0 (u; v; w; x; ; ) agree. First, by Lemma 4.1, if  (u; v; w; x; ; ) = 0, then either  0 (u; v; w; x; ; ) = 0 for every L in the L L same equivalence class. of all lines in the same equivalence class contain a switch point ! 2 . So, by 0 0 the de nition of the equivalence relation, for any L  L, it must also be the case that ! 2 L . C [C [ M N By Lemma 4.1, this implies that  (u; v; w; x; ; ) = 0 as well. Next suppose, without loss of generality, that (u; v; w; x; ; ) > 0 and  0 (u; v; w; x; ; ) < 0: L L This implies that neither L nor L contain the point ! corresponding to the quadruple u; v; w; x and to the set A(u; v; w; x) which generate !. Note that, for xed  and  and within an equivalence class with respect to C [C [ , the quantity M N (u; v; w; x; ; ) is a continuous function with respect to translation or rotation. Moreover, L may be obtained from L by a translation and/or rotation which is always in the same equivalence class with respect to C [ C [ as both L and L . M N Therefore,  (u; v; w; x; ; ) is a continuous function of L which changes sign within an equivalence class with respect to C [ C [ , and by the Intermediate Value Theorem, there must be a line L , M N 0 00 equivalent to both L and L via  such that  00 (u; v; w; x) = 0. This implies that ! 2 L , C [C [ M N 00 0 00 0 and since L  L and L  L, then ! 2 L \ L. C [C [[ C [C [ M N M N This is a contradiction, and therefore, the sign of  (u; v; w; x; ; ) is the same as the sign of (u; v; w; x; ; ) for all lines L  L . This must be true for all quadruples in C [ C , so L M N C [C [ M N property (Q) holds for 5 Conclusion The main contributions of this paper include: • theoretical methods for computing the matching distance between two 2-parameter persistence modules, which can be used to produce algorithms with comparable time complexity to [17]; • the identi cation of the parameter space of a 2-parameter persistence module as a subset of the projective plane, which allows for an understanding of the importance of diagonal lines in the computation of the matching distance; and • new de nitions for the costs (w.r.t. computing the matching distance) of horizontal and vertical lines of a 2-parameter persistence module. To address the rst bullet, one advantage of the theoretical framework developed in this paper is that it allows us to formulate explicit ways of computing the switch-points from the critical values of two 2-parameter persistence modules. These explicit formulas will allow us to provide an algorithm performing the exact computation of the matching distance in two dimensions as the maximum over a nite set of lines, as guaranteed by Theorem 3.1. More speci cally, we expect to formulate algorithms based on the explicit computations of the switch- points provided in Section 4. An algorithm to compute the matching distance based on the framework in this paper can be described by the following main steps: 25 1. Identify the critical values in C and C ; M N 2. compute the switch-points using the results from Section 4, 3. compute all the lines through pairs of points in C [ C [ and lines of diagonal slope through M N at least one point of C [ C [ M N 4. compute the bottleneck distance together with its corresponding weight for each of the lines com- puted above and take the maximum. The above steps can all be easily implemented, yielding a new tool to compute the matching distance for 2-parameter persistence modules, while also giving an intuition on the lines that attain that distance. We expect the method to also be parallelizable, with computations carried out on di erent lines in parallel. The implementation of this method is currently underway. In addition to the explicit computation of the matching distance, we provide a detailed study of the lines through the parameter space and a geometric interpretation corresponding to a line being vertical, horizontal or diagonal. For instance, we show in Example 2.12 that we need to include diagonal lines through at least one point of the set C [ C [ in order to compute the matching distance, as lines M N through pairs of critical points are not sucient. This agrees with the observation in [12] that diagonal lines play a key role in the study of 2-parameter persistence. We are able to give an explicit formula for the computation of the weighted bottleneck distance along horizontal and vertical lines. Moreover, we are able to conclude that these lines are not important in computing the overall matching distance between two 2-parameter persistence modules | there is always a line with positive but nite slope for which the weighted bottleneck distance is greater than that along a horizontal or vertical line. Viewing the parameter space as a subset of the projective plane helps in understanding the importance of vertical, diagonal, and horizontal lines, and makes the treatment of these lines more consistent. In [17, 6], the authors leverage the duality between points and lines in the plane, performing compu- tations in the dual plane. In comparison, our method is directly de ned in the parameter space of the persistence modules, helping the understanding of the nite set of lines involved in the maximization for the computation of the matching distance. However, our approach to the computation of the matching distance remains possible only in two dimensions. This is due to the fact that, in two dimensions, a line is caged by a set of points, in our case the critical and switch points, whereas this is no longer the case in higher dimension. For this reason, the method we propose does not, for now, generalize to more parameters. Acknowledgements A.B. acknowledges the support of ICERM at Brown University, where part of this work was done. R.B. was supported for travel to BIRS by an AWM-NSF Travel Grant. C.H. is supported by NCCR-Synapsy Phase-3 SNSF grant number 51NF40-185897. C.L. carried out this work under the auspices of INdAM-GNSAGA and partially within the activities of ARCES (University of Bologna). B.I.M. is supported by Digital Futures (dBrain). E.S. is supported by European Research Council (ERC) grant no. 788183, `Alpha Shape Theory Ex- tended'. We are grateful to the Ban International Research Station for hosting us as a Focused Research Group. We also thank Martin Hafskjold Thoresen for help with computation. 26 References [1] Asilata Bapat, Robyn Brooks, Celia Hacker, Claudia Landi, and Barbara I. Mahler. Morse-Based Fibering of the Persistence Rank Invariant, volume 30 of Association for Women in Mathematics Series, pages 27{62. Springer International Publishing, 2022. [2] Paul Bendich, J. S. Marron, Ezra Miller, Alex Pieloch, and Sean Skwerer. Persistent homology analysis of brain artery trees. Ann. Appl. Stat., 10(1):198{218, 2016. [3] Subhrajit Bhattacharya, Robert Ghrist, and Vijay Kumar. Persistent homology for path planning in uncertain environments. IEEE Transactions on Robotics, 31(3):578{590, 2015. [4] S. Biasotti, A. Cerri, P. Frosini, D. Giorgi, and C. Landi. Multidimensional size functions for shape comparison. Journal of Mathematical Imaging and Vision, 32(2):161, 2008. [5] Silvia Biasotti, Andrea Cerri, Patrizio Frosini, and Daniela Giorgi. Approximating the 2-dimensional matching distance, 01 2010. [6] H avard Bakke Bjerkevik and Michael Kerber. Asymptotic improvements on the exact matching distance for 2-parameter persistence, 2021. [7] Andrew J. Blumberg and Michael Lesnick. Stability of 2-parameter persistent homology. 2020. [8] Gunnar Carlsson, Gurjeet Singh, and Afra Zomorodian. Computing multidimensional persistence. In Algorithms and computation, volume 5878 of Lect. Notes Comput. Sci., pages 730{739. Springer, Berlin, 2009. [9] Gunnar Carlsson and Afra Zomorodian. The theory of multidimensional persistence. Discrete Comput. Geom., 42(1):71{93, 2009. [10] Mathieu Carriere and Andrew J Blumberg. Multiparameter Persistence Images for Topological Machine Learning. In NeurIPS 2020 - 34th Conference on Neural Information Processing Systems, Vancouver / Virtuel, Canada, December 2020. [11] Andrea Cerri, Barbara Di Fabio, Massimo Ferri, Patrizio Frosini, and Claudia Landi. Betti num- bers in multidimensional persistent homology are stable functions. Math. Methods Appl. Sci., 36(12):1543{1557, 2013. [12] Andrea Cerri, Marc Ethier, and Patrizio Frosini. On the geometrical properties of the coherent matching distance in 2d persistent homology. Journal of Applied and Computational Topology, 3(4):381{422, 2019. [13] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete Comput. Geom., 37(1):103{120, 2007. [14] Patrizio Frosini and Claudia Landi. Persistent betti numbers for a noise tolerant shape-based approach to image retrieval. Pattern Recognition Letters, 34(8):863{872, 2013. Computer Analysis of Images and Patterns. [15] Sheridan B. Green, Abby Mintz, Xin Xu, and Jessi Cisewski-Kehe. Topology of our cosmology with persistent homology. CHANCE, 32(3):6{13, 2019. [16] Bryn Keller, Michael Lesnick, and Ted Willke. Persistent Homology for Virtual Screening, 2018. [17] Michael Kerber, Michael Lesnick, and Steve Oudot. Exact computation of the matching distance on 2-parameter persistence modules. In SoCG 2019, volume 129 of LIPIcs, pages 46:1{46:15, 2019. [18] Michael Kerber, Dmitriy Morozov, and Arnur Nigmetov. Geometry helps to compare persistence diagrams. ACM J. Exp. Algorithmics, 22, sep 2017. [19] Michael Kerber and Arnur Nigmetov. Ecient approximation of the matching distance for 2- parameter persistence, 2019. [20] C. Landi. The Rank Invariant Stability via Interleavings, volume 13 of Association for Women in Mathematics Series, pages 1{10. Springer, 2018. 27 [21] Yongjin Lee, Senja D. Barthel, Pawe c D cotko, S. Mohamad Moosavi, Kathryn Hess, and Berend Smit. Quantifying similarity of pore-geometry in nanoporous materials. Nature Communications, 8(1), 2017. [22] Ezra Miller. Data structures for real multiparameter persistence modules. arXiv: 1709.08155, 2017. [23] Hans Riess, Jakob Hansen, and Robert Ghrist. Multidimensional persistence module classi cation via lattice-theoretic convolutions. 2020. [24] Michael Sinhuber and Nicholas T. Ouellette. Phase coexistence in insect swarms. Phys. Rev. Lett., 119:178003, 2017. [25] Oliver Vipond, Joshua A. Bull, Philip S. Macklin, Ulrike Tillmann, Christopher W. Pugh, He- len M. Byrne, and Heather A. Harrington. Multiparameter persistent homology landscapes iden- tify immune cell spatial patterns in tumors. Proceedings of the National Academy of Sciences, 118(41):e2102166118, 2021. [26] Afra Zomorodian and Gunnar Carlsson. Computing persistent homology. Discrete Comput. Geom., 33(2):249{274, 2004. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Mathematics arXiv (Cornell University)

Computing the Matching Distance of 2-Parameter Persistence Modules from Critical Values

Loading next page...
 
/lp/arxiv-cornell-university/computing-the-matching-distance-of-2-parameter-persistence-modules-4qUFxzXPR0

References (33)

eISSN
ARCH-3343
DOI
10.48550/arXiv.2210.12868
Publisher site
See Article on Publisher Site

Abstract

The exact computation of the matching distance for multi-parameter persistence modules is an active area of research in computational topology. Achieving an easily obtainable exact computation of this distance would allow multi-parameter persistent homology to be a viable option for data analysis. In this paper, we provide theoretical results for the computation of the matching distance in two dimensions along with a geometric interpretation of the lines through parameter space realizing this distance. The crucial point of the method we propose is that it can be easily implemented. 1 Introduction Persistent Homology is known as one of the most-often used tools in Topological Data Analysis, allowing one to use topological invariants to understand the underlying shape of data. In particular, single pa- rameter persistence yields a summary of data through a one-dimensional ltration, allowing an overview of the data at many di erent scales. Single parameter persistent homology has been the subject of much study and has proven to be useful in many applications [2, 3, 15, 21, 24]. However, some data requires ltration along multiple parameters to fully capture its information: this is the role of multi-parameter persistent homology [9, 8]. In some contexts, it can be helpful to use multiple parameters to capture the details of the data [10, 16, 22, 23, 25]. Additionally, single parameter persistent homology is not robust to outliers in a point cloud; these outliers can lead to a misinterpretation of the persistent homology, with the unnecessary creation or destruction of homology classes. Multi- parameter persistence can be a natural x to this problem by adding a dimension depending on the density of the samples (see, e.g., [14, 7]). Unfortunately, understanding, visualizing, and computing invariants in multi-parameter persistent homology remains a dicult task both mathematically and computationally. This diculty holds as well when it comes to computing distances between such invariants. In the single parameter case, there are several ways to compare persistent homology modules | such as the bottleneck distance and Wasserstein distances | which exhibit some stability properties with respect to variations in the input [13]. For more than one parameter, there are also several de nitions of distances between persistence modules. Amongst them, the matching distance [4] attracts the attention of multi-parameter persistence practitioners. Australian National University, Canberra, Australia Boston College, Massachusetts, United States EPFL, Lausanne, Switzerland Universit a di Modena e Reggio Emilia, DISMI, Italy KTH Royal Institute of Technology Stockholm, Sweden Institute of Science and Technology Austria, Klosterneuburg, Austria arXiv:2210.12868v1 [math.AT] 23 Oct 2022 To compute the matching distance between two n-parameter persistence modules, one uses the fact that restricting an n-parameter ltration to any ane line of positive slope through the parameter space yields a 1-parameter ltration. There is therefore a corresponding restriction of the n-parameter persistence module to a 1-parameter module along that line. This construction allows for the knowledge and computational methods from the 1-dimensional case to be applied to the n-dimensional case. Indeed, following this idea, the matching distance is de ned as a supremum of the one-dimensional bottleneck distance, over the collection of all lines of positive slope in the parameter space, i.e., L L L d (M; N ) := sup m ^  d (dgm M ; dgm N ); match B L : u=sm ~ +b where m ^ is a normalization term explained in Section 2. However, computing exactly this distance is not an easy task given the nature of its de nition. As a rst step towards an exact computation, several approximations of this distance have been provided for 2-dimensional persistence modules [5, 19]. The exact computation of the matching distance is currently only possible for 2-dimensional modules [17], with recent computational improvements in terms of time complexity [6]. These methods exploit the duality of points and lines in the plane. In this paper, we propose a method to compute the matching distance based on a re nement of the framework developed in [1], in which we compute the rank invariant of a multi-persistence module from a nite set of critical parameter values. These critical parameter values capture all the changes in homology occurring throughout the multi- ltration. They may also be used to partition the set of positive lines in the parameter space into equivalence classes, where each equivalence class maintains the same birth and death ordering within the restricted module for all possible homology classes. Based on these results, we formulate the idea that the critical values must be relevant to the choice of lines for the computation of the matching distance. Leveraging the framework of [1] to compute the matching distance allows us to reduce the number of lines necessary to compute the matching distance to a nite set, thus reducing the computation to a maximum rather than a supremum without exploiting the point-line duality used in [17]. Through some examples, we show that considering only lines passing through pairs of critical values is not sucient. This is because the de nition of matching distance depends on the bottleneck distance, and lines in the same equivalence class might not be such that the bottleneck distance is always achieved by the same pairing of births/deaths, as will be further explained in Section 3. To overcome this problem, we analyze where switches might happen in the matching which achieves the bottleneck distance, identifying a set of points in the parameter space which we later on refer to as the set of switch points. This set allows us to re ne our equivalence relation on the set of positive lines (see Section 4) by partitioning lines through the parameter space using the union of the critical parameter values with the switch points. Using this union, it is possible to identify all the lines at which the matching distance can be realised. We explain in detail how to compute all relevant parameter values and show that the matching distance is attained either at: a line through a pair of points in this union, or a line of diagonal slope through exactly one of the points in the union (see Theorem 3.1). The contribution of this paper is to advance the state of the art, although still restricted to two dimen- sions, in two ways: computing the matching distance in a way which is both geometrically interpretable and implementable. In contrast to other methods such as [17, 6], we provide a geometric understanding of di erent lines, including horizontal, vertical, and diagonal lines, as well as lines through critical values, and their contribution to the matching distance. In addition, the method we propose leads to explicit computations of important lines, which will lead to an algorithm that can be implemented. The paper is structured as follows. In Section 2, we give necessary background from multi-parameter persistent homology, as well as some of the concepts coming from the framework developed in [1]. We also provide some examples of two-parameter persistence modules for which we explicitly show which lines are necessary to compute the matching distance. In Section 3, we provide the statement and proof of our main theoretical result Theorem 3.1 and include a discussion on the role of vertical and horizontal lines. In Section 4, we provide guidelines for the computation of the switch points, which are necessary to the exact computation of the matching distance. Finally, in Section 5, we discuss future direction of study. 2 Background While we will subsequently specialize to the case of n = 2 parameters, we begin with some general de nitions to set the scene. 2 2.1 Notation and de nitions De nition 2.1 (Parameter Spaces). Let n 2 N. If n = 1, endow R with the usual order . If n > 1, n n take the following partial order on R : for u = (u ); v = (v ) 2 R , set u  v if and only if u  v for i i i i i = 1; : : : ; n. The poset (R ;) is called an nD parameter space. For n > 1, the nD parameter space can be sliced by lines with positive slope. A line L  R is a 1D parameter space when considered parameterized by s 2 R as L : u = ms ~ + b n n where m ~ 2 R and b 2 R . L has positive slope if each coordinate m of m ~ is positive: m > 0. i i Throughout the paper, we consider the following standard normalisation for parameterizations of lines: km ~ k := maxfm j i = 1; : : : ; ng = 1; b = 0 (1) 1 i i i=1 This normalization is the one used in papers such as [11, 20]. Other choices are possible. Changing the normalization would impact all the intermediate computations that follow but not the overall results. De nition 2.2 (Persistence Modules). Let F be a xed eld. An (n-parameter) persistence module M n n over the parameter space R is an assignment of an F-vector space M to each u 2 R , and linear maps, called transition or internal maps, i (u; v) : M ! M to each pair of points u  v 2 R , satisfying the M u v following properties: • i (u; u) is the identity map for all u 2 R . • i (v; w) i (u; v) = i (u; w) for all u  v  w 2 R . M M M In this paper, persistence modules will always assumed to be nitely presented. To x the ideas, we will assume that they are obtained applying homology in a xed degree, say q, over a xed coecient eld F, to a tame and one-critical (cf. [1]) n-parameter ltration K of a nite simplicial complex: M = H (K;F). In the particular case when n = 1, a nitely presented persistence module M can be uniquely decomposed as a nite sum of interval modules [26]. De nition 2.3 (Interval Module). An interval module is a 1-parameter persistence module of the form I [b; d) with b < d  1 2 R := R[f1g such that, for every s  t 2 R, F if b  s < d id if b  s  t < d I [b; d) = ; I [b; d)(s  t) = 0 otherwise 0 otherwise The interval [b; d) can be represented as a point (b; d) in R R, above the diagonal. This encoding permits de ning persistence diagrams. De nition 2.4 (Persistence Diagram). If M =  I [b ; d ) , then the persistence diagram of M , j2J j j denoted dgm M , is the nite multi-set of points (b ; d ) of R R with multiplicity m for j 2 J . j j j We now review the de nitions of bottleneck distance between persistence diagrams and matching distance between two n-parameter persistence modules M and N . When n = 1, the bottleneck distance d is an easily computable extended metric on persistence diagrams [18], de ned as follows. De nition 2.5 (Bottleneck Distance). Let M , N be two 1-parameter persistence modules, with persis- tence diagrams dgm M and dgm N . A matching between dgm M and dgm N is a multi-bijection  : P ! Q between the points of a multi-subset P in dgm M and a multi-subset Q in dgm N . The cost of a matching , denoted c(), is the greatest among the costs of each matched pair of points, taken to be equal to the ` distance of the corresponding pair of points in R  R, with the convention that 11 = 0 and 1 a = 1 for every a 2 R, and the costs of each unmatched point, 1 2 taken to be equal to the ` distance of that point to the diagonal of R : p p 2 1 c() := max maxkp (p)k ; max : p2P p2=P Q 2 3 The bottleneck distance d is de ned as the smallest possible cost of any matching  between dgm M and dgm N : d (dgm M; dgm N ) := min c(): :P!Q Pdgm M;Qdgm N 2 0 Observe that a matching  that pairs a point (b; d) 2 R with a point (b ;1) 2 R R has cost equal to in nity. Hence, the bottleneck distance between two persistence diagrams with a di erent number of points at in nity cannot be nite. On the contrary, matching points of the same type always gives a nite (more convenient) cost that can be expressed as follows. Remark 2.6. By de nition, the cost c() of  : P ! Q takes one of the two forms. • If c() is realized by matching p 2 P to q = (p), then c() = jstj, where s and t are either both abscissas or both ordinates of p and q, respectively; jstj • If c() is realized by some p not in P Q, then c() = , where s and t are the abscissa and the ordinate of p. jstj Brie y, we can say that c() is attained by some , with  2 f1; 2g and s; t coordinates of points in dgm M [ dgm N . When the number of parameters is n  2, we can use the bottleneck distance to de ne an extended pseudo-metric between persistence modules M and N via restrictions to lines with positive slope. n L The restriction of M to a line of positive slope L  R is the persistence module M that assigns L L M to each u 2 L, and whose transition maps i L (u; v) : (M ) ! (M ) for u  v 2 L are the same u u v as in M . Once a parameterization u = ms ~ + b of L is xed, the persistence module M is isomorphic to the 1-parameter persistence module, by abuse of notation still denoted by M , that • assigns to each s 2 R the vector space (M ) := M , and s u L L • whose transition map between (M ) and (M ) for s  t is equal to that of M between M and s t u M with u = ms ~ + b and v = mt ~ + b. De nition 2.7 (Matching Distance). The matching distance between M and N is de ned by L L L d (M; N ) := sup m ^  d (dgm M ; dgm N ) match B L : u=sm ~ +b where m ^ := minfm j i = 1; : : : ; ng > 0 is the minimal coordinate of the direction vector m ~ of L, and L varies among all the lines of R with positive slope, taken with standard normalization. The role of the weight m ^ in the de nition of the matching distance is that of ensuring stability of persistence modules against perturbations of the ltrations that originate them (cf. [11, 20]). Such weight would vanish for lines parallel to the coordinate axes. Therefore, such lines are not considered in the de nition of the matching distance. In the context of the matching distance between multi-parameter persistence modules, it is more L L convenient to include the factor m ^ when computing the cost of a matching  between the points of L L a multi-subset in dgm M and the points of a multi-subset in dgm N for a xed line L with positive slope. Therefore, we introduce the following notation: L L L cost( ) := m ^ c( ) : Examples of computation of the matching distance in some simple cases are given in Section 2.3. By de nition, the matching distance is computed by taking the supremum of the bottleneck distance over all (in nitely many) lines with positive slope through the parameter space. However, in [1], we have shown that this set of lines may be partitioned into a nite number of equivalence classes. For the two-parameter case (i.e., for n = 2), we will use these equivalence classes to generate a nite set of lines from which one may compute the matching distance. Therefore, we introduce notation and de nitions necessary to understand these equivalence classes below. Henceforth in this paper, we always assume n = 2, which is the case for which we have extremely explicit results. We introduce the following de nitions (with associated gures) following [1]. 4 2 De nition 2.8 (Positive Cone). For every point u in R , let S (u) be the positive cone with vertex u: S (u) := fv 2 R : u  vg : The boundary of the positive cone, @S (u), decomposes into open faces. In particular, @S (u) can + + be partitioned by non-empty subsets A of f1; 2g in the following way. For ; =6 A  f1; 2g, de ne fug if A = f1; 2g ; S (u) := f(x ; x ) 2 R j x = u ; x > u g if A = f1g ; 1 2 1 1 2 2 f(x ; x ) 2 R j x > u ; x = u g if A = f2g : 1 2 1 1 2 2 f1g S (u) u = S S f1;2g f2g Figure 1: The positive cone S (u) of u 2 R and the decomposition of its boundary into S , S and f1g f2g S , which correspond respectively to the vertical boundary, horizontal boundary, and u. f1;2g De nition 2.9. A line L with positive slope intersects @S (u) in exactly one point, which we call the push of u onto L, and denote by push (u): push (u) := L\ @S (u) : push (u) Figure 2: The push of u along the line L. In this case, A = f2g, and so u is to the left of L. Because of the partition of @S (u) by its open faces, there is a unique non-empty subset of f1; 2g, L L denoted A , such that push (u) = L \ S L (u). Concretely, in the plane, A = f1g means that u is u L A u L L strictly to the right of L, A = f2g means that u is strictly to the left of L, A = f1; 2g means that u is u u L L on L. More generally, when f1g  A , we say u is to the right of L, and when f2g  A , we say u is to u u the right of L, allowing for the fact that it may be true in either of these cases that u is on L. 0 2 We say that two lines L; L  R with positive slope have the same reciprocal position with respect L L 2 0 0 to u if and only if A = A . Given a non-empty subset U of R , we write L  L if L and L have u u the same reciprocal position with respect to u for all u 2 U . Note that  is an equivalence relation on lines with positive slope in R [1]. 5 In this paper, it is necessary for us to extend this equivalence relation to include points in the 2 2 2 projective completion of the real plane, P = R [ ` , with ` the line at in nity of P . Points on the 1 1 line at in nity are given by homogeneous coordinates [x : x : x ] with x = 0: Note that a line with 0 1 2 0 positive slope in R will intersect ` at exactly one point [0 : v : v ] with ~v = (v ; v )  0 giving the 1 1 2 1 2 direction of the line. Given a point v = [0 : v : v ] 2 ` with v ; v > 0, de ne 1 2 1 1 2 >fvg if A = f1; 2g ; S (v) := f[0 : x : x ] 2 ` j x = v ; x > v g if A = f1g ; A 1 2 1 1 1 2 2 f[0 : x : x ] 2 ` j x > v ; x = v g if A = f2g : 1 2 1 1 1 2 2 2 2 L For every such v 2 P and line L  R with positive slope, de ne A to be the largest subset of f1; 2g 0 2 such that L\ S L 6= ;. As above, we say that two lines L; L in R with positive slope are in the same L L reciprocal position with respect to v if and only if A = A . v v We then extend the equivalence relation on lines in R with positive slope de ned in [1] as follows: two lines L and L are equivalent if and only if they are in the same reciprocal position with respect to a nite set U of points in P : 0 L L L  L if and only if A = A 8u 2 U : u u We introduce the following notation for simplicity. For a line L  R parameterized as ms ~ + b, and for u 2 R , we set p (u) to be the unique real number such that push (u) = m ~  p (u) + b: In other words, p (u) is the parameter value of the push of u to L. We note that p (u) depends on L L the chosen parameterization of the line L. However, we implicitly assume that the parameterization we choose is the standardly normalized one. The formula for the quantity p (u) depends on whether u lies to the left (i.e. A = f2g) or to the right (i.e. A = f2g) of the line L. If u lies to the left of L, then u b 2 2 p (u) = : (2) m m 2 2 If u lies to the right of L, then u b 1 1 p (u) = : (3) m m 1 1 If u lies on L, then u b u b 1 1 2 2 p (u) = = : (4) m m m m 1 1 2 2 Note that, thanks to the standard parametrization of a line L, pushing any two points u; v 2 R onto L yields to parameter values p (u); p (v) 2 R such that L L push (u) push (v) if L has slope greater than 1, 2 2 L L p (u) p (v) = (5) L L push (u) push (v) otherwise. 1 1 L L 2.2 Working assumptions Our goal in this paper is to identify a nite set of lines with positive slope from which to compute the matching distance between two nitely presented persistence modules, M and N . Our strategy is that of partitioning the set of lines with positive slope into equivalence classes where it is easy to understand how the cost of a matching between persistence diagrams along lines changes when moving such lines around, for example by translations and rotations. To this end, we start with requesting that the following property holds for all the n-parameter persistence modules M considered hereafter. Property (P). There exists a nite set C of points in R , called critical values, for which the ranks of transition maps in the n-parameter persistence module M are completely determined the ranks of transition maps between values in C , the closure of C under least upper bound. More speci cally, M M for any u; v 2 R , rank i (u; v) = rank i (u;  v ) M M 6 with 0 0 u = maxfu 2 C ju  ug; 0 0 v = maxfv 2 C jv  vg 0 0 if fu 2 C ju  ug is non-empty, and rank i (u; v) = 0 otherwise. Note that C is empty if and only M M M if M is trivial. In the case of a nitely presented persistence module M , one such set C exists and is given by the set of grades of its generators and relators. More concretely, property (P) is satis ed by modules M obtained by taking the persistent homology over a led F of simplicial complexes K ltered by a tame and one-critical ltration K = fK g n : M = H (K;F). Indeed, xing a discrete gradient vector eld u2R q V consistent with the ltration K and de ning C to be the set of entrance values of simplices  that are critical in the gradient vector eld V , M satis es property (P) with set of critical values given by C as proved in [1][Thm. 1]. If both M and N have property (P ), then we may partition the set of lines with positive slope in n 0 R via the equivalence relation  . For any two lines L  L , the persistence diagrams C [C C [C M N M N 0 0 L L L L dgm M and dgm M (respectively dgm N and dgm N ) are in bijection [1][Thm. 2]. As we will see in Lemma 3.2 later, these bijections will extend to a bijection, denoted 0 , between L;L 0 0 L L L L the sets of matchings that are used to compute both d (M ; N ) and d (M ; N ) . Moreover, the B B bijection 0 can be used to compute the cost of a matching for line L from the cost of the corresponding L;L matching for line L, provided the lines are in the same equivalence class. L L Ideally, one would like to use 0 to show that a line L on which d (M ; N ) attains a maximum L;L B is a line which passes through two points in C [ C : However, this is not the case, as Examples 2.12 M N and 2.13 show. The reason is that, within an  -equivalence class, it is possible that there is no C [C M N \winning" matched pair of points or \winning" matching that works for all lines within that equivalence class, i.e., the winning matching may \switch" within the same  equivalence class. C [C M N Therefore, the equivalence relation  must be re ned, in order to generate equivalence classes C [C M N in which there exists at least one matching and matched pair which computes the bottleneck distance for all lines in a given equivalence class. This requires the addition of switch points: points ! in the projective completion for which the cost of matching some pair u; v equals the cost of matching some other pair w; x, with u; v; w; x 2 C [ C for any line through !. M N So, with the goal of computing the matching distance using the method described above in the case n = 2, we introduce another property, called (Q), that pairs M; N of 2-parameter persistence modules are required to satisfy, involving a set of switch points. Property (Q). There exists a nite set of points (M; N ) in P such that for all u; v; w; x 2 C , not necessarily distinct, and ;  2 f1; 2g, also not necessarily distinct, the sign (positive, negative or null) of jp (u) p (v)j jp (w) p (x)j L L L L (u; v; w; x; ; ) := ; remains the same for all lines L in the same equivalence class with respect to C [ C [ M N Once the equivalence relation  is re ned to  , a winning matching  and a winning C [C C [C [ M N M N matched pair u; v 2 C [C can be determined for each equivalence class. Indeed, Property (Q) implies M N the following property (Q'), as Proposition 2.10 will prove. Property (Q'). There exists a nite set of points (M; N ) in P with the following property. Let L be any line, and let  be an optimal matching with respect to computing the bottleneck distance L L between dgm M and dgm N . Suppose that the cost c() of the matching  is realised by the pushes of u; v 2 C [ C onto L. Then, if L is any other line in the same equivalence class as L with respect to M N C [ C [ , the matching 0 () is an optimal matching with respect to computing the bottleneck M N L;L 0 0 L L distance between dgm M and dgm N . Moreover, its cost c( ()) is realised by the pushes of u L;L and v onto L . In other words, 0 0 jp (u) p (v)j jp (u) p (v)j L L L L c() = () c( ()) = : L;L Proposition 2.10. Property (Q) implies Property (Q'). 7 L L Proof. Choose a line L and let  be a matching between dgm M and dgm N that has cost jp (u) p (v)j L L c() = for some u; v 2 C [ C and  2 f1; 2g. M N Let L be a line that is equivalent to L with respect to C [ C [ . Property (Q) implies that M N jp (u) p (v)j L L c( ()) = : L;L This tells us that within an equivalence class under  , if the pair u; v 2 C [ C obtains the M N C [C [ M N cost of , then that pair also obtains the cost of (), although it may not be the unique such pair L;L to do so. Now suppose that  is the optimal matching with respect to computing the bottleneck distance L L between dgm M and dgm N . Let u ; v 2 C [ C and  2 f1; 2g such that M N jp (u ) p (v )j L  L c( ) = : L L Since  is the optimal matching, for any other matching  between dgm M and dgm N , it is true that c( )  c(). Therefore, again by property (Q), we have c( 0 ( ))  c( 0 ()): L;L L;L Therefore ( ) is a (possibly not unique) matching that obtains the bottleneck distance between L;L 0 0 L L dgm M and dgm N , and u ; v 2 C [ C is a (possibly not unique) pair of critical values which M N obtains the cost of ( ). L;L Finally, because Example 2.12 shows that the diagonal direction is also needed, we set to be the union of with the singleton f[0 : 1 : 1]g. This will allow for the computation of the matching distance between M and N from lines through two points in C [ C [ via Theorem 3.1, assuming that M M N and N satisfy properties (P) and (Q). As mentioned above, all nitely presented persistence modules satisfy property (P). In Section 4, we provide the theoretical justi cation also for the existence of the set ensuring property (Q). 2.3 Examples We now give some simple examples of computation of the matching distance in the case of two parameters. The considered persistence modules will be direct sums of rectangle modules, i.e. persistence module that are trivial outside a supporting rectangle [u ; v )  [u ; v ), with u = (u ; u ), v = (v ; v ) 2 R , 1 1 2 2 1 2 1 2 and equal to F with all the transition maps equal the identity on the supporting rectangle with bottom left corner u and a top right corner v. The following two examples, illustrated in Figure 3 and in Figure 4, respectively, show that is does not suce to consider only lines with diagonal direction or only lines though critical values of the modules. Example 2.11. Let M be the rectangle decomposable module given as the direct sum of two rectangle modules, one with support [0; 7) [0; 7), the other one with support [0; 7) [4; 11). Let N be the rectangle decomposable module given as the direct sum of the rectangle modules with support [0; 7) [0; 11) and [0; 7) [4; 7). L L For diagonal lines L with equation y = x + q for q  1 or q  3, we have that M = N so that L L d (M ; N ) = 0. For 0 < q < 4, a straightforward computation shows that the maximum bottleneck distance achieved is for q = 2, where the cost is 2. However, for the line L through the points (0; 0) L L L 7 and (7; 11), the bottleneck distance is d (M ; N ) = 4 and m ^ = , so that the matching distance between M and N is at least . This example shows that diagonal lines do not suce. Example 2.12. Let M be the rectangle module with support [2;1) [2; 7), and let N be the rectangle module with support [2;1) [2; 10). For lines L with equation y = mx + q with m  1 and q  2, we get weighted bottleneck distance 5, thus achieving the matching distance, because the weighted distance is smaller than 5 for all other lines. In particular, the line through (2; 2), point in C \C , with diagonal direction achieves the matching M N distance between M and N . 8 Figure 3: The persistence modules considered in Example 2.11: The matching distance is not attained at any diagonal line. Figure 4: The persistence module considered in Example 2.12: The matching distance is not attained at any line through two critical points. Symmetrically, if we consider the rectangle persistence modules M = [2; 7) [2;1) and N = [2; 10) [2;1), the situation is reversed, that is, the matching distance is realised by lines of slope greater or equal than 1. Note that the only slope that achieves the matching distance in both cases is 1. Also note that in neither of these two examples is it possible to achieve the matching distance by a line through two points in C [ C or their closure with respect to the least upper bound. M N The next example, illustrated in Figures 5 and 6, shows that not even considering both lines through critical values or their least upper bounds and diagonal lines ensures we achieve the matching distance. This fact will motivate the introduction of a further set of points, called switch points, in Section 4. Example 2.13. Let M be the rectangle decomposable module which is the direct sum of two rectangle modules, one with underlying rectangle [0; 7)  [0; 8), the other one with underlying rectangle [0; 7) [4; 11). Let N be the rectangle decomposable module sum of two rectangle modules whose underlying rectangles are [0; 7) [4; 8) and [0; 7) [0; 11). L L For diagonal lines L with equation y = x + q for q  1 or q  4, we have that M = N so that L L d (M ; N ) = 0. For 1 < q < 4, straightforward computation shows that the maximum bottleneck 3 0 7 distance achieved is for q = 2:5, where the cost is . However, for the line L through the points ( ; 6) 2 2 L L L 7 and (7; 11), the bottleneck distance is d (M ; N ) = 3 and m ^ = , so that the matching distance between M and N is at least . Out of the lines through two points in the closure with respect to L L lowest upper bound of C [ C , the only line that does not give M = N is the line through (0; 0) M N and (7; 11), which has weighted bottleneck distance . This example shows that it does not suce to consider only diagonal lines and lines through two points in the closure with respect to lowest upper bound of C [ C . M N 9 Figure 5: The persistence modules considered in Example 2.13: Neither lines through critical values nor diagonal lines suce to achieve the matching distance. 3 Main result The goal of this section is to prove the following main result. Theorem 3.1. Let M; N be two 2-parameter persistence modules. If M; N are both trivial, then d (M; N ) = 0. Otherwise, assume that M and N satisfy properties (P) and (Q). Let C and match M C be nite sets of critical values in R as ensured by Property (P), and let (M; N ) be a nite set of points in P as ensured by property (Q). Then the matching distance between M and N is achieved by a line through two (not necessarily unique) points in C [ C [ , with (M; N )[f[0 : 1 : 1]g. M N In particular, since nitely presented persistence modules satisfy properties (P) and (Q), Theorem 3.1 holds for such persistence modules. We prepare the proof of this theorem, given in Section 3.3, showing some preliminary claims in which the set is written as the union of its proper part R and its part at in nity ` . P 1 1 3.1 Preliminary claims 0 2 0 0 0 0 0 Let L and L be lines in R with positive slope parameterized by L : u = ms ~ + b and L : u = m ~ s + b L L 0 respectively. Let S be the set of all matchings between M and N , and let S be the set of all matchings 0 0 L L between M and N . 0 0 Lemma 3.2. If L  L , then there exists a bijection between S and S . L;L C [C M N 0 0 0 Proof. Since L  L , we know that L  L and L  L . Therefore, by our previous C [C C C M N M N 0 0 L L L L paper [1], there exist multi-bijections : dgm M ! dgm M and : dgm N ! dgm N . For any M N L L 1 matching  : P ! Q in S such that P  dgm M and Q  N , set () =  ( ) . In other L;L N M words, the following diagram commutes: L L dgm M dgm N M N 0 () 0 0 L;L L L dgm M dgm N Since and are bijections, () is easily seen to be a matching of S . Similarly, we could M N L;L 0 1 have just as easily taken a matching  2 S , and created a matching ( )    in S. Therefore, N M establishes a bijection between the sets S and S . L;L More explicitly, the multi-bijection behaves as follows. Let u; v 2 C be such that the pair (p 0 (u); p 0 (v)) 2 dgm M . Then gives a bijection between M L L M the parametrized values of pairs u; v 2 C on L and L, i.e., 1 L (p 0 (u); p 0 (v)) = (p (u); p (v)) 2 dgm M : L L L L Under the assumptions above, since L  L , both lines intersect the same face of the positive cone 0 0 0 of u. This means that there exists some i 2 f1; 2g such that (m ~  p (u) + b) = (m ~  p (u) + b ) = u . L i L i i This enables us to solve for p (u) from p (u): L L 10 Figure 6: Weighted bottleneck distance across the line parameters. On the horizontal axis we have the values of the angle , giving the slope of the line. On the vertical axis we have the parameter b giving the origin of the line. The red cross corresponds to the maximum value over all lines, the black cross the maximum value for lines through pairs of critical points and the blue cross is the maximum value for diagonal lines. This particular example shows that it is not sucient to consider only lines through pairs of critical points or diagonal lines. Indeed the line achieving the matching distance (represented by the red cross), is not diagonal, nor passes through two critical values. 11 0 0 (m ~  p (u) + b) = (m ~  p 0 (u) + b ) , L i L i 0 0 m p (u) + b = m p 0 (u) + b , i L i L i i m b b i i p 0 (u) = p (u) + : (6) L L 0 0 m m i i Similarly, there is some j 2 f1; 2g such that b b j j p 0 (v) = p (v) + : L L 0 0 m m j j Analogously, if u 2 C is such that the pair (p 0 (u);1) 2 dgm M , we can solve for p 0 (u) from M L L p (u) as in (6). L L For the rest of the section, we x a line L with positive slope, we let P  dgm M and Q  dgm N , and we take  : P ! Q to be a matching such that the pair u; v 2 C [ C achieves the cost of : M N jp (u)p (v)j L L cost() = m ^ ; with 1; if cost() is achieved by a matched pair, 2; if cost() is achieved by an unmatched point . Also, for any other line L in the same equivalence class as L, we simply write for the bijection L;L between matchings along L and matchings along L from Lemma 3.2. As we are assuming that M; N satisfy Property (Q), for L equivalent to L via  , gives C [C [ M N a bijection between the matchings of L to those of L and, if u; v achieve the cost of , then u; v also achieve the cost of (). In this situation, the following lemmas tell us how to move a line around in its equivalence class with respect to C [ C [ in order to increase (or at least not decrease) the cost of a matching. M N We rst analyze what happens when we translate a line. 2 L L Lemma 3.3. Let L  R be a line with positive slope and let  : dgm M ! dgm N be a matching jp (u)p (v)j L L such that c() = , with u; v 2 C [ C [ and  2 f1; 2g. Then, for any other line M N L  L, obtained by translating L, the cost of () is given by the following formulas: C [C [ M N jp (u)p (v)j L L L L L L L 1. cost(()) = cost() = m ^ , if A  A or A  A , u v v u jp (u)p (v)+(b b )(1=m +1=m )j L L L 1 1 2 2. cost(()) = m ^ , otherwise. 0 0 0 0 ~0 Proof. Suppose that L is parameterized as ms ~ + b and L as m s + b . Let L be a line that is parallel to and in the same reciprocal position as L with respect to C [ C [ . Since L is parallel to L, we M N L L ~0 have m ~ = m , and in particular, m ^ = m ^ . As a consequence of Property (Q), we see that 0 0 jp (u) p (v)j L L c(()) = : L L 0 0 If either u or v lies on L, meaning A or A = f1; 2g, then L  L and L parallel to L u v C [C [ M N 0 L L implies that L = L , and therefore c(()) = c(). Otherwise, we have that both A and A are strictly u v contained in f1; 2g. We now have two cases. 1. First suppose that L (and hence L ) intersects the same face of the positive cone of u and of v, L L L L that is A  A or A  A . u v v u By Equation (6), this means that there is some i 2 f1; 2g such that: 0 0 m b b b b i i i i i p 0 (u) = p (u) + = p (u) + ; and L L L 0 0 m m m i i 0 0 m b b b b i i i i i p 0 (v) = p (v) + = p (v) + : L L L 0 0 m m m i i 12 We can then compute directly: jp 0 (u) p 0 (v)j L L c(()) = 0 0 jp (u) + (b b )=m p (v) (b b )=m j L i i L i i i i jp (u) p (v)j L L = c(): 2. Now suppose that L (and hence also L ) intersects di erent faces of the positive cones of the points u and v. Without loss of generality, assume that u lies strictly to the right of both L and L , and that v lies strictly to the left. This means that b b p (u) = p (u) + ; L L b b p 0 (v) = p (v) + : L L Recall that because of our standard normalization (see Equation (1)), we have b = b and 1 2 0 0 b = b . So we have 1 2 jp 0 (u) p 0 (v)j L L c(()) = 0 0 jp (u) + (b b )=m p (v) (b b )=m j L 1 1 L 2 2 1 2 jp (u) p (v) + (b b )(1=m + 1=m )j L L 1 1 2 = : Remark 3.4. Since translation preserves the slope of a line, Lemma 3.3 holds with respect to C [C [ M N as well. Lemma 3.5. Let L  R be a line with positive slope not intersecting C [ C [ . For each M N P L L 0 matching  : dgm M ! dgm N , there exists a direction of translation such that if L is a line obtained by translating L in that direction and L  L , then C [C [ M N cost(())  cost(): Proof. With reference to the notations and cases analyzed in Lemma 3.3, we now have two cases: 1. First suppose that L (and hence L ) intersects the same face of the positive cone of u and of v. We then know that cost(()) = cost(). This shows that in this case, translating L in either of the two directions keeps the cost unchanged. Since we are assuming that C [ C [ contains at M N least the proper points u and v, at least one of the two directions of translation guarantees hitting a proper point in C [ C [ M N P 2. Now suppose that L (and hence also L ) intersects di erent positive faces of the points u and v so jp (u)p (v)+(b b )(1=m +1=m )j L L L 1 1 2 that cost(()) = m ^ . Note that m ; m > 0. If p (u)  p (v), 1 2 L L 0 0 then having b > b would imply that cost(())  cost(). If p (v)  p (u), then having b > b 1 L L 1 1 1 would imply that cost(())  cost(). 0 0 0 Note that b > b means that L is a left translate of L, and b > b means that L is a right 1 1 1 1 translate of L. In either case, there is a direction of translation in which the cost does not decrease. We may therefore translate L, increasing c(), until L hits a point in C [ C [ . Since u and M N v lie on opposite sides of L, we are guaranteed to hit a proper point in C [ C [ M N P Lemma Lemma 3.3 is used in the following way. 13 Remark 3.6. For each line L with positive slope that does not intersect any point of C [ C [ M N P there is a direction of translation such that • translating L in that direction does not decrease the weighted cost cost(), • during the translation the line remains in the same equivalence class as L with respect to C [ C [ , until • eventually, the translated line hits a proper point in C [ C [ R . M N P Let us now study what happens when we rotate a line. Lemma 3.7. Let L be a line through a point r 2 R with direction m ~ = (m ; m ) (with m ; m > 0), 1 2 1 2 jp (u)p (v)j L L L L and let  : dgm M ! dgm N be a matching such that c() = , with u; v 2 C [ C and M N 2 f1; 2g. Then, for any other line L  L, obtained by rotating L about r, the cost of () is C [C [ M N given by the following formulas: m ^ L L 1. cost(()) =  cost(); if both A and A contain argmax fm g. L i u v i2f1;2g m ^ L L 2. cost(()) = cost() if both A and A contain argmin fm g. u v i2f1;2g L L L 3. cost(()) =  m ^ (v r ) + r u ; if A = argmin fm g, A = argmax fm g 2 2 1 1 i i u i2f1;2g v i2f1;2g and p 0 (u) p 0 (v)  0. L L 1 L L L 4. cost(()) =  m ^ (r v ) + u r ; if A = argmin fm g, A = argmax fm g 2 2 1 1 i i u i2f1;2g v i2f1;2g 0 0 and p (u) p (v)  0. L L Proof. Assume, without loss of generality, that L has slope greater than 1. That is, m ^ = m < m = 1, 1 2 so that argmax fm g = 2 and argmin fm g = 1. i i i2f1;2g i2f1;2g L L 1. Suppose that A and A both contain 2. Then u v jp (u) p (v)j = ju v j = jp 0 (u) p 0 (v)j: L L 2 2 L L Therefore, 0jp (u) p (v)j m ^ L L cost(()) = m ^ =  cost(): m ^ L L 2. Next suppose that both A and A contain 1. Then u v jv u j 1 1 jp 0 (v) p 0 (u)j = : L L 0 So we have that 0 jp 0 (v) p 0 (u)j 1 0 jv u j jv u j L L 1 1 1 1 L L cost(()) = m ^  =  m  = = cost(): L L 3. Assume that A = f1g and A = f2g. That is, assume that u lies strictly to the right of the line u v L and that v lies strictly to the left of the line L. Let L be another line obtained by rotating L 0 0 L L L L around r without hitting any point in C [C [ . We have A = A and A = A . Moreover, M N u u v v push 0 (u) r 1 1 2 2 = = : push (u) r m ^ 0 m 1 1 L 1 Thus, push 0 (u) r u r 1 1 1 1 push (u) = + r = + r : 2 0 2 0 2 L L m m 1 1 By Equation (5), we therefore see that p 0 (v) p 0 (u) = push (v) push (u) 0 0 L L 2 2 L L u r 1 1 = v + r 2 0 2 r u 1 1 = v r + : 2 2 0 14 0 0 0 Hence, if p (v) p (u)  0, the cost of () for L is L L 0 p 0 (v) p 0 (u) L L cost(()) = m ^ 1 0 r u 1 1 =  m  v r + 2 2 0 1 0 =  m (v r ) + r u : 2 2 1 1 0 0 4. Similarly, if p (u) p (v)  0, then the cost of ()) for L is L L 1 0 cost(()) =  m (r v ) + u r : 2 2 1 1 Lemma 3.8. Let L  R be a line with positive slope and let r be any proper point on L. If L does not pass through any point of C [ C [ , apart from possibly the point r itself, then, for each matching M N L L 0 : dgm M ! dgm N , there exists a direction of rotation such that if L is a line obtained by rotating L around r in this direction and L  L , then C [C [ M N cost(())  cost(): Proof. Assume that L does not pass through any point of C [ C [ other than r. Recall that the M N point [0 : 1 : 1] lies in , so by our assumptions, L is not a line with diagonal direction. Assume, without loss of generality, that the slope of L is larger than 1, that is m ^ = m < m = 1. Consider the matching 1 2 , and suppose that jp (u) p (v)j L L c() = : By assumption, one of the following mutually exclusive cases applies: • at least one of the points u and v is not on L, and r equals the other point, • neither of the points u and v lie on L. We therefore have the following cases. 1. The points u and v lie on the same side of L, including the possibility that one of them lies on L. L L In other words, either A and A are equal, or one is a proper subset of the other one. u v 2. The points u and v lie strictly on di erent sides of L, implying that neither of them lie on L. In L L other words, one of A and A equals f1g, and the other one equals f2g. u v We prove each case separately. L L 1. First suppose that u and v lie on the same side of L. This means that either both A and A u v contain 1 or both contain 2. Let L be a line in the same equivalence class as L with respect to C [ C [ M N We have two subcases. L L (a) Suppose that A and A both contain 2. Then, by Lemma 3.7(1), we have that u v m ^ cost(()) =  cost(): m ^ Therefore rotating L towards the diagonal will increase the cost, at least until we hit a point of C [ C [ , possibly the diagonal direction. M N L L (b) Next suppose that both A and A contain 1. Then, by Lemma 3.7(2), we have u v cost(()) = cost(): So the cost of the matching does not change as we rotate L about r. We can once again rotate towards the diagonal without decreasing the cost, at least until we hit a point in C [C [ M N including the diagonal direction. 15 2. Next suppose that u and v lie on di erent sides of L. Assume, without loss of generality, that L L A = f1g and A = f2g. That is, u lies strictly to the right of the line L and v lies strictly to the u v left of the line L. Let L be another line obtained by rotating L around r without crossing any point in C [C [ M N 0 0 The sign of p (u) p (v) is determined by A . L L l:u:b:(u;v) 0 0 • If f1g  A , then p (u) p (v)  0. By Lemma 3.7(4), we have L L l:u:b:(u;v) 1 0 cost(()) =  m (r v ) + u r : (7) 2 2 1 1 • If f2g  A , then p 0 (u) p 0 (v)  0. By Lemma 3.7(3), we have L L l:u:b:(u;v) 1 0 cost(()) =  m (v r ) + r u : (8) 2 2 1 1 We now have the following further subcases to consider. L L (a) Suppose that f1g  A and v  r . By Equation (7), taking m < m rotates L 2 2 1 l:u:b:(u;v) away from the diagonal direction and does not decrease the cost of the matching . In this case, we may hit no point of C [ C [ before we get to the vertical line through r, that M N is, we may eventually hit the point [0 : 0 : 1] 2 ` . L L (b) Now, suppose that f1g  A and v < r . This time, we take m > m , which rotates 2 2 1 l:u:b:(u;v) the line L towards the diagonal direction without decreasing its cost. In this case, we may hit no point of C [ C [ before we get to the diagonal line through r, that is, we may M N eventually hit the point [0 : 1 : 1] 2 (c) Next, suppose that f2g  A and v > r . So by Equation (8), we see that taking 2 2 l:u:b:(u;v) m > m , that is, rotating the line L towards the diagonal slope, does not decrease the cost. L L (d) Finally, suppose that f2g  A and v  r . This time, we can take m < m , which 2 2 1 l:u:b:(u;v) rotates the line away from the diagonal slope. This does not decrease the cost, and we will hit the point u before getting to a vertical line. When the slope of L is smaller than 1, that is m ^ = m < m = 1, the argument is very similar except 2 1 that for the case 2(c) where we may hit the point [0 : 1 : 0] 2 ` instead of the point [0 : 0 : 1]. The case when the slope of L is exactly 1 is excluded by the assumption that L intersects C [ C [ only at M N the proper point r. Lemma 3.8 is used in the following way. Remark 3.9. For each line L as in Lemma 3.8, there is a direction of rotation such that • rotating L in such direction around r does not decrease cost(), • during the rotation the line remains in the same equivalence class as L with respect to C [C [ M N until • eventually, the rotated line hits a point in C [ C [ [f[0 : 1 : 0]; [0 : 0 : 1]g. M N 3.2 Vertical and Horizontal Lines As can be seen in the proof of Lemma 3.8, it is not immediately apparent that the bottleneck distance for horizontal or vertical lines is irrelevant for the overall matching distance computation - in other words, this lemma does not discount the possibility that the matching distance is achieved on such a line. In order to understand better the role that horizontal and vertical lines play in the computation of the matching distance, we must more formally derive the cost of a vertical line (De nition 3.13); the correctness of this de nition relies on Theorem 3.12. After de ning the cost of a vertical line, we are able to show that these lines do not determine the matching distance, via Corollary 3.14. The case dealing with horizontal lines is analogous. In this section we maintain the notations and assumptions of the previous section. In particular, the persistence modules M and N are assumed to satisfy property (P) for some nite sets C and C , and M N property (Q) for some set . Moreover, we maintain the notation := [f[0 : 1 : 1]g. In order to prove Theorem 3.12, we rst prove the following two lemmas concerning the cost of lines within the same equivalence class as a vertical line. 16 2 0 Lemma 3.10. Let V  R be a vertical line and let c; c be distinct points on V such that the open line segment on V between them does not intersect C [ C [ . Then there exists a m ~ > 0 suciently M N 1 small such that, for any two parallel lines L and L with direction m ~ = (m ; 1) and 0 < m < m ~ that 1 1 1 0 0 intersect V on the open line segment c; c , it holds that L  L . C [C [ M N Proof. Let us set " := minfm 2 R j [0 : m : m ] 2 ; m = 1g ; 1 1 1 2 2 c :=g:l:b: C [ C [ min M N P (9) c :=l:u:b: C [ C [ max M N P " := minfjx y j j x 6= y ; x; y 2 C [ C [ g : 1 1 1 1 M N P Assume, without loss of generality, that c < c . Taking "=0 to be equal to +1, choose " " 0 < m ~ < min ; ; " : 1 1 0 min max jc c j jc c j 2 2 2 0 0 As there are no points in C [ C [ on V between c and c , we get that L and L have the same M N reciprocal position with respect to all points in V \ (C [ C [ ). Moreover, by the choice of their M N slope, L and L are in the same reciprocal position with respect to all points in (C [ C [ )n V as M N well. Thus, L  L . C [C [ M N 2 0 Lemma 3.11. Let V  R be a vertical line, and let c; c be distinct points on V such that the line segment between them does not contain any point of C [C [ . Let L be a line through a point r 2 V M N between c and c with direction m ~ = (m ; 1), 0 < m < 1. Then the limit of the weighted bottleneck 1 1 distance along L as it rotates around r toward the vertical direction, written as L L L lim m ^ d (dgm M ; dgm N ); m !0 is the same for all points r in the open segment of V between c and c . Moreover, this convergence is uniform on the open line segment between c and c . Proof. We consider the behavior of the weighted bottleneck distance along lines L(r; m ) with direction m ~ = (m ; 1), and 0 < m < 1, through a point r on the open segment of V between c and c , as m 1 1 1 tends to 0. First of all we observe that, for any two points r and r on such open segment, and for suciently 0 0 small m > 0, Lemma 3.10 ensures that L := L(r; m )  L(r ; m ) =: L . Hence, by Property 1 1 1 C [C [ M N (Q'), if the optimal matching achieving the bottleneck distance along L is , then the optimal matching achieving the bottleneck distance along L is 0 (), and that the matched pair giving the cost of L;L 0 () is the image of the matched pair giving the cost of  under 0 : If p (u); p (v) give the L;L L;L L L bottleneck distance along L, then p 0 (u); p 0 (v) give the bottleneck distance along L . In other words, L L for any line L(; m ) that is steep enough and intersects V between c and c , the bottleneck distance is determined by the same pair u; v of critical points. Now, considering the cases listed in Lemma 3.7 separately, we show that the weighted bottleneck distance along L(r; m ) converges uniformly, independent of the choice of r, as m tends to 0. That is, 1 1 for any  > 0, there exists a m ~ > 0 such that for any 0 < m < m ~ and any point r 2 V in between c 1 1 1 0 0 00 and c , the line L through r with direction (m ; 1) gives 0 0 0 L L L L L L jm ^ d (dgm M ; dgm N ) lim m ^ d (dgm M ; dgm N )j <  : B B m !0 Let  > 0. L(r;m ) L(r;m ) 1 1 1. If both A and A contain 2 = argmax fm g, then the weighted bottleneck dis- v u i i2f1;2g ju v j 2 2 tance is m , which is independent of r and tends to 0 when m tends to 0. In particular, 1 1 ju v j 0  2 2 for any r 2 V between c and c and any 0 < m < , we have that m < , proving 1 1 ju v j 2 2 L(r;m ) L(r;m ) L(r;m ) 1 1 1 the uniform convergence of m ^ d dgm M ; dgm N to 0 as m tends to 0 in this B 1 case. 17 L(r;m ) L(r;m ) 1 1 2. If A and A contain 1 = argmin fm g, then the weighted bottleneck distance is v u i2f1;2g i ju v j 1 1 constantly equal to , independently of both r and m . Hence, it converges uniformly (for ju v j 0 1 1 steep enough lines intersecting V between c and c ) to as m tends to 0 in this case. L(r;m ) L(r;m ) 1 1 3. If A = f1g and A = f2g, and p (u) p (v)  0, then the weighted u v L(r;m ) L(r;m ) 1 1 1 1 bottleneck distance is  (m (v r ) + r u ). As m tends to 0, this tends to  (r u ) 1 2 2 1 1 1 1 1 regardless of r (that is, regardless of r , as r is the same for any point r between c and c ), thanks 2 1 to the fact that r is bounded between c and c . 2 2 In particular, we have that L(r;m ) L(r;m ) 1 1 m d dgm M ; dgm N  (r u ) = m  (v r ) 1 B 1 1 1 2 2 < m  maxfjv c j;jv c jg : 1 2 2 2 Therefore, for any r 2 V between c and c and any 0 < m < , we have 1 0 maxfjv c j;jv c jg 2 2 2 ju v j 2 2 1 m   (r u ) < , proving uniform convergence of the weighted bottleneck distance 1 1 1 along L(r; m ) to  (r u ) as m tends to 0 in this case. 1 1 1 1 L(r;m ) L(r;m ) 1 1 4. If A = f1g and A = f2g, and p (u) p (v)  0, then, by an analogous u v L(r;m ) L(r;m ) 1 1 argument to the one made in the previous case, we can see that the weighted bottleneck distance along L(r; m ) converges uniformly to  (u r ) as m tends to 0 in this case. 1 1 1 1 Theorem 3.12. Given a vertical line V  R , the limit L L L lim m ^ d (dgm M ; dgm N ) m !0 of the weighted bottleneck distance of the line L through the point r 2 V with direction m ~ = (m ; 1), 0 < m < 1, as L rotates toward the vertical direction, does not depend on the point r. Proof. By Lemma 3.11, we already know that, for every two points r; r 2 V such that the closed segment between them does not contain any point of C [C [ , the limit of the weighted bottleneck distance M N P of lines through the point r 2 V as they rotate toward the vertical direction is equal to that of lines through r . Let us now take c to belong to V \ (C [ C [ ) and r 2 V to be arbitrarily close to c. Let L(r) M N and L(c) be parallel lines with positive slope through r and c, respectively. L(r) L(r) L(r) Let us now assume that m ^ d (dgm M ; dgm N ) tends to the value `, which has to be the L(c) L(c) L(c) same for all r suciently close to c by Lemma 3.11, and m ^ d (dgm M ; dgm N ) tends to the value `(c), as L(c) and L(r) tend to V by rotating counterclockwise about r and c, respectively. Let us also assume, by contradiction, that j`(c) `j > 0 and take 0 < " < j`(c) `j=3. By internal stability [20, Thm. 2], we know that the weighted bottleneck distance for L(r) tends to that of L(c) as r tends to c, so L(r) L(r) L(r) L(c) L(c) L(c) m ^ d dgm M ; dgm N m ^ d dgm M ; dgm N  " (10) B B for every r suciently close to c. By Lemma 3.11, we also know that the weighted bottleneck distance for L(r) tends to ` uniformly as we rotate L(r) towards the vertical line and move r on the line. Hence, for all r suciently close to c, there is some real number m ~ > 0 such that for all 0 < m  m ~ , the lines L(r) through r with direction 1 1 1 m ~ = (m ; 1) satisfy L(r) L(r) L(r) m ^ d dgm M ; dgm N `  " (11) We may now take m > 0 to be suciently small so that for the lines L(c) through c with direction m ~ = (m ; 1) we also have L(c) L(c) L(c) m ^ d dgm M ; dgm N `(c)  " (12) By applying the triangle inequality, and using Equations (10) to (12), we get j`(c)`j  3" < j`(c)`j, giving a contradiction. 18 The above preposition guarantees that the following notion for the cost of a vertical line is well de ned. The case of the analogous statement for horizontal lines can be handled in a similar way. De nition 3.13. The cost of a vertical line V is the limit of the weighted bottleneck distance of a line L through a xed point r 2 V as it rotates toward the vertical direction: L L L cost(V ) := lim m ^ d (dgm M ; dgm N ) L + m ^ !0 where L is any line through r 2 V with direction m ~ = (m ; m ) and m ^ = m . 1 2 1 We can now conclude that the matching distance can always be achieved by a non-vertical line. Corollary 3.14. For every vertical line V , there exists a line L of positive slope through a point of C [ C [ and a di erent point of C [ C [ such that M N P M N L L L cost(V )  m ^ d (dgm M ; dgm N ) : Proof. By Theorem 3.12 the cost of a vertical line V does not depend on the point r 2 V used to compute it. Thus, we are free to choose r 2 V such that it is higher than any point in C [C [ . We can now M N P rotate V about r making sure we stop at the rst point of C [C [ we hit. The proof of Lemma 3.8 M N ensures that, if we rotate towards the diagonal, the obtained line L has a weighted bottleneck distance not smaller than cost(V ) except possibly when we are in case 2.(a) or 2.(d). However, since we have chosen r higher than any point in C [ C [ , we cannot be in case 2.(a). M N P Finally, suppose that when we slightly rotate V around r toward the diagonal direction we get lines L that fall in case 2.(d). As usual, we call u; v the two points realizing the weighted bottleneck distance along such lines. Let us recall that, in case 2.(d), u lies strictly to the right of L , v lies strictly to the left 0 0 0 of L , and l:u:b:(u; v) lies to the left of L . However, if l:u:b:(u; v) lies to the left of L and r > v ; u , 2 2 2 it must be the case that r  u , since r is a point on the line L . This would imply that u does not lie 1 1 strictly to the right of V , the vertical line through r, which is a contradiction. So, we cannot be in case 2.(d). If the obtained line L intersects only at a point at in nity, then Lemma 3.5 guarantees that we can translate L until we hit a point of C [ C [ without decreasing the weighted bottleneck distance, M N P and we are done. If the obtained line L is through a point c in C [ C [ , then c must be strictly to the left M N P of V because r was chosen to be strictly higher than C [ C [ . Now we can use c as a new M N P center of rotation applying again Lemma 3.8. If we get again a vertical line, then we can iterate the argument. Eventually, there will be no points strictly to the left of the newly obtained vertical line, and thus the process will terminate with a line L through a point of C [ C [ and a di erent point of M N P C [ C [ M N 3.3 Proof of the main theorem We are now ready to prove Theorem 3.1, thanks to Proposition 2.10, Lemma 3.5, 3.8, and Corollary 3.14, which we restate below for convenience. Theorem 3.1. Let M; N be two 2-parameter persistence modules. If M; N are both trivial, then d (M; N ) = 0. Otherwise, assume that M and N satisfy properties (P) and (Q). Let C and match M C be nite sets of critical values in R as ensured by Property (P), and let (M; N ) be a nite set of points in P as ensured by property (Q). Then the matching distance between M and N is achieved by a line through two (not necessarily unique) points in C [ C [ , with (M; N )[f[0 : 1 : 1]g. M N L L Proof. If M and N are both trivial, then M and N are also trivial for all lines L, yielding that d (M; N ) = 0. Otherwise, if M and N are not both trivial, consider the critical sets C and C match M N L L guaranteed by property (P), and which cannot be both empty. Consider a line L on which M and N are also not both trivial. Clearly, the matching distance between M and N is achieved on one such line L. L L L L If dgm(M ); dgm(N ) have a di erent number of points at in nity, then d dgm(M ); dgm(N ) = 0 L L 1. For any line L  L, the points at in nity of dgm(M ) (resp. dgm(N )) are in bi- C [C [ M N 0 0 L L jection with those of dgm(M ) (resp. dgm(N )) via (resp. ) as de ned in Lemma 3.2. M N 19 0 0 L L So, dgm(M ) and dgm(N ) will also have a di erent number of points at in nity, implying that 0 0 L L d dgm(M ); dgm(N = 1 as well. By internal stability [20, Thm. 2], we have that the bottleneck distance is equal to in nity on any line on the boundary of the equivalence class of L, implying the claim. L L L L If dgm(M ); dgm(N ) have the same number of points at in nity, then d dgm(M ); dgm(N ) is realized by a matching . As in Lemmas 3.3 and 3.7 above, we denote the matched pair that gives the bottleneck distance L L between M and N by u; v, so the corresponding real parameters on the line L are p (u) and p (v), L L respectively. Thus, jp (u) p (v)j L L L L L m ^ d dgm(M ); dgm(N ) = with  2 f1; 2g. 0 L Properties (P ) and (Q ) guarantee that the optimal matching  and the pair of points u; v 2 C [ C whose push onto L gives maximal distance in that matching do not change if we vary the M N line L remaining inside its equivalence class with respect to C [ C [ , i.e. without hitting any new M N points in C [ C [ M N Depending on the following exhaustive list of cases, we see that it is always possible to get a line L that gives no smaller weighted bottleneck distance and that passes through two points of C [ C [ M N Writing as the union of its set of proper points with its set of points at in nity , the list of P 1 cases to analyze is: (i) L is already through at least two distinct points of C [ C [ M N (ii) L is only through one point of C [ (iii) L is only through one point of (iv) L is through no point of C [ C [ M N In case (i) there is nothing to prove. In case (ii), the claim is proved by applying Lemma 3.8, i.e. by rotating the line until we either hit another point in C [C [ [f[0 : 1 : 0]; [0 : 0 : 1]g. If the point we M N hit is in C [ C [ , then we are done, otherwise the line is vertical, and we can apply Corollary 3.14 M N and we are also done. In case (iii), the claim is proven by applying Lemma 3.5, i.e. by translating the line until we get a line hitting a point of C [ C [ and still through the same point of as L. M N P 1 In case (iv), by Lemma 3.5, there exists a direction of translation such that we can translate the line in that direction without decreasing the cost and eventually hit a point in C [ C [ . We are now in M N P case (iii) from which we can reach the conclusion as explained above. 4 Computation of the switch points In this section we explain how to compute the possible switch points that we need to add to generate the set of switch points for the pair M; N . The main result of this section is Theorem 4.3 stating that, if M and N satisfy property (P), then also Property (Q), and hence (Q'), is satis ed. It is important to note that this result is not only an existence result but leads to a constructive procedure to determine the set of switch points (M; N ). Switch points are needed when there exist two matched pairs u; v and w; x with u; v; w; x 2 C [C M N for which, • there is a line L in a particular equivalence class via  on which the cost of matching u C [C M N with v equals the cost of matching w with x, but • the cost of matching u with v does not equal the cost of matching w with x for all lines in that equivalence class. Therefore, we use Lemma 4.1 to determine when, given a quadruple u; v; w; x and equivalence class E via  , it is possible to have a line L 2 E with the aforementioned equality. As the following C [C M N proposition shows, all such lines with this equality in E pass through some point ! 2 P . Spanning over all quadruples and all equivalence classes via  ; we obtain the set of switch points (M; N ): C [C M N To avoid degenerate or trivial cases, we assume that u 6= v and w 6= x. This is because the matched pair u; v with u = v would indicate a simultaneous birth and death, i.e. a point on the diagonal within a 20 persistence diagram or yield null contribution to the matching distance (similarly for w = x). However, it can be the case that one of the 4 points is repeated, such as v = x: In this case, a switch point would indicate a line L for which the cost of matching u to v is the same as matching u to x . u x v = x Figure 8: An example of 4 distinct points Figure 7: An example of 3 points u; w; v = x u; w; v; x which generate an ! point. Here, m that generate an ! point. Here, m denotes the and m denote the midpoints of the line seg- midpoint of the line segment between v and ments between u and w and v and x, respec- x. The pushes of the values u and w to L are tively. The pushes of the values u; w; v; x to denoted as red points, and the push of v is ! L are denoted as red points. Since the two red itself. For any line which passes through !, the segments in the gure have equal length for any cost of matching u with v is the same as the line which passes through !, on those lines the cost of matching w with x. cost of matching u with v is the same as the cost of matching w with x. Recall the notation p (u) for the parameter value of the push of a point u onto the line L, and 2 L L recall that, for every u 2 R , u pushes rightwards if A = f2g, upwards if A = f1g, and to itself if u u A = f1; 2g. We will use the following notation in Lemma 4.1: Let U be a non-empty set of points in the plane and set A(U ) = fA j u 2 U;; =6 A  f1; 2gg: u u The set A(U ) speci es, for each u 2 U , one or more push directions, and there are 3 possible choices for A(U ) if jUj = m. Given such a set A(U ), de ne E(A(U )) to be the set of all lines L such that A = A for each u 2 U . In other words, E(A(U )) is the set of all lines such that all points of U push onto these lines in the directions speci ed by A(U ). Note that it is a union of all equivalence classes in which the points of U push in the speci ed directions. We introduce this notation because switch points are generated based on how the pairs of points u; v and w; x push onto a line. Taking U = fu; v; w; xg, this is only a subset of the set C [ C , M N implying that there may be many equivalence classes via  for which these four points push in C [C M N the same way for each equivalence class. So, we create the union of all such equivalence classes via the set A(u; v; w; x). We can prove the following proposition. Lemma 4.1. Let u; v; w; x 2 C [ C . Assume that u 6= v and w 6= x, and that there are at least M N three distinct points among u; v; w; x. Let ;  2 f1; 2g . Fix a choice for A(u; v; w; x), and set E := E(A(u; v; w; x)) For any line L 2 E, consider the real quantity jp (u) p (v)j jp (w) p (x)j L L L L (u; v; w; x; ; ) := : If  (u; v; w; x; ; ) = 0, then exactly one of the following two possibilities is true, depending on E: 1. We have  (u; v; w; x; ; ) = 0 for every L 2 E. 2 0 2. There exists a point ! 2 P such that, for every L 2 E, we have  (u; v; w; x; ; ) = 0 if and only if ! 2 L . In fact,  and  are determined by the pairs u; v and w; x respectively. For example, if both points u; v are in C , then this pair represents matching point in dgm M to the diagonal for some L 2 E, and  = 2. If u 2 C and v 2 C ; M N L L then this pair represents the cost of matching a point in dgm M to dgm N , and  = 1. The value of  may be computed analogously. However, we prove the proposition for the slightly more general case of arbitrary ;  2 f1; 2g. 21 In other words, Lemma 4.1 says that, xed A(u; v; w; x), either the sign of  is constant on the equivalence classes of lines L onto which the points u; v; w; x push as prescribed by A(u; v; w; x), or otherwise the lines where there is a switch of the sign of  belong to a pencil though a point ! 2 P . Proof. Consider some L 2 E, parameterized as ms ~ + b. The four points can be in various di erent positions with respect to L and we want to nd the lines for which the quantity jp (u) p (v)j jp (w) p (x)j L L L L (u; v; w; x; ; ) := jp (u)p (v)j jp (w)p (x)j L L L L vanishes, as ( , respectively) represents the cost of pairing u with v (w with x, respectively), so to determine when these costs are equal. Recall, saying that a point is on one side of a line includes the possibility that the point is on the line. Saying that a point is strictly on one side of the line excludes the possibility that the point is on the line. Without loss of generality, we have the following possibilities. 1. There exists i 2 f1; 2g such that i 2 A for all p 2 fu; v; w; xg. That is, all four points lie on one side of L or, 2. There exists i 2 f1; 2g such that i 2 A for exactly 3 of p 2 fu; v; w; xg. That is, three points lie on one side of L, and the fourth lies strictly on the other side. 3. Two paired points have A = f1g for both points in the pair, and the other two paired points have A = f2g for both points in the pair. This is the case when two paired points lie strictly on one side of L, and the other two paired points lie strictly on the other side of L. 4. Two unpaired points have A = f1g, and the other two unpaired points have A = f2g. This is p p the case when two paired points lie strictly on opposite sides of L, and the other two paired points also lie strictly on opposite sides of L. We now address each possibility recalling Equations (2) to (4): 1. Suppose, without loss of generality, that p (u)  p (v) and p (w)  p (x): L L L L Then the value of  is given by: 1 u v 1 w x i i i i (u; v; w; x; ; ) = m m  m m i i i i u v w x 1 i i i i = + + + ; (13) with i = 1 (resp. i = 2) when all the points are to the right (resp. left) of L. 2. Suppose without loss of generality that u, v, w lie to the left of L and x lies strictly to the right. Suppose also, without loss of generality, that p (u)  p (v). L L If p (w)  p (x), then L L u v (w b ) (x b ) 2 2 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 2 2 1 x 1 u v w 1 1 b 1 b 1 2 2 2 1 2 = + + + : (14) m    m  m  m 1 2 1 2 If p (x)  p (w), then L L u v (w b ) (x b ) 2 2 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 2 2 1 x 1 u v w 1 1 b 1 b 1 2 2 2 1 2 = + + + + (15) m    m  m  m 1 2 1 2 The case when u, v, w lie to the right of L and x lies strictly to the left can be handled similarly. 22 3. Suppose, without loss of generality, that u and v lie strictly to the left of L and that w and x lie strictly to the right of L. Suppose also, without loss of generality, that p (u)  p (v) and L L p (w)  p (x). Then L L u v w x 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 2 1 1 w x 1 u v 1 1 1 2 2 = + + : (16) m   m 1 2 4. Suppose, without loss of generality, that u and w lie strictly to the left of L and that v and x lie strictly to the right of L. Also suppose, without loss of generality, that p (u)  p (v). If L L p (w)  p (x), then L L u b v b w b x b 2 2 1 1 2 2 1 1 (u; v; w; x; ; ) = + : m m m m 2 1 2 1 v x 1 u w 1 1 1 b 1 1 b 1 1 2 2 1 2 = + + + + + + + : (17) m   m   m   m 1 2 1 2 If p (x)  p (w), then L L u b v b w b x b 2 2 1 1 2 2 1 1 (u; v; w; x; ; ) = + m m m m 2 1 2 1 v x 1 u w 1 1 1 b 1 1 b 1 1 2 2 1 2 = + + + + + + + : (18) m   m   m   m 1 2 1 2 Observe that in all Equations (13) to (18), the expression for  (u; v; w; x; ; ) has the form 1 1 b b 1 2 (u; v; w; x; ; ) =  +  +   ; L 1 2 m m m m 1 2 1 2 where , , and are constants that depend only on u; v; w; x and ;  and not on the line L. 1 2 We now consider the following exhaustive cases: • = 0 and at least one of ; equal 0, 1 2 • = 0 and ; 6= 0, 1 2 • 6= 0: Suppose rst that is zero and one of and is zero. Assume, without loss of generality, that 1 2 = 0. Since m > 0, then the sign of  (u; v; w; x; ; ) is the same as the sign of . In other words, 1 2 L 2 either  (u; v; w; x; ; ) = 0 for every line L 2 E, or it is positive for every line L 2 E, or negative for every line L 2 E. Thus, this case gives rise to no switch point !. This implies that Equation (13) generates no switch points, as in that equation = 0 and one of and is zero. We can conclude that when all points u; v; w; x lie on the same side of any line L in E, no switch point is created. In Equation (16), it is always the case that = 0: If it is also the case that x = w or u = w , then 1 1 2 2 either or is zero. If we assume that x = w , for  to be zero, it must also be the case that u = v : 1 2 1 1 L 2 2 This implies that c (u; v) = c (w; x) = 0 for all lines L 2 E; and therefore  (u; v; w; x; ; ) = 0 for L L L every line L 2 E. (An identical argument works for u = v as well.) Thus, no switch point is created. 2 2 It is also possible to have = 0 in Equation (17), when  = : If, additionally, x = v or u = w , 1 1 2 2 then either or is zero. However, if x = v , then, for  to be zero, it must also be the case that 1 2 1 1 L u = w : This implies that c (u; v) = c (w; x) for all lines L 2 E; and therefore  (u; v; w; x; ; ) = 0 2 2 L L L for every line L 2 E. (An identical argument works for u = w as well.) So no switch point is created. 2 2 To summarize, no switch points are created for: • Equation (13) • Equation (16), when x = w or u = w ; nor 1 1 2 2 • Equation (17), when  =  and x = v or u = w . 1 1 2 2 23 Next suppose that = 0 and 6= 0 and 6= 0. In this case, the quantity  (u; v; w; x; ; ) is zero 1 2 L if and only if m  = m  : 2 1 1 2 This is a condition on the slope of the line, which states that L must have a xed slope. When considering the line L in P , this means that L passes through the point at in nity ! = [0 : : ]. Conversely, 1 2 it is easy to check that any line L 2 E with this slope has  (u; v; w; x; ; ) = 0. It is only possible for = 0 in Equation (16) and in Equation (17), when  = . In both of these, to ensure that 6= 0 and 6= 0, it must be that x 6= w and u 6= v : 1 2 1 1 2 2 Using this formula, we obtain the following ! points on the line at in nity: • From Equation (16), when x 6= w and u 6= v : 1 1 2 2 ! = [0 : (w x ) : (u v )] : 1 1 2 2 • From Equation (17), when  =  and x 6= w and u 6= v : 1 1 2 2 ! = [0 : v x : u w ] : 1 1 2 2 Finally, suppose that 6= 0. In this case,  (u; v; w; x; ; ) = 0 if and only if 1 1 b b 1 2 1 2 +  + = 0: m m m m 1 2 1 2 Again, viewing the lines L in P , it is easy to check that this equation is satis ed if and only if the line L 2 E passes through the point ! = [ : : ] : 1 2 When we require that 6= 0, we only need to consider Equations (14), (15) and (17), and Equa- tion (18) when  6= . Using this formula, we obtain the following proper ! points: • From Equation (14): ! = [ : x : (v u ) + w ] : 1 2 2 2 • From Equation (14), altered to consider the case when u; v; and w lie to the right of lines L 2 E, and x to the left: ! = [ : (v u ) + w : x ] : 1 1 1 2 • From Equation (15): ! = [ : x : (u v ) + w ] : 1 2 2 2 • From Equation (15), altered to consider the case when u; v; and w lie to the right of lines L 2 E, and x to the left: ! = [ : (u v ) + w : x ] : 1 1 1 2 • From Equation (17) when  6= : ! = [  : v x : u w ] : 1 1 2 2 • From Equation (18): ! = [ +  : x + v : w + u ] : 1 1 2 2 Remark 4.2. Lemma 4.1 describes how to construct certain ! points based on a choice of a quadruple u; v; w; x 2 C [ C and sets A for p 2 fu; v; w; xg. In fact, the same ! points can be obtained if we M N p choose fu; v; w; xg from the more restricted set C [ C . M N For example, let u 2 (C [ C n (C [ C ). Then u is the least upper bound either of two M N M N 0 00 points in C , or two points in C . Suppose, without loss of generality, that u ; u 2 C such that M N M 0 00 0 u = ((u ) ; (u ) ). If L is a line such that the push direction of u onto L is upward, then p (u) = p (u ). 1 2 L L Similarly, if L is a line such that the push direction of u onto L is rightward, then p (u) = p (u ). Since L L the formula for ! only depends on the direction in which u pushes, the value of ! does not change. 24 2 The proof of next Theorem 4.3 uses Lemma 4.1 to generate a set (M; N )  P of switch for which property (Q') holds. Theorem 4.3. Given two 2-parameter persistence modules M; N satisfying Property (P), there exists a set of switch points (M; N ) ensuring that M and N also satisfy Property (Q), and hence (Q'). Proof. As usual, let C and C denote the set of critical values of M and N , respectively, as guaranteed M N by Property (P), and let C and C be their closure with respect to the least upper bound. M N For each quadruple u; v; w; x 2 C [C (or simply each quadruple in C [C , thanks to Remark 4.2) M N M N and each set A(u; v; w; x) satisfying the conditions of Lemma 4.1, we may obtain a switch point !, either proper or at in nity. Iterating over all such quadruples and possible push directions for each quadruple, we obtain a nite set of points in P , which we denote (M; N ), or brie y We now show that the set satis es property (Q); then, by Proposition 2.10, it satis es property (Q'). 0 0 Let L and L be two lines such that L  L , and let u; v; w; x be any quadruple, with at C [C [ M N least three distinct points, in C [ C . We want to prove that the signs of  (u; v; w; x; ; ) and M N L 0 (u; v; w; x; ; ) agree. First, by Lemma 4.1, if  (u; v; w; x; ; ) = 0, then either  0 (u; v; w; x; ; ) = 0 for every L in the L L same equivalence class. of all lines in the same equivalence class contain a switch point ! 2 . So, by 0 0 the de nition of the equivalence relation, for any L  L, it must also be the case that ! 2 L . C [C [ M N By Lemma 4.1, this implies that  (u; v; w; x; ; ) = 0 as well. Next suppose, without loss of generality, that (u; v; w; x; ; ) > 0 and  0 (u; v; w; x; ; ) < 0: L L This implies that neither L nor L contain the point ! corresponding to the quadruple u; v; w; x and to the set A(u; v; w; x) which generate !. Note that, for xed  and  and within an equivalence class with respect to C [C [ , the quantity M N (u; v; w; x; ; ) is a continuous function with respect to translation or rotation. Moreover, L may be obtained from L by a translation and/or rotation which is always in the same equivalence class with respect to C [ C [ as both L and L . M N Therefore,  (u; v; w; x; ; ) is a continuous function of L which changes sign within an equivalence class with respect to C [ C [ , and by the Intermediate Value Theorem, there must be a line L , M N 0 00 equivalent to both L and L via  such that  00 (u; v; w; x) = 0. This implies that ! 2 L , C [C [ M N 00 0 00 0 and since L  L and L  L, then ! 2 L \ L. C [C [[ C [C [ M N M N This is a contradiction, and therefore, the sign of  (u; v; w; x; ; ) is the same as the sign of (u; v; w; x; ; ) for all lines L  L . This must be true for all quadruples in C [ C , so L M N C [C [ M N property (Q) holds for 5 Conclusion The main contributions of this paper include: • theoretical methods for computing the matching distance between two 2-parameter persistence modules, which can be used to produce algorithms with comparable time complexity to [17]; • the identi cation of the parameter space of a 2-parameter persistence module as a subset of the projective plane, which allows for an understanding of the importance of diagonal lines in the computation of the matching distance; and • new de nitions for the costs (w.r.t. computing the matching distance) of horizontal and vertical lines of a 2-parameter persistence module. To address the rst bullet, one advantage of the theoretical framework developed in this paper is that it allows us to formulate explicit ways of computing the switch-points from the critical values of two 2-parameter persistence modules. These explicit formulas will allow us to provide an algorithm performing the exact computation of the matching distance in two dimensions as the maximum over a nite set of lines, as guaranteed by Theorem 3.1. More speci cally, we expect to formulate algorithms based on the explicit computations of the switch- points provided in Section 4. An algorithm to compute the matching distance based on the framework in this paper can be described by the following main steps: 25 1. Identify the critical values in C and C ; M N 2. compute the switch-points using the results from Section 4, 3. compute all the lines through pairs of points in C [ C [ and lines of diagonal slope through M N at least one point of C [ C [ M N 4. compute the bottleneck distance together with its corresponding weight for each of the lines com- puted above and take the maximum. The above steps can all be easily implemented, yielding a new tool to compute the matching distance for 2-parameter persistence modules, while also giving an intuition on the lines that attain that distance. We expect the method to also be parallelizable, with computations carried out on di erent lines in parallel. The implementation of this method is currently underway. In addition to the explicit computation of the matching distance, we provide a detailed study of the lines through the parameter space and a geometric interpretation corresponding to a line being vertical, horizontal or diagonal. For instance, we show in Example 2.12 that we need to include diagonal lines through at least one point of the set C [ C [ in order to compute the matching distance, as lines M N through pairs of critical points are not sucient. This agrees with the observation in [12] that diagonal lines play a key role in the study of 2-parameter persistence. We are able to give an explicit formula for the computation of the weighted bottleneck distance along horizontal and vertical lines. Moreover, we are able to conclude that these lines are not important in computing the overall matching distance between two 2-parameter persistence modules | there is always a line with positive but nite slope for which the weighted bottleneck distance is greater than that along a horizontal or vertical line. Viewing the parameter space as a subset of the projective plane helps in understanding the importance of vertical, diagonal, and horizontal lines, and makes the treatment of these lines more consistent. In [17, 6], the authors leverage the duality between points and lines in the plane, performing compu- tations in the dual plane. In comparison, our method is directly de ned in the parameter space of the persistence modules, helping the understanding of the nite set of lines involved in the maximization for the computation of the matching distance. However, our approach to the computation of the matching distance remains possible only in two dimensions. This is due to the fact that, in two dimensions, a line is caged by a set of points, in our case the critical and switch points, whereas this is no longer the case in higher dimension. For this reason, the method we propose does not, for now, generalize to more parameters. Acknowledgements A.B. acknowledges the support of ICERM at Brown University, where part of this work was done. R.B. was supported for travel to BIRS by an AWM-NSF Travel Grant. C.H. is supported by NCCR-Synapsy Phase-3 SNSF grant number 51NF40-185897. C.L. carried out this work under the auspices of INdAM-GNSAGA and partially within the activities of ARCES (University of Bologna). B.I.M. is supported by Digital Futures (dBrain). E.S. is supported by European Research Council (ERC) grant no. 788183, `Alpha Shape Theory Ex- tended'. We are grateful to the Ban International Research Station for hosting us as a Focused Research Group. We also thank Martin Hafskjold Thoresen for help with computation. 26 References [1] Asilata Bapat, Robyn Brooks, Celia Hacker, Claudia Landi, and Barbara I. Mahler. Morse-Based Fibering of the Persistence Rank Invariant, volume 30 of Association for Women in Mathematics Series, pages 27{62. Springer International Publishing, 2022. [2] Paul Bendich, J. S. Marron, Ezra Miller, Alex Pieloch, and Sean Skwerer. Persistent homology analysis of brain artery trees. Ann. Appl. Stat., 10(1):198{218, 2016. [3] Subhrajit Bhattacharya, Robert Ghrist, and Vijay Kumar. Persistent homology for path planning in uncertain environments. IEEE Transactions on Robotics, 31(3):578{590, 2015. [4] S. Biasotti, A. Cerri, P. Frosini, D. Giorgi, and C. Landi. Multidimensional size functions for shape comparison. Journal of Mathematical Imaging and Vision, 32(2):161, 2008. [5] Silvia Biasotti, Andrea Cerri, Patrizio Frosini, and Daniela Giorgi. Approximating the 2-dimensional matching distance, 01 2010. [6] H avard Bakke Bjerkevik and Michael Kerber. Asymptotic improvements on the exact matching distance for 2-parameter persistence, 2021. [7] Andrew J. Blumberg and Michael Lesnick. Stability of 2-parameter persistent homology. 2020. [8] Gunnar Carlsson, Gurjeet Singh, and Afra Zomorodian. Computing multidimensional persistence. In Algorithms and computation, volume 5878 of Lect. Notes Comput. Sci., pages 730{739. Springer, Berlin, 2009. [9] Gunnar Carlsson and Afra Zomorodian. The theory of multidimensional persistence. Discrete Comput. Geom., 42(1):71{93, 2009. [10] Mathieu Carriere and Andrew J Blumberg. Multiparameter Persistence Images for Topological Machine Learning. In NeurIPS 2020 - 34th Conference on Neural Information Processing Systems, Vancouver / Virtuel, Canada, December 2020. [11] Andrea Cerri, Barbara Di Fabio, Massimo Ferri, Patrizio Frosini, and Claudia Landi. Betti num- bers in multidimensional persistent homology are stable functions. Math. Methods Appl. Sci., 36(12):1543{1557, 2013. [12] Andrea Cerri, Marc Ethier, and Patrizio Frosini. On the geometrical properties of the coherent matching distance in 2d persistent homology. Journal of Applied and Computational Topology, 3(4):381{422, 2019. [13] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete Comput. Geom., 37(1):103{120, 2007. [14] Patrizio Frosini and Claudia Landi. Persistent betti numbers for a noise tolerant shape-based approach to image retrieval. Pattern Recognition Letters, 34(8):863{872, 2013. Computer Analysis of Images and Patterns. [15] Sheridan B. Green, Abby Mintz, Xin Xu, and Jessi Cisewski-Kehe. Topology of our cosmology with persistent homology. CHANCE, 32(3):6{13, 2019. [16] Bryn Keller, Michael Lesnick, and Ted Willke. Persistent Homology for Virtual Screening, 2018. [17] Michael Kerber, Michael Lesnick, and Steve Oudot. Exact computation of the matching distance on 2-parameter persistence modules. In SoCG 2019, volume 129 of LIPIcs, pages 46:1{46:15, 2019. [18] Michael Kerber, Dmitriy Morozov, and Arnur Nigmetov. Geometry helps to compare persistence diagrams. ACM J. Exp. Algorithmics, 22, sep 2017. [19] Michael Kerber and Arnur Nigmetov. Ecient approximation of the matching distance for 2- parameter persistence, 2019. [20] C. Landi. The Rank Invariant Stability via Interleavings, volume 13 of Association for Women in Mathematics Series, pages 1{10. Springer, 2018. 27 [21] Yongjin Lee, Senja D. Barthel, Pawe c D cotko, S. Mohamad Moosavi, Kathryn Hess, and Berend Smit. Quantifying similarity of pore-geometry in nanoporous materials. Nature Communications, 8(1), 2017. [22] Ezra Miller. Data structures for real multiparameter persistence modules. arXiv: 1709.08155, 2017. [23] Hans Riess, Jakob Hansen, and Robert Ghrist. Multidimensional persistence module classi cation via lattice-theoretic convolutions. 2020. [24] Michael Sinhuber and Nicholas T. Ouellette. Phase coexistence in insect swarms. Phys. Rev. Lett., 119:178003, 2017. [25] Oliver Vipond, Joshua A. Bull, Philip S. Macklin, Ulrike Tillmann, Christopher W. Pugh, He- len M. Byrne, and Heather A. Harrington. Multiparameter persistent homology landscapes iden- tify immune cell spatial patterns in tumors. Proceedings of the National Academy of Sciences, 118(41):e2102166118, 2021. [26] Afra Zomorodian and Gunnar Carlsson. Computing persistent homology. Discrete Comput. Geom., 33(2):249{274, 2004.

Journal

MathematicsarXiv (Cornell University)

Published: Oct 23, 2022

There are no references for this article.