Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A directed persistent homology theory for dissimilarity functions

A directed persistent homology theory for dissimilarity functions We develop a theory of persistent homology for directed simplicial complexes which detects persistent directed cycles in odd dimensions. We relate directed persistent homology to classical persistent homology, prove some stability results, and discuss the computational challenges of our approach. Our directed persistent homology theory is motivated by homology with semiring coefficients: by explicitly removing additive inverses, we are able to detect directed cycles algebraically. Keywords Persistent homology · Dissimilarity functions · Directed simplicial complexes · Directed cycles Mathematics Subject Classification 55N31 · 55N35 · 55U10 · 16Y60 1 Introduction Persistent homology is one of the most successful tools in Topological Data Analysis (Carlsson 2009), with recent applications in numerous scientific domains such as biol- ogy, medicine, neuroscience, robotics, and many others (Otter et al. 2017). In its most common implementation, persistent homology is used to infer topological properties of the metric space underlying a finite point cloud using two steps (Edelsbrunner and Harer 2010): B David Méndez david.mendez@research.fchampalimaud.org Rubén J. Sánchez-García R.Sanchez-Garcia@soton.ac.uk School of Mathematical Sciences, University of Southampton, SO17 1BJ Southampton, UK Champalimaud Centre for the Unknown, Av. Brasília s/n, 1400-038 Lisbon, Portugal The Alan Turing Institute, London NW1 2DB, UK 123 D. Méndez,R.J.Sánchez-García (1) Build a filtration of simplicial complexes from distances, or similarities, between data points. (2) Compute the singular homology of each of the simplicial complexes in the fil- tration, along with the linear maps induced in homology by their inclusions. The resulting persistence module can normally be represented using a persistence dia- gram or a persistence barcode. A fundamental limitation of persistent homology, and in fact homology, is its inability to incorporate directionality, which can be important in some real-world applications (see examples below). For instance, homology cannot in principle dis- tinguish between directed and undirected cycles (Fig. 1). Although the definition of homology, namely the differential or boundary operator, requires a choice of orienta- tion for the simplices, the resulting homology is independent of this choice. Previous attempts have had some partial success at this issue (see ‘Related work’ below), how- ever, as far as we know, there is no homology theory able to exactly detect directed cycles up to boundary equivalence. The main difficulty occurs at the algebraic level: opposite orientations on a simplex correspond to additive inverse elements in the coefficient ring or field. Our key insight is to retain the submodule of homology generated by those classes that only have non-negative coefficients on elementary chains. In this way, we can define a directed homology module (Definition 3.6) of directed simplicial complexes (Definition 3.1) that detects homology classes of directed 1-cycles, as desired (Fig. 1, Proposition 3.24). This insight arises from the use of homology with semiring coefficients, where additive inverses may not exist. Indeed, the main results in this paper can be understood and reformulated in full generality using homology theory with semiring coefficients, see Appendix 1. Our directed homology theory can be extended to the persistent setting. More concretely, we define two persistence modules associated to filtrations of directed simplicial complexes: an undirected one using the whole homology module (Defini- tion 4.2) and a directed one using the submodule of directed homology (Definition 4.3). We show that the bars in the directed barcodes can be matched one-to-one with bars in the undirected barcodes, with the directed bars possibly having a later birth (Propo- sition 4.7,Figs. 2, 5, 6). Our motivation to develop directed persistent homology is the applications to real-world data sets where asymmetry or directionality, in particular the presence of directed cycles, plays a fundamental role. Examples include biological neural networks (directed synaptic connections), time series data (directed temporal connections) or biological molecular networks such as protein-protein interaction networks (inhibitory and excitatory connections). Mathematically, we can model these situations using a finite set V of data points, and an arbitrary function d : V × V → R, which we call dissimilarity function (Sect. 2.2) and which, crucially, may not be symmetric. Our main example of a directed simplicial complex is the directed Rips complex (Defini- tion 4.12)of (V , d ), which becomes the input of our suggested directed persistent homology pipeline. In the context of dissimilarity functions, we prove a stability result with respect to the bottleneck distance between (directed) persistent diagrams and the correspondence 123 A directed persistent homology theory for dissimilarity… v w 3 3 v v w w 1 2 1 2 Fig. 1 Two examples of directed simplicial complexes X (left) and Y (right). Our directed submodules of Dir Dir homology over the rings R ∈{Z, Q, R} detect directed 1-cycles: H (X , R) ={0}, while H (Y , R) = R 1 1 generated by the directed 1-cycle [w ,w ]+[w ,w ]+[w ,w ] 1 2 2 3 3 1 distortion distance (a generalization of the Gromov-Hausdorff distance) between dis- similarity functions over finite sets (Theorem 4.19). We finish this article by discussing the computational challenges of our approach. Related work A first approach to incorporating directionality would be to encode the asymmetry of a data set in the simplicial complex built from it. In Turner (2019), the author uses ordered tuple complexes or OT-complexes, which are generalizations of simplicial complexes where simplices are ordered tuples of vertices. Similar ideas have been successfully used in the field of neuroscience to show the importance of directed cliques of neurons (Reimann et al. 2017), showing that directionality in neuron con- nectivity plays a crucial role in the structure and function of the brain. In Chowdhury and Mémoli (2018a), the authors use so-called Dowker filtrations to develop persis- tent homology for asymmetric networks, and note (Chowdhury and Mémoli 2018a, Remark 35) that a non-trivial 1-dimensional persistence diagram associated to Dowker filtrations suggests the presence of directed cycles. However, they also note that such persistence diagram may be non-trivial even if directed cycles are not present. Fur- ther progress can be achieved by combining the ideas above with homology theories that are themselves sensitive to asymmetry. In Chowdhury and Mémoli (2018b), the authors develop persistent homology for directed networks using the theory of path homology of graphs (Grigor’yan et al. 2020). This persistent homology theory shows stability with respect to the distance between asymmetric networks (Carlsson et al. 2014), and is indeed able to tell apart digraphs with isomorphic underlying graphs but different orientations on the edges. Nonetheless, cycles in path homology do not corre- spond to directed cycles, see Chowdhury and Mémoli (2018b, Example 10). Another significant contribution can be found in Turner (2019), where four new approaches to persistent homology are developed for asymmetric data, all of which are shown to be stable. Each approach is sensitive to asymmetry in a different way. For example, one of them uses a generalization of poset homology to preorders to detect strongly connected components in digraphs, a feature our implementation does not have (see Proposition 3.18). More interestingly for our purposes, in Turner (2019, Section 5)the author introduces a persistent homology approach that builds a directed Rips filtration of ordered tuple complexes associated to dissimilarity functions d : V × V → R. This is indeed sensitive to asymmetry and yields a persistent homology pipeline that 123 D. Méndez,R.J.Sánchez-García v v v v 3 3 3 3 v v v v v v v v 4 2 4 2 4 2 4 2 v v v v v v v v 5 1 5 1 5 1 5 1 X X X X 0 1 2 3 0123 0123 Fig. 2 A filtration of a directed simplicial complex (top, left to right) and corresponding undirected (bottom, left) and directed (bottom, right) persistence barcodes. Every bar in the directed barcode corresponds to a unique bar in the undirected barcode (shown here by matching colours) (colour figure online) is stable with respect to the correspondence distortion distance, a generalization of the Gromov-Hausdorff distance to dissimilarity functions (see Turner 2019, Section 2 and Sect. 2.2). However, undirected cycles are also detected by this homology theory. In contrast, in this article we produce a theory of persistent homology associated to ordered tuple complexes that only detects directed cycles and which is also stable with respect to the correspondence distortion distance. Overview of results We introduce modules of undirected and directed homology associated to directed simplicial complexes (Definition 3.4) and show that they satisfy some desirable prop- erties. Furthermore, we prove that for certain coefficient rings, closed paths will only have non-trivial directed homology in dimension 1 when their edges form a directed cycle, as we set up to do (Proposition 3.24). Our modules of directed homology can be defined in dimensions greater than one, although they are zero in all even dimen- sions (Proposition 3.26) except in dimension 0 where they detect weakly connected components (Proposition 3.18). Next, we extend our homology theory to the persistence setting. In particular, we introduce, for each dimension, two persistence modules associated to the same fil- tration: an undirected persistence module (Definition 4.2), which is analogous to the one introduced in Turner (2019, Section 5), and the submodule generated by directed classes, which we call the directed persistence module (Definition 4.3). The directed persistence module gives rise to bars associated to homology classes that can be rep- resented by directed cycles, as illustrated in Fig. 2 and Examples 4.8 to 4.11.We also establish a relation between the undirected and directed persistence barcodes of the same filtration. Indeed, in Proposition 4.7 we show that every bar in the directed barcode corresponds to a unique bar in the undirected barcode that dies at the same time. The bar in the directed barcode may, however, be born later, and some bars in the undirected barcode may be left unmatched (see Figs. 2, 5 and 6). Having all the necessary ingredients, we can provide a complete workflow for directed persistent homology: given a dissimilarity function d on a finite set V , we construct its directed Rips filtration (Definition 4.12), and take the correspond- 123 A directed persistent homology theory for dissimilarity… ing undirected and directed n-dimensional persistence modules, respectively denoted Dir H (V , d ) and H (V , d ) (Definition 4.13). Both of these persistence modules have n V V persistence diagrams and barcodes (Definition 4.15) and the associated persistence modules are stable with respect to the correspondence distortion distance (Sect. 2.2 and Theorem 4.19). We finish the article with a discussion of the computational challenges of calculating the directed persistent homology of a dissimilarity function. The Standard Algorithm can easily be adapted to compute the barcodes of the undirected persistent homology of the dissimilarity function, but the computation of the directed persistence barcodes is more challenging. A direct approach results in an extreme ray enumeration problem (Sect. 5), a well-known problem in computational geometry. Developing an algorithm that computes the directed homology classes, rather than all directed cycles, remains an open problem. We include an Appendix on homology with semiring coefficients and on how it gives rise to the ideas introduced in this article. This provides additional motivation and context for our results, and presents them in full generality. Outline of the article In Sect. 2, we introduce the necessary background in persistence modules (2.1) and dissimilarity functions (2.2). In Sect. 3 we introduce undirected and directed homol- ogy modules of directed simplicial complexes. We then use these homology modules to introduce directed and undirected persistent homology in Sect. 4, where we also show our stability results with respect to dissimilarity functions. In Sect. 5, we discuss the computational implementation of directed persistent homology and lay out some future research directions. Finally, Appendix 1 introduces homology with semiring coefficients and shows how it relates to directed homology. 2 Persistence modules and dissimilarity functions Our goal is to extend persistent homology to directed simplicial complexes such as those constructed from a dissimilarity function. In this section, we introduce the neces- sary background regarding persistent homology (Sect. 2.1) and dissimilarity functions (Sect. 2.2). 2.1 Persistence modules, diagrams and barcodes In this section, we introduce persistence modules and their associated persistence diagrams and barcodes. We follow the exposition in Chowdhury and Mémoli (2018b), as it is simple and focused on persistence modules arising in the context we are interested in: that of persistent homology of finite simplicial complexes. Many of the results below hold in more generality (Chazal et al. 2016) but we chose to keep the exposition simple. Let R be a ring with unity and let T ⊆ R be a subset. 123 D. Méndez,R.J.Sánchez-García δ δ Definition 2.1 A persistence R-module over T , written V = {V }, {ν } ,is δ≤δ ∈T δ δ δ δ a family of R-modules {V } and homomorphisms ν : V → V , whenever δ∈T δ ≤ δ ∈ T , such that (1) for every δ ∈ T , ν is the identity map, and δ δ δ (2) for every δ ≤ δ ≤ δ ∈ T , ν ◦ ν = ν . δ δ δ δ δ δ δ Let V = {V }, {ν } and W = {W }, {μ } be two persistence δ δ δ≤δ ∈T δ≤δ ∈T R-modules over T.A morphism of persistence R-modules f : V → W is a family δ δ δ of morphisms of R-modules { f : V → W } such that for every δ ≤ δ ∈ T we δ∈T have a commutative diagram δ δ V V f f W W . δ δ Let us assume that R is a field, thus V = {V }, {ν } is a persistence vector δ δ≤δ ∈T space, and that V is finite-dimensional, for every δ ∈ R. Furthermore, we suppose that there exists a finite subset {δ ,δ ,...,δ }⊆ T for which 0 1 n (1) if δ ∈ T , δ ≤ δ , then V = 0, (2) if δ ∈ T ∩[δ ,δ ) for some 1 ≤ i ≤ n,the map ν is an isomorphism, while i −1 i i −1 ν is not, and δ δ δ (3) if δ, δ ∈ T , δ ≤ δ< δ , then v : V → V is an isomorphism. Remark 2.2 When R is a field, either of these restrictions alone (namely, each V being finite-dimensional or the existence of a finite subset {δ ,δ ,...,δ } as above) 0 1 n is enough to associate a persistence barcode and a persistence diagram to V (Chazal et al. 2016, Theorem 2.8). We nevertheless assume both restrictions since they hold for persistence diagrams associated to finite simplicial complexes, which are precisely the sort of objects that arise from data, and such assumption makes the exposition simpler. Let us now introduce persistence barcodes and diagrams. First, for simplicity, we consider V indexed by the natural numbers: δ i +1 V = {V }, ν , i i ∈N δ δ l k n where V = V for all k ≥ n and ν is the identity whenever k, l ≥ n. (This clearly contains all the information in V.) By Edelsbrunner et al. (2015, Basis Lemma), we can find bases B of the vector spaces V , i ∈ N, such that i +1 (1) ν (B ) ⊆ B ∪{0}, i i +1 123 A directed persistent homology theory for dissimilarity… δ δ i +1 i +1 (2) rank(ν ) =|v (B ) ∩ B |, and i i +1 δ δ i i i +1 (3) each w ∈ Im v ∩ B is the image of exactly one element v ∈ B . i +1 i Such bases are called compatible bases. Elements in B that are mapped to an element in B correspond to linearly independent elements of V that ‘survive’ until the i +1 next step in the persistence vector space. Similarly, elements in a basis B which are not in B are considered to be ‘born’ at index i. Formally, we define i −1 L := (b, i ) | b ∈ B , b ∈ / Im ν , i > 0 ∪{(b, 0) | b ∈ B }. i 0 i −1 Given (b, i ) ∈ L, we call i the birth index of the basis element b. This element ‘survives’ on subsequent bases until it is eventually mapped to zero. When this happens, the number of steps taken until the class dies is its death index of b. Formally, the death index of (b, i ) is j δ δ i +2 i +1 (b, i ) := max j ∈ N | ν ◦ ··· ◦ ν ◦ ν (b) ∈ B . δ δ δ j −1 i +1 i We allow (b, i ) =∞,if b is in every basis B for all j ≥ i, so that the death index takes values in N = N ∪{+∞}. Using this information, we can introduce the persistence barcode of V. We call a pair (X , m) a multiset if X is a set and m : X → N = N∪{+∞} is a function. We call m(x ) the multiplicity of x ∈ X. δ i +1 Definition 2.3 Consider a persistence vector space V = {V }, ν as above. i i ∈N The persistence barcode of V is defined as the multiset of intervals Pers(V) := [δ ,δ ) |∃(b, i ) ∈ L, j ∈ Ns.t.(b, i ) = j i j +1 ∪ [δ , +∞) |∃(b, i ) ∈ Ls.t.(b, i ) =+∞ , with the multiplicity of [δ ,δ ) (respectively [δ , +∞)) being the number of ele- i j +1 i ments (b, i ) ∈ L such that (b, i ) = j (respectively (b, i ) =+∞). Thus, persistence barcodes encode the birth and death of elements in a family of compatible bases. Crucially, and even though compatible bases are not unique, the number of birth and death events at each step is determined by the rank of the i +1 linear maps ν and their compositions, and it is thus independent of the choice of compatible bases. Furthermore, the persistence barcode of a persistence vector space completely determines both the dimension of the vector spaces V and the ranks of the linear maps between them, hence it determines the persistence vector space up to isomorphism (Chazal et al. 2016, Theorem 2.8). Each interval in Pers(V) is called a persistence interval. Persistence barcodes can be represented by stacking horizontal lines, each of which represents a persistence interval or bar. The endpoints of the line in the horizontal axis correspond to the endpoints of the interval it represents, whereas the vertical axis has no significance other than being able to represent every persistence interval at once. Persistence bars can be stacked in any order, although they are usually ordered by their birth. This representation is what gives persistence barcodes their name. 123 D. Méndez,R.J.Sánchez-García An alternative characterization of a persistence vector space is its persistence dia- gram. Let us write R = R ∪{−∞, +∞} for the extended real line. Definition 2.4 The persistence diagram of the persistence vector space V is the mul- tiset 2 2 Dgm(V) := (δ ,δ ) ∈ R |[δ ,δ ) ∈ Pers(V) ∪ (δ , +∞) ∈ R |[δ , +∞) ∈ Pers(V) . i j +1 i j +1 i i The multiplicity of a point in Dgm(V) is the multiplicity of the corresponding interval in Pers(V). A crucial property for applications of persistent homology to real data is that a small perturbation of the input (a data set, encoded as filtration of simplicial complexes) results in a small perturbation of the output (its persistence module). Algebraically, this amounts to proving that a small perturbation of the persistence diagram results in a small perturbation of the associated persistence module. In order to state this stability result, we therefore need to introduce distances between persistence diagrams, and persistence modules, respectively. To measure how far apart two persistence diagrams are, we can use the bottleneck 2 2 distance between multisets of R .Let  denote the multiset of R consisting of every point in the diagonal counted with infinite multiplicity. A bijection of multisets m (x ) m (y) X Y ϕ : (X , m ) → (Y , m ) is a bijection of sets ϕ:∪ x →∪ y, X Y x ∈X y∈Y i =1 i =1 that is, a bijection between the sets obtained when counting each element in both X and Y with its multiplicity. Definition 2.5 Let A and B be two multisets in R .The bottleneck distance between A and B is defined as d (A, B) := inf sup a − ϕ(a) , B ∞ a∈A ∞ ∞ where the infimum is taken over all bijections of multisets ϕ : A ∪  → B ∪ and where · denotes the  -norm in R . Note that the bottleneck distance between two persistence diagrams defined from persistence barcodes can only be finite if the diagrams have the same amount of bars of form [a, ∞) for a ∈ R. We also need a distance between persistence vector spaces. To that purpose, we use the interleaving distance, introduced in Chazal et al. (2009). δ δ δ δ Definition 2.6 Let V = {V }, {ν } and W = {W }, {μ } be two δ δ δ≤δ ∈R δ≤δ ∈R persistence vector spaces, and ε ≥ 0. We say that V and W are ε-interleaved if there exist two families of linear maps δ δ+ε δ δ+ε {ϕ : V → W } , {ψ : W → V } δ δ∈R δ δ∈R such that the following diagrams are commutative for all δ ≥ δ ∈ R: 123 A directed persistent homology theory for dissimilarity… δ δ ν μ δ δ δ δ δ δ V V W W ϕ ϕ ψ ψ δ δ δ δ δ +ε δ +ε μ ν δ+ε δ+ε δ+ε δ +ε δ+ε δ +ε W , V , W V δ+2ε δ+2ε ν μ δ δ δ δ+2ε δ δ+2ε V V W V ψ ψ ϕ δ δ+ε δ δ+ε δ+ε δ+ε W , W . The interleaving distance between V and W is then defined as d (V, W) = inf{ε ≥ 0 | V and W are ε − interleaved}. The authors in Chazal et al. (2009) show that the interleaving distance is a pseudo- metric (a zero distance between distinct points may occur) in the class of persistence vector spaces. Moreover, they show the following Algebraic Stability Theorem. δ δ δ Theorem 2.7 (Chazal et al. 2009) Let V= {V }, {ν } and W= {W }, δ≤δ ∈R {μ } be two persistence vector spaces. Then, δ δ≤δ ∈R d Dgm(V), Dgm(W) ≤ d (V, W). B I 2.2 Dissimilarity functions and the correspondence distortion distance Our objective is to define a theory of persistent homology able to detect directed cycles modulo boundaries. Therefore, instead of building filtrations (of simplicial complexes) from finite metric spaces, we are interested in filtrations (of directed simplicial com- plexes) built from arbitrary dissimilarity functions. We will follow (Turner 2019), where they receive the name of set-function pairs, and Carlsson et al. (2014), Chowd- hury and Mémoli (2018a, b), where they are referred to as dissimilarity networks or asymmetric networks. In this section, we introduce the necessary background on dissimilarity functions, and the correspondence distortion distance between them, which is a generalization of the Gromov-Hausdorff distance to the asymmetric setting. We finish with a refor- mulation of this distance, established in Carlsson et al. (2014), which we will need to prove our persistent homology stability result. Definition 2.8 Let V be a finite set. A dissimilarity function (V , d ) on V is a function d : V × V → R.The valueof d on a pair (v ,v ) is called the distance,or V V 1 2 dissimilarity, from v to v . 1 2 123 D. Méndez,R.J.Sánchez-García Note that no restrictions are imposed on d , thus it may not be symmetric, the triangle inequality may not hold, and the distance from a point to itself may not be zero. These functions are also referred to as asymmetric networks (Carlsson et al. 2014; Chowdhury and Mémoli 2018a, b) since they may be represented as a network with vertex set V and an edge from v to v with weight d (v ,v ) for every (v ,v ) ∈ 1 2 V 1 2 1 2 V × V . Thus, dissimilarity functions are very flexible and allow for the modelling of widely different problems, including any problem that can be modelled using a network. We would like to build (directed) simplicial complexes and, ultimately, persistence diagrams from dissimilarity functions. In order to check the stability of our construc- tions, we need a way to measure how close two such objects are. When comparing networks with the same vertex sets, a natural choice is the  norm. However, we are interested in comparing dissimilarity functions on different vertex sets. To that end, we consider the  norm over all possible pairings (quantified by a binary rela- tion) between vertex sets, following ideas similar to those behind the definition of the Gromov-Hausdorff distance. Definition 2.9 Let (V , d ) and (W , d ) be two dissimilarity functions and let θ be a V W non-empty binary relation between V and W , that is, an arbitrary subset θ ⊆ V × W . The distortion of the relation θ is defined as dis(θ ) := max |d (v ,v ) − d (w ,w )|. V 1 2 W 1 2 (v ,w ),(v ,w )∈θ 1 1 2 2 A correspondence between V and W is a relation θ between these sets such that π (θ ) = V and π (θ ) = W , where π : V × W → V is the projection onto V , and V W V similarly for π . That is, θ is a correspondence if every element of V is related to at least an element of W , and vice-versa. The set of all correspondences between V and W is denoted R(V , W ). The correspondence distortion distance (Turner 2019) between (V , d ) and (W , d ) is defined as d (V , d ), (W , d ) = min dis(θ ). CD V W θ ∈R(V ,W ) This notion agrees with that of the Gromov–Hausdorff distance when (V , d ) and (W , d ) are metric spaces (Burago et al. 2001,Section 7.3.3). We will use a reformulation of the correspondence distortion that can be found in Chowdhury and Mémoli (2018a). In order to introduce it, we need to define the dis- tortion and co-distortion of maps between sets endowed with dissimilarity functions. Definition 2.10 Let (V , d ) and (W , d ) be any two dissimilarity functions and let V W ϕ : V → W and ψ : W → V be maps of sets. The distortion of ϕ (with respect to d and d ) is defined as dis(ϕ) := max d (v ,v ) − d ϕ(v ), ϕ(v ) . V 1 2 W 1 2 v ,v ∈V 1 2 123 A directed persistent homology theory for dissimilarity… The co-distortion of ϕ and ψ (with respect to d and d ) is defined as V W codis(ϕ, ψ ) := max d v, ψ(w) − d ϕ(v), w . V W (v,w)∈V ×W Note that codistortion is not necessarily symmetrical, namely, codis(ϕ, ψ ) and codis(ψ, ϕ) may be different if either of the dissimilarity functions are asymmetric. Finally, we have the following reformulation of the correspondence distortion dis- tance. Proposition 2.11 (Chowdhury and Mémoli 2018a, Proposition 9) Let (V , d ) and (W , d ) be any two dissimilarity functions. Then, d (V , d ), (W , d ) = min max {dis(ϕ), dis(ψ ), codis(ϕ, ψ ), codis(ψ, ϕ)} . CD V W 2 ϕ : V →W , ψ : W →V 3 Directed homology of directed simplicial complexes In this section, we introduce and study a family of subrings of the homology rings of simplicial complexes which encode directionality information and in particular detect directed cycles (Fig. 1). We cannot introduce such a family using (undirected) sim- plicial complexes, as they cannot encode directionality information of the simplices. One approach is to use the so-called ordered-set complexes, where simplices are sets with a total order on the vertices. They generalize simplicial complexes, which can be encoded as fully symmetric ordered-set complexes, that is, ordered-set complexes where if a set of vertices forms a simplex, it must do so with every possible order. However, persistent homology of ordered-set complexes is not stable (Remark 4.20). To achieve stability, we use one further generalization, called ordered tuple com- plexes or OT-complexes in Turner (2019), and called directed simplicial complexes in this article (Definition 3.1). The only difference is that arbitrary repetitions of vertices are allowed in the ordered tuples representing simplices. Clearly, any ordered- set complex is a directed simplicial complex. Furthermore, a (undirected) simplicial complex can be encoded as a fully symmetric (as above) directed simplicial com- plex where every possible repetition of vertices is also included. When doing so, the (undirected) simplicial complex and its associated directed simplicial complex have isomorphic homologies over rings (Remark 3.5), and morphisms of simplicial complexes can be lifted to morphisms between their associated directed simplicial complexes (Remark 3.9). Finally, and crucially for us, we will be able to use the above- mentioned subrings encoding information on directed cycles to introduce a theory of persistent homology for dissimilarity functions which is stable with respect to the correspondence-distortion distance (Sect. 4). Note that, as ordered-set complexes are directed simplicial complexes, the results stated here apply to homology computations on ordered-set complexes as well. 123 D. Méndez,R.J.Sánchez-García 3.1 Chain complexes of directed simplicial complexes In this section, we introduce directed simplicial complexes and their associated chain complexes and homology modules, including modules of directed homology. Definition 3.1 A directed simplicial complex,or ordered tuple complex, is a pair (V , X ) where V is a finite set of vertices, and X is a family of tuples (x , x ,..., x ) 0 1 n of elements of V such that if (x , x ,..., x ) ∈ X, then (x , x ,..., x ,..., x ) ∈ X 0 1 n 0 1 i n for every i = 0, 1,..., n. (Here and throughout the rest of the paper, x indicates that x is omitted from the list.) We will denote the directed simplicial complex (V , X ) just by X, and assume that every vertex belongs to at least one directed simplex (so that V is uniquely determined from X). Elements of X of length n + 1 are called n-simplices, and the subset of the n-simplices of X is denoted by X . Note that arbitrary repetitions of vertices in a tuple are allowed. Note that directed simplicial complexes are more general objects than simplicial complexes, in a way that allows directed simplicial complexes to encode asymmetric information. Note further that it is possible to define a geometric realization for these objects in which simplices with vertex repetitions are collapsed to a simplex with no repetitions, thus simplices with repeated vertices play a role similar to that of the degenerate simplices in simplicial sets. We make use of the following terminology when dealing with directed simplicial complexes. Elements of X of length n + 1 are called n-simplices, and the subset of the n-simplices of X is denoted by X .An n-simplex obtained by removing vertices from an m-simplex, n ≤ m,issaidtobea face of the m-simplex. Such face is called proper if n < m, or equivalently, if at least vertex of the m-simplex is removed. A directed simplicial complex is said to be n-dimensional, denoted dim(X ) = n,if X is trivial (empty) but X is not. A collection Y of simplices of X that is itself a n+1 n directed simplicial complex is said to be a directed simplicial subcomplex (or, simply, subcomplex)of X, denoted Y ⊆ X. Note that the vertex set of Y may be strictly smaller than that of X. We now introduce chain complexes associated to a directed simplicial complex. These definitions are straight generalizations of those for (undirected) simplicial com- plexes. Definition 3.2 Let R be a commutative ring. The n-dimensional chains of X are defined as the elements of the free R-module generated by the n-simplices, C (X , R) = R {[x , x ,..., x ]| (x , x ,..., x ) ∈ X } . n 0 1 n 0 1 n We call the elements [x , x ,..., x ]∈ C (X , R), (x , x ,..., x ) ∈ X, elementary 0 1 n n 0 1 n n-chains. 123 A directed persistent homology theory for dissimilarity… Definition 3.3 Let X be a directed simplicial complex. For each n > 0, we define a morphism of R-modules ∂ : C (X , R) → C (X , R) by n n n−1 ∂ ([x , x ,..., x ]) = (−1) [x , x ,..., x ,..., x ] n 0 1 n 0 1 i n i =0 For n = 0, let ∂ : C (X , R) →{0} be the trivial map. 0 0 A straightforward computation analogous to the one for chain complexes in sim- plicial homology shows that {C (X , R), ∂ } is a chain complex of R-modules. In n n particular, we can define homology groups. Definition 3.4 Let X be a directed simplicial complex, R a commutative ring, and n ≥ 0. The n-dimensional cycles, boundaries, and homology of X are Z (X , R) = {x ∈ C (X , R) | ∂ (x ) = 0}, B (X , R) = Im ∂ and the quotient H (X , R) = n n n n+1 n Z (X , R)/B (X , R), respectively. n n Remark 3.5 If X is an (undirected) simplicial complex, we can define an ordered-set OT OT simplicial complex X where (x , x ,..., x ) ∈ X whenever {x , x ,..., x }, 0 1 n 0 1 n after removing any repetition of vertices, is a simplex in X. The chain complex OT C (X , R) is called the ordered chain complex of X in Munkres (1984), and its homology is isomorphic to the singular homology of X over R. We now discuss how to only retain the information associated to directed cycles. Note that when introducing the differential for the chain complexes of simplicial complexes, an orientation is chosen for each of the simplices. However, undirected cycles can still be produced because the chosen orientation for a simplex can be reverted by changing the sign of the corresponding elementary chain. As such, homology of (undirected) simplicial complexes is independent of these choices of orientations. In the case of directed simplicial complexes, orientations are not chosen, and we can ensure that we are only retaining information regarding directed cycles by making sure that all the chains have positive coefficients, when the coefficient ring is chosen appropriately. However, we also require the introduced object to be a submodule of the entire module of homology. Thus, we arrive at the following definition. Definition 3.6 Let X be a directed simplicial complex and let R ∈{Z, Q, R}.The set of n-dimensional directed cycles of X over R is Dir Z (X , R) = λ c ∈ Z (X , R) | λ ≥ 0, c elementary n-chain . i i n i i Dir The nth dimensional module of directed homology, H (X , R), is the submodule of H (X , R) generated by the directed cycles, Dir Dir H (X , R) = [x]| x ∈ Z (X , R) . n n 123 D. Méndez,R.J.Sánchez-García Remark 3.7 The rings Z, Q and R are used to simplify the exposition: the definition Dir Dir of Z (X , R), and thus of H (X , R), can be made in an analogous way for any ring n n R that is the Grothendieck completion of a cancellative, zerosumfree semiring (see Appendix 1). In view of Definition 3.6 it is clear that, in dimension 1, directed circuits in the directed graph underlying a directed simplicial complex will give rise to directed cycles. In Sect. 3.3 we will show how these directed submodules of homology behave in a desirable way. 3.2 Functoriality of directed homology In this section, we prove that homology and directed homology are functors from the category of directed simplicial complexes to the category of graded R-modules. We also show that two morphisms allowing for the construction of the prism operator must induce the same map on homology, a result we will need to prove our persistent homology stability result. We begin by introducing morphisms of directed simplicial complexes. Definition 3.8 A morphism of directed simplicial complexes f : (V , X ) → (W , Y ) or just f : X → Y,isamapofsets f : V → W such that if (x , x ,..., x ) ∈ X then 0 1 n f (x ), f (x ),..., f (x ) ∈ Y . 0 1 n Remark 3.9 This definition is stricter than the classical notion of morphism of simpli- cial complexes, where morphisms are allowed to take simplices to simplices of lower dimension, as directed simplicial complexes can intrinsically account for vertex rep- etitions. However, note that if X and Y are (undirected) simplicial complexes, a map OT OT f : X → Y is a morphism of simplicial complexes if and only if f : X → Y (see Remark 3.5) is a morphism of directed simplicial complexes. Definition 3.10 Let f : X → Y be a morphism of directed simplicial complexes. Then f induces morphisms of R-modules C ( f ) ={C ( f )}, defined as C ( f ) : C (X , R) −→ C (Y , R) n n n [x , x ,..., x ] −→ [ f (x ), f (x ),..., f (x )]. 0 1 n 0 1 n We will often abbreviate C ( f ) = f . n n Proposition 3.11 If f : X → Y is a morphism of directed simplicial complexes, the family of maps {C ( f )} is a R-homomorphism of chain complexes. Therefore, it induces a map H ( f ) : H (X , R) → H (Y , R). Furthermore, H ( f ) restricts to a n n n n Dir Dir Dir map H ( f ) : H (X , R) → H (Y , R). n n n Proof Let [x , x ,..., x ]∈ C (X , R) be a simplex. Simple computations show that 0 1 n n C ( f )∂ = ∂ C ( f ), thus inducing a map H ( f ) : H (X , R) → H (Y , R). n−1 n n n n n n Dir Now let x ∈ Z (X , R). We can write x = λ x , where λ ≥ 0 and i i i n i x is an elementary n-chain. Then, C ( f )(x ) = λ C ( f )(x ). Clearly, C ( f ) i n i n i n 123 A directed persistent homology theory for dissimilarity… Dir takes elementary n-chains to elementary n-chains, thus C ( f )(x ) ∈ Z (Y , R). It is then clear from the definition of directed homology that H ( f ) restricts to H ( f ) : H (X , R) → H (Y , R). n n n Remark 3.12 Let X and Y be (undirected) simplicial complexes and let R be a ring. If f : X → Y is a morphism of (undirected) simplicial complexes, the map induced on OT homology by f : X → Y is the same as the map induced on homology by f : X → OT Y (see Remarks 3.5 and 3.9). Corollary 3.13 (Directed) Homology is a functor from the category of directed simpli- cial complexes to the category of graded R-modules. In particular, isomorphic directed simplicial complexes have isomorphic (directed) homologies. We finish this section by showing a sufficient condition for two morphisms to induce the same map on homology. We will need this result to prove that our definition of persistent homology is stable. Lemma 3.14 Let R be a commutative ring. Let X and Y be two directed simplicial complexes and let f , g : X → Y be morphisms of directed simplicial complexes such that if (x , x ,..., x ) ∈ X, then f (x ), f (x ),..., f (x ), g(x ), . . . , g(x ) ∈ Y 0 1 n 0 1 i i n for every i = 0, 1,..., n. Then, H ( f ) = H (g) for every n ≥ 0. Consequently, n n Dir Dir H ( f ) = H (g), for every n ≥ 0. n n Proof For x =[x , x ,..., x ]∈ C (X , R) an elementary n-chain, define 0 1 n n s [x , x ,..., x ]= (−1) [ f (x ),..., f (x ), g(x ), ..., g(x )]. n 0 1 n 0 i i n i =0 We show that this family of maps provides a chain homotopy between f and g. Namely, we prove that C (g) − C ( f ) = ∂ s + s ∂ , for all n. On the one hand, n n n+1 n n−1 n n i −1 i j s ∂ (x ) = (−1) (−1) f (x ), f (x ), ..., f (x ), g(x ), ..., g(x ), ..., g(x ) n−1 n 0 1 j j i n i =0 j =0 j +1 + (−1) f (x ), f (x ), ..., f (x ), ..., f (x ), g(x ), ..., g(x ) . 0 1 i j j n j =i +1 (3.1) On the other hand, j i ∂ s (x ) = (−1) (−1) f (x ), f (x ), ..., f (x ), ..., f (x ), g(x ), ..., g(x ) n+1 n 0 1 i j j n j =0 i =0 i +1 + (−1) f (x ), f (x ), ..., f (x ), g(x ), ..., g(x ), ..., g(x ) . 0 1 j j i n i = j 123 D. Méndez,R.J.Sánchez-García By exchanging the roles of the indices in this last equation, we obtain that n i i +1 j ∂ s (x ) = (−1) (−1) f (x ), f (x ), ..., f (x ), g(x ), ..., g(x ), ..., g(x ) n+1 n 0 1 j j i n i =0 j =0 i j + (−1) (−1) f (x ), f (x ), ..., f (x ), ..., f (x ), g(x ), ..., g(x ) . 0 1 i j j n j =i (3.2) Now, adding Eqs. (3.1) and (3.2), ∂ s (x ) + s ∂ (x ) = [ f (x ), f (x ),..., f (x ), g(x ), ..., g(x )] n+1 n n−1 n 0 1 i −1 i n i =0 − f (x ), f (x ), . . . , f (x ), g(x ), . . . , g(x ) . 0 1 i i +1 n i =0 It is now straightforward to check that this sum amounts to [g(x ), g(x ),..., g(x )]−[ f (x ), f (x ),..., f (x )]= C (g)(x ) − C ( f )(x ), 0 1 n 0 1 n n n and we are done. 3.3 Basic computations and examples In this section, we explore some basic properties of this homology theory. We begin by studying the relation between homology and connectivity. Definition 3.15 Let (V , X ) be a directed simplicial complex and v, w ∈ V be two vertices. A path from v to w in X a sequence of vertices v = x , x ,..., x = w such 0 1 n that either (x , x ) or (x , x ) is a simplex, for all 1 ≤ i ≤ n − 1. The (weakly) i i +1 i +1 i connected components of X are the equivalence classes of the equivalence relation v ∼ w if there is a path in X from v to w. We call X (weakly) connected if it has only one connected component, that is, every pair of vertices can be connected by a path. Note that this notion of connectedness ignores the direction of the edges (1- simplices). The next result shows that we can compute the homology of each connected component independently. The proof follows by observing that the chain complex C (X , R) is clearly the direct sum of the chain complexes associated to each of the (weakly) connected components of X. In particular, it is immediate that any directed cycle can be decomposed as a sum of directed cycles in each of the directed compo- nents. Proposition 3.16 The (directed) homology of a directed simplicial complex X is iso- morphic to the direct sum of the (directed) homology modules of its (weakly) connected components. 123 A directed persistent homology theory for dissimilarity… We next show that the 0th (directed) homology counts the number of (weakly) connected components of a directed simplicial complex X. We start with a lemma. Lemma 3.17 Let R be a commutative ring and X be a directed simplicial complex. For any vertex v and any λ ∈ R, λ[v] is homologous to the trivial class if and only if λ = 0. Proof It is clear that λ[v] is a cycle as ∂ is trivial. On the other hand, it is clear that it must be the trivial class if λ = 0. Now assume that λ[v] is the trivial class. We shall prove that λ = 0. Assume that X has n + 1 vertices, namely V ={x = v, x ,..., x }. Then if λ[x ] 0 1 n 0 is trivial, there exists a 1-chain x = λ [x , x ] ij i j i , j =0 such that λ[x ]= ∂(x ), where we are assuming that λ = 0 whenever [x , x ] is not 0 ij i j a 1-chain. By computing the differentials, n n λ[x ]= λ [x ]− λ [x ]. 0 ij j ij i i , j =0 i , j =0 Now, since [x ], [x ],..., [x ] is a basis of C (X , R), the coefficients of each ele- 0 1 n 0 ment in the basis must be equal in both sides of the equality. Namely, for each j = 0, n n 0 = (λ − λ ), and λ = (λ − λ ). Adding these equations, we deduce ij ji i0 0i i =0 i =0 that λ = 0. Proposition 3.18 Let X be a directed simplicial complex and let R be a commutative ring. Then H (X , R) R , where k is the number of (weakly) connected components Dir k of X. Furthermore, if R ∈{Z, Q, R}, then H (X , R) = H (X , R) R . Proof By Proposition 3.16 we only need to show that if X is connected, then H (X , R) = R.Let v, w be any two vertices and let us show that [v] and [w] are homologous 0-cycles. Indeed, since X is connected, we can find vertices v = v ,v ,...,v = w such that either [v ,v ] or [v ,v ] are 1-chains, for all 0 1 m i i +1 i +1 i i = 0, 1,..., n − 1. If either [v ,v ] or [v ,v ] are chains, then [v ] and [v ] are i i +1 i +1 i i i +1 homologous. As this holds for every i = 0, 1,..., m − 1, [v] is homologous to [w]. n n Now, if λ [x ] is any 0-cycle, it is homologous to λ [v]. Then, by Lemma i i i i =1 i =1 3.17, λ[v] is not homologous to μ[v] whenever λ = μ, thus H (X , R) = R. Dir For the second statement, if R ∈{Z, Q, R}, we can compute H (X , R). In such Dir case, [v] is a directed cycle, thus [v] = H (X , R) ⊆ H (X , R). It then follows Dir that H (X , R) = H (X , R). Remark 3.19 The fact that 0th directed homology ignores edge directions seems counter-intuitive, as we set up this framework to detect directed features, namely directed cycles. However, this is a natural consequence of homology being an equiv- alence relation. In particular, as mentioned in the proof, if either [u,v] or [v, u] are 123 D. Méndez,R.J.Sánchez-García 1-chains, [u] and [v] are homologous, thus directionality of the 1-chains does not affect 0th directed homology. However, this only occurs in dimension 0, as a consequence of 0-chains always being cycles. Thus, when working with 1-simplices, the symmetry in the homology relation does not affect the detection of asymmetry in the data, as such asymmetry is encoded in the cycles themselves. It is also worth mentioning that a persistent homology able to detect strongly connected components in directed graphs has been introduced in Turner (2019). We now show that the homology of a point is trivial for positive indices. Example 3.20 Let R be a commutative ring and let X be the directed simplicial complex with vertex set V ={x } and simplices X ={(x )}. Then, 0 0 R, if n = 0, H (X , R) = 0, if n > 0. Indeed, X is connected, thus H (X , R) = R by Proposition 3.18.For n ≥ 1, there are no n-chains, hence H (X , R) = 0. In this case, it is also immediate to check that if Dir R ∈{Z, Q, R}, H (X , R) = H (X , R), for every n. Directed simplicial complexes whose homology is isomorphic to that of the point are called acyclic. Definition 3.21 A directed simplicial complex X is acyclic if R, if n = 0, H (X , R) = 0, if n > 0. If R ∈{Z, Q, R}, we say that X is directionally acyclic if R, if n = 0, Dir H (X , R) = 0, if n > 0. Note that acyclic implies directionally acyclic, as directed homology is a submodule of the module of homology, for every n. Next, we show that a directed simplicial complex consisting of one m-dimensional simplex along with all of its faces (note that this is an ordered-set complex) is acyclic. Proposition 3.22 Let X be the directed simplicial complex consisting of the simplex (x , x ,..., x ) and all of its faces. Then X is acyclic. 0 1 m Proof As a consequence of how X is defined, if x ∈ C (X , R) is a chain so that x n 0 does not participate in any of its elementary n-chains (as there are no repetitions in the simplices of X), then x can be added as the first element of every elementary 123 A directed persistent homology theory for dissimilarity… v w u u 3 3 4 3 v v w w u u 1 2 1 2 1 2 X Y Z Dir Fig. 3 Directed homology detects directed 1-cycles modulo boundaries. In these examples, H (X , R) = 0 Dir Dir while H (Y , R) = H (Z , R) = R respectively generated by [w ,w ]+[w ,w ]+[w ,w ] and 1 2 2 3 3 1 1 1 [u , u ]+[u , u ]+[u , u ] (or [u , u ]+[u , u ]+[u , u ]+[u , u ]). To be a directed 1-cycle, the 1 2 2 4 4 1 1 2 2 3 3 4 4 1 directions of the involved 1-simplices must form circuit. This does not hold for (undirected) homology n-chain in x,givingusan (n + 1)-chain which we denote x x ∈ C (X , R).Simple 0 n+1 computations show that ∂(x x ) = x − x ∂(x ). 0 0 Now take x ∈ Z (X , R) any cycle and decompose it as x = x y + z, where x n 0 0 does not participate in any of the chains in either y or z. Using the formula above, ∂(x ) = ∂(x y + z) = y − x ∂(y) + ∂(z) = 0. 0 0 In particular, the chains in which x does not participate must add up to zero, namely y + ∂(z) = 0. Finally, consider the chain x z. We have that ∂(x z) = z − x ∂(z) = z + x y = x . 0 0 0 Thus, x ∈ Z (X , R) is homologous to zero, and the result follows. We now illustrate the ability of this homology theory to detect directed cycles. We do so through some simple examples. Example 3.23 Let R ∈{Z, Q, R}.Let X, Y and Z be the directed simplicial complexes represented in Fig. 3. These three complexes are connected, so their 0th homologies are R, both directed and undirected. Furthermore, neither X nor Y have k-simplices for k ≥ 2, so H (X , R) and H (Y , R) are trivial (zero) for every k ≥ 2. This in k k Dir Dir particular implies that H (X , R) and H (Y , R) are trivial as well. Although Z has k k one 2-simplex, it is not a 2-cycle, thus H (Z , R) is also trivial. To compute the first homology, we need to consider the 1-simplices and their differentials. We list them below. 123 D. Méndez,R.J.Sánchez-García X ∂ [v ,v ] v − v 1 2 2 1 [v ,v ] v − v 2 3 3 2 [v ,v ] v − v 1 3 3 1 Y ∂ [w ,w ] w − w 1 2 2 1 [w ,w ] w − w 2 3 3 2 [w ,w ] w − w 3 1 1 3 Z ∂ [u , u ] u − u 1 2 2 1 [u , u ] u − u 2 3 3 2 [u , u ] u − u 3 4 4 3 [u , u ] u − u 4 1 1 4 [u , u ] u − u 2 4 4 2 We start with X. Note that λ [v ,v ]+ λ [v ,v ]+ λ [v ,v ], λ ,λ ,λ ∈ R, 1 1 2 2 2 3 3 1 3 1 2 3 is a cycle for X if and only if λ = λ =−λ . Since there are no 2-cycles in X, 1 2 3 different 1-cycles cannot be homologous. We deduce that H (X , R) R. However, no 1-cycle can have non-negative coefficients for the three elementary 1-cycles, thus Dir H (X , R) ={0}. Regarding Y , note that μ [w ,w ]+ μ [w ,w ]+ μ [w ,w ], μ ,μ ,μ ∈ R, 1 1 2 2 2 3 3 3 1 1 2 3 is a cycle if and only if μ = μ = μ . Since there are no 2-simplices in Y , we deduce 1 2 3 that H (Y , R) = R. Furthermore, the coefficients can be simultaneously positive. Namely, [w ,w ]+[w ,w ]+[w ,w ] is a directed cycle, which generates the 1 2 2 3 3 1 Dir entirety of H (X , R). We deduce that H (X , R) = R. Note that X and Y have isomorphic homology rings, but their directed homology rings are different. Finally, for Z, similar computations show that Z (Z , R) is the free R-module generated by x =[u , u ]+[u , u ]+[u , u ]+[u , u ] and x =[u , u ]+ 1 1 2 2 3 3 4 4 1 2 1 2 Dir [u , u ]+[u , u ], both of which are directed cycles. In particular, Z (Z , R) consists 2 4 4 1 of linear combinations of those two cycles, with positive coefficients. Note, however, that generating sets of Z (X , R) containing fewer than two directed cycles exist. This shows that computing a generating set for the cycles may not be enough to compute the full set of directed cycles. Continuing with the homology computations, note that in this case we have a 2-simplex, y =[u , u , u ],for which ∂(y) =[u , u ]−[u , u ]+[u , u ]= 2 3 4 3 4 2 4 2 3 Dir x −x . Namely, x and x are homologous. Therefore, H (Z , R) = H (Z , R) = R, 1 2 1 2 1 generated by either x or x . In particular, both the directed and undirected homologies 1 2 of Y and Z are isomorphic, and we can see how 2-simplices can make directed cycles equivalent, as expected. 123 A directed persistent homology theory for dissimilarity… v v 1 2 Fig. 4 A directed simplicial complex with just two vertices can have a non-trivial 1-cycle More generally, if R ∈{Z, Q, R},a polygon will give rise to a non-trivial directed homology class in dimension 1 if, and only if, the cycle can be traversed following the direction of the edges. Proposition 3.24 Let X be a 1-dimensional directed simplicial complex with vertices v ,v ,...,v ,n ≥ 1, and with 1-simplices e ,e ,..., e where either e = (v ,v ) 0 1 n 0 1 n i i i +1 or e = (v ,v ),i = 0, 1,..., n, and we write v = v . Then, H (X , R) = R. If i i +1 i n+1 0 1 Dir R ∈{Z, Q, R}, then H (X , R) is non-trivial if and only if either e = (v ,v ) for i i i +1 Dir all i or e = (v ,v ) for all i, in which case H (X , R) R. i i +1 i Proof We first prove that H (X , R) = R.Let x = λ [e ] be a cycle. If x is 1 i i i =0 non-trivial, we can assume without loss of generality that λ = 0. We can further assume that e = (v ,v ). Thus, ∂(λ [e ]) = λ [v ]− λ [v ]. The only other edge in 0 0 1 0 0 0 1 0 0 which e participates is e . We deduce that λ = λ ,if e = (v ,v ),or λ =−λ ,if 1 1 1 0 1 1 2 1 0 e = (v ,v ). By proceeding iteratively, we deduce that λ = λ ,if e = (v ,v ), 1 2 1 i 0 i i i +1 or that λ =−λ ,if e = (v ,v ). It is now straightforward to check that, with λ i 0 i i +1 i i defined in terms of λ in such way, i = 1,..., n, x = λ [e ] is a non-trivial 0 i i i =0 cycle. Since there are no 2-chains, H (X , R) = R. Regarding the directed homology, note that if e = (v ,v ), the coefficients of 0 0 1 the cycle x can only be all positive if and only if e = (v ,v ). In such case, i i i +1 Dir Dir H (X , R) = H (X , R) R, and otherwise H (X , R) = 0. The case in which 1 1 e = (v ,v ) is analogous. 0 1 0 Remark 3.25 Note that Proposition 3.24 applies to polygons with only two vertices v and v (see Fig. 4), which are allowed in a directed simplicial complex. Indeed, an 1 2 immediate computation shows that [v ,v ]+[v ,v ] is a directed 1-cycle. This result 1 2 2 1 shows that directed homology is not a homotopy invariant for the geometric realization of directed simplicial complexes, as all of the polygons considered in Proposition 3.24 give rise to homotopic geometric realizations. We end this section by noting that, although our initial aim was a directed homology group able to detect directed 1-cycles, we have a directed homology theory defined in all dimensions. However, we can show that the directed homology is trivial in even dimensions other than 0. Proposition 3.26 Let X be a directed simplicial complex R ∈{Z, Q, R}. Then Dir H (X , R) ={0}, for every n ≥ 1. 2n Proof We will show that no non-trivial directed cycles may exist in such dimensions. In order to do so, consider the morphism of R-modules ϕ : C (X , R) −→ R n n [v ,v ,...,v ] −→ 1. 0 1 n 123 D. Méndez,R.J.Sánchez-García k k Thus, if x = λ x where x is an elementary n-chain, ϕ (x ) = λ . i i i n i i =0 i =0 Now assume that x is an elementary 2n-chain, for some n ≥ 1. In this case, 2n ∂(λ x ) = λ ∂(x ), where ∂(x ) consists of the sum of  = n elementary n-chains i i i i i 2n−1 with positive sign and  = n − 1 elementary n-chains with negative sign. Thus, ϕ (λ x ) = λ (n − (n − 1)) = λ . 2n−1 i i i i To prove our claim, let x = λ x ∈ Z (X , R) be a cycle where each x is i i 2n i i =0 an elementary 2n-cycle. Then, given that ∂(x ) = 0, ϕ ∂(x ) = λ = 0. If 2n−1 i i =0 x is directed, then λ ≥ 0, for all i, thus we deduce that λ = 0, for all i. Namely, the i i Dir trivial cycle is the only directed cycle, and H (X , R) ={0}. 2n Remark 3.27 The proof of this result is based on the fact that the number of positive coefficients in the boundary of an elementary n-chain is larger than the number of negative coefficients, if n is even. This may also be related to the fact that there is no obvious way to define directed n-cycles for n > 1. For the purposes of this article, it suffices to consider 1-cycles and H (X , R). However, note that non-trivial (homological) cycles do exist in all odd dimensions. For instance, the elementary (2n − 1)-chain obtained by repeating a vertex 2n times is always a cycle. 4 Persistent directed homology In this section, we introduce a theory of persistent homology for directed simplicial complexes which comes in two flavours: one takes into account the directionality of the complex, whereas the other one is analogous to persistent homology in the classical setting. We thus have, associated to the same filtration of directed simpli- cial complexes, two persistence modules which produce two different barcodes. Both persistent homology theories show stability (see Theorem 4.19) and they are closely related. Indeed, directed cycles are undirected cycles as well, thus every bar in a directed persistence barcode can be uniquely matched with a bar in the corresponding undirected one, although the undirected bar may be born sooner, see Proposition 4.7. 4.1 Persistence modules associated to a directed simplicial complex Let us begin by introducing filtrations of directed simplicial complexes. Definition 4.1 Let X be a directed simplicial complex (Definition 3.1). A filtration of X is a family of subcomplexes (X ) , T ⊆ R, such that if δ ≤ δ ∈ T , then X is δ δ∈T δ a subcomplex of X , and such that X =∪ X . Note that for δ ≤ δ , the inclusion δ δ∈T δ i : X → X is a morphism of directed simplicial complexes. δ δ We now introduce undirected persistence modules. Definition 4.2 Let (X ) be a filtration of a directed simplicial complex X and let δ δ∈T R be a commutative ring. The n-dimensional undirected persistence R-module of X is the persistence R-module ({H (X , R)}, {H (i )} . n δ n δ≤δ ∈T 123 A directed persistent homology theory for dissimilarity… The functoriality of H makes this a persistence module. Now, in order to retain the information on directionality, we take the submodule of directed classes. Definition 4.3 Let (X ) be a filtration of a directed simplicial complex X and δ δ∈T let R ∈{Z, Q, R}.The n-dimensional directed persistence R-module of X is the persistence R-module Dir Dir δ {H (X , R)}, {H (i )} , n n δ δ≤δ ∈T Dir δ where H (X , R) are defined as in Definition 3.6, i : X → X  are the inclusion δ δ δ δ δ Dir maps, and H (i ) are the restrictions of the maps H (i ) to directed homology, as n δ δ introduced in Proposition 3.11. Remark 4.4 Let (X ) be a filtration of a directed simplicial complex X and let δ δ∈T δ ≤ δ ∈ T.If R ∈{Z, Q, R}, by applying Proposition 3.11 to the map i : X → X , we have a commutative diagram H (i ) H (X , R) H (X  , R) n δ n δ Dir Dir H (X , R) H (X  , R). δ δ n  n Dir δ H (i ) n δ Dir In particular, the injections H (X , R)→ H (X , R) give rise to a monomor- δ n δ δ∈T phism of persistence modules. We can now introduce persistence diagrams and barcodes associated to filtrations of directed simplicial complexes. As these were only introduced for fields in Sect. 2.1, from this point on, we will assume that R is a field. In particular, when taking directed homology, we will assume that R ∈{Q, R}. Definition 4.5 Let (X ) be a filtration of a directed simplicial complex X where δ δ∈T T ⊆ R is finite. The persistence diagrams associated to the n-dimensional undi- rected and directed persistence R-modules of X are respectively denoted Dgm (X , R) Dir and Dgm (X , R). Similarly, the respective barcodes are denoted Pers (X , R) and Dir Pers (X , R). Dir Remark 4.6 Note that H (X , R) H (X , R) for every δ ∈ T , as shown in Propo- δ 0 δ sition 3.18. As a consequence, the 0-dimensional directed and undirected persistence R-modules of a directed simplicial complex X are isomorphic. In particular, they have the same persistence barcodes and diagrams, and they measure the connectivity of the simplicial complex at each stage of the filtration. 123 D. Méndez,R.J.Sánchez-García v v v v v v v 4 4 3 4 3 4 3 v v v v v v v v 1 2 1 2 1 2 1 2 Z Z Z Z 0 1 2 3 0123 0123 Fig. 5 A filtration of the directed simplicial complex Z in Example 3.23 (Fig. 3) and its associated undirected (bottom, left) and directed (bottom, right) 1-dimensional persistence barcodes. The shorter undirected bar (blue) is also directed, while the longer undirected bar (red) becomes directed at δ = 2 (colour figure online) The next result establishes the relation between the undirected and directed per- sistence barcodes and diagrams of a persistence module, and it is thus key to understanding the directed persistence barcodes. Proposition 4.7 Let (X ) be a filtration of directed simplicial complexes. Then, δ δ∈T Dir there is an injective matching sending each bar in Pers (X , R) to a bar in Pers (X , R) which contains it. Furthermore, matched bars must die at the same time. Proof This result is an immediate consequence of Remark 4.4 and Bauer and Lesnick (2014, Proposition 6.1). Note that a bar in the directed persistence barcode of a filtration may be born after the one it is matched with in the undirected one, and some bars in the undirected barcode may remain unmatched (see Examples below and Figs. 5 and 6). We now introduce some examples to illustrate Proposition 4.7 and the general behaviour of the directed persistence barcodes. Example 4.8 Let us illustrate how undirected and directed persistence modules and their barcodes can be different. First, consider the directed simplicial complexes X and Y in Fig. 3 (see Example 3.23). Regardless of the filtration chosen for X, the lack of directed cycles means that the 1-dimensional directed persistence module is trivial. However, at the end of the filtration, there is a cycle in homology, thus there is a bar in the undirected persistence barcode. In the case of Y , the only 1-cycle is directed, so the undirected and directed persistence modules associated to any filtration of Y are isomorphic, and different from those of X. Example 4.9 Consider now the directed simplicial complex Z in Fig. 3 (see Example 3.23). Let T ={0, 1, 2,... } and consider the filtration of Z given by (Z ) ,as δ δ∈T illustrated in Fig. 5, where Z = Z for every δ ≥ 3. The undirected and directed 1-dimensional persistence modules of this filtration are Dir not isomorphic. Indeed, in Z there is clearly an undirected cycle, whereas Z (Z , R) 1 1 Dir Dir is trivial, thus H (Z , R) ={0}. However, H (Z , R) H (Z , R), as both 1 1 2 2 1 1 vector spaces are generated by the classes of [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ] 1 2 2 3 3 4 4 1 and [v ,v ]+[v ,v ]+[v ,v ]. These two classes become equivalent in Z .The 2 4 4 1 1 2 3 undirected and directed 1-dimensional persistence barcodes of this filtration, shown in Fig. 5, illustrate how undirected classes may become directed. 123 A directed persistent homology theory for dissimilarity… v v v v v 4 4 3 4 3 v v v v v v 1 2 1 2 1 2 X X X 0 1 2 0123 0123 Fig. 6 A filtration of directed simplicial complexes and its associated undirected (bottom, left) and directed (bottom, right) 1-dimensional persistence barcodes. The shorter undirected bar (blue) is also directed, while the longer undirected bar (red) never becomes directed (colour figure online) The last example also shows an important difference between classical and directed persistence modules and barcodes. Namely, in the undirected setting, the addition of one simplex to the filtration can either cause the birth of a class in the dimension of the added simplex, or kill a class in the preceding dimension. This simple idea is in fact the basis of the Standard Algorithm for computing persistent homology. However, the addition of only one simplex to a directed simplicial complex can cause the birth of several classes in directed homology, as shown in the previous example at δ = 2, and also in the following example. Example 4.10 Let T ={0, 1, 2,... } and consider the filtration of simplicial complexes (X ) illustrated in Fig. 2 in the Introduction, where X = X for every δ ≥ 3. δ δ∈T δ 3 Dir It is clear that Z (X , R) is trivial for j = 0, 1, 2, whereas undirected cycles appear as early as X . However, by adding the edge from v to v , several directed 1 5 1 Dir cycles are born at once. Namely, Z (X , R) contains the cycles [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 2 2 3 3 4 4 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 2 2 4 4 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 2 2 3 3 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 3 3 4 4 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]. 1 3 3 5 5 1 Straightforward computations show that H (X , R) = R , and four of the cycles 1 3 Dir 4 above are linearly independent, thus H (X , R) = H (X , R) = R . Consequently, 3 1 3 at δ = 3 every class (including the birthing one) becomes directed. Our last example shows an undirected homology class that never becomes directed. Example 4.11 Let T ={0, 1, 2,... } and consider the filtration of simplicial complexes (X ) illustrated in Fig. 6, where X = X for δ ≥ 2. δ δ∈T δ 2 Dir Dir Clearly, Z (X , R) ={0} for j = 0, 1, whereas Z (X , R) contains the j 2 1 1 cycle [v ,v ]+[v ,v ]+[v ,v ]. The undirected class represented by this element 1 2 2 4 4 1 is thus directed, but there is a linearly independent class in undirected homology, [v ,v ]+[v ,v ]−[v ,v ], which never becomes directed. Its bar in the barcode is 3 2 2 4 3 4 thus unmatched. 123 D. Méndez,R.J.Sánchez-García 4.2 Directed persistent homology of dissimilarity functions In this section, we introduce the undirected and directed persistence diagrams and barcodes associated to dissimilarity functions (Definition 4.13) and prove their stability with respect to the bottleneck distance (Theorem 4.19). Let us begin by introducing the directed Rips filtration of directed simplicial com- plexes associated to a dissimilarity function (Turner 2019, Definition 16). Definition 4.12 Let (V , d ) be a dissimilarity function. The directed Rips filtration Dir of (V , d ) is the filtration of directed simplicial complexes R (V , d ) where V V δ∈R Dir (v ,v ,...,v ) ∈ R (V , d ) if and only if d (v ,v ) ≤ δ, for all 0 ≤ i ≤ j ≤ n. 0 1 n V δ V i j δ Dir Dir It is clearly a filtration with the inclusion maps i : R (V , d ) → R (V , d ) V δ V δ for all δ ≤ δ . Let us now introduce the persistent homology modules associated to such a filtra- tion. Assume that R ∈{Q, R}. Definition 4.13 Let (V , d ) be a dissimilarity function and consider its associated Dir directed Rips filtration R (V , d ) . For each n ≥ 0, the n-dimensional undi- δ∈R rected persistence R-module of (V , d ) is Dir δ H (V , d ) := {H (R (V , d ) , R)}, {H (i )} . n V n V δ n δ≤δ ∈R Similarly, the n-dimensional directed persistence R-module of (V , d ) is defined as Dir Dir Dir Dir δ H (V , d ) := {H (R (V , d ) , R)}, {H (i )} . V V δ n n n δ δ≤δ ∈R Remark 4.14 The persistence module H (V , d ) associated to the directed Rips n V filtration of (V , d ) is precisely the persistence module studied in Turner (2019, Section 5), hence the remarks made there hold for the undirected persistence mod- Dir ule. In particular, if (V , d ) is a (finite) metric space, R (V , d ) is the (classical) V V Vietoris-Rips filtration of (V , d ). Furthermore, in this case, it can easily be seen that Dir H (V , d ) = H (V , d ). Thus, these persistence modules generalize the persis- n V V tence modules associated to the Vietoris-Rips filtration of a metric space. Also note that Dir R (V , d ) is closed under adjacent repeats (Turner 2019, Definition 18). Namely, Dir Dir given (v ,v ,...,v ) ∈ R (V , d ) , (v ,v ,...,v ,v ,...,v ) ∈ R (V , d ) , 0 1 n V δ 0 1 i i n V δ for any i = 0, 1,..., n. Using this, it can be proven that the homology of this object must be trivial in dimensions larger than |V |+ 1. As R is a field and since V is finite, both of these persistence modules fulfil the assumptions in Sect. 2.1. Namely, their indexing sets can be chosen to be finite, cor- responding to the threshold values where new simplices are added to the simplicial complex. Furthermore, no simplex is added to the filtration until the threshold value reaches the minimum of the images of the dissimilarity function. Finally, and even Dir though the directed simplicial complex R (V , d ) may have infinite simplices due V δ to arbitrary repetitions of vertices being allowed, it always has a finite number of simplices in a given dimension n, thus its n-dimensional homology is always finite- dimensional. As a consequence, we can introduce the following. 123 A directed persistent homology theory for dissimilarity… Definition 4.15 Let (V , d ) be a dissimilarity function. For each n ≥ 0, the n- dimensional persistence diagrams associated to the persistence R-modules H (V , d ) n V Dir Dir and H (V , d ) are respectively denoted by Dgm (V , d ) and Dgm (V , d ). V V V n n n Similarly, their associated persistence barcodes are denoted by Pers (V , d ) and n V Dir Pers (V , d ). Of course, Proposition 4.7 holds for these barcodes, namely, every bar in Dir Pers (V , d ) can be uniquely matched with one in Pers (V , d ) which dies at the V n V same time, although the directed bar may be born later. We now use results from Sect. 2.2 to show that both these persistent homology constructions are stable. The proof is split in several lemmas. Let (V , d ) and (W , d ) be two dissimilarity functions on respective sets V and W and define η = 2d (V , d ), (W , d ) , where d is the correspondence distortion distance CD V W CD (Definition 2.9). By Proposition 2.11, we can find maps ϕ : V → W and ψ : W → V such that dis(ϕ), dis(ψ ), codis(ϕ, ψ ), codis(ψ, ϕ) ≤ η. To simplify notation in the Dir δ Dir δ proofs below, denote R (V , d ) = X and R (W , d ) = X , for all δ ∈ R. V δ W δ V W Lemma 4.16 For each δ ∈ R, the maps ϕ and ψ induce morphisms of directed sim- plicial complexes δ+η δ+η δ δ ϕ : X −→ X ψ : X −→ X δ δ V W W V x −→ ϕ(x ), x −→ ψ(x ). Proof Let us prove the statement for ϕ (the proof is analogous for ψ ). Let δ δ (x , x ,..., x ) be an n-simplex in X . Then d (x , x ) ≤ δ for all 1 ≤ i ≤ j ≤ n. 0 1 n V i j Since dis(ϕ) ≤ η, we have that, for all v ,v ∈ V , 1 2 d (v ,v ) − d ϕ(v ), ϕ(v ) ≤ η. V 1 2 W 1 2 Choosing v = x and v = x ,wehave 1 i 2 j d ϕ(x ), ϕ(x ) ≤ η + d (x , x ) ≤ δ + η for all 1 ≤ i ≤ j ≤ n. W i j V i j δ+η Consequently, ϕ(x ), ϕ(x ),...,ϕ(x ) ∈ X and the result follows. 0 1 n δ δ δ Lemma 4.17 For δ ≤ δ ∈ R consider the inclusion maps i : X → X and δ V V δ δ δ j : X → X . The following are commutative diagrams of morphisms of directed W W simplicial complexes: δ δ i j δ δ δ δ δ δ X X X X V V W W ϕ ϕ ψ ψ δ δ δ δ δ +η δ +η j i δ+η  δ+η δ+η δ +η δ+η δ +η X X , X X . W V W V 123 D. Méndez,R.J.Sánchez-García Proof We prove that the first diagram is commutative (the proof for the second diagram δ δ is analogous). Let x ∈ V . Since i is an inclusion, (ϕ  ◦ i )(x ) = ϕ  (x ) = ϕ(x ). δ δ δ δ δ +η δ +η δ +η Similarly, since j is an inclusion, ( j ◦ ϕ )(x ) = j ϕ(x ) = ϕ(x ). δ+η δ+η δ+η Lemma 4.18 With the same notation as in Lemmas 4.16 and 4.17, for every δ ∈ R,the following diagrams of morphisms of directed simplicial complexes induce commutative diagrams on homology. δ+2η δ+2η i j δ δ+2η δ δ+2η δ δ X X X X V W V V ϕ ψ ψ ϕ δ δ+η δ δ+η δ+η δ+η X , X . W W Proof Again, we only prove the result for the first diagram, as the proof for the second diagram is analogous. We show that it is commutative up to homotopy by showing δ+2η that the maps i and ψ ◦ ϕ satisfy the hypothesis of Lemma 3.14. δ+η δ Take a simplex σ = (x , x ,..., x ) ∈ X , thus d (x , x ) ≤ δ, for all 0 1 n V i j δ+2η δ+2η 1 ≤ i ≤ j ≤ n. On the one hand, i is an inclusion, so i (σ ) = σ.On δ δ the other hand, since ψ ◦ ϕ is a morphism of directed simplicial complexes δ+η δ δ+2η (Definition 3.8), ψ(ϕ(x )), ψ (ϕ(x )),...,ψ(ϕ(x )) ∈ X . This implies that 0 1 n d ψ(ϕ(x )), ψ (ϕ(x )) ≤ δ + 2η, for all 1 ≤ i ≤ j ≤ n. V i j Now recall that codis(ϕ, ψ ) ≤ η, thus for all v ∈ V and w ∈ W , d v, ψ(w) − d ϕ(v), w ≤ η. V W Then, for 1 ≤ i ≤ j ≤ n, by taking v = x and w = ϕ(x ), i j d x ,ψ(ϕ(x )) ≤ η + d ϕ(x ), ϕ(x ) ≤ δ + 2η. V i j W i j As a consequence of the inequalities above, we have shown that for every 0 ≤ i ≤ n, δ+2η x , x ,..., x ,ψ(ϕ(x )), ψ (ϕ(x )),...,ψ(ϕ(x )) ∈ X . 0 1 i i i +1 n δ+2η Therefore, the maps i and ψ ◦ ϕ satisfy the hypothesis of Lemma 3.14, δ+η δ thus they induce the same map on homology. The result follows. We now have everything we need to prove the stability results. Theorem 4.19 Let R ∈{Q, R}. Let (V , d ) and (W , d ) be two dissimilarity func- V W tions on finite sets V and W . Then, for all n ≥ 0, d Dgm (V , d ), Dgm (W , d ) ≤ 2d (V , d ), (W , d ) B V V CD V W n n 123 A directed persistent homology theory for dissimilarity… and Dir Dir d Dgm (V , d ), Dgm (W , d ) ≤ 2d (V , d ), (W , d ) . B V V CD V W n n Proof Define η = 2d (V , d ), (W , d ) . By Theorem 2.7, it suffices to show that CD V W Dir the persistence modules H (V , d ) and H (W , d ) (respectively H (V , d ) and n V n W V Dir H (W , d ))are η-interleaved. Comparing Definition 2.5 (for ε = η) and Lemmas 4.16, 4.17 and 4.18, the result follows by using the functoriality of homology and, in the directed case, Proposition 3.11. Remark 4.20 Recall from Remark 4.14 that the persistence module H (V , d ) asso- n V ciated to the directed Rips filtration of (V , d ) is the persistence module studied in Turner (2019, Section 5). Thus, the remarks made in Turner (2019, Section 5.2) hold for these persistence modules, meaning that a result analogous to Theorem 4.19 would not hold if we were using ordered-set complexes instead of directed simplicial com- plexes, as mentioned at the beginning of Sect. 3. This justifies our definition of directed simplicial complex (Definition 3.1). 5 Computational challenges Let (V , d ) be a dissimilarity function in a finite set V . In this section, we study the Dir algorithmic aspects of computing both H (V , d ) and H (V , d ). For simplicity, n V V we assume that R = R. The computation of the persistence barcodes for the undirected persistent homology H (V , d ) can be made using the Standard Algorithm (Edelsbrunner et al. 2001; n V Zomorodian and Carlsson 2005), which also applies to ordered simplicial complexes. Dir δ Write R (V , d ) = X , δ ∈ R for the directed Rips filtration (Definition 4.12). V δ As V is finite, the number of simplices in X up to dimension n + 1 is finite, say N . We can list them {σ ,σ ,...,σ } in such a way that i < j if σ is a (proper) face of 1 2 N i σ ,orif σ appears ‘later’ in the filtration, thus having a compatible ordering. Then, j j we can represent the differential using a sparse N × N matrix M over R, where the (i , j )-entry M is the coefficient of σ in the differential of σ . Such matrix M is an ij j i upper-triangular matrix. At this stage, the Standard Algorithm can be used to compute Dgm (V , d ) by reducing M at once using column operations. Dir We cannot directly adapt the Standard Algorithm to compute Dgm (V , d ):we can try to find reduction operations that generate directed cycles using operations with positive coefficients only, but this is not enough to guarantee that we find all the possible directed cycles. For instance, once one column has been eliminated, we may be preventing the participation of the simplex it represents in directed cycles not involving the columns we are using in the elimination. To avoid such a problem, we would need to keep track of every possible way in which column eliminations can be produced using only positive coefficients. This is very similar to the double description algorithm for the computation of extreme rays of a polyhedral cone (Fukuda et al. 1995; Motzkin Dir et al. 1953). Indeed, the directed cycles Z (X , R) are the non-negative solutions to 123 D. Méndez,R.J.Sánchez-García the linear equation Mx = 0, that is, x = (x , x ,..., x ) ∈ R | Mx = 0, x ≥ 0 , (5.1) 1 2 N where a vector x = (x , x ,..., x ) ∈ R represents the cycle x σ , M is the 1 2 N i i i =1 matrix corresponding to the differential, and x ≥ 0 means that every entry in x is non-negative, that is, x ≥ 0 for all i. Equivalently, directed cycles are the extreme rays of the unbounded polyhedral cone consisting of the intersection of the solution space to the linear system Mx = 0 with the first quadrant of R . The enumeration of the extreme rays (generators of the solution set) of a polyhedral cone such as Eq. (5.1) is a well-known albeit hard problem in computational geometry. Several algorithms, including the double description algorithm, and modifications thereof, exist (Terzer 2009). Note, however, that we do not require the computation of all the directed cycles. Indeed, it would be enough to calculate them up to homology, which suggests that efficient computations may be possible. In summary, although important challenges remain on the computation of the per- sistence diagrams associated to directed persistence modules, our objective was to provide the necessary groundwork for the study of directed persistent homology of asymmetric data sets, and we are hopeful that we are opening up an exciting new approach for future research into the topological properties of asymmetric data sets. Acknowledgements This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1 The first author was partially supported by Ministerio de Economía y Competitividad (Spain) Grants Nos. MTM2016-79661-P and MTM2016-78647-P and by Ministerio de Ciencia, Innovación y Uni- versidades (Spain) Grant No. PGC2018-095448-B-I00. Funding Open access funding provided by FCT|FCCN (b-on). Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Appendix A: Directed persistent homology via semiring homology As mentioned in the Introduction, our ideas of directed homology are based on homology with semiring coefficients, which can be introduced for directed simplicial complexes using chain complexes of semimodules and their associated homologies, building on work by Patchkoria (1977, 2014). In this appendix, we introduce this homology theory and explain how it relates to the modules of directed homology introduced in Definition 3.6. We remark that other perspectives of directionality in topological spaces could be used instead. One such important perspective is that of directed algebraic topology, 123 A directed persistent homology theory for dissimilarity… which is extensively used in the study of concurrency in computer science (Fajstrup et al. 2016). In this approach, directionality information is encoded by means of distin- guished paths, forming a category dTop where objects are topological spaces together with their directed paths (Fajstrup et al. 2016, Definition 4.1). These spaces can be modelled using cubical complexes, and their homology including directional informa- tion can be computed using chain complexes over preordered abelian groups (Grandis 2009, Chapter 2). In this theory, the homology groups of a space are those of ordinary homology, but directionality information is recorded in the preorder given to those groups. Therefore, it would be interesting for future work to study potential definitions of persistent homology over preordered abelian groups. Nonetheless, we focus here on homology over semimodules, since it provides us with the information we require while keeping the exposition simpler. A.1 Semirings, semimodules and their completions Let us begin by introducing the algebraic structures that we need, following (Golan 1999). Definition A.1 A semiring = ( , +, ·) is a set together with two operations such that • ( , +) is an abelian monoid whose identity element we denote 0 , • ( , ·) is a monoid whose identity element we denote 1 , •· is distributive with respect to + from either side, • 0 · λ = λ · 0 = 0 , for all λ ∈ . A semiring is commutative if ( , ·) is a commutative monoid, and cancellative if ( , +) is a cancellative monoid, that is, λ + λ = λ + λ implies λ = λ for all λ, λ ,λ ∈ .Asemiring is a semifield if every 0 = λ ∈ has a multiplicative inverse. A semiring is zerosumfree if no element other than 0 has an additive inverse. Example A.2 Every ring is a semiring. The non-negative integers, rationals and reals + + with their usual operations, respectively denoted N, Q and R , are cancellative zerosumfree commutative semirings which are not rings. Definition A.3 Let be a semiring. A (left) -semimodule is an abelian monoid (A, +) with identity element 0 together with a map × A → A which we denote (λ, a) → λa and such that for all λ, λ ∈ and a, a ∈ A, • (λλ )a = λ(λ a), • λ(a + a ) = λa + λa , • (λ + λ )a = λa + λ a, • 1 a = a, • λ0 = 0 = 0 λ. A A A non-empty subset B of a left -semimodule A is a subsemimodule of A if B is closed under addition and scalar multiplication, which implies that B is a left -semimodule with identity element 0 ∈ B.If A and B are -semimodules, a -homomorphism is a map f : A → B such that for all a, a ∈ A and for all λ ∈ , 123 D. Méndez,R.J.Sánchez-García • f (a + a ) = f (a) + f (a ), • f (λa) = λ f (a). Clearly, f (0 ) = 0 . A B A -semimodule A is cancellative if a + a = a + a implies a = a , and zerosumfree if a + a = 0 implies a = a = 0, for all a, a , a ∈ A. The Cartesian product of -semimodules is a -semimodule. The definition of direct sum of modules can be generalized to the framework of -semimodules, and a finite direct sum of -semimodules is isomorphic to the Cartesian product of the same -semimodules. Quotient -semimodules can be defined using congruence relations. Definition A.4 Let A be a left -semimodule. An equivalence relation ρ on A is a -congruence if, for all a, a ∈ , and all λ ∈ , • if a ∼ a and b ∼ b , then (a + b) ∼ (a + b ), and ρ ρ ρ • if a ∼ a , then λa ∼ λa . ρ ρ If ρ is a -congruence relation on A and we write a/ρ for the class of an element a ∈ A, then A/ρ ={a/ρ | a ∈ A} inherits a -semimodule structure by setting (a/ρ) + (a /ρ) = (a + a )/ρ and λ(a/ρ) = (λa)/ρ, for all a, a ∈ A and λ ∈ . The left -semimodule A/ρ is called the factor semimodule of A by ρ. Note that the quotient map A → A/ρ is a surjective -homomorphism. If B is a subsemimodule of a -semimodule A, then it determines a -congruence ∼ by setting a ∼ a if there exist b, b ∈ B such that a + b = a + b . Classes in B B this quotient are denoted a/B, and the factor semimodule is denoted A/B.If is a ring, this is just the usual quotient module of A by B. Example A.5 The following are examples of factor semimodules. • If is a ring and B ≤ A are -modules, the factor semimodule A/B is the usual module quotient of A by B. • In the N-module N×N, consider the relation ρ such that (u,v) ∼ (x , y) if u +y = v + x. It is immediate to show that ρ is a N-congruence, and the factor semimodule (N × N)/ρ is isomorphic to Z. Indeed, consider the map (N × N)/ρ → Z taking (u,v) → u −v. It is clearly a map of N-semimodules, (u,v) ∼ (x , y) ⇔ u + y = v + x ⇔ u − v = x − y shows that it is well-defined and injective, and it is clearly surjective. • Also in N×N, consider the submodule B ={0}×N. Then (u,v) ∼ (x , y) if there exist (0, n), (0, m) ∈{0}×N such that (u,v)+(0, n) = (x , y)+(0, m).Fromthis, we deduce that (u,v) ∼ (x , y) if and only if u = x. Namely, (N × N)/({0}× N) is isomorphic to N. Clearly, if is a semiring then it is a -semimodule over itself. The idea of free -semimodules comes about naturally just as in the case of rings, and we will make extensive use of these objects. Definition A.6 Let be a semiring, A aleft -semimodule and V ={v ,v ,...,v } 1 2 n a finite subset of A.The set V is a generating set for A if every element in A is a linear combination of elements of V.The rank of a -semimodule A, denoted rank(A),is 123 A directed persistent homology theory for dissimilarity… the least n for which there is a set of generators of A with cardinality n, or infinity, if not such n exists. The set V is linearly independent if for λ ,λ ,...,λ ∈ and μ ,μ ,...,μ ∈ 1 2 n 1 2 n n n such that λ v = μ v , then λ = μ ,for i = 1, 2,..., n, and it is called i i i i i i i =1 i =1 linearly dependent otherwise. We call V is a basis of A if it is a linearly independent generating set of A. The -semimodule A is a free -semimodule if it admits a basis V . We denote a free -semimodule with basis {v ,v ,...,v } by (v ,v ,...,v ). 1 2 n 1 2 n Remark A.7 It is easy to check that free -semimodules over a cancellative semiring are themselves cancellative, as are subsemimodules of cancellative -semimodules. The quotient of a cancellative -semimodule over any of its subsemimodules is also cancellative, see Golan (1999, Proposition 15.24). Note that if is a ring, free -semimodules are just free -modules. Thus, as not every module over a ring is free, clearly not every -semimodule is free. On the other hand, (the direct sum, or product, of n copies of ) is clearly a free - semimodule, for every n ≥ 1. Another key property is that if A is a free -semimodule with basis V and B is another -semimodule, each map V → B uniquely extends to a -homomorphism A → B,see Golan(1999, Proposition 17.12). Remark A.8 If is a commutative semiring and A is a -semimodule admitting a finite basis, every basis has the same cardinality, which coincides with the rank of A (Tan 2014, Theorem 3.4). In fact, for every integer n > 0, every finitely generated n + n subsemimodule of N has a unique basis, whereas basis for semimodules of (R ) are unique up to non-zero multiples, see e.g. Kim and Roush (1980, Theorem 2.1). We finish this section by extending the Grothendieck construction from abelian monoids to semirings and semimodules. Definition A.9 Let M be an abelian monoid. Consider the congruence relation ρ in M × M defined as (u,v) ∼ (x , y) ⇔ there exists z ∈ M such that u + y + z = v + x + z. Let [u,v] denote the equivalence class of (u,v). Then M × M /ρ becomes a group with the componentwise addition. This group is called the Grothendieck group or group completion of M. There is a canonical homomorphism of monoids k : M → K (M ) defined as k (x ) =[x , 0].If M is a cancellative monoid, then k is injective. M M Given an element [u,v]∈ K (M ) we can interpret u as its positive part and v as its negative part, and the relation ∼ becomes the obvious one. The identity element is then 0 =[x , x ] for any element x ∈ M, and the inverse of [x , y] is [y, x ] for any x , y ∈ M. If f : M → N is a morphism of monoids, there is a morphism of groups K ( f ) : K (M ) → K (N ) that takes [u,v] to [ f (u), f (v)]. In fact, K is a functor from the category of abelian monoids to the category of abelian groups. Furthermore, K ( f ) ◦ k = k ◦ f . M N 123 D. Méndez,R.J.Sánchez-García ∼ ∼ As shown in Example A.5, K (N) Z. Similarly, one can show that K (R ) R. = = Crucially for us, this construction can be extended to semirings and semimodules. Definition A.10 Let be a semiring. The group completion K ( ) becomes a ring with the operation [x , x ]·[y , y ]=[x y + x y , x y + x y ], and k : → K ( ) 1 2 1 2 1 1 2 2 2 1 1 2 is in fact a morphism of semirings, which is injective if is cancellative. We call K ( ) the ring completion of .If f : → is a morphism of semirings, the map K ( f ) : K ( ) → K ( ) as introduced in Definition A.9 is a morphism of rings, and K is a functor from the category of semirings to the category of rings. If A is a -semimodule, then the abelian group K (A) together with the operation K ( ) × K (A) → K (A) given by [λ ,λ ][a , a ]=[λ a + λ a ,λ a + λ a ] is 1 2 1 2 1 1 2 2 1 2 2 1 a K ( )-module, the K ( )-module completion of A.Again,if f : A → B is a - morphism, the map K ( f ) : K (A) → K (B) becomes a morphism of K ( )-modules. Furthermore, if A is a free -semimodule with basis {v | i ∈ I }, it becomes immediate that K (A) is a free K ( )-module with basis [v , 0]| i ∈ I }. A.2 Chain complexes of semimodules Now that we have introduced the necessary algebraic structures, we move on to chain complexes of semimodules over semirings, following Patchkoria (2014). This theory is a natural generalization of the classical theory of chain complexes of modules and, in fact, they give rise to the same cycles, boundaries and homologies when the semimodules are modules over a ring. In order to introduce chain complexes in the context of semimodules an immediate problem arises: alternating sums cannot be defined as elements in a semimodule may not have inverses. The solution is to use two maps, a positive and negative part, for the differentials. Definition A.11 Let be a semiring and consider a sequence of -semimodules and homomorphisms indexed by n ∈ Z n+1 n − − − X : ··· ⇒ X ⇒ X ⇒ X ⇒ ··· . n+1 − − n − n−1 − − ∂ ∂ n+1 + − We say that X ={X ,∂ ,∂ } is a chain complex of -semimodules if n n + + − − + − − + ∂ ∂ + ∂ ∂ = ∂ ∂ + ∂ ∂ . n n+1 n n+1 n n+1 n n+1 As in the classical case, chain complexes of -semimodules give rise to a - semimodule of homology. + − Definition A.12 Let X ={X ,∂ ,∂ } be a chain complex of -semimodules. The n n -semimodule of cycles of X is + − Z (X, ) ={x ∈ X | ∂ (x ) = ∂ (x )}. n n n n 123 A directed persistent homology theory for dissimilarity… The nth homology of X is then the quotient -semimodule H (X, ) = Z (X, )/ρ (X, ) n n n where ρ (X, ) is the following -congruence relation on Z (X, ): n n + − x ∼ y ⇔∃u,v ∈ X s.t. x + ∂ (u) + ∂ (v) ρ (X , ) n+1 n n+1 n+1 + − = y + ∂ (v) + ∂ (u). n+1 n+1 We will omit from now on the coefficient semiring from the notation when it is clear from the context. Remark A.13 The definition of cycle is a direct generalization of the classical defini- tion. For the boundary relation, note that we may need two different chains u and v in order to establish two classes as homologous. Intuitively, these two classes are the ‘positive’ and ‘negative’ part of w, where x = y + ∂(w) in the classical setting. + − Remark A.14 If X ={X ,∂ ,∂ } is a chain complex of -semimodules, then n n + − + − K (∂ )−K (∂ ) K (∂ )−K (∂ ) n+1 n+1 n n ... −→ K (X ) − −−−−−−−−−→ K (X ) − −−−−−−−→ K (X ) −→ ··· n+1 n n−1 is a chain complex of K ( )-modules. If furthermore the -semimodules X are cancellative for all n, the converse is also true. If is a ring, then K ( ) = and the functor K acts as the identity on -modules. + − Therefore, X ={X ,∂ ,∂ } is a chain complex of -semimodules if and only if n n + − + − ∂ −∂ ∂ −∂ n+1 n+1 n n ··· −→ X − −−−−− → X − −−− → X −→ ··· n+1 n n−1 is a chain complex of -modules. In this case, it is clear that the homology semi- modules introduced in Definition A.12 are precisely the usual homology modules of X. We remark that a similar idea of decomposition of differentials in a positive and negative components has been used by Steiner to relate ω-categories with the so-called augmented directed complexes (Steiner 2004). These structures share, in fact, a deep similarity with Patchkoria’s chain complexes with semiring coefficients. We now turn our attention to maps between complexes. + −   + − Definition A.15 Let X ={X ,∂ ,∂ } and X ={X ,∂ ,∂ } be two chain com- n n n n n plexes of -semimodules. A sequence f ={ f } of -homomorphisms f : X → X n n n is said to be a morphism from X to X if + − − + ∂ f + f ∂ = ∂ f + f ∂ . n n−1 n n−1 n n n n It is clear that for such a map f Z (X ) is a -subsemimodule of Z (X ). Fur- n n n thermore, if X and X are cancellative -semimodules, f is also compatible with 123 D. Méndez,R.J.Sánchez-García the congruence relations ρ (X ) and ρ (X ), so it induces a map n n H ( f ) : H (X ) → H (X ). n n n It is then easy to check that homology H is a functor from the category of chain complexes of cancellative -semimodules and morphisms to the category of graded -semimodules. + − Remark A.16 Let be a semiring and let {X ,∂ ,∂ } be a chain complex of - n n semimodules. The family of canonical maps k : X → K (X ) givesrisetoa X n n + − + − morphism from {X ,∂ ,∂ } to K (X ), K (∂ ), K (∂ ) and, therefore, to a mor- n n n n n n phism of -semimodules H (k ) : H (X ) → H K (X ) , X n n which takes the class of x to the class of [x , 0]. Furthermore, if the -semimodules X are cancellative, H (k ) is injective, which in particular implies that H (X ) is n n X n a cancellative -semimodule. Also note that, by Remark A.14, the homology of + − + − K (X ), K (∂ ), K (∂ ) and the usual homology of K (X ), K (∂ ) − K (∂ ) are n n n n n n isomorphic as K ( )-modules. Finally, we discuss chain homotopies. + − Definition A.17 Let f ={ f } and g ={g } be morphisms from X ={X ,∂ ,∂ } to n n n n n + − X ={X ,∂ ,∂ }. We say that f is homotopic to g if there exist -homomorphisms n n n + − s , s : X → X such that n n n+1 + − − + − + + − ∂ s + ∂ s + s ∂ + s ∂ + g n n n n n+1 n+1 n−1 n−1 + + − − + + − − = ∂ s + ∂ s + s ∂ + s ∂ + f , n+1 n n+1 n n−1 n n−1 n + − for all n. The family {s , s } is called a chain homotopy from f to g, and we write n n + − (s , s ) : f  g. We then have the following result. Proposition A.18 (Patchkoria 2014, Proposition 3.3) Let f , g : X → X be morphisms between chain complexes of cancellative -semimodules. If f is homotopic to g, then H ( f ) = H (g). n n Homotopy equivalences are defined in the usual way and they induce isomorphisms on homology. We finish with a remark that the homotopy of maps behaves well with respect to semimodule completion. Remark A.19 If a morphism f : X → X is homotopic to g : X → X , then K ( f ) : K (X ) → K (X ) is homotopic to K (g) : K (X ) → K (X ). Furthermore, if both X and X are cancellative -semimodules, for all n, and X is a free - n n semimodule for all n, the converse is also true. 123 A directed persistent homology theory for dissimilarity… A.3 Chain complexes of semimodules of directed simplicial complexes In this section, we introduce chain complexes of semimodules and homology semi- modules of directed simplicial complexes, Definition 3.1. The results introduced here directly generalize those in Sect. 3. Definition A.20 The n-dimensional chains of a directed simplicial complex X are defined as the elements of the free -semimodule generated by (i.e. with basis) the n-simplices, C (X, ) = {[x , x ,..., x ]| (x , x ,..., x ) ∈ X } . n 0 1 n 0 1 n We call the elements [x , x ,..., x ]∈ C (X, ), (x , x ,..., x ) ∈ X, elementary 0 1 n n 0 1 n n-chains. Remark A.21 Note that if is a cancellative semiring, the -semimodule C (X, ) is cancellative for every n, as it is a free semimodule over a cancellative semiring. We now define the positive and negative differentials on C (X, ). Definition A.22 Let X be a directed simplicial complex. For each n > 0, we define + − morphisms of -semimodules ∂ ,∂ : C (X, ) → C (X, ) by n n−1 n n ∂ ([x , x ,..., x ]) = [x , x ,..., x ,..., x ], and 0 1 n 0 1 2i n i =0 n−1 ∂ ([x , x ,..., x ]) = [x , x ,..., x  ,..., x ]. 0 1 n 0 1 2i +1 n i =0 + − For n = 0, let ∂ ,∂ : C (X, ) →{0} be the trivial maps, by definition. 0 0 Proposition A.23 Let X be a directed simplicial complex and be a cancellative + − semiring. Then {C (X, ), ∂ ,∂ } is a chain complex of modules of -semimodules. n n Proof By Remark A.21,the -semimodule C (X, ) is cancellative for every n. + − Thus, by Remark A.14, it is enough to prove that K C (X, ) , K (∂ ) − K (∂ ) n n is a chain complex of K ( )-modules. Note that K C (X, ) is a free K ( )-module whose basis is given by elements [x , x ,..., x ], 0 such that [x , x ,..., x ] is an 0 1 n 0 1 n + − elementary n-chain. Thus, it suffices to show that the composition K (∂ ) − K (∂ ) ◦ n n + − K (∂ ) − K (∂ ) is trivial (zero) on these elements. Since n−1 n−1 + − i K (∂ ) − K (∂ ) [x , x ,..., x ,..., x ], 0 = (−1) [x , x ,..., x ], 0 , 0 1 i n 0 1 n n n i =0 the proof is now a straightforward computation analogous to the standard one for chain complexes in simplicial homology. 123 D. Méndez,R.J.Sánchez-García Proposition A.23 allows us to define the homology of a directed simplicial complex with coefficients on a semiring. Definition A.24 Let X be a directed simplicial complex and n ≥ 0 and a cancellative semiring. The n-dimensional homology of X, written H (X, ),isthe nth homology + − -semimodule of the chain complex {C (X, ), ∂ ,∂ }. n n + + Remark A.25 If is a ring, by defining ∂ = ∂ −∂ , {C (X, ), ∂} is a chain complex in the usual sense. In fact, it is precisely the chain complex introduced in Definition 3.3. Thus, as a consequence of Remark A.14, the homology theory introduced in Definition A.24 coincides with the one introduced in Definition 3.4. Therefore, homology with semiring coefficients of directed simplicial complexes is a non-trivial generalization of homology with ring coefficients. The following result will become useful later on, so we recorded it here. Proposition A.26 Let be a cancellative semiring and X a directed simplicial com- + − plex. The chain complexes of K ( )-semimodules K C (X, ) , K (∂ ), K (∂ ) n n + − and C X , K ( ) ,∂ ,∂ are isomorphic. n n Proof Take n ≥ 0 an integer. By definition, C X , K ( ) is the free K ( )-module over the elementary n-chains [x , x ,..., x ]. Similarly, K C (X, ) is a free K ( )- 0 1 n n module with a basis given by the elements [x , x ,..., x ], 0 where [x , x ,..., x ] 0 1 n 0 1 n is an elementary n-chain. Consider the K ( )-morphisms α : C X , K ( ) −→ K C (X, ) β : K C (X, ) −→ C X , K ( ) n n n n n n [x , x ,..., x ] −→ [x , x ,..., x ], 0 , [x , x ,..., x ], 0 −→ [x , x ,..., x ]. 0 1 n 0 1 n 0 1 n 0 1 n Immediate computations show that the families of K ( )-morphisms {α } and {β } n n are morphisms of K ( )-semimodules which are inverses of each other, and the claim follows. A.4 Functoriality of semiring homology In this section, we prove that homology is a functor from the category of directed simplicial complexes to the category of graded -semimodules. We also show that two morphisms allowing for the construction of the prism operator must induce the same map on homology, a result we will need to prove our persistent homology stability result. Recall the definition of morphism of directed simplicial complexes, Definition 3.8. Definition A.27 Let f : X → Y be a morphism of directed simplicial complexes. Then f induces morphisms of -semimodules C ( f ) ={C ( f )}, C ( f ) : C (X, ) −→ C (Y , ) n n n [x , x ,..., x ] −→ [ f (x ), f (x ), . . . , f (x )]. 0 1 n 0 1 n We will often abbreviate C ( f ) = f . n n 123 A directed persistent homology theory for dissimilarity… Proposition A.28 If f : X → Y is a morphism of directed simplicial complexes, the family of maps {C ( f )} is a -homomorphism of chain complexes. Therefore, it induces a map H ( f ) : H (X, ) → H (Y , ). n n n + + − − Proof Simple computations show that f ∂ = ∂ f and that f ∂ = ∂ f . n−1 n n−1 n n n n n Remark A.29 As a consequence of Proposition A.26,themap C ( f ) : C X , K ( ) → n n C Y , K ( ) is precisely K C ( f ) : K C (X, ) → K C (Y , ) . n n n n Corollary A.30 Homology is a functor from the category of directed simplicial com- plexes to the category of graded -semimodules. In particular, isomorphic directed simplicial complexes have isomorphic homologies. Note further that when is a ring, the homology functor is precisely the functor introduced in Sect. 3.2. We finish this section by showing a sufficient condition for two morphisms to induce the same map on homology. We will need this result to prove that our definition of persistent homology is stable. Lemma A.31 Let be a cancellative semiring. Let X and Y be two directed simplicial complexes and let f , g : X → Y be morphisms of directed simplicial complexes such that if (x , x ,..., x ) ∈ X, then f (x ), f (x ),..., f (x ), g(x ), . . . , g(x ) ∈ Y 0 1 n 0 1 i i n for every i = 0, 1,..., n. Then, H ( f ) = H (g) for every n ≥ 0. n n Proof For x =[x , x ,..., x ]∈ C (X, ) an elementary n-chain, define 0 1 n n s [x , x ,..., x ]= [ f (x ), . . . , f (x ), g(x ),..., g(x )], and 0 1 n 0 2i 2i n i =0 n−1 s [x , x ,..., x ]= [ f (x ),..., f (x ), g(x ). ..., g(x )]. 0 1 n 0 2i +1 2i +1 n i =0 Now recall that since is cancellative, the canonical map k : C (X, ) → n n + − K C (X, ) is injective for all n. We will show that (s , s ) : C ( f )  C (g) by proving that both sides of the equality in Definition A.17 have the same images through k . We will also make use of the isomorphism K C (X, ) = C X , K ( ) estab- n n n lished in Proposition A.26, and that for any -morphism h : C (X, ) → C (X, ), n n k ◦ h = K (h) ◦ k . n n By abuse of notation, we write k (x ) = x =[x , x ,..., x ]∈ C X , K ( ) . n 0 1 n n + − + − Denote ∂ = K (∂ ) − K (∂ ) and s = K (s ) − K (s ). Then, n n n n n n ∂ (x ) = (−1) [x , x ,..., x ,..., x ], s (x ) n 0 1 i n n i =0 = (−1) f (x ), f (x ),..., f (x ), g(x ),..., g(x ) . 0 1 j j n j =0 123 D. Méndez,R.J.Sánchez-García Note that, by appropriately grouping the terms in the equality in Definition A.17,it suffices to show that K (g ) − K ( f ) and ∂ s + s ∂ have the same image on x. n n n+1 n n−1 n These computations are analogous to the ones in Lemma 3.14. A.5 Comparison to directed (persistent) homology In Sect. 3, directed homology modules were introduced by taking submodules of homology generated by cycles with only non-negative coefficients on the elementary chains. This was done for the rings Z, Q, and R. These three rings have in common that they are the Grothendieck’s completion of a cancellative, zerosumfree semiring. + + Namely, Z = K (N), Q = K (Q ), R = K (R ). It turns out that homology with coefficients in this class of semirings has the ability to detect directed cycles, and what we have introduced in the main body of this article is a particular case of this fact. + + Proposition A.32 Let ∈{N, Q , R } and let X be a directed simplicial complex. Dir Consider Z X , K ( ) the set of directed cycles introduced in Definition 3.6. Then, Dir Z X , K ( ) = Z (X, ) as -semimodules. Proof First note that for the considered semirings, can be seen as the non- negative elements of K ( ). Using this identification, it is straightforward to show Dir that Z X , K ( ) is a -semimodule. In particular, linear combinations of directed cycles with non-negative coefficients are directed cycles. By abuse of notation, we will denote elementary n-chains in the same way in C (X, ) and in C X , K ( ) . Assume that λ c ∈ Z (X, ). Then clearly n n i i n i =1 λ c ∈ Z X , K ( ) (see Proposition A.26). Furthermore, since the involved i i n i =1 Dir coefficients are non-negative, λ c ∈ Z X , K ( ) . Consequently, we can i i i =1 n introduce a map Dir ϕ : Z (X, ) −→ Z X , K ( ) n n λ c −→ λ c . i i i i i =1 i =1 It is immediate to check that ϕ is a morphism of -semimodules. It is also injective. We only need to prove that it is surjective. For that, it would be enough to prove that if n n Dir λ c ∈ Z X , K ( ) , then λ c ∈ Z (X, ). To that purpose, note that i i i i n i =1 n i =1 x = λ c can be regarded as an element of C (X, ) as the involved coefficients i i n i =1 + − are non-negative, and thus, in . We only need to prove that ∂ (x ) = ∂ (x ). Since + − is cancellative, it is enough to prove that K ∂ (x ) = K ∂ (x ) , or equivalently, + − that K ∂ (x ) − K ∂ (x ) = 0. But this follows from Proposition A.26 and from the fact that x ∈ Z X , K ( ) . In general, we could use homology with coefficients in a cancellative, zerosumfree semiring to detect directed cycles. Namely, we can prove an analogous to Proposition 3.24 for homology with semiring coefficients. 123 A directed persistent homology theory for dissimilarity… Proposition A.33 Let be a cancellative, zerosumfree semiring. Let X be a 1- dimensional directed simplicial complex with vertex set {v ,v ,...,v },n ≥ 1, 0 1 n and with 1-simplices e ,e ,..., e where either e = (v ,v ) or e = (v ,v ), 0 1 n i i i +1 i i +1 i i = 0, 1,..., n, and where v = v . Then, H (X, ) = is non-trivial if and only n+1 0 1 if either e = (v ,v ) for all i or e = (v ,v ) for all i, and H (X, ) ={0} in i i i +1 i i +1 i 1 any other case. Proof Let [e ] denote the elementary 1-chain associated to e .Let x = λ [e ] i i i i i =0 be any non-trivial 1-cycle. We can assume without loss of generality that λ = 0. If e = (v ,v ), then λ [v ] is a non-trivial summand in ∂ (e ). Since is zerosumfree, 0 0 1 0 1 0 such summand does not have an inverse, thus λ [v ] must be a summand in ∂ (x ). 0 1 Now note that e is the only other simplex involving the vertex v . Furthermore, [v ] 1 1 1 − − appears in ∂ [e ] if and only if e = (v ,v ), in which case ∂ [e ]= λ v .We 1 1 1 2 1 1 1 further deduce that λ = λ . 1 0 By proceeding iteratively, we deduce that if x is non-trivial, necessarily e = (v ,v ) and λ = λ = ··· = λ = λ, in which case x = λ[v ,v ]. i i +1 0 1 n i i +1 i =0 Simple computations show that such x is indeed a cycle, and since there are no 2- simplices, it cannot be homologous to any other cycle. Thus, H (X, ) = .The case where e = (v ,v ) is analogous. 0 1 0 The question would then be how to use this information in the setting of per- sistent homology. Recall from Remark A.16 that the family of canonical maps k : C (X, ) → K C (X, ) gives rise to a morphism of -semimodules C (X , ) n n k : H (X, ) → H K (C (X, )) , which takes the class of x to the class of C (X , ) n n [x , 0]. As a consequence of Proposition A.26, we can also rewrite H K (C (X, )) = H C (X , K ( )) . Therefore, the map k above can be regarded as a morphism n C (X , ) k : H (X, ) → H X , K ( ) . We denote this map by k . We then have the C (X , ) n n X following result. + + Proposition A.34 Let X be a directed simplicial complex and let ∈{N, Q , R }. Dir The directed homology module introduced in Definition 3.6,H X , K ( ) ,isthe submodule of H X , K ( ) generated by Im k . n X Proof Note that all representatives of classes in H (X, ) are in Z (X, ). Thus, the n n image through k corresponds to a subset of the classes generated by positive cycles, Dir as a consequence of Proposition A.32. Namely, Im k ⊆ H X , K ( ) . To prove the result, it would be enough to prove that every class of a directed cycle is in Im k . But this follows from the definition of H (X, ). More generally, given a cancellative, zerosumfree semiring, we can define Dir H X , K ( ) as the submodule of H X , K ( ) generated by Im k , and such n X module would detect directed cycles as exhibited by Proposition A.34.If K ( ) is a field, we can also use these homology modules to define persistence vector spaces for which we can prove stability results in a way analogous to the results in Sect. 4. The results of this paper arose from the ideas in the constructions above, which are introduced in their full generality using homology with semiring coefficients. This shows that the idea of introducing a submodule of directed homology by taking cycles with non-negative coefficients has a deeper algebraic meaning, providing more context 123 D. Méndez,R.J.Sánchez-García for the results in this paper and presenting them in their full generality. Furthermore, similar ideas could be considered for semirings which are not zerosumfree, perhaps allowing for the detection of additional information that persistent homology with ring coefficients may miss. The matter of whether a theory of persistence semimodules could provide more information could also be looked into. References Bauer, U., Lesnick, M.: Induced matchings of barcodes and the algebraic stability of persistence. In: Pro- ceedings of the Thirtieth Annual Symposium on Computational Geometry (New York, NY, USA), SOCG’14, Association for Computing Machinery. pp. 355–364 (2014) Burago, D., Burago, Y., Ivanov, S.: A Course in Metric Geometry. American Mathematical Society, Provi- dence, RI (2001) Carlsson, G.: Topology and data. Bull. Am. Math. Soc. (N.S.) 46(2), 255–308 (2009) Carlsson, G., Mémoli, F., Ribeiro, A., Segarra, S.: Hierarchical quasi-clustering methods for asymmet- ric networks. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, JMLR.org. pp. II-352–II-360 (2014) Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L.J., Oudot, S.: Proximity of persistence modules and their diagrams. In: Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry (New York, NY, USA), SCG ’09, Association for Computing Machinery. pp. 237–246 (2009) Chazal, F., de Silva, V., Glisse, M., Oudot, S.: The Structure and Stability of Persistence Modules. Springer- Briefs in Mathematics. Springer, Cham (2016) Chowdhury, S., Mémoli, F.: A functorial Dowker theorem and persistent homology of asymmetric networks. J. Appl. Comput. Topol. 2(1–2), 115–175 (2018) Chowdhury, S., Mémoli, F.: Persistent path homology of directed networks. In: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (USA), SODA’18, Society for Industrial and Applied Mathematics, pp. 1152–1169 (2018) Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological Persistence and Simplification, Discrete and Computational Geometry and Graph Drawing (Columbia, SC, 2001), vol. 28, pp. 511–533 (2002) Edelsbrunner, H., Harer, J.L.: Computational Topology: An Introduction. American Mathematical Society, Providence, RI (2010) Edelsbrunner, H., Jabłonski, ´ G., Mrozek, M.: The persistent homology of a self-map. Found. Comput. Math. 15(5), 1213–1244 (2015) Fajstrup, L., Goubault, E., Haucourt, E., Mimram, S., Raussen, M.: Directed Algebraic Topology and Concurrency. Springer, Berlin (2016) Fukuda, K., Prodon, A.: Double description method revisited, Combinatorics and computer science (Brest: Lecture Notes in Computer Science vol. 1120. Springer, Berlin 1996), pp. 91–111 (1995) Golan, J.S.: Semirings and Their Applications. Kluwer Academic Publishers, Dordrecht (1999) Grandis, M.: Directed Algebraic Topology: Models of Non-reversible Worlds. Cambridge University Press, Cambridge (2009) Grigor’yan, A., Lin, Y., Muranov, Y., Yau, S.-T.: Path Complexes and their Homologies. J. Math. Sci. 248, 564–599 (2020) Kim, K.H., Roush, F.W.: Generalized fuzzy matrices. Fuzzy Sets Syst. 4(3), 293–315 (1980) Motzkin, T.S., Raiffa, H., Thompson, G.L., Thrall, R.M.: The Double Description Method, Contributions to the Theory of Games, Annals of Mathematics Studies, no. 28, Princeton University Press, Princeton, NJ, , vol. 2, pp. 51–73 (1953) Munkres, J.R.: Elements of Algebraic Topology. Addison-Wesley Publishing Company, Menlo Park, CA (1984) Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. EPJ Data Sci. 6(1), 17 (2017) Patchkoria, A.: Cohomology of monoids with coefficients in semimodules. Bull. Georgian Acad. Sci. 86(3), 545–548 (1977) Patchkoria, A.: Cohomology monoids of monoids with coefficients in semimodules I. J. Homotopy Relat. Struct. 9(1), 239–255 (2014) 123 A directed persistent homology theory for dissimilarity… Reimann, M.W., Nolte, M., Scolamiero, M., Turner, K., Perin, R., Chindemi, G., Dłotko, P., Levi, R., Hess, K., Markram, H.: Cliques of neurons bound into cavities provide a missing link between structure and function. Front. Comput. Neurosci. 11, 48 (2017) Steiner, R.: Omega-categories and chain complexes. Homol. Homotopy Appl. 6(1), 175–200 (2004) Tan, Y.-J.: Bases in semimodules over commutative semirings. Linear Algebra Appl. 443, 139–152 (2014) Terzer, M.: Large scale methods to enumerate extreme rays and elementary modes, Ph.D. thesis (2009), ETH Zürich, Switzerland Turner, K.: Rips filtrations for quasimetric spaces and asymmetric functions with stability results. Algebr. Geom. Topol. 19(3), 1135–1170 (2019) Zomorodian, A., Carlsson, G.: Computing persistent homology. Discrete Comput. Geom. 33(2), 249–274 (2005) Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Applied and Computational Topology Springer Journals

A directed persistent homology theory for dissimilarity functions

Loading next page...
 
/lp/springer-journals/a-directed-persistent-homology-theory-for-dissimilarity-functions-wokL0iwaYB

References (38)

Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2023
ISSN
2367-1726
eISSN
2367-1734
DOI
10.1007/s41468-023-00124-x
Publisher site
See Article on Publisher Site

Abstract

We develop a theory of persistent homology for directed simplicial complexes which detects persistent directed cycles in odd dimensions. We relate directed persistent homology to classical persistent homology, prove some stability results, and discuss the computational challenges of our approach. Our directed persistent homology theory is motivated by homology with semiring coefficients: by explicitly removing additive inverses, we are able to detect directed cycles algebraically. Keywords Persistent homology · Dissimilarity functions · Directed simplicial complexes · Directed cycles Mathematics Subject Classification 55N31 · 55N35 · 55U10 · 16Y60 1 Introduction Persistent homology is one of the most successful tools in Topological Data Analysis (Carlsson 2009), with recent applications in numerous scientific domains such as biol- ogy, medicine, neuroscience, robotics, and many others (Otter et al. 2017). In its most common implementation, persistent homology is used to infer topological properties of the metric space underlying a finite point cloud using two steps (Edelsbrunner and Harer 2010): B David Méndez david.mendez@research.fchampalimaud.org Rubén J. Sánchez-García R.Sanchez-Garcia@soton.ac.uk School of Mathematical Sciences, University of Southampton, SO17 1BJ Southampton, UK Champalimaud Centre for the Unknown, Av. Brasília s/n, 1400-038 Lisbon, Portugal The Alan Turing Institute, London NW1 2DB, UK 123 D. Méndez,R.J.Sánchez-García (1) Build a filtration of simplicial complexes from distances, or similarities, between data points. (2) Compute the singular homology of each of the simplicial complexes in the fil- tration, along with the linear maps induced in homology by their inclusions. The resulting persistence module can normally be represented using a persistence dia- gram or a persistence barcode. A fundamental limitation of persistent homology, and in fact homology, is its inability to incorporate directionality, which can be important in some real-world applications (see examples below). For instance, homology cannot in principle dis- tinguish between directed and undirected cycles (Fig. 1). Although the definition of homology, namely the differential or boundary operator, requires a choice of orienta- tion for the simplices, the resulting homology is independent of this choice. Previous attempts have had some partial success at this issue (see ‘Related work’ below), how- ever, as far as we know, there is no homology theory able to exactly detect directed cycles up to boundary equivalence. The main difficulty occurs at the algebraic level: opposite orientations on a simplex correspond to additive inverse elements in the coefficient ring or field. Our key insight is to retain the submodule of homology generated by those classes that only have non-negative coefficients on elementary chains. In this way, we can define a directed homology module (Definition 3.6) of directed simplicial complexes (Definition 3.1) that detects homology classes of directed 1-cycles, as desired (Fig. 1, Proposition 3.24). This insight arises from the use of homology with semiring coefficients, where additive inverses may not exist. Indeed, the main results in this paper can be understood and reformulated in full generality using homology theory with semiring coefficients, see Appendix 1. Our directed homology theory can be extended to the persistent setting. More concretely, we define two persistence modules associated to filtrations of directed simplicial complexes: an undirected one using the whole homology module (Defini- tion 4.2) and a directed one using the submodule of directed homology (Definition 4.3). We show that the bars in the directed barcodes can be matched one-to-one with bars in the undirected barcodes, with the directed bars possibly having a later birth (Propo- sition 4.7,Figs. 2, 5, 6). Our motivation to develop directed persistent homology is the applications to real-world data sets where asymmetry or directionality, in particular the presence of directed cycles, plays a fundamental role. Examples include biological neural networks (directed synaptic connections), time series data (directed temporal connections) or biological molecular networks such as protein-protein interaction networks (inhibitory and excitatory connections). Mathematically, we can model these situations using a finite set V of data points, and an arbitrary function d : V × V → R, which we call dissimilarity function (Sect. 2.2) and which, crucially, may not be symmetric. Our main example of a directed simplicial complex is the directed Rips complex (Defini- tion 4.12)of (V , d ), which becomes the input of our suggested directed persistent homology pipeline. In the context of dissimilarity functions, we prove a stability result with respect to the bottleneck distance between (directed) persistent diagrams and the correspondence 123 A directed persistent homology theory for dissimilarity… v w 3 3 v v w w 1 2 1 2 Fig. 1 Two examples of directed simplicial complexes X (left) and Y (right). Our directed submodules of Dir Dir homology over the rings R ∈{Z, Q, R} detect directed 1-cycles: H (X , R) ={0}, while H (Y , R) = R 1 1 generated by the directed 1-cycle [w ,w ]+[w ,w ]+[w ,w ] 1 2 2 3 3 1 distortion distance (a generalization of the Gromov-Hausdorff distance) between dis- similarity functions over finite sets (Theorem 4.19). We finish this article by discussing the computational challenges of our approach. Related work A first approach to incorporating directionality would be to encode the asymmetry of a data set in the simplicial complex built from it. In Turner (2019), the author uses ordered tuple complexes or OT-complexes, which are generalizations of simplicial complexes where simplices are ordered tuples of vertices. Similar ideas have been successfully used in the field of neuroscience to show the importance of directed cliques of neurons (Reimann et al. 2017), showing that directionality in neuron con- nectivity plays a crucial role in the structure and function of the brain. In Chowdhury and Mémoli (2018a), the authors use so-called Dowker filtrations to develop persis- tent homology for asymmetric networks, and note (Chowdhury and Mémoli 2018a, Remark 35) that a non-trivial 1-dimensional persistence diagram associated to Dowker filtrations suggests the presence of directed cycles. However, they also note that such persistence diagram may be non-trivial even if directed cycles are not present. Fur- ther progress can be achieved by combining the ideas above with homology theories that are themselves sensitive to asymmetry. In Chowdhury and Mémoli (2018b), the authors develop persistent homology for directed networks using the theory of path homology of graphs (Grigor’yan et al. 2020). This persistent homology theory shows stability with respect to the distance between asymmetric networks (Carlsson et al. 2014), and is indeed able to tell apart digraphs with isomorphic underlying graphs but different orientations on the edges. Nonetheless, cycles in path homology do not corre- spond to directed cycles, see Chowdhury and Mémoli (2018b, Example 10). Another significant contribution can be found in Turner (2019), where four new approaches to persistent homology are developed for asymmetric data, all of which are shown to be stable. Each approach is sensitive to asymmetry in a different way. For example, one of them uses a generalization of poset homology to preorders to detect strongly connected components in digraphs, a feature our implementation does not have (see Proposition 3.18). More interestingly for our purposes, in Turner (2019, Section 5)the author introduces a persistent homology approach that builds a directed Rips filtration of ordered tuple complexes associated to dissimilarity functions d : V × V → R. This is indeed sensitive to asymmetry and yields a persistent homology pipeline that 123 D. Méndez,R.J.Sánchez-García v v v v 3 3 3 3 v v v v v v v v 4 2 4 2 4 2 4 2 v v v v v v v v 5 1 5 1 5 1 5 1 X X X X 0 1 2 3 0123 0123 Fig. 2 A filtration of a directed simplicial complex (top, left to right) and corresponding undirected (bottom, left) and directed (bottom, right) persistence barcodes. Every bar in the directed barcode corresponds to a unique bar in the undirected barcode (shown here by matching colours) (colour figure online) is stable with respect to the correspondence distortion distance, a generalization of the Gromov-Hausdorff distance to dissimilarity functions (see Turner 2019, Section 2 and Sect. 2.2). However, undirected cycles are also detected by this homology theory. In contrast, in this article we produce a theory of persistent homology associated to ordered tuple complexes that only detects directed cycles and which is also stable with respect to the correspondence distortion distance. Overview of results We introduce modules of undirected and directed homology associated to directed simplicial complexes (Definition 3.4) and show that they satisfy some desirable prop- erties. Furthermore, we prove that for certain coefficient rings, closed paths will only have non-trivial directed homology in dimension 1 when their edges form a directed cycle, as we set up to do (Proposition 3.24). Our modules of directed homology can be defined in dimensions greater than one, although they are zero in all even dimen- sions (Proposition 3.26) except in dimension 0 where they detect weakly connected components (Proposition 3.18). Next, we extend our homology theory to the persistence setting. In particular, we introduce, for each dimension, two persistence modules associated to the same fil- tration: an undirected persistence module (Definition 4.2), which is analogous to the one introduced in Turner (2019, Section 5), and the submodule generated by directed classes, which we call the directed persistence module (Definition 4.3). The directed persistence module gives rise to bars associated to homology classes that can be rep- resented by directed cycles, as illustrated in Fig. 2 and Examples 4.8 to 4.11.We also establish a relation between the undirected and directed persistence barcodes of the same filtration. Indeed, in Proposition 4.7 we show that every bar in the directed barcode corresponds to a unique bar in the undirected barcode that dies at the same time. The bar in the directed barcode may, however, be born later, and some bars in the undirected barcode may be left unmatched (see Figs. 2, 5 and 6). Having all the necessary ingredients, we can provide a complete workflow for directed persistent homology: given a dissimilarity function d on a finite set V , we construct its directed Rips filtration (Definition 4.12), and take the correspond- 123 A directed persistent homology theory for dissimilarity… ing undirected and directed n-dimensional persistence modules, respectively denoted Dir H (V , d ) and H (V , d ) (Definition 4.13). Both of these persistence modules have n V V persistence diagrams and barcodes (Definition 4.15) and the associated persistence modules are stable with respect to the correspondence distortion distance (Sect. 2.2 and Theorem 4.19). We finish the article with a discussion of the computational challenges of calculating the directed persistent homology of a dissimilarity function. The Standard Algorithm can easily be adapted to compute the barcodes of the undirected persistent homology of the dissimilarity function, but the computation of the directed persistence barcodes is more challenging. A direct approach results in an extreme ray enumeration problem (Sect. 5), a well-known problem in computational geometry. Developing an algorithm that computes the directed homology classes, rather than all directed cycles, remains an open problem. We include an Appendix on homology with semiring coefficients and on how it gives rise to the ideas introduced in this article. This provides additional motivation and context for our results, and presents them in full generality. Outline of the article In Sect. 2, we introduce the necessary background in persistence modules (2.1) and dissimilarity functions (2.2). In Sect. 3 we introduce undirected and directed homol- ogy modules of directed simplicial complexes. We then use these homology modules to introduce directed and undirected persistent homology in Sect. 4, where we also show our stability results with respect to dissimilarity functions. In Sect. 5, we discuss the computational implementation of directed persistent homology and lay out some future research directions. Finally, Appendix 1 introduces homology with semiring coefficients and shows how it relates to directed homology. 2 Persistence modules and dissimilarity functions Our goal is to extend persistent homology to directed simplicial complexes such as those constructed from a dissimilarity function. In this section, we introduce the neces- sary background regarding persistent homology (Sect. 2.1) and dissimilarity functions (Sect. 2.2). 2.1 Persistence modules, diagrams and barcodes In this section, we introduce persistence modules and their associated persistence diagrams and barcodes. We follow the exposition in Chowdhury and Mémoli (2018b), as it is simple and focused on persistence modules arising in the context we are interested in: that of persistent homology of finite simplicial complexes. Many of the results below hold in more generality (Chazal et al. 2016) but we chose to keep the exposition simple. Let R be a ring with unity and let T ⊆ R be a subset. 123 D. Méndez,R.J.Sánchez-García δ δ Definition 2.1 A persistence R-module over T , written V = {V }, {ν } ,is δ≤δ ∈T δ δ δ δ a family of R-modules {V } and homomorphisms ν : V → V , whenever δ∈T δ ≤ δ ∈ T , such that (1) for every δ ∈ T , ν is the identity map, and δ δ δ (2) for every δ ≤ δ ≤ δ ∈ T , ν ◦ ν = ν . δ δ δ δ δ δ δ Let V = {V }, {ν } and W = {W }, {μ } be two persistence δ δ δ≤δ ∈T δ≤δ ∈T R-modules over T.A morphism of persistence R-modules f : V → W is a family δ δ δ of morphisms of R-modules { f : V → W } such that for every δ ≤ δ ∈ T we δ∈T have a commutative diagram δ δ V V f f W W . δ δ Let us assume that R is a field, thus V = {V }, {ν } is a persistence vector δ δ≤δ ∈T space, and that V is finite-dimensional, for every δ ∈ R. Furthermore, we suppose that there exists a finite subset {δ ,δ ,...,δ }⊆ T for which 0 1 n (1) if δ ∈ T , δ ≤ δ , then V = 0, (2) if δ ∈ T ∩[δ ,δ ) for some 1 ≤ i ≤ n,the map ν is an isomorphism, while i −1 i i −1 ν is not, and δ δ δ (3) if δ, δ ∈ T , δ ≤ δ< δ , then v : V → V is an isomorphism. Remark 2.2 When R is a field, either of these restrictions alone (namely, each V being finite-dimensional or the existence of a finite subset {δ ,δ ,...,δ } as above) 0 1 n is enough to associate a persistence barcode and a persistence diagram to V (Chazal et al. 2016, Theorem 2.8). We nevertheless assume both restrictions since they hold for persistence diagrams associated to finite simplicial complexes, which are precisely the sort of objects that arise from data, and such assumption makes the exposition simpler. Let us now introduce persistence barcodes and diagrams. First, for simplicity, we consider V indexed by the natural numbers: δ i +1 V = {V }, ν , i i ∈N δ δ l k n where V = V for all k ≥ n and ν is the identity whenever k, l ≥ n. (This clearly contains all the information in V.) By Edelsbrunner et al. (2015, Basis Lemma), we can find bases B of the vector spaces V , i ∈ N, such that i +1 (1) ν (B ) ⊆ B ∪{0}, i i +1 123 A directed persistent homology theory for dissimilarity… δ δ i +1 i +1 (2) rank(ν ) =|v (B ) ∩ B |, and i i +1 δ δ i i i +1 (3) each w ∈ Im v ∩ B is the image of exactly one element v ∈ B . i +1 i Such bases are called compatible bases. Elements in B that are mapped to an element in B correspond to linearly independent elements of V that ‘survive’ until the i +1 next step in the persistence vector space. Similarly, elements in a basis B which are not in B are considered to be ‘born’ at index i. Formally, we define i −1 L := (b, i ) | b ∈ B , b ∈ / Im ν , i > 0 ∪{(b, 0) | b ∈ B }. i 0 i −1 Given (b, i ) ∈ L, we call i the birth index of the basis element b. This element ‘survives’ on subsequent bases until it is eventually mapped to zero. When this happens, the number of steps taken until the class dies is its death index of b. Formally, the death index of (b, i ) is j δ δ i +2 i +1 (b, i ) := max j ∈ N | ν ◦ ··· ◦ ν ◦ ν (b) ∈ B . δ δ δ j −1 i +1 i We allow (b, i ) =∞,if b is in every basis B for all j ≥ i, so that the death index takes values in N = N ∪{+∞}. Using this information, we can introduce the persistence barcode of V. We call a pair (X , m) a multiset if X is a set and m : X → N = N∪{+∞} is a function. We call m(x ) the multiplicity of x ∈ X. δ i +1 Definition 2.3 Consider a persistence vector space V = {V }, ν as above. i i ∈N The persistence barcode of V is defined as the multiset of intervals Pers(V) := [δ ,δ ) |∃(b, i ) ∈ L, j ∈ Ns.t.(b, i ) = j i j +1 ∪ [δ , +∞) |∃(b, i ) ∈ Ls.t.(b, i ) =+∞ , with the multiplicity of [δ ,δ ) (respectively [δ , +∞)) being the number of ele- i j +1 i ments (b, i ) ∈ L such that (b, i ) = j (respectively (b, i ) =+∞). Thus, persistence barcodes encode the birth and death of elements in a family of compatible bases. Crucially, and even though compatible bases are not unique, the number of birth and death events at each step is determined by the rank of the i +1 linear maps ν and their compositions, and it is thus independent of the choice of compatible bases. Furthermore, the persistence barcode of a persistence vector space completely determines both the dimension of the vector spaces V and the ranks of the linear maps between them, hence it determines the persistence vector space up to isomorphism (Chazal et al. 2016, Theorem 2.8). Each interval in Pers(V) is called a persistence interval. Persistence barcodes can be represented by stacking horizontal lines, each of which represents a persistence interval or bar. The endpoints of the line in the horizontal axis correspond to the endpoints of the interval it represents, whereas the vertical axis has no significance other than being able to represent every persistence interval at once. Persistence bars can be stacked in any order, although they are usually ordered by their birth. This representation is what gives persistence barcodes their name. 123 D. Méndez,R.J.Sánchez-García An alternative characterization of a persistence vector space is its persistence dia- gram. Let us write R = R ∪{−∞, +∞} for the extended real line. Definition 2.4 The persistence diagram of the persistence vector space V is the mul- tiset 2 2 Dgm(V) := (δ ,δ ) ∈ R |[δ ,δ ) ∈ Pers(V) ∪ (δ , +∞) ∈ R |[δ , +∞) ∈ Pers(V) . i j +1 i j +1 i i The multiplicity of a point in Dgm(V) is the multiplicity of the corresponding interval in Pers(V). A crucial property for applications of persistent homology to real data is that a small perturbation of the input (a data set, encoded as filtration of simplicial complexes) results in a small perturbation of the output (its persistence module). Algebraically, this amounts to proving that a small perturbation of the persistence diagram results in a small perturbation of the associated persistence module. In order to state this stability result, we therefore need to introduce distances between persistence diagrams, and persistence modules, respectively. To measure how far apart two persistence diagrams are, we can use the bottleneck 2 2 distance between multisets of R .Let  denote the multiset of R consisting of every point in the diagonal counted with infinite multiplicity. A bijection of multisets m (x ) m (y) X Y ϕ : (X , m ) → (Y , m ) is a bijection of sets ϕ:∪ x →∪ y, X Y x ∈X y∈Y i =1 i =1 that is, a bijection between the sets obtained when counting each element in both X and Y with its multiplicity. Definition 2.5 Let A and B be two multisets in R .The bottleneck distance between A and B is defined as d (A, B) := inf sup a − ϕ(a) , B ∞ a∈A ∞ ∞ where the infimum is taken over all bijections of multisets ϕ : A ∪  → B ∪ and where · denotes the  -norm in R . Note that the bottleneck distance between two persistence diagrams defined from persistence barcodes can only be finite if the diagrams have the same amount of bars of form [a, ∞) for a ∈ R. We also need a distance between persistence vector spaces. To that purpose, we use the interleaving distance, introduced in Chazal et al. (2009). δ δ δ δ Definition 2.6 Let V = {V }, {ν } and W = {W }, {μ } be two δ δ δ≤δ ∈R δ≤δ ∈R persistence vector spaces, and ε ≥ 0. We say that V and W are ε-interleaved if there exist two families of linear maps δ δ+ε δ δ+ε {ϕ : V → W } , {ψ : W → V } δ δ∈R δ δ∈R such that the following diagrams are commutative for all δ ≥ δ ∈ R: 123 A directed persistent homology theory for dissimilarity… δ δ ν μ δ δ δ δ δ δ V V W W ϕ ϕ ψ ψ δ δ δ δ δ +ε δ +ε μ ν δ+ε δ+ε δ+ε δ +ε δ+ε δ +ε W , V , W V δ+2ε δ+2ε ν μ δ δ δ δ+2ε δ δ+2ε V V W V ψ ψ ϕ δ δ+ε δ δ+ε δ+ε δ+ε W , W . The interleaving distance between V and W is then defined as d (V, W) = inf{ε ≥ 0 | V and W are ε − interleaved}. The authors in Chazal et al. (2009) show that the interleaving distance is a pseudo- metric (a zero distance between distinct points may occur) in the class of persistence vector spaces. Moreover, they show the following Algebraic Stability Theorem. δ δ δ Theorem 2.7 (Chazal et al. 2009) Let V= {V }, {ν } and W= {W }, δ≤δ ∈R {μ } be two persistence vector spaces. Then, δ δ≤δ ∈R d Dgm(V), Dgm(W) ≤ d (V, W). B I 2.2 Dissimilarity functions and the correspondence distortion distance Our objective is to define a theory of persistent homology able to detect directed cycles modulo boundaries. Therefore, instead of building filtrations (of simplicial complexes) from finite metric spaces, we are interested in filtrations (of directed simplicial com- plexes) built from arbitrary dissimilarity functions. We will follow (Turner 2019), where they receive the name of set-function pairs, and Carlsson et al. (2014), Chowd- hury and Mémoli (2018a, b), where they are referred to as dissimilarity networks or asymmetric networks. In this section, we introduce the necessary background on dissimilarity functions, and the correspondence distortion distance between them, which is a generalization of the Gromov-Hausdorff distance to the asymmetric setting. We finish with a refor- mulation of this distance, established in Carlsson et al. (2014), which we will need to prove our persistent homology stability result. Definition 2.8 Let V be a finite set. A dissimilarity function (V , d ) on V is a function d : V × V → R.The valueof d on a pair (v ,v ) is called the distance,or V V 1 2 dissimilarity, from v to v . 1 2 123 D. Méndez,R.J.Sánchez-García Note that no restrictions are imposed on d , thus it may not be symmetric, the triangle inequality may not hold, and the distance from a point to itself may not be zero. These functions are also referred to as asymmetric networks (Carlsson et al. 2014; Chowdhury and Mémoli 2018a, b) since they may be represented as a network with vertex set V and an edge from v to v with weight d (v ,v ) for every (v ,v ) ∈ 1 2 V 1 2 1 2 V × V . Thus, dissimilarity functions are very flexible and allow for the modelling of widely different problems, including any problem that can be modelled using a network. We would like to build (directed) simplicial complexes and, ultimately, persistence diagrams from dissimilarity functions. In order to check the stability of our construc- tions, we need a way to measure how close two such objects are. When comparing networks with the same vertex sets, a natural choice is the  norm. However, we are interested in comparing dissimilarity functions on different vertex sets. To that end, we consider the  norm over all possible pairings (quantified by a binary rela- tion) between vertex sets, following ideas similar to those behind the definition of the Gromov-Hausdorff distance. Definition 2.9 Let (V , d ) and (W , d ) be two dissimilarity functions and let θ be a V W non-empty binary relation between V and W , that is, an arbitrary subset θ ⊆ V × W . The distortion of the relation θ is defined as dis(θ ) := max |d (v ,v ) − d (w ,w )|. V 1 2 W 1 2 (v ,w ),(v ,w )∈θ 1 1 2 2 A correspondence between V and W is a relation θ between these sets such that π (θ ) = V and π (θ ) = W , where π : V × W → V is the projection onto V , and V W V similarly for π . That is, θ is a correspondence if every element of V is related to at least an element of W , and vice-versa. The set of all correspondences between V and W is denoted R(V , W ). The correspondence distortion distance (Turner 2019) between (V , d ) and (W , d ) is defined as d (V , d ), (W , d ) = min dis(θ ). CD V W θ ∈R(V ,W ) This notion agrees with that of the Gromov–Hausdorff distance when (V , d ) and (W , d ) are metric spaces (Burago et al. 2001,Section 7.3.3). We will use a reformulation of the correspondence distortion that can be found in Chowdhury and Mémoli (2018a). In order to introduce it, we need to define the dis- tortion and co-distortion of maps between sets endowed with dissimilarity functions. Definition 2.10 Let (V , d ) and (W , d ) be any two dissimilarity functions and let V W ϕ : V → W and ψ : W → V be maps of sets. The distortion of ϕ (with respect to d and d ) is defined as dis(ϕ) := max d (v ,v ) − d ϕ(v ), ϕ(v ) . V 1 2 W 1 2 v ,v ∈V 1 2 123 A directed persistent homology theory for dissimilarity… The co-distortion of ϕ and ψ (with respect to d and d ) is defined as V W codis(ϕ, ψ ) := max d v, ψ(w) − d ϕ(v), w . V W (v,w)∈V ×W Note that codistortion is not necessarily symmetrical, namely, codis(ϕ, ψ ) and codis(ψ, ϕ) may be different if either of the dissimilarity functions are asymmetric. Finally, we have the following reformulation of the correspondence distortion dis- tance. Proposition 2.11 (Chowdhury and Mémoli 2018a, Proposition 9) Let (V , d ) and (W , d ) be any two dissimilarity functions. Then, d (V , d ), (W , d ) = min max {dis(ϕ), dis(ψ ), codis(ϕ, ψ ), codis(ψ, ϕ)} . CD V W 2 ϕ : V →W , ψ : W →V 3 Directed homology of directed simplicial complexes In this section, we introduce and study a family of subrings of the homology rings of simplicial complexes which encode directionality information and in particular detect directed cycles (Fig. 1). We cannot introduce such a family using (undirected) sim- plicial complexes, as they cannot encode directionality information of the simplices. One approach is to use the so-called ordered-set complexes, where simplices are sets with a total order on the vertices. They generalize simplicial complexes, which can be encoded as fully symmetric ordered-set complexes, that is, ordered-set complexes where if a set of vertices forms a simplex, it must do so with every possible order. However, persistent homology of ordered-set complexes is not stable (Remark 4.20). To achieve stability, we use one further generalization, called ordered tuple com- plexes or OT-complexes in Turner (2019), and called directed simplicial complexes in this article (Definition 3.1). The only difference is that arbitrary repetitions of vertices are allowed in the ordered tuples representing simplices. Clearly, any ordered- set complex is a directed simplicial complex. Furthermore, a (undirected) simplicial complex can be encoded as a fully symmetric (as above) directed simplicial com- plex where every possible repetition of vertices is also included. When doing so, the (undirected) simplicial complex and its associated directed simplicial complex have isomorphic homologies over rings (Remark 3.5), and morphisms of simplicial complexes can be lifted to morphisms between their associated directed simplicial complexes (Remark 3.9). Finally, and crucially for us, we will be able to use the above- mentioned subrings encoding information on directed cycles to introduce a theory of persistent homology for dissimilarity functions which is stable with respect to the correspondence-distortion distance (Sect. 4). Note that, as ordered-set complexes are directed simplicial complexes, the results stated here apply to homology computations on ordered-set complexes as well. 123 D. Méndez,R.J.Sánchez-García 3.1 Chain complexes of directed simplicial complexes In this section, we introduce directed simplicial complexes and their associated chain complexes and homology modules, including modules of directed homology. Definition 3.1 A directed simplicial complex,or ordered tuple complex, is a pair (V , X ) where V is a finite set of vertices, and X is a family of tuples (x , x ,..., x ) 0 1 n of elements of V such that if (x , x ,..., x ) ∈ X, then (x , x ,..., x ,..., x ) ∈ X 0 1 n 0 1 i n for every i = 0, 1,..., n. (Here and throughout the rest of the paper, x indicates that x is omitted from the list.) We will denote the directed simplicial complex (V , X ) just by X, and assume that every vertex belongs to at least one directed simplex (so that V is uniquely determined from X). Elements of X of length n + 1 are called n-simplices, and the subset of the n-simplices of X is denoted by X . Note that arbitrary repetitions of vertices in a tuple are allowed. Note that directed simplicial complexes are more general objects than simplicial complexes, in a way that allows directed simplicial complexes to encode asymmetric information. Note further that it is possible to define a geometric realization for these objects in which simplices with vertex repetitions are collapsed to a simplex with no repetitions, thus simplices with repeated vertices play a role similar to that of the degenerate simplices in simplicial sets. We make use of the following terminology when dealing with directed simplicial complexes. Elements of X of length n + 1 are called n-simplices, and the subset of the n-simplices of X is denoted by X .An n-simplex obtained by removing vertices from an m-simplex, n ≤ m,issaidtobea face of the m-simplex. Such face is called proper if n < m, or equivalently, if at least vertex of the m-simplex is removed. A directed simplicial complex is said to be n-dimensional, denoted dim(X ) = n,if X is trivial (empty) but X is not. A collection Y of simplices of X that is itself a n+1 n directed simplicial complex is said to be a directed simplicial subcomplex (or, simply, subcomplex)of X, denoted Y ⊆ X. Note that the vertex set of Y may be strictly smaller than that of X. We now introduce chain complexes associated to a directed simplicial complex. These definitions are straight generalizations of those for (undirected) simplicial com- plexes. Definition 3.2 Let R be a commutative ring. The n-dimensional chains of X are defined as the elements of the free R-module generated by the n-simplices, C (X , R) = R {[x , x ,..., x ]| (x , x ,..., x ) ∈ X } . n 0 1 n 0 1 n We call the elements [x , x ,..., x ]∈ C (X , R), (x , x ,..., x ) ∈ X, elementary 0 1 n n 0 1 n n-chains. 123 A directed persistent homology theory for dissimilarity… Definition 3.3 Let X be a directed simplicial complex. For each n > 0, we define a morphism of R-modules ∂ : C (X , R) → C (X , R) by n n n−1 ∂ ([x , x ,..., x ]) = (−1) [x , x ,..., x ,..., x ] n 0 1 n 0 1 i n i =0 For n = 0, let ∂ : C (X , R) →{0} be the trivial map. 0 0 A straightforward computation analogous to the one for chain complexes in sim- plicial homology shows that {C (X , R), ∂ } is a chain complex of R-modules. In n n particular, we can define homology groups. Definition 3.4 Let X be a directed simplicial complex, R a commutative ring, and n ≥ 0. The n-dimensional cycles, boundaries, and homology of X are Z (X , R) = {x ∈ C (X , R) | ∂ (x ) = 0}, B (X , R) = Im ∂ and the quotient H (X , R) = n n n n+1 n Z (X , R)/B (X , R), respectively. n n Remark 3.5 If X is an (undirected) simplicial complex, we can define an ordered-set OT OT simplicial complex X where (x , x ,..., x ) ∈ X whenever {x , x ,..., x }, 0 1 n 0 1 n after removing any repetition of vertices, is a simplex in X. The chain complex OT C (X , R) is called the ordered chain complex of X in Munkres (1984), and its homology is isomorphic to the singular homology of X over R. We now discuss how to only retain the information associated to directed cycles. Note that when introducing the differential for the chain complexes of simplicial complexes, an orientation is chosen for each of the simplices. However, undirected cycles can still be produced because the chosen orientation for a simplex can be reverted by changing the sign of the corresponding elementary chain. As such, homology of (undirected) simplicial complexes is independent of these choices of orientations. In the case of directed simplicial complexes, orientations are not chosen, and we can ensure that we are only retaining information regarding directed cycles by making sure that all the chains have positive coefficients, when the coefficient ring is chosen appropriately. However, we also require the introduced object to be a submodule of the entire module of homology. Thus, we arrive at the following definition. Definition 3.6 Let X be a directed simplicial complex and let R ∈{Z, Q, R}.The set of n-dimensional directed cycles of X over R is Dir Z (X , R) = λ c ∈ Z (X , R) | λ ≥ 0, c elementary n-chain . i i n i i Dir The nth dimensional module of directed homology, H (X , R), is the submodule of H (X , R) generated by the directed cycles, Dir Dir H (X , R) = [x]| x ∈ Z (X , R) . n n 123 D. Méndez,R.J.Sánchez-García Remark 3.7 The rings Z, Q and R are used to simplify the exposition: the definition Dir Dir of Z (X , R), and thus of H (X , R), can be made in an analogous way for any ring n n R that is the Grothendieck completion of a cancellative, zerosumfree semiring (see Appendix 1). In view of Definition 3.6 it is clear that, in dimension 1, directed circuits in the directed graph underlying a directed simplicial complex will give rise to directed cycles. In Sect. 3.3 we will show how these directed submodules of homology behave in a desirable way. 3.2 Functoriality of directed homology In this section, we prove that homology and directed homology are functors from the category of directed simplicial complexes to the category of graded R-modules. We also show that two morphisms allowing for the construction of the prism operator must induce the same map on homology, a result we will need to prove our persistent homology stability result. We begin by introducing morphisms of directed simplicial complexes. Definition 3.8 A morphism of directed simplicial complexes f : (V , X ) → (W , Y ) or just f : X → Y,isamapofsets f : V → W such that if (x , x ,..., x ) ∈ X then 0 1 n f (x ), f (x ),..., f (x ) ∈ Y . 0 1 n Remark 3.9 This definition is stricter than the classical notion of morphism of simpli- cial complexes, where morphisms are allowed to take simplices to simplices of lower dimension, as directed simplicial complexes can intrinsically account for vertex rep- etitions. However, note that if X and Y are (undirected) simplicial complexes, a map OT OT f : X → Y is a morphism of simplicial complexes if and only if f : X → Y (see Remark 3.5) is a morphism of directed simplicial complexes. Definition 3.10 Let f : X → Y be a morphism of directed simplicial complexes. Then f induces morphisms of R-modules C ( f ) ={C ( f )}, defined as C ( f ) : C (X , R) −→ C (Y , R) n n n [x , x ,..., x ] −→ [ f (x ), f (x ),..., f (x )]. 0 1 n 0 1 n We will often abbreviate C ( f ) = f . n n Proposition 3.11 If f : X → Y is a morphism of directed simplicial complexes, the family of maps {C ( f )} is a R-homomorphism of chain complexes. Therefore, it induces a map H ( f ) : H (X , R) → H (Y , R). Furthermore, H ( f ) restricts to a n n n n Dir Dir Dir map H ( f ) : H (X , R) → H (Y , R). n n n Proof Let [x , x ,..., x ]∈ C (X , R) be a simplex. Simple computations show that 0 1 n n C ( f )∂ = ∂ C ( f ), thus inducing a map H ( f ) : H (X , R) → H (Y , R). n−1 n n n n n n Dir Now let x ∈ Z (X , R). We can write x = λ x , where λ ≥ 0 and i i i n i x is an elementary n-chain. Then, C ( f )(x ) = λ C ( f )(x ). Clearly, C ( f ) i n i n i n 123 A directed persistent homology theory for dissimilarity… Dir takes elementary n-chains to elementary n-chains, thus C ( f )(x ) ∈ Z (Y , R). It is then clear from the definition of directed homology that H ( f ) restricts to H ( f ) : H (X , R) → H (Y , R). n n n Remark 3.12 Let X and Y be (undirected) simplicial complexes and let R be a ring. If f : X → Y is a morphism of (undirected) simplicial complexes, the map induced on OT homology by f : X → Y is the same as the map induced on homology by f : X → OT Y (see Remarks 3.5 and 3.9). Corollary 3.13 (Directed) Homology is a functor from the category of directed simpli- cial complexes to the category of graded R-modules. In particular, isomorphic directed simplicial complexes have isomorphic (directed) homologies. We finish this section by showing a sufficient condition for two morphisms to induce the same map on homology. We will need this result to prove that our definition of persistent homology is stable. Lemma 3.14 Let R be a commutative ring. Let X and Y be two directed simplicial complexes and let f , g : X → Y be morphisms of directed simplicial complexes such that if (x , x ,..., x ) ∈ X, then f (x ), f (x ),..., f (x ), g(x ), . . . , g(x ) ∈ Y 0 1 n 0 1 i i n for every i = 0, 1,..., n. Then, H ( f ) = H (g) for every n ≥ 0. Consequently, n n Dir Dir H ( f ) = H (g), for every n ≥ 0. n n Proof For x =[x , x ,..., x ]∈ C (X , R) an elementary n-chain, define 0 1 n n s [x , x ,..., x ]= (−1) [ f (x ),..., f (x ), g(x ), ..., g(x )]. n 0 1 n 0 i i n i =0 We show that this family of maps provides a chain homotopy between f and g. Namely, we prove that C (g) − C ( f ) = ∂ s + s ∂ , for all n. On the one hand, n n n+1 n n−1 n n i −1 i j s ∂ (x ) = (−1) (−1) f (x ), f (x ), ..., f (x ), g(x ), ..., g(x ), ..., g(x ) n−1 n 0 1 j j i n i =0 j =0 j +1 + (−1) f (x ), f (x ), ..., f (x ), ..., f (x ), g(x ), ..., g(x ) . 0 1 i j j n j =i +1 (3.1) On the other hand, j i ∂ s (x ) = (−1) (−1) f (x ), f (x ), ..., f (x ), ..., f (x ), g(x ), ..., g(x ) n+1 n 0 1 i j j n j =0 i =0 i +1 + (−1) f (x ), f (x ), ..., f (x ), g(x ), ..., g(x ), ..., g(x ) . 0 1 j j i n i = j 123 D. Méndez,R.J.Sánchez-García By exchanging the roles of the indices in this last equation, we obtain that n i i +1 j ∂ s (x ) = (−1) (−1) f (x ), f (x ), ..., f (x ), g(x ), ..., g(x ), ..., g(x ) n+1 n 0 1 j j i n i =0 j =0 i j + (−1) (−1) f (x ), f (x ), ..., f (x ), ..., f (x ), g(x ), ..., g(x ) . 0 1 i j j n j =i (3.2) Now, adding Eqs. (3.1) and (3.2), ∂ s (x ) + s ∂ (x ) = [ f (x ), f (x ),..., f (x ), g(x ), ..., g(x )] n+1 n n−1 n 0 1 i −1 i n i =0 − f (x ), f (x ), . . . , f (x ), g(x ), . . . , g(x ) . 0 1 i i +1 n i =0 It is now straightforward to check that this sum amounts to [g(x ), g(x ),..., g(x )]−[ f (x ), f (x ),..., f (x )]= C (g)(x ) − C ( f )(x ), 0 1 n 0 1 n n n and we are done. 3.3 Basic computations and examples In this section, we explore some basic properties of this homology theory. We begin by studying the relation between homology and connectivity. Definition 3.15 Let (V , X ) be a directed simplicial complex and v, w ∈ V be two vertices. A path from v to w in X a sequence of vertices v = x , x ,..., x = w such 0 1 n that either (x , x ) or (x , x ) is a simplex, for all 1 ≤ i ≤ n − 1. The (weakly) i i +1 i +1 i connected components of X are the equivalence classes of the equivalence relation v ∼ w if there is a path in X from v to w. We call X (weakly) connected if it has only one connected component, that is, every pair of vertices can be connected by a path. Note that this notion of connectedness ignores the direction of the edges (1- simplices). The next result shows that we can compute the homology of each connected component independently. The proof follows by observing that the chain complex C (X , R) is clearly the direct sum of the chain complexes associated to each of the (weakly) connected components of X. In particular, it is immediate that any directed cycle can be decomposed as a sum of directed cycles in each of the directed compo- nents. Proposition 3.16 The (directed) homology of a directed simplicial complex X is iso- morphic to the direct sum of the (directed) homology modules of its (weakly) connected components. 123 A directed persistent homology theory for dissimilarity… We next show that the 0th (directed) homology counts the number of (weakly) connected components of a directed simplicial complex X. We start with a lemma. Lemma 3.17 Let R be a commutative ring and X be a directed simplicial complex. For any vertex v and any λ ∈ R, λ[v] is homologous to the trivial class if and only if λ = 0. Proof It is clear that λ[v] is a cycle as ∂ is trivial. On the other hand, it is clear that it must be the trivial class if λ = 0. Now assume that λ[v] is the trivial class. We shall prove that λ = 0. Assume that X has n + 1 vertices, namely V ={x = v, x ,..., x }. Then if λ[x ] 0 1 n 0 is trivial, there exists a 1-chain x = λ [x , x ] ij i j i , j =0 such that λ[x ]= ∂(x ), where we are assuming that λ = 0 whenever [x , x ] is not 0 ij i j a 1-chain. By computing the differentials, n n λ[x ]= λ [x ]− λ [x ]. 0 ij j ij i i , j =0 i , j =0 Now, since [x ], [x ],..., [x ] is a basis of C (X , R), the coefficients of each ele- 0 1 n 0 ment in the basis must be equal in both sides of the equality. Namely, for each j = 0, n n 0 = (λ − λ ), and λ = (λ − λ ). Adding these equations, we deduce ij ji i0 0i i =0 i =0 that λ = 0. Proposition 3.18 Let X be a directed simplicial complex and let R be a commutative ring. Then H (X , R) R , where k is the number of (weakly) connected components Dir k of X. Furthermore, if R ∈{Z, Q, R}, then H (X , R) = H (X , R) R . Proof By Proposition 3.16 we only need to show that if X is connected, then H (X , R) = R.Let v, w be any two vertices and let us show that [v] and [w] are homologous 0-cycles. Indeed, since X is connected, we can find vertices v = v ,v ,...,v = w such that either [v ,v ] or [v ,v ] are 1-chains, for all 0 1 m i i +1 i +1 i i = 0, 1,..., n − 1. If either [v ,v ] or [v ,v ] are chains, then [v ] and [v ] are i i +1 i +1 i i i +1 homologous. As this holds for every i = 0, 1,..., m − 1, [v] is homologous to [w]. n n Now, if λ [x ] is any 0-cycle, it is homologous to λ [v]. Then, by Lemma i i i i =1 i =1 3.17, λ[v] is not homologous to μ[v] whenever λ = μ, thus H (X , R) = R. Dir For the second statement, if R ∈{Z, Q, R}, we can compute H (X , R). In such Dir case, [v] is a directed cycle, thus [v] = H (X , R) ⊆ H (X , R). It then follows Dir that H (X , R) = H (X , R). Remark 3.19 The fact that 0th directed homology ignores edge directions seems counter-intuitive, as we set up this framework to detect directed features, namely directed cycles. However, this is a natural consequence of homology being an equiv- alence relation. In particular, as mentioned in the proof, if either [u,v] or [v, u] are 123 D. Méndez,R.J.Sánchez-García 1-chains, [u] and [v] are homologous, thus directionality of the 1-chains does not affect 0th directed homology. However, this only occurs in dimension 0, as a consequence of 0-chains always being cycles. Thus, when working with 1-simplices, the symmetry in the homology relation does not affect the detection of asymmetry in the data, as such asymmetry is encoded in the cycles themselves. It is also worth mentioning that a persistent homology able to detect strongly connected components in directed graphs has been introduced in Turner (2019). We now show that the homology of a point is trivial for positive indices. Example 3.20 Let R be a commutative ring and let X be the directed simplicial complex with vertex set V ={x } and simplices X ={(x )}. Then, 0 0 R, if n = 0, H (X , R) = 0, if n > 0. Indeed, X is connected, thus H (X , R) = R by Proposition 3.18.For n ≥ 1, there are no n-chains, hence H (X , R) = 0. In this case, it is also immediate to check that if Dir R ∈{Z, Q, R}, H (X , R) = H (X , R), for every n. Directed simplicial complexes whose homology is isomorphic to that of the point are called acyclic. Definition 3.21 A directed simplicial complex X is acyclic if R, if n = 0, H (X , R) = 0, if n > 0. If R ∈{Z, Q, R}, we say that X is directionally acyclic if R, if n = 0, Dir H (X , R) = 0, if n > 0. Note that acyclic implies directionally acyclic, as directed homology is a submodule of the module of homology, for every n. Next, we show that a directed simplicial complex consisting of one m-dimensional simplex along with all of its faces (note that this is an ordered-set complex) is acyclic. Proposition 3.22 Let X be the directed simplicial complex consisting of the simplex (x , x ,..., x ) and all of its faces. Then X is acyclic. 0 1 m Proof As a consequence of how X is defined, if x ∈ C (X , R) is a chain so that x n 0 does not participate in any of its elementary n-chains (as there are no repetitions in the simplices of X), then x can be added as the first element of every elementary 123 A directed persistent homology theory for dissimilarity… v w u u 3 3 4 3 v v w w u u 1 2 1 2 1 2 X Y Z Dir Fig. 3 Directed homology detects directed 1-cycles modulo boundaries. In these examples, H (X , R) = 0 Dir Dir while H (Y , R) = H (Z , R) = R respectively generated by [w ,w ]+[w ,w ]+[w ,w ] and 1 2 2 3 3 1 1 1 [u , u ]+[u , u ]+[u , u ] (or [u , u ]+[u , u ]+[u , u ]+[u , u ]). To be a directed 1-cycle, the 1 2 2 4 4 1 1 2 2 3 3 4 4 1 directions of the involved 1-simplices must form circuit. This does not hold for (undirected) homology n-chain in x,givingusan (n + 1)-chain which we denote x x ∈ C (X , R).Simple 0 n+1 computations show that ∂(x x ) = x − x ∂(x ). 0 0 Now take x ∈ Z (X , R) any cycle and decompose it as x = x y + z, where x n 0 0 does not participate in any of the chains in either y or z. Using the formula above, ∂(x ) = ∂(x y + z) = y − x ∂(y) + ∂(z) = 0. 0 0 In particular, the chains in which x does not participate must add up to zero, namely y + ∂(z) = 0. Finally, consider the chain x z. We have that ∂(x z) = z − x ∂(z) = z + x y = x . 0 0 0 Thus, x ∈ Z (X , R) is homologous to zero, and the result follows. We now illustrate the ability of this homology theory to detect directed cycles. We do so through some simple examples. Example 3.23 Let R ∈{Z, Q, R}.Let X, Y and Z be the directed simplicial complexes represented in Fig. 3. These three complexes are connected, so their 0th homologies are R, both directed and undirected. Furthermore, neither X nor Y have k-simplices for k ≥ 2, so H (X , R) and H (Y , R) are trivial (zero) for every k ≥ 2. This in k k Dir Dir particular implies that H (X , R) and H (Y , R) are trivial as well. Although Z has k k one 2-simplex, it is not a 2-cycle, thus H (Z , R) is also trivial. To compute the first homology, we need to consider the 1-simplices and their differentials. We list them below. 123 D. Méndez,R.J.Sánchez-García X ∂ [v ,v ] v − v 1 2 2 1 [v ,v ] v − v 2 3 3 2 [v ,v ] v − v 1 3 3 1 Y ∂ [w ,w ] w − w 1 2 2 1 [w ,w ] w − w 2 3 3 2 [w ,w ] w − w 3 1 1 3 Z ∂ [u , u ] u − u 1 2 2 1 [u , u ] u − u 2 3 3 2 [u , u ] u − u 3 4 4 3 [u , u ] u − u 4 1 1 4 [u , u ] u − u 2 4 4 2 We start with X. Note that λ [v ,v ]+ λ [v ,v ]+ λ [v ,v ], λ ,λ ,λ ∈ R, 1 1 2 2 2 3 3 1 3 1 2 3 is a cycle for X if and only if λ = λ =−λ . Since there are no 2-cycles in X, 1 2 3 different 1-cycles cannot be homologous. We deduce that H (X , R) R. However, no 1-cycle can have non-negative coefficients for the three elementary 1-cycles, thus Dir H (X , R) ={0}. Regarding Y , note that μ [w ,w ]+ μ [w ,w ]+ μ [w ,w ], μ ,μ ,μ ∈ R, 1 1 2 2 2 3 3 3 1 1 2 3 is a cycle if and only if μ = μ = μ . Since there are no 2-simplices in Y , we deduce 1 2 3 that H (Y , R) = R. Furthermore, the coefficients can be simultaneously positive. Namely, [w ,w ]+[w ,w ]+[w ,w ] is a directed cycle, which generates the 1 2 2 3 3 1 Dir entirety of H (X , R). We deduce that H (X , R) = R. Note that X and Y have isomorphic homology rings, but their directed homology rings are different. Finally, for Z, similar computations show that Z (Z , R) is the free R-module generated by x =[u , u ]+[u , u ]+[u , u ]+[u , u ] and x =[u , u ]+ 1 1 2 2 3 3 4 4 1 2 1 2 Dir [u , u ]+[u , u ], both of which are directed cycles. In particular, Z (Z , R) consists 2 4 4 1 of linear combinations of those two cycles, with positive coefficients. Note, however, that generating sets of Z (X , R) containing fewer than two directed cycles exist. This shows that computing a generating set for the cycles may not be enough to compute the full set of directed cycles. Continuing with the homology computations, note that in this case we have a 2-simplex, y =[u , u , u ],for which ∂(y) =[u , u ]−[u , u ]+[u , u ]= 2 3 4 3 4 2 4 2 3 Dir x −x . Namely, x and x are homologous. Therefore, H (Z , R) = H (Z , R) = R, 1 2 1 2 1 generated by either x or x . In particular, both the directed and undirected homologies 1 2 of Y and Z are isomorphic, and we can see how 2-simplices can make directed cycles equivalent, as expected. 123 A directed persistent homology theory for dissimilarity… v v 1 2 Fig. 4 A directed simplicial complex with just two vertices can have a non-trivial 1-cycle More generally, if R ∈{Z, Q, R},a polygon will give rise to a non-trivial directed homology class in dimension 1 if, and only if, the cycle can be traversed following the direction of the edges. Proposition 3.24 Let X be a 1-dimensional directed simplicial complex with vertices v ,v ,...,v ,n ≥ 1, and with 1-simplices e ,e ,..., e where either e = (v ,v ) 0 1 n 0 1 n i i i +1 or e = (v ,v ),i = 0, 1,..., n, and we write v = v . Then, H (X , R) = R. If i i +1 i n+1 0 1 Dir R ∈{Z, Q, R}, then H (X , R) is non-trivial if and only if either e = (v ,v ) for i i i +1 Dir all i or e = (v ,v ) for all i, in which case H (X , R) R. i i +1 i Proof We first prove that H (X , R) = R.Let x = λ [e ] be a cycle. If x is 1 i i i =0 non-trivial, we can assume without loss of generality that λ = 0. We can further assume that e = (v ,v ). Thus, ∂(λ [e ]) = λ [v ]− λ [v ]. The only other edge in 0 0 1 0 0 0 1 0 0 which e participates is e . We deduce that λ = λ ,if e = (v ,v ),or λ =−λ ,if 1 1 1 0 1 1 2 1 0 e = (v ,v ). By proceeding iteratively, we deduce that λ = λ ,if e = (v ,v ), 1 2 1 i 0 i i i +1 or that λ =−λ ,if e = (v ,v ). It is now straightforward to check that, with λ i 0 i i +1 i i defined in terms of λ in such way, i = 1,..., n, x = λ [e ] is a non-trivial 0 i i i =0 cycle. Since there are no 2-chains, H (X , R) = R. Regarding the directed homology, note that if e = (v ,v ), the coefficients of 0 0 1 the cycle x can only be all positive if and only if e = (v ,v ). In such case, i i i +1 Dir Dir H (X , R) = H (X , R) R, and otherwise H (X , R) = 0. The case in which 1 1 e = (v ,v ) is analogous. 0 1 0 Remark 3.25 Note that Proposition 3.24 applies to polygons with only two vertices v and v (see Fig. 4), which are allowed in a directed simplicial complex. Indeed, an 1 2 immediate computation shows that [v ,v ]+[v ,v ] is a directed 1-cycle. This result 1 2 2 1 shows that directed homology is not a homotopy invariant for the geometric realization of directed simplicial complexes, as all of the polygons considered in Proposition 3.24 give rise to homotopic geometric realizations. We end this section by noting that, although our initial aim was a directed homology group able to detect directed 1-cycles, we have a directed homology theory defined in all dimensions. However, we can show that the directed homology is trivial in even dimensions other than 0. Proposition 3.26 Let X be a directed simplicial complex R ∈{Z, Q, R}. Then Dir H (X , R) ={0}, for every n ≥ 1. 2n Proof We will show that no non-trivial directed cycles may exist in such dimensions. In order to do so, consider the morphism of R-modules ϕ : C (X , R) −→ R n n [v ,v ,...,v ] −→ 1. 0 1 n 123 D. Méndez,R.J.Sánchez-García k k Thus, if x = λ x where x is an elementary n-chain, ϕ (x ) = λ . i i i n i i =0 i =0 Now assume that x is an elementary 2n-chain, for some n ≥ 1. In this case, 2n ∂(λ x ) = λ ∂(x ), where ∂(x ) consists of the sum of  = n elementary n-chains i i i i i 2n−1 with positive sign and  = n − 1 elementary n-chains with negative sign. Thus, ϕ (λ x ) = λ (n − (n − 1)) = λ . 2n−1 i i i i To prove our claim, let x = λ x ∈ Z (X , R) be a cycle where each x is i i 2n i i =0 an elementary 2n-cycle. Then, given that ∂(x ) = 0, ϕ ∂(x ) = λ = 0. If 2n−1 i i =0 x is directed, then λ ≥ 0, for all i, thus we deduce that λ = 0, for all i. Namely, the i i Dir trivial cycle is the only directed cycle, and H (X , R) ={0}. 2n Remark 3.27 The proof of this result is based on the fact that the number of positive coefficients in the boundary of an elementary n-chain is larger than the number of negative coefficients, if n is even. This may also be related to the fact that there is no obvious way to define directed n-cycles for n > 1. For the purposes of this article, it suffices to consider 1-cycles and H (X , R). However, note that non-trivial (homological) cycles do exist in all odd dimensions. For instance, the elementary (2n − 1)-chain obtained by repeating a vertex 2n times is always a cycle. 4 Persistent directed homology In this section, we introduce a theory of persistent homology for directed simplicial complexes which comes in two flavours: one takes into account the directionality of the complex, whereas the other one is analogous to persistent homology in the classical setting. We thus have, associated to the same filtration of directed simpli- cial complexes, two persistence modules which produce two different barcodes. Both persistent homology theories show stability (see Theorem 4.19) and they are closely related. Indeed, directed cycles are undirected cycles as well, thus every bar in a directed persistence barcode can be uniquely matched with a bar in the corresponding undirected one, although the undirected bar may be born sooner, see Proposition 4.7. 4.1 Persistence modules associated to a directed simplicial complex Let us begin by introducing filtrations of directed simplicial complexes. Definition 4.1 Let X be a directed simplicial complex (Definition 3.1). A filtration of X is a family of subcomplexes (X ) , T ⊆ R, such that if δ ≤ δ ∈ T , then X is δ δ∈T δ a subcomplex of X , and such that X =∪ X . Note that for δ ≤ δ , the inclusion δ δ∈T δ i : X → X is a morphism of directed simplicial complexes. δ δ We now introduce undirected persistence modules. Definition 4.2 Let (X ) be a filtration of a directed simplicial complex X and let δ δ∈T R be a commutative ring. The n-dimensional undirected persistence R-module of X is the persistence R-module ({H (X , R)}, {H (i )} . n δ n δ≤δ ∈T 123 A directed persistent homology theory for dissimilarity… The functoriality of H makes this a persistence module. Now, in order to retain the information on directionality, we take the submodule of directed classes. Definition 4.3 Let (X ) be a filtration of a directed simplicial complex X and δ δ∈T let R ∈{Z, Q, R}.The n-dimensional directed persistence R-module of X is the persistence R-module Dir Dir δ {H (X , R)}, {H (i )} , n n δ δ≤δ ∈T Dir δ where H (X , R) are defined as in Definition 3.6, i : X → X  are the inclusion δ δ δ δ δ Dir maps, and H (i ) are the restrictions of the maps H (i ) to directed homology, as n δ δ introduced in Proposition 3.11. Remark 4.4 Let (X ) be a filtration of a directed simplicial complex X and let δ δ∈T δ ≤ δ ∈ T.If R ∈{Z, Q, R}, by applying Proposition 3.11 to the map i : X → X , we have a commutative diagram H (i ) H (X , R) H (X  , R) n δ n δ Dir Dir H (X , R) H (X  , R). δ δ n  n Dir δ H (i ) n δ Dir In particular, the injections H (X , R)→ H (X , R) give rise to a monomor- δ n δ δ∈T phism of persistence modules. We can now introduce persistence diagrams and barcodes associated to filtrations of directed simplicial complexes. As these were only introduced for fields in Sect. 2.1, from this point on, we will assume that R is a field. In particular, when taking directed homology, we will assume that R ∈{Q, R}. Definition 4.5 Let (X ) be a filtration of a directed simplicial complex X where δ δ∈T T ⊆ R is finite. The persistence diagrams associated to the n-dimensional undi- rected and directed persistence R-modules of X are respectively denoted Dgm (X , R) Dir and Dgm (X , R). Similarly, the respective barcodes are denoted Pers (X , R) and Dir Pers (X , R). Dir Remark 4.6 Note that H (X , R) H (X , R) for every δ ∈ T , as shown in Propo- δ 0 δ sition 3.18. As a consequence, the 0-dimensional directed and undirected persistence R-modules of a directed simplicial complex X are isomorphic. In particular, they have the same persistence barcodes and diagrams, and they measure the connectivity of the simplicial complex at each stage of the filtration. 123 D. Méndez,R.J.Sánchez-García v v v v v v v 4 4 3 4 3 4 3 v v v v v v v v 1 2 1 2 1 2 1 2 Z Z Z Z 0 1 2 3 0123 0123 Fig. 5 A filtration of the directed simplicial complex Z in Example 3.23 (Fig. 3) and its associated undirected (bottom, left) and directed (bottom, right) 1-dimensional persistence barcodes. The shorter undirected bar (blue) is also directed, while the longer undirected bar (red) becomes directed at δ = 2 (colour figure online) The next result establishes the relation between the undirected and directed per- sistence barcodes and diagrams of a persistence module, and it is thus key to understanding the directed persistence barcodes. Proposition 4.7 Let (X ) be a filtration of directed simplicial complexes. Then, δ δ∈T Dir there is an injective matching sending each bar in Pers (X , R) to a bar in Pers (X , R) which contains it. Furthermore, matched bars must die at the same time. Proof This result is an immediate consequence of Remark 4.4 and Bauer and Lesnick (2014, Proposition 6.1). Note that a bar in the directed persistence barcode of a filtration may be born after the one it is matched with in the undirected one, and some bars in the undirected barcode may remain unmatched (see Examples below and Figs. 5 and 6). We now introduce some examples to illustrate Proposition 4.7 and the general behaviour of the directed persistence barcodes. Example 4.8 Let us illustrate how undirected and directed persistence modules and their barcodes can be different. First, consider the directed simplicial complexes X and Y in Fig. 3 (see Example 3.23). Regardless of the filtration chosen for X, the lack of directed cycles means that the 1-dimensional directed persistence module is trivial. However, at the end of the filtration, there is a cycle in homology, thus there is a bar in the undirected persistence barcode. In the case of Y , the only 1-cycle is directed, so the undirected and directed persistence modules associated to any filtration of Y are isomorphic, and different from those of X. Example 4.9 Consider now the directed simplicial complex Z in Fig. 3 (see Example 3.23). Let T ={0, 1, 2,... } and consider the filtration of Z given by (Z ) ,as δ δ∈T illustrated in Fig. 5, where Z = Z for every δ ≥ 3. The undirected and directed 1-dimensional persistence modules of this filtration are Dir not isomorphic. Indeed, in Z there is clearly an undirected cycle, whereas Z (Z , R) 1 1 Dir Dir is trivial, thus H (Z , R) ={0}. However, H (Z , R) H (Z , R), as both 1 1 2 2 1 1 vector spaces are generated by the classes of [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ] 1 2 2 3 3 4 4 1 and [v ,v ]+[v ,v ]+[v ,v ]. These two classes become equivalent in Z .The 2 4 4 1 1 2 3 undirected and directed 1-dimensional persistence barcodes of this filtration, shown in Fig. 5, illustrate how undirected classes may become directed. 123 A directed persistent homology theory for dissimilarity… v v v v v 4 4 3 4 3 v v v v v v 1 2 1 2 1 2 X X X 0 1 2 0123 0123 Fig. 6 A filtration of directed simplicial complexes and its associated undirected (bottom, left) and directed (bottom, right) 1-dimensional persistence barcodes. The shorter undirected bar (blue) is also directed, while the longer undirected bar (red) never becomes directed (colour figure online) The last example also shows an important difference between classical and directed persistence modules and barcodes. Namely, in the undirected setting, the addition of one simplex to the filtration can either cause the birth of a class in the dimension of the added simplex, or kill a class in the preceding dimension. This simple idea is in fact the basis of the Standard Algorithm for computing persistent homology. However, the addition of only one simplex to a directed simplicial complex can cause the birth of several classes in directed homology, as shown in the previous example at δ = 2, and also in the following example. Example 4.10 Let T ={0, 1, 2,... } and consider the filtration of simplicial complexes (X ) illustrated in Fig. 2 in the Introduction, where X = X for every δ ≥ 3. δ δ∈T δ 3 Dir It is clear that Z (X , R) is trivial for j = 0, 1, 2, whereas undirected cycles appear as early as X . However, by adding the edge from v to v , several directed 1 5 1 Dir cycles are born at once. Namely, Z (X , R) contains the cycles [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 2 2 3 3 4 4 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 2 2 4 4 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 2 2 3 3 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]+[v ,v ], 1 3 3 4 4 5 5 1 [v ,v ]+[v ,v ]+[v ,v ]. 1 3 3 5 5 1 Straightforward computations show that H (X , R) = R , and four of the cycles 1 3 Dir 4 above are linearly independent, thus H (X , R) = H (X , R) = R . Consequently, 3 1 3 at δ = 3 every class (including the birthing one) becomes directed. Our last example shows an undirected homology class that never becomes directed. Example 4.11 Let T ={0, 1, 2,... } and consider the filtration of simplicial complexes (X ) illustrated in Fig. 6, where X = X for δ ≥ 2. δ δ∈T δ 2 Dir Dir Clearly, Z (X , R) ={0} for j = 0, 1, whereas Z (X , R) contains the j 2 1 1 cycle [v ,v ]+[v ,v ]+[v ,v ]. The undirected class represented by this element 1 2 2 4 4 1 is thus directed, but there is a linearly independent class in undirected homology, [v ,v ]+[v ,v ]−[v ,v ], which never becomes directed. Its bar in the barcode is 3 2 2 4 3 4 thus unmatched. 123 D. Méndez,R.J.Sánchez-García 4.2 Directed persistent homology of dissimilarity functions In this section, we introduce the undirected and directed persistence diagrams and barcodes associated to dissimilarity functions (Definition 4.13) and prove their stability with respect to the bottleneck distance (Theorem 4.19). Let us begin by introducing the directed Rips filtration of directed simplicial com- plexes associated to a dissimilarity function (Turner 2019, Definition 16). Definition 4.12 Let (V , d ) be a dissimilarity function. The directed Rips filtration Dir of (V , d ) is the filtration of directed simplicial complexes R (V , d ) where V V δ∈R Dir (v ,v ,...,v ) ∈ R (V , d ) if and only if d (v ,v ) ≤ δ, for all 0 ≤ i ≤ j ≤ n. 0 1 n V δ V i j δ Dir Dir It is clearly a filtration with the inclusion maps i : R (V , d ) → R (V , d ) V δ V δ for all δ ≤ δ . Let us now introduce the persistent homology modules associated to such a filtra- tion. Assume that R ∈{Q, R}. Definition 4.13 Let (V , d ) be a dissimilarity function and consider its associated Dir directed Rips filtration R (V , d ) . For each n ≥ 0, the n-dimensional undi- δ∈R rected persistence R-module of (V , d ) is Dir δ H (V , d ) := {H (R (V , d ) , R)}, {H (i )} . n V n V δ n δ≤δ ∈R Similarly, the n-dimensional directed persistence R-module of (V , d ) is defined as Dir Dir Dir Dir δ H (V , d ) := {H (R (V , d ) , R)}, {H (i )} . V V δ n n n δ δ≤δ ∈R Remark 4.14 The persistence module H (V , d ) associated to the directed Rips n V filtration of (V , d ) is precisely the persistence module studied in Turner (2019, Section 5), hence the remarks made there hold for the undirected persistence mod- Dir ule. In particular, if (V , d ) is a (finite) metric space, R (V , d ) is the (classical) V V Vietoris-Rips filtration of (V , d ). Furthermore, in this case, it can easily be seen that Dir H (V , d ) = H (V , d ). Thus, these persistence modules generalize the persis- n V V tence modules associated to the Vietoris-Rips filtration of a metric space. Also note that Dir R (V , d ) is closed under adjacent repeats (Turner 2019, Definition 18). Namely, Dir Dir given (v ,v ,...,v ) ∈ R (V , d ) , (v ,v ,...,v ,v ,...,v ) ∈ R (V , d ) , 0 1 n V δ 0 1 i i n V δ for any i = 0, 1,..., n. Using this, it can be proven that the homology of this object must be trivial in dimensions larger than |V |+ 1. As R is a field and since V is finite, both of these persistence modules fulfil the assumptions in Sect. 2.1. Namely, their indexing sets can be chosen to be finite, cor- responding to the threshold values where new simplices are added to the simplicial complex. Furthermore, no simplex is added to the filtration until the threshold value reaches the minimum of the images of the dissimilarity function. Finally, and even Dir though the directed simplicial complex R (V , d ) may have infinite simplices due V δ to arbitrary repetitions of vertices being allowed, it always has a finite number of simplices in a given dimension n, thus its n-dimensional homology is always finite- dimensional. As a consequence, we can introduce the following. 123 A directed persistent homology theory for dissimilarity… Definition 4.15 Let (V , d ) be a dissimilarity function. For each n ≥ 0, the n- dimensional persistence diagrams associated to the persistence R-modules H (V , d ) n V Dir Dir and H (V , d ) are respectively denoted by Dgm (V , d ) and Dgm (V , d ). V V V n n n Similarly, their associated persistence barcodes are denoted by Pers (V , d ) and n V Dir Pers (V , d ). Of course, Proposition 4.7 holds for these barcodes, namely, every bar in Dir Pers (V , d ) can be uniquely matched with one in Pers (V , d ) which dies at the V n V same time, although the directed bar may be born later. We now use results from Sect. 2.2 to show that both these persistent homology constructions are stable. The proof is split in several lemmas. Let (V , d ) and (W , d ) be two dissimilarity functions on respective sets V and W and define η = 2d (V , d ), (W , d ) , where d is the correspondence distortion distance CD V W CD (Definition 2.9). By Proposition 2.11, we can find maps ϕ : V → W and ψ : W → V such that dis(ϕ), dis(ψ ), codis(ϕ, ψ ), codis(ψ, ϕ) ≤ η. To simplify notation in the Dir δ Dir δ proofs below, denote R (V , d ) = X and R (W , d ) = X , for all δ ∈ R. V δ W δ V W Lemma 4.16 For each δ ∈ R, the maps ϕ and ψ induce morphisms of directed sim- plicial complexes δ+η δ+η δ δ ϕ : X −→ X ψ : X −→ X δ δ V W W V x −→ ϕ(x ), x −→ ψ(x ). Proof Let us prove the statement for ϕ (the proof is analogous for ψ ). Let δ δ (x , x ,..., x ) be an n-simplex in X . Then d (x , x ) ≤ δ for all 1 ≤ i ≤ j ≤ n. 0 1 n V i j Since dis(ϕ) ≤ η, we have that, for all v ,v ∈ V , 1 2 d (v ,v ) − d ϕ(v ), ϕ(v ) ≤ η. V 1 2 W 1 2 Choosing v = x and v = x ,wehave 1 i 2 j d ϕ(x ), ϕ(x ) ≤ η + d (x , x ) ≤ δ + η for all 1 ≤ i ≤ j ≤ n. W i j V i j δ+η Consequently, ϕ(x ), ϕ(x ),...,ϕ(x ) ∈ X and the result follows. 0 1 n δ δ δ Lemma 4.17 For δ ≤ δ ∈ R consider the inclusion maps i : X → X and δ V V δ δ δ j : X → X . The following are commutative diagrams of morphisms of directed W W simplicial complexes: δ δ i j δ δ δ δ δ δ X X X X V V W W ϕ ϕ ψ ψ δ δ δ δ δ +η δ +η j i δ+η  δ+η δ+η δ +η δ+η δ +η X X , X X . W V W V 123 D. Méndez,R.J.Sánchez-García Proof We prove that the first diagram is commutative (the proof for the second diagram δ δ is analogous). Let x ∈ V . Since i is an inclusion, (ϕ  ◦ i )(x ) = ϕ  (x ) = ϕ(x ). δ δ δ δ δ +η δ +η δ +η Similarly, since j is an inclusion, ( j ◦ ϕ )(x ) = j ϕ(x ) = ϕ(x ). δ+η δ+η δ+η Lemma 4.18 With the same notation as in Lemmas 4.16 and 4.17, for every δ ∈ R,the following diagrams of morphisms of directed simplicial complexes induce commutative diagrams on homology. δ+2η δ+2η i j δ δ+2η δ δ+2η δ δ X X X X V W V V ϕ ψ ψ ϕ δ δ+η δ δ+η δ+η δ+η X , X . W W Proof Again, we only prove the result for the first diagram, as the proof for the second diagram is analogous. We show that it is commutative up to homotopy by showing δ+2η that the maps i and ψ ◦ ϕ satisfy the hypothesis of Lemma 3.14. δ+η δ Take a simplex σ = (x , x ,..., x ) ∈ X , thus d (x , x ) ≤ δ, for all 0 1 n V i j δ+2η δ+2η 1 ≤ i ≤ j ≤ n. On the one hand, i is an inclusion, so i (σ ) = σ.On δ δ the other hand, since ψ ◦ ϕ is a morphism of directed simplicial complexes δ+η δ δ+2η (Definition 3.8), ψ(ϕ(x )), ψ (ϕ(x )),...,ψ(ϕ(x )) ∈ X . This implies that 0 1 n d ψ(ϕ(x )), ψ (ϕ(x )) ≤ δ + 2η, for all 1 ≤ i ≤ j ≤ n. V i j Now recall that codis(ϕ, ψ ) ≤ η, thus for all v ∈ V and w ∈ W , d v, ψ(w) − d ϕ(v), w ≤ η. V W Then, for 1 ≤ i ≤ j ≤ n, by taking v = x and w = ϕ(x ), i j d x ,ψ(ϕ(x )) ≤ η + d ϕ(x ), ϕ(x ) ≤ δ + 2η. V i j W i j As a consequence of the inequalities above, we have shown that for every 0 ≤ i ≤ n, δ+2η x , x ,..., x ,ψ(ϕ(x )), ψ (ϕ(x )),...,ψ(ϕ(x )) ∈ X . 0 1 i i i +1 n δ+2η Therefore, the maps i and ψ ◦ ϕ satisfy the hypothesis of Lemma 3.14, δ+η δ thus they induce the same map on homology. The result follows. We now have everything we need to prove the stability results. Theorem 4.19 Let R ∈{Q, R}. Let (V , d ) and (W , d ) be two dissimilarity func- V W tions on finite sets V and W . Then, for all n ≥ 0, d Dgm (V , d ), Dgm (W , d ) ≤ 2d (V , d ), (W , d ) B V V CD V W n n 123 A directed persistent homology theory for dissimilarity… and Dir Dir d Dgm (V , d ), Dgm (W , d ) ≤ 2d (V , d ), (W , d ) . B V V CD V W n n Proof Define η = 2d (V , d ), (W , d ) . By Theorem 2.7, it suffices to show that CD V W Dir the persistence modules H (V , d ) and H (W , d ) (respectively H (V , d ) and n V n W V Dir H (W , d ))are η-interleaved. Comparing Definition 2.5 (for ε = η) and Lemmas 4.16, 4.17 and 4.18, the result follows by using the functoriality of homology and, in the directed case, Proposition 3.11. Remark 4.20 Recall from Remark 4.14 that the persistence module H (V , d ) asso- n V ciated to the directed Rips filtration of (V , d ) is the persistence module studied in Turner (2019, Section 5). Thus, the remarks made in Turner (2019, Section 5.2) hold for these persistence modules, meaning that a result analogous to Theorem 4.19 would not hold if we were using ordered-set complexes instead of directed simplicial com- plexes, as mentioned at the beginning of Sect. 3. This justifies our definition of directed simplicial complex (Definition 3.1). 5 Computational challenges Let (V , d ) be a dissimilarity function in a finite set V . In this section, we study the Dir algorithmic aspects of computing both H (V , d ) and H (V , d ). For simplicity, n V V we assume that R = R. The computation of the persistence barcodes for the undirected persistent homology H (V , d ) can be made using the Standard Algorithm (Edelsbrunner et al. 2001; n V Zomorodian and Carlsson 2005), which also applies to ordered simplicial complexes. Dir δ Write R (V , d ) = X , δ ∈ R for the directed Rips filtration (Definition 4.12). V δ As V is finite, the number of simplices in X up to dimension n + 1 is finite, say N . We can list them {σ ,σ ,...,σ } in such a way that i < j if σ is a (proper) face of 1 2 N i σ ,orif σ appears ‘later’ in the filtration, thus having a compatible ordering. Then, j j we can represent the differential using a sparse N × N matrix M over R, where the (i , j )-entry M is the coefficient of σ in the differential of σ . Such matrix M is an ij j i upper-triangular matrix. At this stage, the Standard Algorithm can be used to compute Dgm (V , d ) by reducing M at once using column operations. Dir We cannot directly adapt the Standard Algorithm to compute Dgm (V , d ):we can try to find reduction operations that generate directed cycles using operations with positive coefficients only, but this is not enough to guarantee that we find all the possible directed cycles. For instance, once one column has been eliminated, we may be preventing the participation of the simplex it represents in directed cycles not involving the columns we are using in the elimination. To avoid such a problem, we would need to keep track of every possible way in which column eliminations can be produced using only positive coefficients. This is very similar to the double description algorithm for the computation of extreme rays of a polyhedral cone (Fukuda et al. 1995; Motzkin Dir et al. 1953). Indeed, the directed cycles Z (X , R) are the non-negative solutions to 123 D. Méndez,R.J.Sánchez-García the linear equation Mx = 0, that is, x = (x , x ,..., x ) ∈ R | Mx = 0, x ≥ 0 , (5.1) 1 2 N where a vector x = (x , x ,..., x ) ∈ R represents the cycle x σ , M is the 1 2 N i i i =1 matrix corresponding to the differential, and x ≥ 0 means that every entry in x is non-negative, that is, x ≥ 0 for all i. Equivalently, directed cycles are the extreme rays of the unbounded polyhedral cone consisting of the intersection of the solution space to the linear system Mx = 0 with the first quadrant of R . The enumeration of the extreme rays (generators of the solution set) of a polyhedral cone such as Eq. (5.1) is a well-known albeit hard problem in computational geometry. Several algorithms, including the double description algorithm, and modifications thereof, exist (Terzer 2009). Note, however, that we do not require the computation of all the directed cycles. Indeed, it would be enough to calculate them up to homology, which suggests that efficient computations may be possible. In summary, although important challenges remain on the computation of the per- sistence diagrams associated to directed persistence modules, our objective was to provide the necessary groundwork for the study of directed persistent homology of asymmetric data sets, and we are hopeful that we are opening up an exciting new approach for future research into the topological properties of asymmetric data sets. Acknowledgements This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1 The first author was partially supported by Ministerio de Economía y Competitividad (Spain) Grants Nos. MTM2016-79661-P and MTM2016-78647-P and by Ministerio de Ciencia, Innovación y Uni- versidades (Spain) Grant No. PGC2018-095448-B-I00. Funding Open access funding provided by FCT|FCCN (b-on). Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Appendix A: Directed persistent homology via semiring homology As mentioned in the Introduction, our ideas of directed homology are based on homology with semiring coefficients, which can be introduced for directed simplicial complexes using chain complexes of semimodules and their associated homologies, building on work by Patchkoria (1977, 2014). In this appendix, we introduce this homology theory and explain how it relates to the modules of directed homology introduced in Definition 3.6. We remark that other perspectives of directionality in topological spaces could be used instead. One such important perspective is that of directed algebraic topology, 123 A directed persistent homology theory for dissimilarity… which is extensively used in the study of concurrency in computer science (Fajstrup et al. 2016). In this approach, directionality information is encoded by means of distin- guished paths, forming a category dTop where objects are topological spaces together with their directed paths (Fajstrup et al. 2016, Definition 4.1). These spaces can be modelled using cubical complexes, and their homology including directional informa- tion can be computed using chain complexes over preordered abelian groups (Grandis 2009, Chapter 2). In this theory, the homology groups of a space are those of ordinary homology, but directionality information is recorded in the preorder given to those groups. Therefore, it would be interesting for future work to study potential definitions of persistent homology over preordered abelian groups. Nonetheless, we focus here on homology over semimodules, since it provides us with the information we require while keeping the exposition simpler. A.1 Semirings, semimodules and their completions Let us begin by introducing the algebraic structures that we need, following (Golan 1999). Definition A.1 A semiring = ( , +, ·) is a set together with two operations such that • ( , +) is an abelian monoid whose identity element we denote 0 , • ( , ·) is a monoid whose identity element we denote 1 , •· is distributive with respect to + from either side, • 0 · λ = λ · 0 = 0 , for all λ ∈ . A semiring is commutative if ( , ·) is a commutative monoid, and cancellative if ( , +) is a cancellative monoid, that is, λ + λ = λ + λ implies λ = λ for all λ, λ ,λ ∈ .Asemiring is a semifield if every 0 = λ ∈ has a multiplicative inverse. A semiring is zerosumfree if no element other than 0 has an additive inverse. Example A.2 Every ring is a semiring. The non-negative integers, rationals and reals + + with their usual operations, respectively denoted N, Q and R , are cancellative zerosumfree commutative semirings which are not rings. Definition A.3 Let be a semiring. A (left) -semimodule is an abelian monoid (A, +) with identity element 0 together with a map × A → A which we denote (λ, a) → λa and such that for all λ, λ ∈ and a, a ∈ A, • (λλ )a = λ(λ a), • λ(a + a ) = λa + λa , • (λ + λ )a = λa + λ a, • 1 a = a, • λ0 = 0 = 0 λ. A A A non-empty subset B of a left -semimodule A is a subsemimodule of A if B is closed under addition and scalar multiplication, which implies that B is a left -semimodule with identity element 0 ∈ B.If A and B are -semimodules, a -homomorphism is a map f : A → B such that for all a, a ∈ A and for all λ ∈ , 123 D. Méndez,R.J.Sánchez-García • f (a + a ) = f (a) + f (a ), • f (λa) = λ f (a). Clearly, f (0 ) = 0 . A B A -semimodule A is cancellative if a + a = a + a implies a = a , and zerosumfree if a + a = 0 implies a = a = 0, for all a, a , a ∈ A. The Cartesian product of -semimodules is a -semimodule. The definition of direct sum of modules can be generalized to the framework of -semimodules, and a finite direct sum of -semimodules is isomorphic to the Cartesian product of the same -semimodules. Quotient -semimodules can be defined using congruence relations. Definition A.4 Let A be a left -semimodule. An equivalence relation ρ on A is a -congruence if, for all a, a ∈ , and all λ ∈ , • if a ∼ a and b ∼ b , then (a + b) ∼ (a + b ), and ρ ρ ρ • if a ∼ a , then λa ∼ λa . ρ ρ If ρ is a -congruence relation on A and we write a/ρ for the class of an element a ∈ A, then A/ρ ={a/ρ | a ∈ A} inherits a -semimodule structure by setting (a/ρ) + (a /ρ) = (a + a )/ρ and λ(a/ρ) = (λa)/ρ, for all a, a ∈ A and λ ∈ . The left -semimodule A/ρ is called the factor semimodule of A by ρ. Note that the quotient map A → A/ρ is a surjective -homomorphism. If B is a subsemimodule of a -semimodule A, then it determines a -congruence ∼ by setting a ∼ a if there exist b, b ∈ B such that a + b = a + b . Classes in B B this quotient are denoted a/B, and the factor semimodule is denoted A/B.If is a ring, this is just the usual quotient module of A by B. Example A.5 The following are examples of factor semimodules. • If is a ring and B ≤ A are -modules, the factor semimodule A/B is the usual module quotient of A by B. • In the N-module N×N, consider the relation ρ such that (u,v) ∼ (x , y) if u +y = v + x. It is immediate to show that ρ is a N-congruence, and the factor semimodule (N × N)/ρ is isomorphic to Z. Indeed, consider the map (N × N)/ρ → Z taking (u,v) → u −v. It is clearly a map of N-semimodules, (u,v) ∼ (x , y) ⇔ u + y = v + x ⇔ u − v = x − y shows that it is well-defined and injective, and it is clearly surjective. • Also in N×N, consider the submodule B ={0}×N. Then (u,v) ∼ (x , y) if there exist (0, n), (0, m) ∈{0}×N such that (u,v)+(0, n) = (x , y)+(0, m).Fromthis, we deduce that (u,v) ∼ (x , y) if and only if u = x. Namely, (N × N)/({0}× N) is isomorphic to N. Clearly, if is a semiring then it is a -semimodule over itself. The idea of free -semimodules comes about naturally just as in the case of rings, and we will make extensive use of these objects. Definition A.6 Let be a semiring, A aleft -semimodule and V ={v ,v ,...,v } 1 2 n a finite subset of A.The set V is a generating set for A if every element in A is a linear combination of elements of V.The rank of a -semimodule A, denoted rank(A),is 123 A directed persistent homology theory for dissimilarity… the least n for which there is a set of generators of A with cardinality n, or infinity, if not such n exists. The set V is linearly independent if for λ ,λ ,...,λ ∈ and μ ,μ ,...,μ ∈ 1 2 n 1 2 n n n such that λ v = μ v , then λ = μ ,for i = 1, 2,..., n, and it is called i i i i i i i =1 i =1 linearly dependent otherwise. We call V is a basis of A if it is a linearly independent generating set of A. The -semimodule A is a free -semimodule if it admits a basis V . We denote a free -semimodule with basis {v ,v ,...,v } by (v ,v ,...,v ). 1 2 n 1 2 n Remark A.7 It is easy to check that free -semimodules over a cancellative semiring are themselves cancellative, as are subsemimodules of cancellative -semimodules. The quotient of a cancellative -semimodule over any of its subsemimodules is also cancellative, see Golan (1999, Proposition 15.24). Note that if is a ring, free -semimodules are just free -modules. Thus, as not every module over a ring is free, clearly not every -semimodule is free. On the other hand, (the direct sum, or product, of n copies of ) is clearly a free - semimodule, for every n ≥ 1. Another key property is that if A is a free -semimodule with basis V and B is another -semimodule, each map V → B uniquely extends to a -homomorphism A → B,see Golan(1999, Proposition 17.12). Remark A.8 If is a commutative semiring and A is a -semimodule admitting a finite basis, every basis has the same cardinality, which coincides with the rank of A (Tan 2014, Theorem 3.4). In fact, for every integer n > 0, every finitely generated n + n subsemimodule of N has a unique basis, whereas basis for semimodules of (R ) are unique up to non-zero multiples, see e.g. Kim and Roush (1980, Theorem 2.1). We finish this section by extending the Grothendieck construction from abelian monoids to semirings and semimodules. Definition A.9 Let M be an abelian monoid. Consider the congruence relation ρ in M × M defined as (u,v) ∼ (x , y) ⇔ there exists z ∈ M such that u + y + z = v + x + z. Let [u,v] denote the equivalence class of (u,v). Then M × M /ρ becomes a group with the componentwise addition. This group is called the Grothendieck group or group completion of M. There is a canonical homomorphism of monoids k : M → K (M ) defined as k (x ) =[x , 0].If M is a cancellative monoid, then k is injective. M M Given an element [u,v]∈ K (M ) we can interpret u as its positive part and v as its negative part, and the relation ∼ becomes the obvious one. The identity element is then 0 =[x , x ] for any element x ∈ M, and the inverse of [x , y] is [y, x ] for any x , y ∈ M. If f : M → N is a morphism of monoids, there is a morphism of groups K ( f ) : K (M ) → K (N ) that takes [u,v] to [ f (u), f (v)]. In fact, K is a functor from the category of abelian monoids to the category of abelian groups. Furthermore, K ( f ) ◦ k = k ◦ f . M N 123 D. Méndez,R.J.Sánchez-García ∼ ∼ As shown in Example A.5, K (N) Z. Similarly, one can show that K (R ) R. = = Crucially for us, this construction can be extended to semirings and semimodules. Definition A.10 Let be a semiring. The group completion K ( ) becomes a ring with the operation [x , x ]·[y , y ]=[x y + x y , x y + x y ], and k : → K ( ) 1 2 1 2 1 1 2 2 2 1 1 2 is in fact a morphism of semirings, which is injective if is cancellative. We call K ( ) the ring completion of .If f : → is a morphism of semirings, the map K ( f ) : K ( ) → K ( ) as introduced in Definition A.9 is a morphism of rings, and K is a functor from the category of semirings to the category of rings. If A is a -semimodule, then the abelian group K (A) together with the operation K ( ) × K (A) → K (A) given by [λ ,λ ][a , a ]=[λ a + λ a ,λ a + λ a ] is 1 2 1 2 1 1 2 2 1 2 2 1 a K ( )-module, the K ( )-module completion of A.Again,if f : A → B is a - morphism, the map K ( f ) : K (A) → K (B) becomes a morphism of K ( )-modules. Furthermore, if A is a free -semimodule with basis {v | i ∈ I }, it becomes immediate that K (A) is a free K ( )-module with basis [v , 0]| i ∈ I }. A.2 Chain complexes of semimodules Now that we have introduced the necessary algebraic structures, we move on to chain complexes of semimodules over semirings, following Patchkoria (2014). This theory is a natural generalization of the classical theory of chain complexes of modules and, in fact, they give rise to the same cycles, boundaries and homologies when the semimodules are modules over a ring. In order to introduce chain complexes in the context of semimodules an immediate problem arises: alternating sums cannot be defined as elements in a semimodule may not have inverses. The solution is to use two maps, a positive and negative part, for the differentials. Definition A.11 Let be a semiring and consider a sequence of -semimodules and homomorphisms indexed by n ∈ Z n+1 n − − − X : ··· ⇒ X ⇒ X ⇒ X ⇒ ··· . n+1 − − n − n−1 − − ∂ ∂ n+1 + − We say that X ={X ,∂ ,∂ } is a chain complex of -semimodules if n n + + − − + − − + ∂ ∂ + ∂ ∂ = ∂ ∂ + ∂ ∂ . n n+1 n n+1 n n+1 n n+1 As in the classical case, chain complexes of -semimodules give rise to a - semimodule of homology. + − Definition A.12 Let X ={X ,∂ ,∂ } be a chain complex of -semimodules. The n n -semimodule of cycles of X is + − Z (X, ) ={x ∈ X | ∂ (x ) = ∂ (x )}. n n n n 123 A directed persistent homology theory for dissimilarity… The nth homology of X is then the quotient -semimodule H (X, ) = Z (X, )/ρ (X, ) n n n where ρ (X, ) is the following -congruence relation on Z (X, ): n n + − x ∼ y ⇔∃u,v ∈ X s.t. x + ∂ (u) + ∂ (v) ρ (X , ) n+1 n n+1 n+1 + − = y + ∂ (v) + ∂ (u). n+1 n+1 We will omit from now on the coefficient semiring from the notation when it is clear from the context. Remark A.13 The definition of cycle is a direct generalization of the classical defini- tion. For the boundary relation, note that we may need two different chains u and v in order to establish two classes as homologous. Intuitively, these two classes are the ‘positive’ and ‘negative’ part of w, where x = y + ∂(w) in the classical setting. + − Remark A.14 If X ={X ,∂ ,∂ } is a chain complex of -semimodules, then n n + − + − K (∂ )−K (∂ ) K (∂ )−K (∂ ) n+1 n+1 n n ... −→ K (X ) − −−−−−−−−−→ K (X ) − −−−−−−−→ K (X ) −→ ··· n+1 n n−1 is a chain complex of K ( )-modules. If furthermore the -semimodules X are cancellative for all n, the converse is also true. If is a ring, then K ( ) = and the functor K acts as the identity on -modules. + − Therefore, X ={X ,∂ ,∂ } is a chain complex of -semimodules if and only if n n + − + − ∂ −∂ ∂ −∂ n+1 n+1 n n ··· −→ X − −−−−− → X − −−− → X −→ ··· n+1 n n−1 is a chain complex of -modules. In this case, it is clear that the homology semi- modules introduced in Definition A.12 are precisely the usual homology modules of X. We remark that a similar idea of decomposition of differentials in a positive and negative components has been used by Steiner to relate ω-categories with the so-called augmented directed complexes (Steiner 2004). These structures share, in fact, a deep similarity with Patchkoria’s chain complexes with semiring coefficients. We now turn our attention to maps between complexes. + −   + − Definition A.15 Let X ={X ,∂ ,∂ } and X ={X ,∂ ,∂ } be two chain com- n n n n n plexes of -semimodules. A sequence f ={ f } of -homomorphisms f : X → X n n n is said to be a morphism from X to X if + − − + ∂ f + f ∂ = ∂ f + f ∂ . n n−1 n n−1 n n n n It is clear that for such a map f Z (X ) is a -subsemimodule of Z (X ). Fur- n n n thermore, if X and X are cancellative -semimodules, f is also compatible with 123 D. Méndez,R.J.Sánchez-García the congruence relations ρ (X ) and ρ (X ), so it induces a map n n H ( f ) : H (X ) → H (X ). n n n It is then easy to check that homology H is a functor from the category of chain complexes of cancellative -semimodules and morphisms to the category of graded -semimodules. + − Remark A.16 Let be a semiring and let {X ,∂ ,∂ } be a chain complex of - n n semimodules. The family of canonical maps k : X → K (X ) givesrisetoa X n n + − + − morphism from {X ,∂ ,∂ } to K (X ), K (∂ ), K (∂ ) and, therefore, to a mor- n n n n n n phism of -semimodules H (k ) : H (X ) → H K (X ) , X n n which takes the class of x to the class of [x , 0]. Furthermore, if the -semimodules X are cancellative, H (k ) is injective, which in particular implies that H (X ) is n n X n a cancellative -semimodule. Also note that, by Remark A.14, the homology of + − + − K (X ), K (∂ ), K (∂ ) and the usual homology of K (X ), K (∂ ) − K (∂ ) are n n n n n n isomorphic as K ( )-modules. Finally, we discuss chain homotopies. + − Definition A.17 Let f ={ f } and g ={g } be morphisms from X ={X ,∂ ,∂ } to n n n n n + − X ={X ,∂ ,∂ }. We say that f is homotopic to g if there exist -homomorphisms n n n + − s , s : X → X such that n n n+1 + − − + − + + − ∂ s + ∂ s + s ∂ + s ∂ + g n n n n n+1 n+1 n−1 n−1 + + − − + + − − = ∂ s + ∂ s + s ∂ + s ∂ + f , n+1 n n+1 n n−1 n n−1 n + − for all n. The family {s , s } is called a chain homotopy from f to g, and we write n n + − (s , s ) : f  g. We then have the following result. Proposition A.18 (Patchkoria 2014, Proposition 3.3) Let f , g : X → X be morphisms between chain complexes of cancellative -semimodules. If f is homotopic to g, then H ( f ) = H (g). n n Homotopy equivalences are defined in the usual way and they induce isomorphisms on homology. We finish with a remark that the homotopy of maps behaves well with respect to semimodule completion. Remark A.19 If a morphism f : X → X is homotopic to g : X → X , then K ( f ) : K (X ) → K (X ) is homotopic to K (g) : K (X ) → K (X ). Furthermore, if both X and X are cancellative -semimodules, for all n, and X is a free - n n semimodule for all n, the converse is also true. 123 A directed persistent homology theory for dissimilarity… A.3 Chain complexes of semimodules of directed simplicial complexes In this section, we introduce chain complexes of semimodules and homology semi- modules of directed simplicial complexes, Definition 3.1. The results introduced here directly generalize those in Sect. 3. Definition A.20 The n-dimensional chains of a directed simplicial complex X are defined as the elements of the free -semimodule generated by (i.e. with basis) the n-simplices, C (X, ) = {[x , x ,..., x ]| (x , x ,..., x ) ∈ X } . n 0 1 n 0 1 n We call the elements [x , x ,..., x ]∈ C (X, ), (x , x ,..., x ) ∈ X, elementary 0 1 n n 0 1 n n-chains. Remark A.21 Note that if is a cancellative semiring, the -semimodule C (X, ) is cancellative for every n, as it is a free semimodule over a cancellative semiring. We now define the positive and negative differentials on C (X, ). Definition A.22 Let X be a directed simplicial complex. For each n > 0, we define + − morphisms of -semimodules ∂ ,∂ : C (X, ) → C (X, ) by n n−1 n n ∂ ([x , x ,..., x ]) = [x , x ,..., x ,..., x ], and 0 1 n 0 1 2i n i =0 n−1 ∂ ([x , x ,..., x ]) = [x , x ,..., x  ,..., x ]. 0 1 n 0 1 2i +1 n i =0 + − For n = 0, let ∂ ,∂ : C (X, ) →{0} be the trivial maps, by definition. 0 0 Proposition A.23 Let X be a directed simplicial complex and be a cancellative + − semiring. Then {C (X, ), ∂ ,∂ } is a chain complex of modules of -semimodules. n n Proof By Remark A.21,the -semimodule C (X, ) is cancellative for every n. + − Thus, by Remark A.14, it is enough to prove that K C (X, ) , K (∂ ) − K (∂ ) n n is a chain complex of K ( )-modules. Note that K C (X, ) is a free K ( )-module whose basis is given by elements [x , x ,..., x ], 0 such that [x , x ,..., x ] is an 0 1 n 0 1 n + − elementary n-chain. Thus, it suffices to show that the composition K (∂ ) − K (∂ ) ◦ n n + − K (∂ ) − K (∂ ) is trivial (zero) on these elements. Since n−1 n−1 + − i K (∂ ) − K (∂ ) [x , x ,..., x ,..., x ], 0 = (−1) [x , x ,..., x ], 0 , 0 1 i n 0 1 n n n i =0 the proof is now a straightforward computation analogous to the standard one for chain complexes in simplicial homology. 123 D. Méndez,R.J.Sánchez-García Proposition A.23 allows us to define the homology of a directed simplicial complex with coefficients on a semiring. Definition A.24 Let X be a directed simplicial complex and n ≥ 0 and a cancellative semiring. The n-dimensional homology of X, written H (X, ),isthe nth homology + − -semimodule of the chain complex {C (X, ), ∂ ,∂ }. n n + + Remark A.25 If is a ring, by defining ∂ = ∂ −∂ , {C (X, ), ∂} is a chain complex in the usual sense. In fact, it is precisely the chain complex introduced in Definition 3.3. Thus, as a consequence of Remark A.14, the homology theory introduced in Definition A.24 coincides with the one introduced in Definition 3.4. Therefore, homology with semiring coefficients of directed simplicial complexes is a non-trivial generalization of homology with ring coefficients. The following result will become useful later on, so we recorded it here. Proposition A.26 Let be a cancellative semiring and X a directed simplicial com- + − plex. The chain complexes of K ( )-semimodules K C (X, ) , K (∂ ), K (∂ ) n n + − and C X , K ( ) ,∂ ,∂ are isomorphic. n n Proof Take n ≥ 0 an integer. By definition, C X , K ( ) is the free K ( )-module over the elementary n-chains [x , x ,..., x ]. Similarly, K C (X, ) is a free K ( )- 0 1 n n module with a basis given by the elements [x , x ,..., x ], 0 where [x , x ,..., x ] 0 1 n 0 1 n is an elementary n-chain. Consider the K ( )-morphisms α : C X , K ( ) −→ K C (X, ) β : K C (X, ) −→ C X , K ( ) n n n n n n [x , x ,..., x ] −→ [x , x ,..., x ], 0 , [x , x ,..., x ], 0 −→ [x , x ,..., x ]. 0 1 n 0 1 n 0 1 n 0 1 n Immediate computations show that the families of K ( )-morphisms {α } and {β } n n are morphisms of K ( )-semimodules which are inverses of each other, and the claim follows. A.4 Functoriality of semiring homology In this section, we prove that homology is a functor from the category of directed simplicial complexes to the category of graded -semimodules. We also show that two morphisms allowing for the construction of the prism operator must induce the same map on homology, a result we will need to prove our persistent homology stability result. Recall the definition of morphism of directed simplicial complexes, Definition 3.8. Definition A.27 Let f : X → Y be a morphism of directed simplicial complexes. Then f induces morphisms of -semimodules C ( f ) ={C ( f )}, C ( f ) : C (X, ) −→ C (Y , ) n n n [x , x ,..., x ] −→ [ f (x ), f (x ), . . . , f (x )]. 0 1 n 0 1 n We will often abbreviate C ( f ) = f . n n 123 A directed persistent homology theory for dissimilarity… Proposition A.28 If f : X → Y is a morphism of directed simplicial complexes, the family of maps {C ( f )} is a -homomorphism of chain complexes. Therefore, it induces a map H ( f ) : H (X, ) → H (Y , ). n n n + + − − Proof Simple computations show that f ∂ = ∂ f and that f ∂ = ∂ f . n−1 n n−1 n n n n n Remark A.29 As a consequence of Proposition A.26,themap C ( f ) : C X , K ( ) → n n C Y , K ( ) is precisely K C ( f ) : K C (X, ) → K C (Y , ) . n n n n Corollary A.30 Homology is a functor from the category of directed simplicial com- plexes to the category of graded -semimodules. In particular, isomorphic directed simplicial complexes have isomorphic homologies. Note further that when is a ring, the homology functor is precisely the functor introduced in Sect. 3.2. We finish this section by showing a sufficient condition for two morphisms to induce the same map on homology. We will need this result to prove that our definition of persistent homology is stable. Lemma A.31 Let be a cancellative semiring. Let X and Y be two directed simplicial complexes and let f , g : X → Y be morphisms of directed simplicial complexes such that if (x , x ,..., x ) ∈ X, then f (x ), f (x ),..., f (x ), g(x ), . . . , g(x ) ∈ Y 0 1 n 0 1 i i n for every i = 0, 1,..., n. Then, H ( f ) = H (g) for every n ≥ 0. n n Proof For x =[x , x ,..., x ]∈ C (X, ) an elementary n-chain, define 0 1 n n s [x , x ,..., x ]= [ f (x ), . . . , f (x ), g(x ),..., g(x )], and 0 1 n 0 2i 2i n i =0 n−1 s [x , x ,..., x ]= [ f (x ),..., f (x ), g(x ). ..., g(x )]. 0 1 n 0 2i +1 2i +1 n i =0 Now recall that since is cancellative, the canonical map k : C (X, ) → n n + − K C (X, ) is injective for all n. We will show that (s , s ) : C ( f )  C (g) by proving that both sides of the equality in Definition A.17 have the same images through k . We will also make use of the isomorphism K C (X, ) = C X , K ( ) estab- n n n lished in Proposition A.26, and that for any -morphism h : C (X, ) → C (X, ), n n k ◦ h = K (h) ◦ k . n n By abuse of notation, we write k (x ) = x =[x , x ,..., x ]∈ C X , K ( ) . n 0 1 n n + − + − Denote ∂ = K (∂ ) − K (∂ ) and s = K (s ) − K (s ). Then, n n n n n n ∂ (x ) = (−1) [x , x ,..., x ,..., x ], s (x ) n 0 1 i n n i =0 = (−1) f (x ), f (x ),..., f (x ), g(x ),..., g(x ) . 0 1 j j n j =0 123 D. Méndez,R.J.Sánchez-García Note that, by appropriately grouping the terms in the equality in Definition A.17,it suffices to show that K (g ) − K ( f ) and ∂ s + s ∂ have the same image on x. n n n+1 n n−1 n These computations are analogous to the ones in Lemma 3.14. A.5 Comparison to directed (persistent) homology In Sect. 3, directed homology modules were introduced by taking submodules of homology generated by cycles with only non-negative coefficients on the elementary chains. This was done for the rings Z, Q, and R. These three rings have in common that they are the Grothendieck’s completion of a cancellative, zerosumfree semiring. + + Namely, Z = K (N), Q = K (Q ), R = K (R ). It turns out that homology with coefficients in this class of semirings has the ability to detect directed cycles, and what we have introduced in the main body of this article is a particular case of this fact. + + Proposition A.32 Let ∈{N, Q , R } and let X be a directed simplicial complex. Dir Consider Z X , K ( ) the set of directed cycles introduced in Definition 3.6. Then, Dir Z X , K ( ) = Z (X, ) as -semimodules. Proof First note that for the considered semirings, can be seen as the non- negative elements of K ( ). Using this identification, it is straightforward to show Dir that Z X , K ( ) is a -semimodule. In particular, linear combinations of directed cycles with non-negative coefficients are directed cycles. By abuse of notation, we will denote elementary n-chains in the same way in C (X, ) and in C X , K ( ) . Assume that λ c ∈ Z (X, ). Then clearly n n i i n i =1 λ c ∈ Z X , K ( ) (see Proposition A.26). Furthermore, since the involved i i n i =1 Dir coefficients are non-negative, λ c ∈ Z X , K ( ) . Consequently, we can i i i =1 n introduce a map Dir ϕ : Z (X, ) −→ Z X , K ( ) n n λ c −→ λ c . i i i i i =1 i =1 It is immediate to check that ϕ is a morphism of -semimodules. It is also injective. We only need to prove that it is surjective. For that, it would be enough to prove that if n n Dir λ c ∈ Z X , K ( ) , then λ c ∈ Z (X, ). To that purpose, note that i i i i n i =1 n i =1 x = λ c can be regarded as an element of C (X, ) as the involved coefficients i i n i =1 + − are non-negative, and thus, in . We only need to prove that ∂ (x ) = ∂ (x ). Since + − is cancellative, it is enough to prove that K ∂ (x ) = K ∂ (x ) , or equivalently, + − that K ∂ (x ) − K ∂ (x ) = 0. But this follows from Proposition A.26 and from the fact that x ∈ Z X , K ( ) . In general, we could use homology with coefficients in a cancellative, zerosumfree semiring to detect directed cycles. Namely, we can prove an analogous to Proposition 3.24 for homology with semiring coefficients. 123 A directed persistent homology theory for dissimilarity… Proposition A.33 Let be a cancellative, zerosumfree semiring. Let X be a 1- dimensional directed simplicial complex with vertex set {v ,v ,...,v },n ≥ 1, 0 1 n and with 1-simplices e ,e ,..., e where either e = (v ,v ) or e = (v ,v ), 0 1 n i i i +1 i i +1 i i = 0, 1,..., n, and where v = v . Then, H (X, ) = is non-trivial if and only n+1 0 1 if either e = (v ,v ) for all i or e = (v ,v ) for all i, and H (X, ) ={0} in i i i +1 i i +1 i 1 any other case. Proof Let [e ] denote the elementary 1-chain associated to e .Let x = λ [e ] i i i i i =0 be any non-trivial 1-cycle. We can assume without loss of generality that λ = 0. If e = (v ,v ), then λ [v ] is a non-trivial summand in ∂ (e ). Since is zerosumfree, 0 0 1 0 1 0 such summand does not have an inverse, thus λ [v ] must be a summand in ∂ (x ). 0 1 Now note that e is the only other simplex involving the vertex v . Furthermore, [v ] 1 1 1 − − appears in ∂ [e ] if and only if e = (v ,v ), in which case ∂ [e ]= λ v .We 1 1 1 2 1 1 1 further deduce that λ = λ . 1 0 By proceeding iteratively, we deduce that if x is non-trivial, necessarily e = (v ,v ) and λ = λ = ··· = λ = λ, in which case x = λ[v ,v ]. i i +1 0 1 n i i +1 i =0 Simple computations show that such x is indeed a cycle, and since there are no 2- simplices, it cannot be homologous to any other cycle. Thus, H (X, ) = .The case where e = (v ,v ) is analogous. 0 1 0 The question would then be how to use this information in the setting of per- sistent homology. Recall from Remark A.16 that the family of canonical maps k : C (X, ) → K C (X, ) gives rise to a morphism of -semimodules C (X , ) n n k : H (X, ) → H K (C (X, )) , which takes the class of x to the class of C (X , ) n n [x , 0]. As a consequence of Proposition A.26, we can also rewrite H K (C (X, )) = H C (X , K ( )) . Therefore, the map k above can be regarded as a morphism n C (X , ) k : H (X, ) → H X , K ( ) . We denote this map by k . We then have the C (X , ) n n X following result. + + Proposition A.34 Let X be a directed simplicial complex and let ∈{N, Q , R }. Dir The directed homology module introduced in Definition 3.6,H X , K ( ) ,isthe submodule of H X , K ( ) generated by Im k . n X Proof Note that all representatives of classes in H (X, ) are in Z (X, ). Thus, the n n image through k corresponds to a subset of the classes generated by positive cycles, Dir as a consequence of Proposition A.32. Namely, Im k ⊆ H X , K ( ) . To prove the result, it would be enough to prove that every class of a directed cycle is in Im k . But this follows from the definition of H (X, ). More generally, given a cancellative, zerosumfree semiring, we can define Dir H X , K ( ) as the submodule of H X , K ( ) generated by Im k , and such n X module would detect directed cycles as exhibited by Proposition A.34.If K ( ) is a field, we can also use these homology modules to define persistence vector spaces for which we can prove stability results in a way analogous to the results in Sect. 4. The results of this paper arose from the ideas in the constructions above, which are introduced in their full generality using homology with semiring coefficients. This shows that the idea of introducing a submodule of directed homology by taking cycles with non-negative coefficients has a deeper algebraic meaning, providing more context 123 D. Méndez,R.J.Sánchez-García for the results in this paper and presenting them in their full generality. Furthermore, similar ideas could be considered for semirings which are not zerosumfree, perhaps allowing for the detection of additional information that persistent homology with ring coefficients may miss. The matter of whether a theory of persistence semimodules could provide more information could also be looked into. References Bauer, U., Lesnick, M.: Induced matchings of barcodes and the algebraic stability of persistence. In: Pro- ceedings of the Thirtieth Annual Symposium on Computational Geometry (New York, NY, USA), SOCG’14, Association for Computing Machinery. pp. 355–364 (2014) Burago, D., Burago, Y., Ivanov, S.: A Course in Metric Geometry. American Mathematical Society, Provi- dence, RI (2001) Carlsson, G.: Topology and data. Bull. Am. Math. Soc. (N.S.) 46(2), 255–308 (2009) Carlsson, G., Mémoli, F., Ribeiro, A., Segarra, S.: Hierarchical quasi-clustering methods for asymmet- ric networks. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, JMLR.org. pp. II-352–II-360 (2014) Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L.J., Oudot, S.: Proximity of persistence modules and their diagrams. In: Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry (New York, NY, USA), SCG ’09, Association for Computing Machinery. pp. 237–246 (2009) Chazal, F., de Silva, V., Glisse, M., Oudot, S.: The Structure and Stability of Persistence Modules. Springer- Briefs in Mathematics. Springer, Cham (2016) Chowdhury, S., Mémoli, F.: A functorial Dowker theorem and persistent homology of asymmetric networks. J. Appl. Comput. Topol. 2(1–2), 115–175 (2018) Chowdhury, S., Mémoli, F.: Persistent path homology of directed networks. In: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (USA), SODA’18, Society for Industrial and Applied Mathematics, pp. 1152–1169 (2018) Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological Persistence and Simplification, Discrete and Computational Geometry and Graph Drawing (Columbia, SC, 2001), vol. 28, pp. 511–533 (2002) Edelsbrunner, H., Harer, J.L.: Computational Topology: An Introduction. American Mathematical Society, Providence, RI (2010) Edelsbrunner, H., Jabłonski, ´ G., Mrozek, M.: The persistent homology of a self-map. Found. Comput. Math. 15(5), 1213–1244 (2015) Fajstrup, L., Goubault, E., Haucourt, E., Mimram, S., Raussen, M.: Directed Algebraic Topology and Concurrency. Springer, Berlin (2016) Fukuda, K., Prodon, A.: Double description method revisited, Combinatorics and computer science (Brest: Lecture Notes in Computer Science vol. 1120. Springer, Berlin 1996), pp. 91–111 (1995) Golan, J.S.: Semirings and Their Applications. Kluwer Academic Publishers, Dordrecht (1999) Grandis, M.: Directed Algebraic Topology: Models of Non-reversible Worlds. Cambridge University Press, Cambridge (2009) Grigor’yan, A., Lin, Y., Muranov, Y., Yau, S.-T.: Path Complexes and their Homologies. J. Math. Sci. 248, 564–599 (2020) Kim, K.H., Roush, F.W.: Generalized fuzzy matrices. Fuzzy Sets Syst. 4(3), 293–315 (1980) Motzkin, T.S., Raiffa, H., Thompson, G.L., Thrall, R.M.: The Double Description Method, Contributions to the Theory of Games, Annals of Mathematics Studies, no. 28, Princeton University Press, Princeton, NJ, , vol. 2, pp. 51–73 (1953) Munkres, J.R.: Elements of Algebraic Topology. Addison-Wesley Publishing Company, Menlo Park, CA (1984) Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. EPJ Data Sci. 6(1), 17 (2017) Patchkoria, A.: Cohomology of monoids with coefficients in semimodules. Bull. Georgian Acad. Sci. 86(3), 545–548 (1977) Patchkoria, A.: Cohomology monoids of monoids with coefficients in semimodules I. J. Homotopy Relat. Struct. 9(1), 239–255 (2014) 123 A directed persistent homology theory for dissimilarity… Reimann, M.W., Nolte, M., Scolamiero, M., Turner, K., Perin, R., Chindemi, G., Dłotko, P., Levi, R., Hess, K., Markram, H.: Cliques of neurons bound into cavities provide a missing link between structure and function. Front. Comput. Neurosci. 11, 48 (2017) Steiner, R.: Omega-categories and chain complexes. Homol. Homotopy Appl. 6(1), 175–200 (2004) Tan, Y.-J.: Bases in semimodules over commutative semirings. Linear Algebra Appl. 443, 139–152 (2014) Terzer, M.: Large scale methods to enumerate extreme rays and elementary modes, Ph.D. thesis (2009), ETH Zürich, Switzerland Turner, K.: Rips filtrations for quasimetric spaces and asymmetric functions with stability results. Algebr. Geom. Topol. 19(3), 1135–1170 (2019) Zomorodian, A., Carlsson, G.: Computing persistent homology. Discrete Comput. Geom. 33(2), 249–274 (2005) Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Journal

Journal of Applied and Computational TopologySpringer Journals

Published: Dec 1, 2023

Keywords: Persistent homology; Dissimilarity functions; Directed simplicial complexes; Directed cycles; 55N31; 55N35; 55U10; 16Y60

There are no references for this article.