Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Sampling networks of ecological interactions

Sampling networks of ecological interactions bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Pedro Jordano Integrative Ecology Group, Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas (EBD-CSIC), Avenida Americo Vespucio s/n, E–41092 Sevilla, Spain Sevilla, September 5, 2015 Summary 1. Sampling ecological interactions presents similar challenges, problems, poten- tial biases, and constraints as sampling individuals and species in biodiversity inventories. Interactions are just pairwise relationships among individuals of two different species, such as those among plants and their seed dispersers in frugivory interactions or those among plants and their pollinators. Sampling interactions is a fundamental step to build robustly estimated interaction networks, yet few analyses have attempted a formal approach to their sam- pling protocols. jordano@ebd.csic.es 1 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 2. Robust estimates of the actual number of interactions (links) within diver- sified ecological networks require adequate sampling effort that needs to be explicitly gauged. Yet we still lack a sampling theory explicitly focusing on ecological interactions. 3. While the complete inventory of interactions is likely impossible, a robust characterization of its main patterns and metrics is probably realistic. We must acknowledge that a sizable fraction of the maximum number of interac- tions I among, say, A animal species and P plant species (i.e., I = AP ) max max is impossible to record due to forbidden links, i.e., life-history restrictions. Thus, the number of observed interactions I in robustly sampled networks is typically I << I , resulting in extremely sparse interaction matrices with max low connectance. 4. Reasons for forbidden links are multiple but mainly stem from spatial and temporal uncoupling, size mismatches, and intrinsically low probabilities of interspecific encounter for most potential interactions of partner species. Ad- equately assessing the completeness of a network of ecological interactions thus needs knowledge of the natural history details embedded, so that for- bidden links can be “discounted” when addressing sampling effort. 5. Here I provide a review and outline a conceptual framework for interaction sampling by building an explicit analogue to individuals and species sam- pling, thus extending diversity-monitoring approaches to the characterization of complex networks of ecological interactions. This is crucial to assess the fast-paced and devastating effects of defaunation-driven loss of key ecological interactions and the services they provide and the analogous losses related 2 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks to interaction gains due to invasive species and biotic homogenization. Keywords complex networks, food webs, frugivory, mutualism, plant-animal interactions, pol- lination, seed dispersal Introduction Biodiversity sampling is a labour-intensive activity, and sampling is often not sufficient to detect all or even most of the species present in an assemblage. Gotelli & Colwell (2011). 1 Biodiversity species assessment aims at sampling individuals in collections and 2 determining the number of species represented. Given that, by definition, samples 3 are incomplete, these collections do not enumerate the species actually present. 4 The ecological literature dealing with robust estimators of species richness and di- 5 versity in collections of individuals is immense, and a number of useful approaches 6 have been used to obtain such estimates (Magurran, 1988; Gotelli & Colwell, 2001; 7 Colwell, Mao & Chang, 2004; Hortal, Borges & Gaspar, 2006; Colwell, 2009; Gotelli 8 & Colwell, 2011; Chao et al., 2014). Recent effort has been also focused at defining 9 essential biodiversity variables (EBV) (Pereira et al., 2013) that can be sampled 10 and measured repeatedly to complement biodiversity estimates. Yet sampling 11 species or taxa-specific EBVs is just probing a single component of biodiversity; 12 interactions among species are another fundamental component, one that supports 3 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 13 the existence, but in some cases also the extinction, of species. For example, the ex- 14 tinction of interactions represents a dramatic loss of biodiversity because it entails 15 the loss of fundamental ecological functions (Valiente-Banuet et al., 2014). This 16 missed component of biodiversity loss, the extinction of ecological interactions, 17 very often accompanies, or even precedes, species disappearance. Interactions 18 among species are a key component of biodiversity and here we aim to show that 19 most problems associated with sampling interactions in natural communities relate 20 to problems associated with sampling species diversity, even worse. We consider 21 pairwise interactions among species at the habitat level, in the context of alpha di- 22 versity and the estimation of local interaction richness from sampling data (Chao 23 et al., 2014). In the first part we provide a succinct overview of previous work 24 addressing sampling issues for ecological interaction networks. In the second part, 25 after a short overview of asymptotic diversity estimates (Gotelli & Colwell, 2001), 26 we discuss specific rationales for sampling the biodiversity of ecological interac- 27 tions. Most of the examples come from the analysis of plant-animal interaction 28 networks, yet are applicable to other types of species-species interactions. 29 Interactions can be a much better indicator of the richness and diversity of 30 ecosystem functions than a simple list of taxa and their abundances and/or related 31 biodiversity indicator variables (EBVs). Thus, sampling interactions should be a 32 central issue when identifying and diagnosing ecosystem services (e.g., pollination, 33 natural seeding by frugivores, etc.). Fortunately, the whole battery of biodiversity- 34 related tools used by ecologists to sample biodiversity (species, sensu stricto) can 35 be extended and applied to the sampling of interactions. Analogs are evident 36 between these approaches (see Table 2 in Colwell, Mao & Chang, 2004). Monitor- 37 ing interactions is a biodiversity sampling and is subject to similar methodological 4 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 38 shortcomings, especially under-sampling (Jordano, 1987; Jordano, Vázquez & Bas- 39 compte, 2009; Coddington et al., 2009; Vázquez, Chacoff & Cagnolo, 2009; Dorado 40 et al., 2011; Rivera-Hutinel et al., 2012). For example, when we study mutualistic 41 networks, our goal is to make an inventory of the distinct pairwise interactions 42 that made up the network. We are interested in having a complete list of all the 43 pairwise interactions among species (e.g., all the distinct, species-species interac- 44 tions, or links, among the pollinators and flowering plants) that do actually exist 45 in a given community. Sampling these interactions thus entails exactly the same 46 problems, limitations, constraints, and potential biases as sampling individual or- 47 ganisms and species diversity. As Mao & Colwell (2005) put it, these are the 48 workings of Preston’s demon, the moving “veil line” (Preston, 1948) between the 49 detected and the undetected interactions as sample size increases. 50 Early efforts to recognize and solve sampling problems in analyses of interac- 51 tions stem from research on food webs and to determine how undersampling biases 52 food web metrics (Martinez, 1991; Cohen et al., 1993; Martinez, 1993; Bersier, 53 Banasek-Richter & Cattin, 2002; Brose, Martinez & Williams, 2003; Banasek- 54 Richter, Cattin & Bersier, 2004; Wells & O’Hara, 2012). In addition, the myriad 55 of classic natural history studies documenting animal diets, host-pathogen infection 56 records, plant herbivory records, etc., represent efforts to document interactions 57 occurring in nature. All of them share the problem of sampling incompleteness in- 58 fluencing the patterns and metrics reported. Yet, despite the early recognition that 59 incomplete sampling may seriously bias the analysis of ecological networks (Jor- 60 dano, 1987), only recent studies have explicitly acknowledged it and attempted to 61 determine its influence (Ollerton & Cranmer, 2002; Nielsen & Bascompte, 2007; 62 Vázquez, Chacoff & Cagnolo, 2009; Gibson et al., 2011; Olesen et al., 2011; Chacoff 5 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 63 et al., 2012; Rivera-Hutinel et al., 2012; Olito & Fox, 2014; Bascompte & Jordano, 64 2014; Vizentin-Bugoni, Maruyama & Sazima, 2014; Frund, McCann & Williams, 65 2015). The sampling approaches have been extended to predict patterns of coex- 66 tintions in interaction assemblages (e.g., hosts-parasites) (Colwell, Dunn & Harris, 67 2012). Most empirical studies provide no estimate of sampling effort, implicitly 68 assuming that the reported network patterns and metrics are robust. Yet recent ev- 69 idences point out that number of partner species detected, number of actual links, 70 and some aggregate statistics describing network patterns, are prone to sampling 71 bias (Nielsen & Bascompte, 2007; Dorado et al., 2011; Olesen et al., 2011; Chacoff 72 et al., 2012; Rivera-Hutinel et al., 2012; Olito & Fox, 2014; Frund, McCann & 73 Williams, 2015). Most of these evidences, however, come either from simulation 74 studies (Frund, McCann & Williams, 2015) or from relatively species-poor assem- 75 blages. Most certainly, sampling limitations pervade biodiversity inventories in 76 tropical areas (Coddington et al., 2009) and we might rightly expect that frequent 77 interactions may be over-represented and rare interactions may be missed entirely 78 in studies of mega-diverse assemblages (Bascompte & Jordano, 2014); but, to what 79 extent? 80 Sampling interactions: methods 81 When we sample interactions in the field we record the presence of two species 82 that interact in some way. For example, Snow and Snow(1988) recorded an inter- 83 action whenever they saw a bird “touching” a fruit on a plant. We observe and 84 record feeding observations, visitation, occupancy, presence in pollen loads or in 85 fecal samples, etc., of individual animals or plants and accumulate pairwise inter- 6 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 86 actions, i.e., lists of species partners and the frequencies with which we observe 87 them. Therefore, estimating the sampling completeness of pairwise interactions for 88 a whole network, requires some gauging of how the number (richness) of distinct 89 pairwise interactions accumulates as sampling effort is increased) and/or estimat- 90 ing the uncertainty around the missed links (Wells & O’Hara, 2012). 91 Most types of ecological interactions can be illustrated with bipartite graphs, 92 with two or more distinct groups of interacting partners (Bascompte & Jordano, 93 2014); for illustration purposes I’ll focus more specifically on plant-animal inter- 94 actions. Sampling interactions requires filling the cells of an interaction matrix 95 with data. The matrix,  = AP (the adjacency matrix for the graph representa- 96 tion of the network), is a 2D inventory of the interactions among, say, A animal 97 species (rows) and P plant species (columns) (Jordano, 1987; Bascompte & Jor- 98 dano, 2014). The matrix entries illustrate the values of the pairwise interactions 99 visualized in the  matrix, and can be 0 or 1, for presence-absence of a given 100 pairwise interaction, or take a quantitative weight w to represent the interaction ji 101 intensity or unidirectional effect of species j on species i (Bascompte & Jordano, 102 2014; Vazquez et al., 2015). The outcomes of most ecological interactions are 103 dependent on frequency of encounters (e.g., visit rate of pollinators, number of 104 records of ant defenders, frequency of seeds in fecal samples). Thus, a frequently 105 used proxy for interaction intensities w is just how frequent new interspecific ji 106 encounters are, whether or not appropriately weighted to estimate interaction ef- 107 fectiveness (Vazquez, Morris & Jordano, 2005). 108 We need to define two basic steps in the sampling of interactions: 1) which 109 type of interactions we sample; and 2) which type of record we get to document 110 the existence of an interaction. In step #1 we need to take into account whether 7 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 111 we are sampling the whole community of interactor species (all the animals, all 112 the plants) or just a subset of them, i.e., a sub matrix  of m < A animal m;n 113 species and n < P plant species of the adjacency matrix  (i.e., the matrix AP 114 representation of interactions among the partner species). Subsets can be: a) all 115 the potential plants interacting with a subset of the animals (Fig. 1a); b) all the 116 potential animal species interacting with a subset of the plant species (Fig. 1b); 117 c) a subset of all the potential animal species interacting with a subset of all the 118 plant species (Fig. 1c). While some discussion has considered how to establish 119 the limits of what represents a network (Strogatz, 2001) (in analogy to discussion 120 on food-web limits; Cohen, 1978), it must be noted that situations a-c in Fig. 121 1 do not represent complete interaction networks. As vividly stated by Cohen 122 et al. (1993): “As more comprehensive, more detailed, more explicit webs become 123 available, smaller, highly aggregated, incompletely described webs may progressively 124 be dropped from analyses of web structure (though such webs may remain useful for 125 other purposes, such as pedagogy)”. Subnet sampling is generalized in studies of 126 biological networks (e.g., protein interactions, gene regulation), yet it is important 127 to recognize that most properties of subnetworks (even random subsamples) do 128 not represent properties of whole networks (Stumpf, Wiuf & May, 2005). 129 In step #2 above we face the problem of the type of record we take to sample 130 interactions. This is important because it defines whether we approach the problem 131 of filling up the interaction matrix in a “zoo-centric” way or in a “phyto-centric” 132 way. Zoo-centric studies directly sample animal activity and document the plants 133 ‘touched’ by the animal. For example, analysis of pollen samples recovered from the 134 body of pollinators, analysis of fecal samples of frugivores, radio-tracking data, etc. 135 Phyto-centric studies take samples of focal individual plant species and document 8 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 136 which animals ‘arrive’ or ‘touch’ the plants. Examples include focal watches of 137 fruiting or flowering plants to record visitation by animals, raising insect herbivores 138 from seed samples, identifying herbivory marks in samples of leaves, etc. 139 Most recent analyses of plant-animal interaction networks are phyto-centric; 140 just 3.5% of available plant-pollinator (N = 58) or 36.6% plant-frugivore (N = 22) 141 interaction datasets are zoo-centric (see Schleuning et al., 2012). Moreover, most 142 available datasets on host-parasite (parasitoid) or plant-herbivore interactions are 143 “host-centric” or phyto-centric (e.g., Thébault & Fontaine, 2010; Morris et al., 144 2013; Eklöf et al., 2013). This may be related to a variety of causes, like preferred 145 methodologies by researchers working with a particular group or system, logistic 146 limitations, or inherent taxonomic focus of the research questions. A likely result 147 of phyto-centric sampling would be adjacency matrices with large A : P ratios. 148 In any case we don’t have a clear view of the potential biases that taxa-focused 149 sampling may generate in observed network patterns, for example by generating 150 consistently asymmetric interaction matrices (Dormann et al., 2009). System sym- 151 metry has been suggested to influence estimations of generalization levels in plants 152 and animals when measured as I and I (Elberling & Olesen, 1999); thus, differ- A P 153 ences in I and I between networks may arise from different A : P ratios rather A P 154 than other ecological factors (Olesen & Jordano, 2002). 155 Reasonably complete analyses of interaction networks can be obtained when 156 combining both phyto-centric and zoo-centric sampling. For example, Bosch et al. 157 (2009) showed that the addition of pollen load data on top of focal-plant sampling 158 of pollinators unveiled a significant number of interactions, resulting in important 159 network structural changes. Connectance increased 1.43-fold, mean plant connec- 160 tivity went from 18.5 to 26.4, and mean pollinator connectivity from 2.9 to 4.1; 9 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 161 moreover, extreme specialist pollinator species (singletons in the adjacency matrix) 162 decreased 0.6-fold. Olesen et al.(2011) identified pollen loads on sampled insects 163 and added the new links to an observation-based visitation matrix, with an extra 164 5% of links representing the estimated number of missing links in the pollination 165 network. The overlap between observational and pollen-load recorded links was 166 only 33%, underscoring the value of combining methodological approaches. Zoo- 167 centric sampling has recently been extended with the use of DNA-barcoding, for 168 example with plant-herbivore (Jurado-Rivera et al., 2009), host-parasiotid (Wirta 169 et al., 2014), and plant-frugivore interactions (González-Varo, Arroyo & Jordano, 170 2014). For mutualistic networks we would expect that zoo-centric sampling could 171 help unveiling interactions of the animals with rare plant species or for relatively 172 common plants species which are difficult to sample by direct observation. Fu- 173 ture methodological work may provide significant advances showing how mixing 174 different sampling strategies strengthens the completeness of network data. These 175 mixed strategies may combine, for instance, timed watches at focal plants, spot 176 censuses along walked transects, pollen load or seed contents analyses, monitoring 177 with camera traps, and DNA barcoding records. We might expect increased power 178 of these mixed sampling approaches when combining different methods from both 179 phyto- and zoo-centric perspectives (Bosch et al., 2009; Blüthgen, 2010). Note also 180 that the different methods could be applied in different combinations to the two 181 distinct sets of species. However, there are no tested protocols and/or sampling 182 designs for ecological interaction studies to suggest an optimum combination of 183 approaches. Ideally, pilot studies would provide adequate information for each 184 specific study setting. 10 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 185 Sampling interactions: rationale 186 The number of distinct pairwise interactions that we can record in a landscape 187 (an area of relatively homogeneous vegetation, analogous to the one we would 188 use to monitor species diversity) is equivalent to the number of distinct classes in 189 which we can classify the recorded encounters among individuals of two different 190 species. Yet, individual-based interaction networks have been only recently studied 191 (Dupont, Trøjelsgaard & Olesen, 2011; Wells & O’Hara, 2012). The most usual 192 approach has been to pool indiviudal-based interaction data into species-based 193 summaries, an approach that ignores the fact that only a fraction of individuals 194 may actually interact given a per capita interaction effect (Wells & O’Hara, 2012). 195 Wells & O’Hara (2012) illustrate the pros and cons of the approach. We walk in 196 the forest and see a blackbird Tm picking an ivy Hh fruit and ingesting it: we 197 have a record for Tm Hh interaction. We keep advancing and record again a 198 blackbird feeding on hawthorn Cm fruits so we record a Tm Cm interaction; 199 as we advance we encounter another ivy plant and record a blackcap swallowing a 200 fruit so we now have a new Sa Hh interaction, and so on. At the end we have 201 a series of classes (e.g., Sa Hh, Tm Hh, Tm Cm, etc.), along with their 202 observed frequencies. Bunge & Fitzpatrick (1993) provide an early review of the 203 main aspects and approaches to estimate the number of distinct classes C in a 204 sample of observations. 205 Our sampling above would have resulted in a vector n = [n :::n ] where n is 1 C i th 206 the number of records in the i class. As stressed by Bunge & Fitzpatrick (1993), th 207 however, the i class would appear in the sample if and only if n > 0, and we 208 don’t know a priori which n are zero. So, n is not observable. Rather, what we 11 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 209 get is a vector c = [c :::c ] where c is the number of classes represented j times 1 n j 210 in our sampling: c is the number of singletons (interactions recorded once), c 1 2 211 is the number of twin pairs (interactions with just two records), c the number 212 of triplets, etc. The problem thus turns to be estimating the number of distinct 213 classes C from the vector of c values and the frequency of unobserved interactions 214 (see “The real missing links” below). 215 More specifically, we usually obtain a type of reference sample (Chao et al., 216 2014) for interactions: a series of replicated samples (e.g., observation days, 1h 217 watches, etc.) with quantitative information, i.e., recording the number of in- 218 stances of each interaction type on each day. This replicated abundance data, 219 can be treated in three ways: 1) Abundance data within replicates: the counts 220 of interactions, separately for each day; 2) Pooled abundance data: the counts of 221 interactions, summed over all days (the most usual approach); and 3) Replicated 222 incidence data: the number of days on which we recorded each interaction. Assum- 223 ing a reasonable number of replicates, replicated incidence data is considered the 224 most robust statistically, as it takes account of heterogeneity among days (Colwell, 225 Mao & Chang, 2004; Colwell, Dunn & Harris, 2012; Chao et al., 2014). Thus, both 226 presence-absence and weighted information on interactions can be accommodated 227 for this purpose. 228 The species assemblage 229 When we consider an observed and recorded sample of interactions on a particular 230 assemblage of A and P species (or a set of replicated samples) as a reference obs obs 231 sample (Chao et al., 2014) we may have three sources of undersampling error that 12 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 232 are ignored by treating a reference sample as a true representation of the inter- 233 actions in well-defined assemblage: 1) some animal species are actually present 234 but not observed (zero abundance or incidence in the interactions in the reference 235 sample), A ; 2) some plant species are actually present but not observed (zero 236 abundance or incidence in the interactions in the reference sample), P ; 3) some 237 unobserved links (the zeroes in the adjacency matrix, UL) may actually occur but 238 not recorded. Thus a first problem is determining if A and P truly represent obs obs 239 the actual species richness interacting in the assemblage. To this end we might use 240 the replicated reference samples to estimate the true number of interacting animal 241 A and plant P species as in traditional diversity estimation analysis (Chao est est 242 et al., 2014). If there are no uniques (species seen on only one day), then A and 243 P will be zero, and we have A and P as robust estimates of the actual species 0 obs obs 244 richness of the assemblage. If A and P are not zero they estimate the minimum 0 0 245 number of undetected animal and plant species that can be expected with a suf- 246 ficiently large number of replicates, taken from the same assemblage/locality by 247 the same methods in the same time period. We can use extrapolation methods 248 (Colwell, Dunn & Harris, 2012) to estimate how many additional replicate surveys 249 it would take to reach a specified proportion g of A and P . est est 250 The interactions 251 We are then faced with assessing the sampling of interactions I. Table 1 summa- 252 rizes the main components and targets for estimation of interaction richness. In 253 contrast with traditional species diversity estimates, sampling networks has the 254 paradox that despite the potentially interacting species being present in the sam- 13 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 255 pled assemblage (i.e., included in the A and P species lists), some of their obs obs 256 pairwise interactions are impossible to be recorded. The reason is forbidden links. 257 Independently of whether we sample full communities or subset communities we 258 face a problem: some of the interactions that we can visualize in the empty ad- 259 jacency matrix  will simply not occur. With a total of A P “potential” in- obs obs 260 teractions (eventually augmented to A P in case we have undetected species), est est 261 a fraction of them are impossible to record, because they are forbidden (Jordano, 262 Bascompte & Olesen, 2003; Olesen et al., 2011). 263 Our goal is to estimate the true number of non-null AP interactions, including 264 interactions that actually occur but have not been observed (I ) from the repli- 265 cated incidence frequencies of interaction types: I = I + I . Note that I est obs 0 0 266 estimates the minimum number of undetected plant-animal interactions that can 267 be expected with a sufficiently large number of replicates, taken from the same 268 assemblage/locality by the same methods in the same time period. Therefore 269 we have two types of non-obsereved links: UL and UL, corresponding to the 270 real assemblage species richness and to the observed assemblage species richness, 271 respectively (Table 1). 272 Forbidden links are non-occurrences of pairwise interactions that can be ac- 273 counted for by biological constraints, such as spatio-temporal uncoupling (Jordano, 274 1987), size or reward mismatching, foraging constraints (e.g., accessibility) (Moré 275 et al., 2012), and physiological-biochemical constraints (Jordano, 1987). We still 276 have extremely reduced information about the frequency of forbidden links in natu- 277 ral communities (Jordano, Bascompte & Olesen, 2003; Stang et al., 2009; Vázquez, 278 Chacoff & Cagnolo, 2009; Olesen et al., 2011; Ibanez, 2012; Maruyama et al., 2014; 279 Vizentin-Bugoni, Maruyama & Sazima, 2014) (Table 1). Forbidden links are thus 14 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 280 represented as structural zeroes in the interaction matrix, i.e., matrix cells that 281 cannot get a non-zero value. 282 We might expect different types of FL to occupy different parts of the  ma- 283 trix, with missing cells due to phenological uncoupling, FL , largely distributed 284 in the lower-right half  matrix and actually missed links ML distributed in its 285 central part (Olesen et al., 2010). Yet, most of these aspects remain understud- 286 ied. Therefore, we need to account for the frequency of these structural zeros in 287 our matrix before proceeding. For example, most measurements of connectance 288 C = I=(AP ) implicitly ignore the fact that by taking the full product AP in the 289 denominator they are underestimating the actual connectance value, i.e., the frac- 290 tion of actual interactions I relative to the biologically possible ones, not to the 291 total maximum I = AP . max 292 Our main problem then turns to estimate the number of true missed links, 293 i.e., those that can’t be accounted for by biological constraints and that might 294 suggest undersampling. Thus, the sampling of interactions in nature, as the sam- 295 pling of species, is a cumulative process. In our analysis, we are not re-sampling 296 individuals, but interactions, so we made interaction-based accumulation curves. 297 If an interaction-based curve suggests a robust sampling, it does mean that no 298 new interactions are likely to be recorded, irrespectively of the species, as it is 299 a whole-network sampling approach (N. Gotelli, pers. com.). We add new, dis- 300 tinct, interactions recorded as we increase sampling effort (Fig. 2). We can obtain 301 an Interaction Accumulation Curve (IAC) analogous to a Species Curve (SAC) 302 (see Supplementary Online Material): the observed number of distinct pairwise 303 interactions in a survey or collection as a function of the accumulated number of 304 observations or samples (Colwell, 2009). 15 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 305 Empirical data on Forbidden Links 306 Adjacency matrices are frequently sparse, i.e., they are densely populated with 307 zeroes, with a fraction of them being structural (unobservable interactions) (Bas- 308 compte & Jordano, 2014). Thus, it would be a serious interpretation error to 309 attribute the sparseness of adjacency matrices for bipartite networks to undersam- 310 pling. The actual typology of link types in ecological interaction networks is thus 311 more complex than just the two categories of observed and unobserved interactions 312 (Table 1). Unobserved interactions are represented by zeroes and belong to two 313 categories. Missing interactions may actually exist but require additional sampling 314 or a variety of methods to be observed. Forbidden links, on the other hand, arise 315 due to biological constraints limiting interactions and remain unobservable in na- 316 ture, irrespectively of sampling effort (Table 1). Forbidden links FL may actually 317 account for a relatively large fraction of unobserved interactions UL when sam- 318 pling taxonomically-restricted subnetworks (e.g., plant-hummingbird pollination 319 networks) (Table 1). Phenological uncoupling is also prevalent in most networks, 320 and may add up to explain ca. 25–40% of the forbidden links, especially in highly 321 seasonal habitats, and up to 20% when estimated relative to the total number of un- 322 observed interactions (Table 2). In any case, we might expect that a fraction of the 323 missing links ML would be eventually explained by further biological reasons, de- 324 pending on the knowledge of natural details of the particular systems. Our goal as 325 naturalists would be to reduce the fraction of UL which remain as missing links; to 326 this end we might search for additional biological constraints or increase sampling 327 effort. For instance, habitat use patterns by hummingbirds in the Arima Valley 328 network (Table 2; Snow & Snow, 1972) impose a marked pattern of microhabitat 16 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 329 mismatches causing up to 44.5% of the forbidden links. A myriad of biological 330 causes beyond those included as FL in Table 2 may contribute explanations for 331 UL: limits of color perception and or partial preferences, presence of secondary 332 metabolites in fruit pulp and leaves, toxins and combinations of monosaccharides 333 in nectar, etc. For example, aside from FL, some pairwise interactions may sim- 334 ply have an asymptotically-zero probability of interspecific encounter between the 335 partner species, if they are very rare. However, it is surprising that just the limited 336 set of forbidden link types considered in Table 1 explain between 24.6–77.2% of 337 the unobserved links. Notably, the Arima Valley, Santa Virgńia, and Hato Ratón 338 networks have > 60% of the unobserved links explained, which might be related 339 to the fact that they are subnetworks (Arima Valley, Santa Virgínia) or relatively 340 small networks (Hato Ratón). All this means that empirical networks may have 341 sizable fractions of structural zeroes. Ignoring this biological fact may contribute 342 to wrongly inferring undersampling of interactions in real-world assemblages. 343 To sum up, two elements of inference are required in the analysis of unobserved 344 interactions in ecological interaction networks: first, detailed natural history infor- 345 mation on the participant species that allows the inference of biological constraints 346 imposing forbidden links, so that structural zeroes can by identified in the adja- 347 cency matrix. Second, a critical analysis of sampling robustness and a robust 348 estimate of the actual fraction of missing links, M, resulting in a robust estimate 349 of I. In the next sections I explore these elements of inference, using IACs to 350 assess the robustness of interaction sampling. 17 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 351 Asymptotic diversity estimates 352 Let’s assume a sampling of the diversity in a specific locality, over relatively ho- 353 mogeneous landscape where we aim at determining the number of species present 354 for a particular group of organisms. To do that we carry out transects or plot 355 samplings across the landscape or use any other type of direct or indirect record- 356 ing method, adequately replicated so we obtain a number of samples. Briefly, S obs 357 is the total number of species observed in a sample, or in a set of samples. S est 358 is the estimated number of species in the community represented by the sample, 359 or by the set of samples, where est indicates an estimator. With abundance data, 360 let S be the number of species each represented by exactly k individuals in a sin- 361 gle sample. Thus, S is the number of undetected species (species present in the 362 community but not included in the sample), S is the number of singleton species 363 (represented by just one individual), S is the number of doubleton species (species 364 with two individuals), etc. The total number of individuals in the sample would be: obs n = S k=1 367 A frequently used asymptotic, bias corrected, non-parametric estimator is S Chao1 368 (Hortal, Borges & Gaspar, 2006; Chao, 2005; Colwell, 2013): S (S 1) 1 1 S = S + Chao1 obs 2(S + 1) 369 Another frequently used alternative is the Chao2 estimator, S (Gotelli & Chao2 18 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 370 Colwell, 2001), which has been reported to have a limited bias for small sample 371 sizes (Colwell & Coddington, 1994; Chao, 2005). Instead of using counts it uses 372 incidence frequencies (Q ) among samples (number of species present in just one 373 sample, in two samples, etc.): Q (Q 1) 1 1 S = S + Chao2 obs 2(Q + 1) 374 A plot of the cumulative number of species recorded, S , as a function of some 375 measure of sampling effort (say, n samples taken) yields the species accumulation 376 curve (SAC) or collector’s curve (Colwell & Coddington, 1994). Similarly, inter- 377 action accumulation curves (IAC), analogous to SACs, can be used to assess the 378 robustness of interactions sampling for plant-animal community datasets (Jordano, 379 1987; Jordano, Vázquez & Bascompte, 2009; Olesen et al., 2011), as discussed in 380 the next section. 381 Assessing sampling effort when recording interac- 382 tions 383 The basic method we can propose to estimate sampling effort and explicitly show 384 the analogues with rarefaction analysis in biodiversity research is to vectorize the 385 interaction matrix AP so that we get a vector of all the potential pairwise interac- 386 tions (I , Table 1) that can occur in the observed assemblage with A animal max obs 387 species and P plant species. The new “species” we aim to sample are the pairwise obs 388 interactions (Table 3). So, if we have in our community Turdus merula (Tm) and 389 Rosa canina (Rc) and Prunus mahaleb (Pm), our problem will be to sample 2 new 19 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 390 “species”: TmRc and TmPm. In general, if we have A = 1:::i , animal species 391 and P = 1:::j plant species (assuming a complete list of species in the assemblage), 392 we’ll have a vector of “new” species to sample: A P ; A P ; :::A P ; A P ; :::A P . 1 1 1 2 2 1 2 2 i j 393 We can represent the successive samples where we can potentially get records of 394 these interactions in a matrix with the vectorized interaction matrix and columns 395 representing the successive samples we take (Table 3). This is simply a vectorized 396 version of the interaction matrix. This is analogous to a biodiversity sampling ma- 397 trix with species as rows and sampling units (e.g., quadrats) as columns (Jordano, 398 Vázquez & Bascompte, 2009). The package EstimateS (Colwell, 2013) includes 399 a complete set of functions for estimating the mean IAC and its unconditional 400 standard deviation from random permutations of the data, or subsampling with- 401 out replacement (Gotelli & Colwell, 2001) and the asymptotic estimators for the 402 expected number of distinct pairwise interactions included in a given reference 403 sample of interaction records (see also the specaccum function in library vegan of 404 the R Package)(R Development Core Team, 2010; Jordano, Vázquez & Bascompte, 405 2009; Olesen et al., 2011). In particular, we may take advantage of replicated in- 406 cidence data, as it takes account of heterogeneity among samples (days, censuses, 407 etc.; R.K Colwell, pers. comm.) (see also Colwell, Mao & Chang, 2004; Colwell, 408 Dunn & Harris, 2012; Chao et al., 2014). 409 In this way we effectively extend sampling theory developed for species diversity 410 to the sampling of ecological interactions. Yet future theoretical work will be 411 needed to formally assess the similarities and differences in the two approaches 412 and developing biologically meaningful null models of expected interaction richness 413 with added sampling effort. 414 Diversity-accumulation analysis (Magurran, 1988; Hortal, Borges & Gaspar, 20 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 415 2006) comes up immediately with this type of dataset. This procedure plots 416 the accumulation curve for the expected number of distinct pairwise interactions 417 recorded with increasing sampling effort (Jordano, Vázquez & Bascompte, 2009; 418 Olesen et al., 2011). Asymptotic estimates of interaction richness and its associ- 419 ated standard errors and confidence intervals can thus be obtained (Hortal, Borges 420 & Gaspar, 2006) (see Supplementary Online Material). It should be noted that 421 the asymptotic estimate of interaction richness explicitly ignores the fact that, 422 due to forbidden links, a number of pairwise interactions among the I number max 423 specified in the adjacency matrix  cannot be recorded, irrespective of sampling 424 effort. 425 We may expect undersampling specially in moderate to large sized networks 426 with multiple modules (i.e., species subsets requiring different sampling strategies) 427 (Jordano, 1987; Olesen et al., 2011; Chacoff et al., 2012); adequate sampling may be 428 feasible when interaction subwebs are studied (Olesen et al., 2011; Vizentin-Bugoni, 429 Maruyama & Sazima, 2014), typically with more homogeneous subsets of species 430 (e.g., bumblebee-pollinated flowers). In any case the sparseness of the  matrix 431 is by no means an indication of undersampling whenever the issue of structural 432 zeroes in the interaction matrices is effectively incorporated in the estimates. 433 For example, mixture models incorporating detectabilities have been proposed 434 to effectively account for rare species (Mao & Colwell, 2005). In an analogous line, 435 mixture models could be extended to samples of pairwise interactions, also with 436 specific detectability values. These detection rate/odds could be variable among 437 groups of interactions, depending on their specific detectability. For example, 438 detectability of flower-pollinator interactions involving bumblebees could have a 439 higher detectability than flower-pollinator pairwise interactions involving, say, ni- 21 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 440 tidulid beetles. These more homogeneous groupings of pairwise interactions within 441 a network define modules (Bascompte & Jordano, 2014), so we might expect that 442 interactions of a given module (e.g., plants and their hummingbird pollinators; Fig. 443 1a) may share similar detectability values, in an analogous way to species groups 444 receiving homogeneous detectability values in mixture models (Mao & Colwell, 445 2005). In its simplest form, this would result in a sample with multiple pairwise 446 interactions detected, in which the number of interaction events recorded for each 447 distinct interaction found in the sample is recorded (i.e., a column vector in Table 448 3, corresponding to, say, a sampling day). The number of interactions recorded for 449 the i pairwise interaction (i.e., A P in Table 3), Y could be treated as a Poisson th i j i 450 random variable with a mean parameter  , its detection rate. Mixture models 451 (Mao & Colwell, 2005) include estimates for abundance-based data (their analogs 452 in interaction sampling would be weighted data), where Y is a Poisson random 453 variable with detection rate  . This is combined with the incidence-based model, 454 where Y is a binomial random variable (their analogous in interaction sampling 455 would be presence/absence records of interactions) with detection odds  . Let 456 T be the number of samples in an incidence-based data set. A Poisson/binomial 457 density can be written as (Mao & Colwell, 2005): [1] y!e g(y; ) = > y > T [2] y (1+) 458 where [1] corresponds to a weighted network, and [2] to a qualitative network. 459 The detection rates  depend on the relative abundances  of the interactions, i i 460 the probability of a pairwise interaction being detected when it is present, and the 22 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 461 sample size (the number of interactions recorded), which, in turn, is a function 462 of the sampling effort. Unfortunately, no specific sampling model has been de- 463 veloped along these lines for species interactions and their characteristic features. 464 For example, a complication factor might be that interaction abundances,  , in 465 real assemblages are a function of the abundances of interacting species that de- 466 termine interspecific encounter rates; yet they also depend on biological factors 467 that ultimately determine if the interaction occurs when the partner species are 468 present. For example,  should be set to zero for all FL. It its simplest form, i i 469 could be estimated from just the product of partner species abundances, an ap- 470 proach recently used as a null model to assess the role of biological constraints in 471 generating forbidden links and explaining interaction patterns (Vizentin-Bugoni, 472 Maruyama & Sazima, 2014). Yet more complex models (e.g., Wells & O’hara 473 2012) should incorporate not only interspecific encounter probabilities, but also 474 interaction detectabilities, phenotypic matching and incidence of forbidden links. 475 Mixture models are certainly complex and for most situations of evaluating sam- 476 pling effort better alternatives include the simpler incidence-based rarefaction and 477 extrapolation (Colwell, Dunn & Harris, 2012; Chao et al., 2014). 478 The real missing links 479 Given that a fraction of unobserved interactions can be accounted for by for- 480 bidden links, what about the remaining missing interactions? We have already 481 discussed that some of these could still be related to unaccounted constraints, and 482 still others would be certainly attributable to insufficient sampling. Would this 483 always be the case? Multispecific assemblages of distinct taxonomic relatedness, 23 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 484 whose interactions can be represented as bipartite networks (e.g., host-parasite, 485 plant-animal mutualisms, plant-herbivore interactions- with two distinct sets of 486 unrelated higher taxa), are shaped by interspecific encounters among individuals 487 of the partner species (Fig. 2). A crucial ecological aspect limiting these inter- 488 actions is the probability of interspecific encounter, i.e., the probability that two 489 individuals of the partner species actually encounter each other in nature. 490 Given log-normally distributed abundances of the two species groups, the ex- 491 pected probabilities of interspecific encounter (PIE) would be simply the product 492 of the two lognormal distributions. Thus, we might expect that for low PIE val- 493 ues, pairwise interactions would be either extremely difficult to sample, or just 494 simply not occurring in nature. Consider the Nava de las Correhuelas interaction 495 web (NCH, Table 2), with A = 36, P = 25, I = 181, and almost half of the unob- 496 served interactions not accounted for by forbidden links, thus M = 53.1%. Given 497 the robust sampling of this network (Jordano, Vázquez & Bascompte, 2009), a 498 sizable fraction of these possible but missing links would be simply not occurring 499 in nature, most likely by extremely low PIE, in fact asymptotically zero. Given 500 the vectorized list of pairwise interactions for NCH, I computed the PIE values for 501 each one by multiplying element-wise the two species abundance distributions. The 502 PIE = 0.0597, being a neutral estimate, based on the assumption that interac- max 503 tions occur in proportion to the species-specific local abundances. With PIE median 4 4 504 < 1:4 10 we may safely expect (note the quantile estimate Q =3:27 10 ) 75% 505 that a sizable fraction of these missing interactions may not occur according to 506 this neutral expectation (Jordano, 1987; Olesen et al., 2011) (neutral forbidden 507 links, sensu Canard et al., 2012). 508 When we consider the vectorized interaction matrix, enumerating all pairwise 24 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 509 interactions for the AP combinations, the expected probabilities of finding a given 510 interaction can be estimated with a Good-Turing approximation (Good, 1953). 511 The technique, developed by Alan Turing and I.J. Good with applications to lin- 512 guistics and word analysis (Gale & Sampson, 1995) has been recently extended in 513 novel ways for ecological analyses (Chao et al., 2015). It estimates the probability 514 of recording an interaction of a hitherto unseen pair of partners, given a set of past 515 records of interactions between other species pairs. Let a sample of N interactions 516 so that n distinct pairwise interactions have exactly r records. All Good-Turing 517 estimators obtain the underlying frequencies of events as: (N + 1) E(1) P (X ) = (1 ) (1) T T 518 where X is the pairwise interaction, N is the number of times interaction X 519 is recorded, T is the sample size (number of distinct interactions recorded) and 520 E(1) is an estimate of how many different interactions were recorded exactly once. 521 Strictly speaking Equation (1) gives the probability that the next interaction type 522 recorded will be X, after sampling a given assemblage of interacting species. In 523 other words, we scale down the maximum-likelihood estimator by a factor of 1E(1) 524 . This reduces all the probabilities for interactions we have recorded, and 525 makes room for interactions we haven’t seen. If we sum over the interactions we 1E(1) 526 have seen, then the sum of P (X ) is 1 . Because probabilities sum to one, E(1) 527 we have the left-over probability of P = of seeing something new, where new 528 new means that we sample a new pairwise interaction. Note, however, that Good- 529 Turing estimators, the traditional asymptotic estimators, do not account in our 530 case for the forbidden interactions. 25 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 531 Discussion 532 Recent work has inferred that most data available for interaction networks are 533 incomplete due to undersampling, resulting in a variety of biased parameters and 534 network patterns (Chacoff et al., 2012). It is important to note, however, that 535 in practice, many surveyed networks to date have been subnets of much larger 536 networks. This is also true for protein interaction, gene regulation, and metabolic 537 networks, where only a subset of the molecular entities in a cell have been sam- 538 pled (Stumpf, Wiuf & May, 2005). Despite recent attempts to document whole 539 ecosystem meta-networks (Pocock, Evans & Memmott, 2012), it is likely that most 540 ecological interaction networks will illustrate just major ecosystem compartments. 541 Due to their high generalization, high temporal and spatial turnover, and high 542 complexity of association patterns, adequate sampling of ecological interaction 543 networks is challenging and requires extremely large sampling effort. Undersam- 544 pling of ecological networks may originate from the analysis of assemblage subsets 545 (e.g., taxonomically or functionally defined), and/or from logistically-limited sam- 546 pling effort. It is extremely hard to robustly sample the set of biotic interactions 547 even for relatively simple, species-poor assemblages; thus, we need to assess how 548 robust is the characterization of the adjacency matrix . Concluding that an 549 ecological network dataset is undersampled just by its sparseness would be unreal- 550 istic. The reason stems from a biological fact: a sizeable fraction of the maximum, 551 potential links that can be recorded among two distinct sets of species is simply un- 552 observable, irrespective of sampling effort (Jordano, 1987). In addition, sampling 553 effort needs to be explicitly gauged because of its potential influence on parameter 554 estimates for the network. 26 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 555 Missing links are a characteristic feature of all plant-animal interaction net- 556 works, and likely pervade other ecological interactions. Important natural history 557 details explain a fraction of them, resulting in unrealizable interactions (i.e., for- 558 bidden interactions) that define structural zeroes in the interaction matrices and 559 contribute to their extreme sparseness. Sampling interactions is a way to monitor 560 biodiversity beyond the simple enumeration of component species and to develop 561 efficient and robust inventories of functional interactions. Yet no sampling theory 562 for interactions is available. Some key components of this sampling are analo- 563 gous to species sampling and traditional biodiversity inventories; however, there 564 are important differences. Focusing just on the realized interactions or treating 565 missing interactions as the expected unique result of sampling bias would miss 566 important components to understand how mutualisms coevolve within complex 567 webs of interdependence among species. 568 Contrary to species inventories, a sizable fraction of non-observed pairwise 569 interactions cannot be sampled, due to biological constraints that forbid their 570 occurrence. Moreover, recent implementations of inference methods for unobserved 571 species (Chao et al., 2015) or for individual-based data (Wells & O’Hara, 2012) 572 can be combined with the forbidden link approach. They do not account either 573 for the existence of these ecological constraints, but can help in estimating their 574 relative importance, simply by the difference between the asymptotic estimate of 575 interaction richness in a robustly-sampled assemblage and the maximum richness 576 I of interactions. max 577 Ecological interactions provide the wireframe supporting the lives of species, 578 and they also embed crucial ecosystem functions which are fundamental for sup- 579 porting the Earth system. We still have a limited knowledge of the biodiversity 27 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 580 of ecological interactions, and they are being lost (extinct) at a very fast pace, 581 frequently preceding species extinctions (Valiente-Banuet et al., 2014). We ur- 582 gently need robust techniques to assess the completeness of ecological interactions 583 networks because this knowledge will allow the identification of the minimal com- 584 ponents of their ecological complexity that need to be restored to rebuild functional 585 ecosystems after perturbations. 586 Acknowledgements 587 I am indebted to Jens M. Olesen, Alfredo Valido, Jordi Bascompte, Thomas 588 Lewinshon, John N. Thompson, Nick Gotelli, Carsten Dormann, and Paulo R. 589 Guimara˜es Jr. for useful and thoughtful discussion at different stages of this 590 manuscript. Jeferson Vizentin-Bugoni kindly helped with the Sta Virgínia data. 591 Jens M. Olesen kindly made available the Grundvad dataset; together with Robert 592 K. Colwell, Néstor Pérez-Méndez, JuanPe González-Varo, and Paco Rodríguez pro- 593 vided most useful comments to a final version of the ms. Robert Colwell shared 594 a number of crucial suggestions that clarified my vision of sampling ecological in- 595 teractions. The study was supported by a Junta de Andalucía Excellence Grant 596 (RNM–5731), as well as a Severo Ochoa Excellence Award from the Ministerio de 597 Economía y Competitividad (SEV–2012–0262). The Agencia de Medio Ambiente, 598 Junta de Andalucía, provided generous facilities that made possible my long-term 599 field work in different natural parks. 28 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 600 Data accessiblity 601 This review does not use new raw data, but includes some re-analyses of previously 602 published material. All the original data supporting the paper, R code, supple- 603 mentary figures, and summaries of analytical protocols is available at the author’s 604 GitHub repository (https://github.com/pedroj/MS_Network-Sampling), with 605 DOI: 10.5281/zenodo.29437. 606 References 607 Banasek-Richter, C., Cattin, M. & Bersier, L. (2004) Sampling effects and the ro- 608 bustness of quantitative and qualitative food-web descriptors. Journal of Theo- 609 retical Biology, 226, 23–32. 610 Bascompte, J. & Jordano, P. (2014) Mutualistic networks. Monographs in Popu- 611 lation Biology, No. 53. Princeton University Press, Princeton, NJ. 612 Bersier, L., Banasek-Richter, C. & Cattin, M. (2002) Quantitative descriptors of 613 food-web matrices. Ecology, 83, 2394–2407. 614 Blüthgen, N. (2010) Why network analysis is often disconnected from community 615 ecology: A critique and an ecologist’s guide. Basic And Applied Ecology, 11, 616 185–195. 617 Bosch, J., Martín González, A.M., Rodrigo, A. & Navarro, D. (2009) Plant- 618 pollinator networks: adding the pollinator’s perspective. Ecology Letters, 12, 619 409–419. 29 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 620 Brose, U., Martinez, N. & Williams, R. (2003) Estimating species richness: Sen- 621 sitivity to sample coverage and insensitivity to spatial patterns. Ecology, 84, 622 2364–2377. 623 Bunge, J. & Fitzpatrick, M. (1993) Estimating the number of species: a review. 624 Journal of the American Statistical Association, 88, 364–373. 625 Canard, E., Mouquet, N., Marescot, L., Gaston, K.J., Gravel, D. & Mouillot, 626 D. (2012) Emergence of structural patterns in neutral trophic networks. PLoS 627 ONE, 7, e38295. 628 Chacoff, N.P., Vazquez, D.P., Lomascolo, S.B., Stevani, E.L., Dorado, J. & Padrón, 629 B. (2012) Evaluating sampling completeness in a desert plant-pollinator network. 630 Journal of Animal Ecology, 81, 190–200. 631 Chao, A. (2005) Species richness estimation. Encyclopedia of Statistical Sciences, 632 pp. 7909–7916. Oxford University Press, New York, USA. 633 Chao, A., Gotelli, N.J., Hsieh, T.C., Sander, E.L., Ma, K.H., Colwell, R.K. & Elli- 634 son, A.M. (2014) Rarefaction and extrapolation with Hill numbers: a framework 635 for sampling and estimation in species diversity studies. Ecological Monographs, 636 84, 45–67. 637 Chao, A., Hsieh, T.C., Chazdon, R.L., Colwell, R.K. & Gotelli, N.J. (2015) Un- 638 veiling the species-rank abundance distribution by generalizing the Good-Turing 639 sample coverage theory. Ecology, 96, 1189–1201. 640 Coddington, J.A., Agnarsson, I., Miller, J.A., Kuntner, M. & Hormiga, G. (2009) 30 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 641 Undersampling bias: the null hypothesis for singleton species in tropical arthro- 642 pod surveys. Journal of Animal Ecology, 78, 573–584. 643 Cohen, J.E. (1978) Food webs and niche space. Princeton University Press, Prince- 644 ton, New Jersey, US. 645 Cohen, J.E., Beaver, R.A., Cousins, S.H., DeAngelis, D.L., Goldwasser, L., Heong, 646 K.L., Holt, R.D., Kohn, A.J., Lawton, J.H., Martinez, N., O’Malley, R., Page, 647 L.M., Patten, B.C., Pimm, S.L., Polis, G., Rejmanek, M., Schoener, T.W., 648 Schenly, K., Sprules, W.G., Teal, J.M., Ulanowicz, R., Warren, P.H., Wilbur, 649 H.M. & Yodis, P. (1993) Improving food webs. Ecology, 74, 252–258. 650 Colwell, R. & Coddington, J. (1994) Estimating terrestrial biodiversity through ex- 651 trapolation. Philosophical Transactions Of The Royal Society Of London Series 652 B-Biological Sciences, 345, 101–118. 653 Colwell, R.K. (2009) Biodiversity: concepts, patterns, and measurement. The 654 Princeton Guide to Ecology (ed. S.A. Levin), pp. 257–263. Princeton University 655 Press, Princeton. 656 Colwell, R.K. (2013) EstimateS: Biodiversity Estimation. -, pp. 1–33. 657 Colwell, R.K., Dunn, R.R. & Harris, N.C. (2012) Coextinction and persistence of 658 dependent species in a changing world. Annual Review of Ecology Evolution and 659 Systematics, 43, 183–203. 660 Colwell, R.K., Mao, C.X. & Chang, J. (2004) Interpolating, extrapolating, and 661 comparing incidence-based species accumulation curves. Ecology, 85, 2717–2727. 31 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 662 Dorado, J., Vazquez, D.P., Stevani, E.L. & Chacoff, N.P. (2011) Rareness and 663 specialization in plant-pollinator networks. Ecology, 92, 19–25. 664 Dormann, C.F., Frund, J., Bluthgen, N. & Gruber, B. (2009) Indices, graphs and 665 null models: Analyzing bipartite ecological networks. Open Ecology Journal, 2, 666 7–24. 667 Dupont, Y.L., Trøjelsgaard, K. & Olesen, J.M. (2011) Scaling down from species 668 to individuals: a flower–visitation network between individual honeybees and 669 thistle plants. Oikos, 120, 170–177. 670 Dupont, Y.L., Trøjelsgaard, K., Hagen, M., Henriksen, M.V., Olesen, J.M., Ped- 671 ersen, N.M.E. & Kissling, W.D. (2014) Spatial structure of an individual-based 672 plant-pollinator network. Oikos, 123, 1301–1310. 673 Eklöf, A., Jacob, U., Kopp, J., Bosch, J., Castro-Urgal, R., Chacoff, N.P., 674 Dalsgaard, B., de Sassi, C., Galetti, M., Guimaraes, P.R., Lomáscolo, S.B., 675 Martín González, A.M., Pizo, M.A., Rader, R., Rodrigo, A., Tylianakis, J.M., 676 Vazquez, D.P. & Allesina, S. (2013) The dimensionality of ecological networks. 677 Ecology Letters, 16, 577–583. 678 Elberling, H. & Olesen, J.M. (1999) The structure of a high latitude plant-flower 679 visitor system: the dominance of flies. Ecography, 22, 314–323. 680 Frund, J., McCann, K.S. & Williams, N.M. (2015) Sampling bias is a challenge 681 for quantifying specialization and network structure: lessons from a quantitative 682 niche model. Oikos, pp. n/a–n/a. 32 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 683 Gale, W.A. & Sampson, G. (1995) Good-Turing frequency estimation without 684 tears. Journal of Quantitative Linguistics, 2, 217–237. 685 Gibson, R.H., Knott, B., Eberlein, T. & Memmott, J. (2011) Sampling method 686 influences the structure of plant–pollinator networks. Oikos, 120, 822–831. 687 González-Varo, J.P., Arroyo, J.M. & Jordano, P. (2014) Who dispersed the seeds? 688 The use of DNA barcoding in frugivory and seed dispersal studies. Methods in 689 Ecology and Evolution, 5, 806–814. 690 Good, I.J. (1953) The population frequencies of species and the estimation of 691 population parameters. Biometrika, 40, 237–264. 692 Gotelli, N.J. & Colwell, R.K. (2011) Estimating species richness. Biological Di- 693 versity Frontiers in Measurement and Assessment (eds. A.E. Magurran & B.J. 694 McGill), pp. 39–54. Oxford University Press, Oxford, UK. 695 Gotelli, N. & Colwell, R. (2001) Quantifying biodiversity: procedures and pitfalls 696 in the measurement and comparison of species richness. Ecology Letters, 4, 697 379–391. 698 Hortal, J., Borges, P. & Gaspar, C. (2006) Evaluating the performance of species 699 richness estimators: sensitivity to sample grain size. Journal of Animal Ecology, 700 75, 274–287. 701 Ibanez, S. (2012) Optimizing size thresholds in a plant–pollinator interaction web: 702 towards a mechanistic understanding of ecological networks. Oecologia, 170, 703 233–242. 33 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 704 Jordano, P. (1987) Patterns of mutualistic interactions in pollination and seed dis- 705 persal: connectance, dependence asymmetries, and coevolution. The American 706 Naturalist, 129, 657–677. 707 Jordano, P., Bascompte, J. & Olesen, J. (2003) Invariant properties in coevolu- 708 tionary networks of plant-animal interactions. Ecology Letters, 6, 69–81. 709 Jordano, P., Vázquez, D. & Bascompte, J. (2009) Redes complejas de interac- 710 ciones planta—animal. Ecología y evolución de interacciones planta-animal (eds. 711 R. Medel, R. Dirzo & R. Zamora), pp. 17–41. Editorial Universitaria, Santiago, 712 Chile. 713 Jurado-Rivera, J.A., Vogler, A.P., Reid, C.A.M., Petitpierre, E. & Gomez-Zurita, 714 J. (2009) DNA barcoding insect-host plant associations. Proceedings Of The 715 Royal Society B-Biological Sciences, 276, 639–648. 716 Magurran, A. (1988) Ecological diversity and its measurement. Princeton Univer- 717 sity Press, Princeton, US. 718 Mao, C. & Colwell, R.K. (2005) Estimation of species richness: mixture models, 719 the role of rare species, and inferential challenges. Ecology, 86, 1143–1153. 720 Martinez, N.D. (1993) Effects of resolution on food web structure. Oikos, 66, 721 403–412. 722 Martinez, N. (1991) Artifacts or attributes? Effects of resolution on food-web 723 patterns in Little Rock Lake food web. Ecological Monographs, 61, 367–392. 724 Maruyama, P.K., Vizentin-Bugoni, J., Oliveira, G.M., Oliveira, P.E. & Dalsgaard, 34 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 725 B. (2014) Morphological and spatio-temporal mismatches shape a neotropical 726 savanna plant-hummingbird network. Biotropica, 46, 740–747. 727 Moré, M., Amorim, F.W., Benitez-Vieyra, S., Medina, A.M., Sazima, M. & 728 Cocucci, A.A. (2012) Armament Imbalances: Match and Mismatch in Plant- 729 Pollinator Traits of Highly Specialized Long-Spurred Orchids. PLoS ONE, 7, 730 e41878. 731 Morris, R.J., Gripenberg, S., Lewis, O.T. & Roslin, T. (2013) Antagonistic inter- 732 action networks are structured independently of latitude and host guild. Ecology 733 Letters, 17, 340–349. 734 Nielsen, A. & Bascompte, J. (2007) Ecological networks, nestedness and sampling 735 effort. Journal of Ecology, 95, 1134–1141–1141. 736 Olesen, J.M., Bascompte, J., Dupont, Y.L., Elberling, H. & Jordano, P. (2011) 737 Missing and forbidden links in mutualistic networks. Proceedings Of The Royal 738 Society B-Biological Sciences, 278, 725–732. 739 Olesen, J.M., Dupont, Y.L., O’gorman, E., Ings, T.C., Layer, K., Melin, C.J., 740 Trjelsgaard, K., Pichler, D.E., Rasmussen, C. & Woodward, G. (2010) From 741 Broadstone to Zackenberg. Advances in Ecological Research, 42, 1–69. 742 Olesen, J. & Jordano, P. (2002) Geographic patterns in plant-pollinator mutualistic 743 networks. Ecology, 83, 2416–2424. 744 Olito, C. & Fox, J.W. (2014) Species traits and abundances predict metrics of 745 plant-pollinator network structure, but not pairwise interactions. Oikos, 124, 746 428–436. 35 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 747 Ollerton, J. & Cranmer, L. (2002) Latitudinal trends in plant-pollinator interac- 748 tions: are tropical plants more specialised? Oikos, 98, 340–350. 749 Pereira, H.M., Ferrier, S., Walters, M., Geller, G.N., Jongman, R.H.G., Scholes, 750 R.J., Bruford, M.W., Brummitt, N., Butchart, S.H.M., Cardoso, A.C., Coops, 751 N., Dulloo, E., Faith, D., Freyhof, J., Gregory, R.D., Heip, C., Hoft, R., Hurtt, 752 G., Jetz, W., Karp, D.S., Mcgeoch, M., Obura, D., Onoda, Y., Pettorelli, N., 753 Reyers, B., Sayre, R., Scharlemann, J.P.W., Stuart, S., Turak, E., Walpole, M. 754 & Wegmann, M. (2013) Essential biodiversity variables. Science, 339, 277–278. 755 Pocock, M.J.O., Evans, D.M. & Memmott, J. (2012) The Robustness and Restora- 756 tion of a Network of Ecological Networks. Science, 335, 973–977. 757 Preston, F. (1948) The commonness, and rarity, of species. Ecology, 29, 254–283. 758 R Development Core Team (2010) R: A language and environment for statis- 759 tical computing. R Foundation for Statistical Computing. Vienna, Austria. 760 http://www.R-project.org, Vienna, Austria. 761 Rivera-Hutinel, A., Bustamante, R.O., Marín, V.H. & Medel, R. (2012) Effects of 762 sampling completeness on the structure of plant-pollinator networks. Ecology, 763 93, 1593–1603. 764 Schleuning, M., Frund, J., Klein, A.M., Abrahamczyk, S., Alarcón, R., Albrecht, 765 M., Andersson, G.K.S., Bazarian, S., Böhning-Gaese, K., Bommarco, R., Dals- 766 gaard, B., Dehling, D.M., Gotlieb, A., Hagen, M., Hickler, T., Holzschuh, A., 767 Kaiser-Bunbury, C.N., Kreft, H., Morris, R.J., Sandel, B., Sutherland, W.J., 768 Svenning, J.C., Tscharntke, T., Watts, S., Weiner, C.N., Werner, M., Williams, 36 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 769 N.M., Winqvist, C., Dormann, C.F. & Blüthgen, N. (2012) Specialization of 770 mutualistic interaction networks decreases toward tropical latitudes. Current 771 Biology, 22, 1925–1931. 772 Snow, B. & Snow, D. (1972) Feeding niches of hummingbirds in a Trinidad valley. 773 Journal of Animal Ecology, 41, 471–485. 774 Snow, B. & Snow, D. (1988) Birds and berries. Poyser, Calton, UK. 775 Stang, M., Klinkhamer, P., Waser, N.M., Stang, I. & van der Meijden, E. (2009) 776 Size-specific interaction patterns and size matching in a plant-pollinator inter- 777 action web. Annals Of Botany, 103, 1459–1469. 778 Strogatz, S. (2001) Exploring complex networks. Nature, 410, 268–276. 779 Stumpf, M.P.H., Wiuf, C. & May, R.M. (2005) Subnets of scale-free networks are 780 not scale-free: Sampling properties of networks. Proceedings of the National 781 Academy of Sciences USA, 102, 4221–4224. 782 Thébault, E. & Fontaine, C. (2010) Stability of ecological communities and the 783 architecture of mutualistic and trophic networks. Science, 329, 853–856. 784 Valiente-Banuet, A., Aizen, M.A., Alcántara, J.M., Arroyo, J., Cocucci, A., 785 Galetti, M., García, M.B., García, D., Gomez, J.M., Jordano, P., Medel, R., 786 Navarro, L., Obeso, J.R., Oviedo, R., Ramírez, N., Rey, P.J., Traveset, A., 787 Verdú, M. & Zamora, R. (2014) Beyond species loss: the extinction of ecological 788 interactions in a changing world. Functional Ecology, 29, 299–307. 789 Vázquez, D.P., Chacoff, N.P. & Cagnolo, L. (2009) Evaluating multiple deter- 37 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 790 minants of the structure of plant-animal mutualistic networks. Ecology, 90, 791 2039–2046. 792 Vazquez, D.P., Ramos-Jiliberto, R., Urbani, P. & Valdovinos, F.S. (2015) A con- 793 ceptual framework for studying the strength of plant-animal mutualistic inter- 794 actions. Ecology Letters, 18, 385–400. 795 Vazquez, D., Morris, W. & Jordano, P. (2005) Interaction frequency as a surrogate 796 for the total effect of animal mutualists on plants. Ecology Letters, 8, 1088–1094. 797 Vizentin-Bugoni, J., Maruyama, P.K. & Sazima, M. (2014) Processes entangling 798 interactions in communities: forbidden links are more important than abundance 799 in a hummingbird-plant network. Proceedings Of The Royal Society B-Biological 800 Sciences, 281, 20132397–20132397. 801 Wells, K. & O’Hara, R.B. (2012) Species interactions: estimating per-individual 802 interaction strength and covariates before simplifying data into per-species eco- 803 logical networks. Methods in Ecology and Evolution, 4, 1–8. 804 Wirta, H.K., Hebert, P.D.N., Kaartinen, R., Prosser, S.W., Várkonyi, G. & Roslin, 805 T. (2014) Complementary molecular information changes our perception of food 806 web structure. Proceedings of the National Academy of Sciences USA, 111, 807 1885–1890. 38 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 808 Figure captions 809 Figure 1. Sampling ecological interaction networks (e.g., plant-animal interac- 810 tions) usually focus on different types of subsampling the full network, yielding 811 submatrices [m; n] of the full interaction matrix  with A and P animal and 812 plant species. a) all the potential plants interacting with a subset of the animals 813 (e.g., studying just the hummingbird-pollinated flower species in a community); 814 b) all the potential animal species interacting with a subset of the plant species 815 (e.g., studying the frugivore species feeding on figs Ficus in a community); and c) 816 sampling a subset of all the potential animal species interacting with a subset of all 817 the plant species (e.g., studying the plant-frugivore interactions of the rainforest 818 understory). 820 Figure 2. Sampling species interactions in natural communities. Suppose an 821 assemblage with A = 3 animal species (red, species 1–3 with three, two, and 1 822 individuals, respectively) and P = 3 plant species (green, species a-c with three 823 individuals each) (colored balls), sampled with increasing effort in steps 1 to 6 824 (panels). In Step 1 we record animal species 1 and plant species 1 and 2 with 825 a total of three interactions (black lines) represented as two distinct interactions: 826 1 a and 1 b. As we advance our sampling (panels 1 to 6, illustrating e.g., 827 additional sampling days) we record new distinct interactions. Note that we actu- 828 ally sample and record interactions among individuals, yet we pool the data across 829 species to get a species by species interaction matrix. Few network analyses have 830 been carried out on individual data(Dupont et al., 2014). 39 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 832 Figures Figure 1: Plants a b c P n n Animals bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Figure 2: 1 2 2 2 2 2 3 3 4 1 2 3 1 1 b 1 1 b 1 1 b c c c a a a b a b a b a 3 3 3 a a a c 2 c 2 c 2 1 1 1 2 2 2 c b c b c b 4 5 6 1 1 b 1 1 b 1 1 b c c c a a a b a b a b a 3 3 3 a a a c 2 c 2 c 2 1 1 1 2 2 2 c b c b c b 3 3 4 3 3 5 3 3 6 Jordano - Figure 1 41 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 833 Table captions 834 Table 1. A taxonomy of link types for ecological interactions (Olesen et al. 2011). 835 A, number of animal species; P , number of plant species; I, number of observed 836 links; C = 100I=(AP ), connectance; FL, number of forbidden links; and ML, 837 number of missing links. As natural scientists, our ultimate goal is to eliminate 838 ML from the equation FL = AP I ML, which probably is not feasible given 839 logistic sampling limitations. When we, during our study, estimate ML to be 840 negligible, we cease observing and estimate I and FL. 842 Table 2. Frequencies of different type of forbidden links in natural plant-animal 843 interaction assemblages. AP , maximum potential links, I ; I, number of ob- max 844 served links; UL, number of unobserved links; FL, number of forbidden links; 845 FL , phenology; FL , size restrictions; FL , accessibility; FL , other types of P S A O 846 restrictions; ML, unknown causes (missing links). Relative frequencies (in paren- 847 theses) calculated over I = AP for I, ML, and FL; for all forbidden links types, max 848 calculated over FL. References, from left to right: Olesen et al. 2008; Olesen & 849 Myrthue unpubl.; Snow & Snow 1972 and Jordano et al. 2006; Vizentin-Bugoni 850 et al. 2014; Jordano et al. 2009; Olesen et al. 2011. 852 Table 3. A vectorized interaction matrix. 854 Table 4. Sampling statistics for three plant-animal interaction networks (Olesen 855 et al. 2011). Symbols as in Table 1; N, number of records; Chao1 and ACE are 856 asymptotic estimators for the number of distinct pairwise interactions I (Hortal 42 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 857 et al. 2006), and their standard errors; C, sample coverage for rare interactions 858 (Chao & Jost 2012). Scaled asymptotic estimators and their confidence intervals 859 (CI) were calculated by weighting Chao1 and ACE with the observed frequencies 860 of forbidden links. 862 Tables 43 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Table 1: Link type Formulation Definition Potential links I = A P Size of observed network matrix, max obs obs i.e. maximum number of potentially observable interactions; A and P , numbers of interacting obs obs animal and plant species, respectively. These might be below the real numbers of animal and plant species, A and P . est est Observed links I Total number of observed links obs in the network given a sufficient sampling effort. Number of ones in the adjacency matrix. True links I Total number of links in the network est given a sufficient sampling effort; expected for the augmented A P matrix. est est Unobserved links UL = I I Number of zeroes in the adjacency max obs matrix. True unobserved links UL = I I Number of zeroes in the augmented max obs adjacency matrix that, eventually, includes unobserved species. Forbidden links FL Number of links, which remain unobserved because of linkage constraints, irrespectively of sufficient sampling effort. Observed Missing links ML = A P I FL Number of links, which may exist in obs obs obs nature but need more sampling effort and/or additional sampling methods to be observed. True Missing links ML = A P I FL Number of links, which may exist in est est est nature but need more sampling effort and/or additional sampling methods to be observed. Augments ML for the A P matrix. est est 44 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Table 2: Pollination Seed dispersal Link Zackenberg Grundvad Arima Sta. Hato Nava type Valley Virginia Ratón Correhuelas I 1891 646 522 423 272 825 max I 268 212 185 86 151 181 (0.1417) (0.3282) (0.3544) (0.1042) (0.4719) (0.2194) UL 1507 434 337 337 169 644 (0.7969) (0.6718) (0.6456) (0.4085) (0.5281) (0.7806) FL 530 107 218 260 118 302 (0.3517) (0.2465) (0.6469) (0.7715) (0.6982) (0.4689) FL 530 94 0 120 67 195 (1.0000) (0.2166) (0.0000) (0.1624) (0.3964) (0.3028) FL    (  ) 8 30 140 31 46 (0.0714) (0.0184) (0.0890) (0.1894) (0.1834) FL    (  ) 5 150    (  ) 20 61 (0.0947) (0.0115) (0.445) (0.1183) FL    (  )    (  ) 38    (  )    (  ) 363 (0.1128) (0.5637) ML 977 327 119 77 51 342 (0.6483) (0.7535) (0.3531) (0.1042) (0.3018) (0.5311) , Lack of accessibility due to habitat uncoupling, i.e., canopy-foraging species vs. understory species. , Colour restrictions, and reward per flower too small relative to the size of the bird. 45 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Table 3: Interaction Sample 1 Sample 2 Sample 3 . . . Sample i A1 - P2 12 2 0 . . . 6 A1 - P2 0 0 0 . . . 1 . . . . . . . . . . . . . . . . . . A5 - P3 5 0 1 . . . 18 A5 - P4 1 0 1 . . . 3 . . . . . . . . . . . . . . . . . . A - P 1 0 1 . . . 2 i i Table 4: Hato Ratón Nava Correhuelas Zackenberg A 17 33 65 P 16 25 31 I 272 825 1891 max N 3340 8378 1245 I 151 181 268 C 0.917 0.886 0.707 Chao1 263:1  70:9 231:4  14:2 509:6  54:7 ACE 240:3  8:9 241:3  7:9 566:1  14:8 % unobserved 8.33 15.38 47.80 , estimated with library Jade (R Core Development Team 2010, Chao et al. 2015) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png bioRxiv bioRxiv

Sampling networks of ecological interactions

bioRxivSep 6, 2015

Loading next page...
 
/lp/biorxiv/sampling-networks-of-ecological-interactions-tnEwCkeg2o

References (90)

Publisher
bioRxiv
Copyright
© 2015, Posted by Cold Spring Harbor Laboratory. This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at http://creativecommons.org/licenses/by-nc-nd/4.0/
DOI
10.1101/025734
Publisher site
See Article on Publisher Site

Abstract

bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Pedro Jordano Integrative Ecology Group, Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas (EBD-CSIC), Avenida Americo Vespucio s/n, E–41092 Sevilla, Spain Sevilla, September 5, 2015 Summary 1. Sampling ecological interactions presents similar challenges, problems, poten- tial biases, and constraints as sampling individuals and species in biodiversity inventories. Interactions are just pairwise relationships among individuals of two different species, such as those among plants and their seed dispersers in frugivory interactions or those among plants and their pollinators. Sampling interactions is a fundamental step to build robustly estimated interaction networks, yet few analyses have attempted a formal approach to their sam- pling protocols. jordano@ebd.csic.es 1 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 2. Robust estimates of the actual number of interactions (links) within diver- sified ecological networks require adequate sampling effort that needs to be explicitly gauged. Yet we still lack a sampling theory explicitly focusing on ecological interactions. 3. While the complete inventory of interactions is likely impossible, a robust characterization of its main patterns and metrics is probably realistic. We must acknowledge that a sizable fraction of the maximum number of interac- tions I among, say, A animal species and P plant species (i.e., I = AP ) max max is impossible to record due to forbidden links, i.e., life-history restrictions. Thus, the number of observed interactions I in robustly sampled networks is typically I << I , resulting in extremely sparse interaction matrices with max low connectance. 4. Reasons for forbidden links are multiple but mainly stem from spatial and temporal uncoupling, size mismatches, and intrinsically low probabilities of interspecific encounter for most potential interactions of partner species. Ad- equately assessing the completeness of a network of ecological interactions thus needs knowledge of the natural history details embedded, so that for- bidden links can be “discounted” when addressing sampling effort. 5. Here I provide a review and outline a conceptual framework for interaction sampling by building an explicit analogue to individuals and species sam- pling, thus extending diversity-monitoring approaches to the characterization of complex networks of ecological interactions. This is crucial to assess the fast-paced and devastating effects of defaunation-driven loss of key ecological interactions and the services they provide and the analogous losses related 2 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks to interaction gains due to invasive species and biotic homogenization. Keywords complex networks, food webs, frugivory, mutualism, plant-animal interactions, pol- lination, seed dispersal Introduction Biodiversity sampling is a labour-intensive activity, and sampling is often not sufficient to detect all or even most of the species present in an assemblage. Gotelli & Colwell (2011). 1 Biodiversity species assessment aims at sampling individuals in collections and 2 determining the number of species represented. Given that, by definition, samples 3 are incomplete, these collections do not enumerate the species actually present. 4 The ecological literature dealing with robust estimators of species richness and di- 5 versity in collections of individuals is immense, and a number of useful approaches 6 have been used to obtain such estimates (Magurran, 1988; Gotelli & Colwell, 2001; 7 Colwell, Mao & Chang, 2004; Hortal, Borges & Gaspar, 2006; Colwell, 2009; Gotelli 8 & Colwell, 2011; Chao et al., 2014). Recent effort has been also focused at defining 9 essential biodiversity variables (EBV) (Pereira et al., 2013) that can be sampled 10 and measured repeatedly to complement biodiversity estimates. Yet sampling 11 species or taxa-specific EBVs is just probing a single component of biodiversity; 12 interactions among species are another fundamental component, one that supports 3 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 13 the existence, but in some cases also the extinction, of species. For example, the ex- 14 tinction of interactions represents a dramatic loss of biodiversity because it entails 15 the loss of fundamental ecological functions (Valiente-Banuet et al., 2014). This 16 missed component of biodiversity loss, the extinction of ecological interactions, 17 very often accompanies, or even precedes, species disappearance. Interactions 18 among species are a key component of biodiversity and here we aim to show that 19 most problems associated with sampling interactions in natural communities relate 20 to problems associated with sampling species diversity, even worse. We consider 21 pairwise interactions among species at the habitat level, in the context of alpha di- 22 versity and the estimation of local interaction richness from sampling data (Chao 23 et al., 2014). In the first part we provide a succinct overview of previous work 24 addressing sampling issues for ecological interaction networks. In the second part, 25 after a short overview of asymptotic diversity estimates (Gotelli & Colwell, 2001), 26 we discuss specific rationales for sampling the biodiversity of ecological interac- 27 tions. Most of the examples come from the analysis of plant-animal interaction 28 networks, yet are applicable to other types of species-species interactions. 29 Interactions can be a much better indicator of the richness and diversity of 30 ecosystem functions than a simple list of taxa and their abundances and/or related 31 biodiversity indicator variables (EBVs). Thus, sampling interactions should be a 32 central issue when identifying and diagnosing ecosystem services (e.g., pollination, 33 natural seeding by frugivores, etc.). Fortunately, the whole battery of biodiversity- 34 related tools used by ecologists to sample biodiversity (species, sensu stricto) can 35 be extended and applied to the sampling of interactions. Analogs are evident 36 between these approaches (see Table 2 in Colwell, Mao & Chang, 2004). Monitor- 37 ing interactions is a biodiversity sampling and is subject to similar methodological 4 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 38 shortcomings, especially under-sampling (Jordano, 1987; Jordano, Vázquez & Bas- 39 compte, 2009; Coddington et al., 2009; Vázquez, Chacoff & Cagnolo, 2009; Dorado 40 et al., 2011; Rivera-Hutinel et al., 2012). For example, when we study mutualistic 41 networks, our goal is to make an inventory of the distinct pairwise interactions 42 that made up the network. We are interested in having a complete list of all the 43 pairwise interactions among species (e.g., all the distinct, species-species interac- 44 tions, or links, among the pollinators and flowering plants) that do actually exist 45 in a given community. Sampling these interactions thus entails exactly the same 46 problems, limitations, constraints, and potential biases as sampling individual or- 47 ganisms and species diversity. As Mao & Colwell (2005) put it, these are the 48 workings of Preston’s demon, the moving “veil line” (Preston, 1948) between the 49 detected and the undetected interactions as sample size increases. 50 Early efforts to recognize and solve sampling problems in analyses of interac- 51 tions stem from research on food webs and to determine how undersampling biases 52 food web metrics (Martinez, 1991; Cohen et al., 1993; Martinez, 1993; Bersier, 53 Banasek-Richter & Cattin, 2002; Brose, Martinez & Williams, 2003; Banasek- 54 Richter, Cattin & Bersier, 2004; Wells & O’Hara, 2012). In addition, the myriad 55 of classic natural history studies documenting animal diets, host-pathogen infection 56 records, plant herbivory records, etc., represent efforts to document interactions 57 occurring in nature. All of them share the problem of sampling incompleteness in- 58 fluencing the patterns and metrics reported. Yet, despite the early recognition that 59 incomplete sampling may seriously bias the analysis of ecological networks (Jor- 60 dano, 1987), only recent studies have explicitly acknowledged it and attempted to 61 determine its influence (Ollerton & Cranmer, 2002; Nielsen & Bascompte, 2007; 62 Vázquez, Chacoff & Cagnolo, 2009; Gibson et al., 2011; Olesen et al., 2011; Chacoff 5 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 63 et al., 2012; Rivera-Hutinel et al., 2012; Olito & Fox, 2014; Bascompte & Jordano, 64 2014; Vizentin-Bugoni, Maruyama & Sazima, 2014; Frund, McCann & Williams, 65 2015). The sampling approaches have been extended to predict patterns of coex- 66 tintions in interaction assemblages (e.g., hosts-parasites) (Colwell, Dunn & Harris, 67 2012). Most empirical studies provide no estimate of sampling effort, implicitly 68 assuming that the reported network patterns and metrics are robust. Yet recent ev- 69 idences point out that number of partner species detected, number of actual links, 70 and some aggregate statistics describing network patterns, are prone to sampling 71 bias (Nielsen & Bascompte, 2007; Dorado et al., 2011; Olesen et al., 2011; Chacoff 72 et al., 2012; Rivera-Hutinel et al., 2012; Olito & Fox, 2014; Frund, McCann & 73 Williams, 2015). Most of these evidences, however, come either from simulation 74 studies (Frund, McCann & Williams, 2015) or from relatively species-poor assem- 75 blages. Most certainly, sampling limitations pervade biodiversity inventories in 76 tropical areas (Coddington et al., 2009) and we might rightly expect that frequent 77 interactions may be over-represented and rare interactions may be missed entirely 78 in studies of mega-diverse assemblages (Bascompte & Jordano, 2014); but, to what 79 extent? 80 Sampling interactions: methods 81 When we sample interactions in the field we record the presence of two species 82 that interact in some way. For example, Snow and Snow(1988) recorded an inter- 83 action whenever they saw a bird “touching” a fruit on a plant. We observe and 84 record feeding observations, visitation, occupancy, presence in pollen loads or in 85 fecal samples, etc., of individual animals or plants and accumulate pairwise inter- 6 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 86 actions, i.e., lists of species partners and the frequencies with which we observe 87 them. Therefore, estimating the sampling completeness of pairwise interactions for 88 a whole network, requires some gauging of how the number (richness) of distinct 89 pairwise interactions accumulates as sampling effort is increased) and/or estimat- 90 ing the uncertainty around the missed links (Wells & O’Hara, 2012). 91 Most types of ecological interactions can be illustrated with bipartite graphs, 92 with two or more distinct groups of interacting partners (Bascompte & Jordano, 93 2014); for illustration purposes I’ll focus more specifically on plant-animal inter- 94 actions. Sampling interactions requires filling the cells of an interaction matrix 95 with data. The matrix,  = AP (the adjacency matrix for the graph representa- 96 tion of the network), is a 2D inventory of the interactions among, say, A animal 97 species (rows) and P plant species (columns) (Jordano, 1987; Bascompte & Jor- 98 dano, 2014). The matrix entries illustrate the values of the pairwise interactions 99 visualized in the  matrix, and can be 0 or 1, for presence-absence of a given 100 pairwise interaction, or take a quantitative weight w to represent the interaction ji 101 intensity or unidirectional effect of species j on species i (Bascompte & Jordano, 102 2014; Vazquez et al., 2015). The outcomes of most ecological interactions are 103 dependent on frequency of encounters (e.g., visit rate of pollinators, number of 104 records of ant defenders, frequency of seeds in fecal samples). Thus, a frequently 105 used proxy for interaction intensities w is just how frequent new interspecific ji 106 encounters are, whether or not appropriately weighted to estimate interaction ef- 107 fectiveness (Vazquez, Morris & Jordano, 2005). 108 We need to define two basic steps in the sampling of interactions: 1) which 109 type of interactions we sample; and 2) which type of record we get to document 110 the existence of an interaction. In step #1 we need to take into account whether 7 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 111 we are sampling the whole community of interactor species (all the animals, all 112 the plants) or just a subset of them, i.e., a sub matrix  of m < A animal m;n 113 species and n < P plant species of the adjacency matrix  (i.e., the matrix AP 114 representation of interactions among the partner species). Subsets can be: a) all 115 the potential plants interacting with a subset of the animals (Fig. 1a); b) all the 116 potential animal species interacting with a subset of the plant species (Fig. 1b); 117 c) a subset of all the potential animal species interacting with a subset of all the 118 plant species (Fig. 1c). While some discussion has considered how to establish 119 the limits of what represents a network (Strogatz, 2001) (in analogy to discussion 120 on food-web limits; Cohen, 1978), it must be noted that situations a-c in Fig. 121 1 do not represent complete interaction networks. As vividly stated by Cohen 122 et al. (1993): “As more comprehensive, more detailed, more explicit webs become 123 available, smaller, highly aggregated, incompletely described webs may progressively 124 be dropped from analyses of web structure (though such webs may remain useful for 125 other purposes, such as pedagogy)”. Subnet sampling is generalized in studies of 126 biological networks (e.g., protein interactions, gene regulation), yet it is important 127 to recognize that most properties of subnetworks (even random subsamples) do 128 not represent properties of whole networks (Stumpf, Wiuf & May, 2005). 129 In step #2 above we face the problem of the type of record we take to sample 130 interactions. This is important because it defines whether we approach the problem 131 of filling up the interaction matrix in a “zoo-centric” way or in a “phyto-centric” 132 way. Zoo-centric studies directly sample animal activity and document the plants 133 ‘touched’ by the animal. For example, analysis of pollen samples recovered from the 134 body of pollinators, analysis of fecal samples of frugivores, radio-tracking data, etc. 135 Phyto-centric studies take samples of focal individual plant species and document 8 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 136 which animals ‘arrive’ or ‘touch’ the plants. Examples include focal watches of 137 fruiting or flowering plants to record visitation by animals, raising insect herbivores 138 from seed samples, identifying herbivory marks in samples of leaves, etc. 139 Most recent analyses of plant-animal interaction networks are phyto-centric; 140 just 3.5% of available plant-pollinator (N = 58) or 36.6% plant-frugivore (N = 22) 141 interaction datasets are zoo-centric (see Schleuning et al., 2012). Moreover, most 142 available datasets on host-parasite (parasitoid) or plant-herbivore interactions are 143 “host-centric” or phyto-centric (e.g., Thébault & Fontaine, 2010; Morris et al., 144 2013; Eklöf et al., 2013). This may be related to a variety of causes, like preferred 145 methodologies by researchers working with a particular group or system, logistic 146 limitations, or inherent taxonomic focus of the research questions. A likely result 147 of phyto-centric sampling would be adjacency matrices with large A : P ratios. 148 In any case we don’t have a clear view of the potential biases that taxa-focused 149 sampling may generate in observed network patterns, for example by generating 150 consistently asymmetric interaction matrices (Dormann et al., 2009). System sym- 151 metry has been suggested to influence estimations of generalization levels in plants 152 and animals when measured as I and I (Elberling & Olesen, 1999); thus, differ- A P 153 ences in I and I between networks may arise from different A : P ratios rather A P 154 than other ecological factors (Olesen & Jordano, 2002). 155 Reasonably complete analyses of interaction networks can be obtained when 156 combining both phyto-centric and zoo-centric sampling. For example, Bosch et al. 157 (2009) showed that the addition of pollen load data on top of focal-plant sampling 158 of pollinators unveiled a significant number of interactions, resulting in important 159 network structural changes. Connectance increased 1.43-fold, mean plant connec- 160 tivity went from 18.5 to 26.4, and mean pollinator connectivity from 2.9 to 4.1; 9 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 161 moreover, extreme specialist pollinator species (singletons in the adjacency matrix) 162 decreased 0.6-fold. Olesen et al.(2011) identified pollen loads on sampled insects 163 and added the new links to an observation-based visitation matrix, with an extra 164 5% of links representing the estimated number of missing links in the pollination 165 network. The overlap between observational and pollen-load recorded links was 166 only 33%, underscoring the value of combining methodological approaches. Zoo- 167 centric sampling has recently been extended with the use of DNA-barcoding, for 168 example with plant-herbivore (Jurado-Rivera et al., 2009), host-parasiotid (Wirta 169 et al., 2014), and plant-frugivore interactions (González-Varo, Arroyo & Jordano, 170 2014). For mutualistic networks we would expect that zoo-centric sampling could 171 help unveiling interactions of the animals with rare plant species or for relatively 172 common plants species which are difficult to sample by direct observation. Fu- 173 ture methodological work may provide significant advances showing how mixing 174 different sampling strategies strengthens the completeness of network data. These 175 mixed strategies may combine, for instance, timed watches at focal plants, spot 176 censuses along walked transects, pollen load or seed contents analyses, monitoring 177 with camera traps, and DNA barcoding records. We might expect increased power 178 of these mixed sampling approaches when combining different methods from both 179 phyto- and zoo-centric perspectives (Bosch et al., 2009; Blüthgen, 2010). Note also 180 that the different methods could be applied in different combinations to the two 181 distinct sets of species. However, there are no tested protocols and/or sampling 182 designs for ecological interaction studies to suggest an optimum combination of 183 approaches. Ideally, pilot studies would provide adequate information for each 184 specific study setting. 10 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 185 Sampling interactions: rationale 186 The number of distinct pairwise interactions that we can record in a landscape 187 (an area of relatively homogeneous vegetation, analogous to the one we would 188 use to monitor species diversity) is equivalent to the number of distinct classes in 189 which we can classify the recorded encounters among individuals of two different 190 species. Yet, individual-based interaction networks have been only recently studied 191 (Dupont, Trøjelsgaard & Olesen, 2011; Wells & O’Hara, 2012). The most usual 192 approach has been to pool indiviudal-based interaction data into species-based 193 summaries, an approach that ignores the fact that only a fraction of individuals 194 may actually interact given a per capita interaction effect (Wells & O’Hara, 2012). 195 Wells & O’Hara (2012) illustrate the pros and cons of the approach. We walk in 196 the forest and see a blackbird Tm picking an ivy Hh fruit and ingesting it: we 197 have a record for Tm Hh interaction. We keep advancing and record again a 198 blackbird feeding on hawthorn Cm fruits so we record a Tm Cm interaction; 199 as we advance we encounter another ivy plant and record a blackcap swallowing a 200 fruit so we now have a new Sa Hh interaction, and so on. At the end we have 201 a series of classes (e.g., Sa Hh, Tm Hh, Tm Cm, etc.), along with their 202 observed frequencies. Bunge & Fitzpatrick (1993) provide an early review of the 203 main aspects and approaches to estimate the number of distinct classes C in a 204 sample of observations. 205 Our sampling above would have resulted in a vector n = [n :::n ] where n is 1 C i th 206 the number of records in the i class. As stressed by Bunge & Fitzpatrick (1993), th 207 however, the i class would appear in the sample if and only if n > 0, and we 208 don’t know a priori which n are zero. So, n is not observable. Rather, what we 11 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 209 get is a vector c = [c :::c ] where c is the number of classes represented j times 1 n j 210 in our sampling: c is the number of singletons (interactions recorded once), c 1 2 211 is the number of twin pairs (interactions with just two records), c the number 212 of triplets, etc. The problem thus turns to be estimating the number of distinct 213 classes C from the vector of c values and the frequency of unobserved interactions 214 (see “The real missing links” below). 215 More specifically, we usually obtain a type of reference sample (Chao et al., 216 2014) for interactions: a series of replicated samples (e.g., observation days, 1h 217 watches, etc.) with quantitative information, i.e., recording the number of in- 218 stances of each interaction type on each day. This replicated abundance data, 219 can be treated in three ways: 1) Abundance data within replicates: the counts 220 of interactions, separately for each day; 2) Pooled abundance data: the counts of 221 interactions, summed over all days (the most usual approach); and 3) Replicated 222 incidence data: the number of days on which we recorded each interaction. Assum- 223 ing a reasonable number of replicates, replicated incidence data is considered the 224 most robust statistically, as it takes account of heterogeneity among days (Colwell, 225 Mao & Chang, 2004; Colwell, Dunn & Harris, 2012; Chao et al., 2014). Thus, both 226 presence-absence and weighted information on interactions can be accommodated 227 for this purpose. 228 The species assemblage 229 When we consider an observed and recorded sample of interactions on a particular 230 assemblage of A and P species (or a set of replicated samples) as a reference obs obs 231 sample (Chao et al., 2014) we may have three sources of undersampling error that 12 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 232 are ignored by treating a reference sample as a true representation of the inter- 233 actions in well-defined assemblage: 1) some animal species are actually present 234 but not observed (zero abundance or incidence in the interactions in the reference 235 sample), A ; 2) some plant species are actually present but not observed (zero 236 abundance or incidence in the interactions in the reference sample), P ; 3) some 237 unobserved links (the zeroes in the adjacency matrix, UL) may actually occur but 238 not recorded. Thus a first problem is determining if A and P truly represent obs obs 239 the actual species richness interacting in the assemblage. To this end we might use 240 the replicated reference samples to estimate the true number of interacting animal 241 A and plant P species as in traditional diversity estimation analysis (Chao est est 242 et al., 2014). If there are no uniques (species seen on only one day), then A and 243 P will be zero, and we have A and P as robust estimates of the actual species 0 obs obs 244 richness of the assemblage. If A and P are not zero they estimate the minimum 0 0 245 number of undetected animal and plant species that can be expected with a suf- 246 ficiently large number of replicates, taken from the same assemblage/locality by 247 the same methods in the same time period. We can use extrapolation methods 248 (Colwell, Dunn & Harris, 2012) to estimate how many additional replicate surveys 249 it would take to reach a specified proportion g of A and P . est est 250 The interactions 251 We are then faced with assessing the sampling of interactions I. Table 1 summa- 252 rizes the main components and targets for estimation of interaction richness. In 253 contrast with traditional species diversity estimates, sampling networks has the 254 paradox that despite the potentially interacting species being present in the sam- 13 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 255 pled assemblage (i.e., included in the A and P species lists), some of their obs obs 256 pairwise interactions are impossible to be recorded. The reason is forbidden links. 257 Independently of whether we sample full communities or subset communities we 258 face a problem: some of the interactions that we can visualize in the empty ad- 259 jacency matrix  will simply not occur. With a total of A P “potential” in- obs obs 260 teractions (eventually augmented to A P in case we have undetected species), est est 261 a fraction of them are impossible to record, because they are forbidden (Jordano, 262 Bascompte & Olesen, 2003; Olesen et al., 2011). 263 Our goal is to estimate the true number of non-null AP interactions, including 264 interactions that actually occur but have not been observed (I ) from the repli- 265 cated incidence frequencies of interaction types: I = I + I . Note that I est obs 0 0 266 estimates the minimum number of undetected plant-animal interactions that can 267 be expected with a sufficiently large number of replicates, taken from the same 268 assemblage/locality by the same methods in the same time period. Therefore 269 we have two types of non-obsereved links: UL and UL, corresponding to the 270 real assemblage species richness and to the observed assemblage species richness, 271 respectively (Table 1). 272 Forbidden links are non-occurrences of pairwise interactions that can be ac- 273 counted for by biological constraints, such as spatio-temporal uncoupling (Jordano, 274 1987), size or reward mismatching, foraging constraints (e.g., accessibility) (Moré 275 et al., 2012), and physiological-biochemical constraints (Jordano, 1987). We still 276 have extremely reduced information about the frequency of forbidden links in natu- 277 ral communities (Jordano, Bascompte & Olesen, 2003; Stang et al., 2009; Vázquez, 278 Chacoff & Cagnolo, 2009; Olesen et al., 2011; Ibanez, 2012; Maruyama et al., 2014; 279 Vizentin-Bugoni, Maruyama & Sazima, 2014) (Table 1). Forbidden links are thus 14 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 280 represented as structural zeroes in the interaction matrix, i.e., matrix cells that 281 cannot get a non-zero value. 282 We might expect different types of FL to occupy different parts of the  ma- 283 trix, with missing cells due to phenological uncoupling, FL , largely distributed 284 in the lower-right half  matrix and actually missed links ML distributed in its 285 central part (Olesen et al., 2010). Yet, most of these aspects remain understud- 286 ied. Therefore, we need to account for the frequency of these structural zeros in 287 our matrix before proceeding. For example, most measurements of connectance 288 C = I=(AP ) implicitly ignore the fact that by taking the full product AP in the 289 denominator they are underestimating the actual connectance value, i.e., the frac- 290 tion of actual interactions I relative to the biologically possible ones, not to the 291 total maximum I = AP . max 292 Our main problem then turns to estimate the number of true missed links, 293 i.e., those that can’t be accounted for by biological constraints and that might 294 suggest undersampling. Thus, the sampling of interactions in nature, as the sam- 295 pling of species, is a cumulative process. In our analysis, we are not re-sampling 296 individuals, but interactions, so we made interaction-based accumulation curves. 297 If an interaction-based curve suggests a robust sampling, it does mean that no 298 new interactions are likely to be recorded, irrespectively of the species, as it is 299 a whole-network sampling approach (N. Gotelli, pers. com.). We add new, dis- 300 tinct, interactions recorded as we increase sampling effort (Fig. 2). We can obtain 301 an Interaction Accumulation Curve (IAC) analogous to a Species Curve (SAC) 302 (see Supplementary Online Material): the observed number of distinct pairwise 303 interactions in a survey or collection as a function of the accumulated number of 304 observations or samples (Colwell, 2009). 15 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 305 Empirical data on Forbidden Links 306 Adjacency matrices are frequently sparse, i.e., they are densely populated with 307 zeroes, with a fraction of them being structural (unobservable interactions) (Bas- 308 compte & Jordano, 2014). Thus, it would be a serious interpretation error to 309 attribute the sparseness of adjacency matrices for bipartite networks to undersam- 310 pling. The actual typology of link types in ecological interaction networks is thus 311 more complex than just the two categories of observed and unobserved interactions 312 (Table 1). Unobserved interactions are represented by zeroes and belong to two 313 categories. Missing interactions may actually exist but require additional sampling 314 or a variety of methods to be observed. Forbidden links, on the other hand, arise 315 due to biological constraints limiting interactions and remain unobservable in na- 316 ture, irrespectively of sampling effort (Table 1). Forbidden links FL may actually 317 account for a relatively large fraction of unobserved interactions UL when sam- 318 pling taxonomically-restricted subnetworks (e.g., plant-hummingbird pollination 319 networks) (Table 1). Phenological uncoupling is also prevalent in most networks, 320 and may add up to explain ca. 25–40% of the forbidden links, especially in highly 321 seasonal habitats, and up to 20% when estimated relative to the total number of un- 322 observed interactions (Table 2). In any case, we might expect that a fraction of the 323 missing links ML would be eventually explained by further biological reasons, de- 324 pending on the knowledge of natural details of the particular systems. Our goal as 325 naturalists would be to reduce the fraction of UL which remain as missing links; to 326 this end we might search for additional biological constraints or increase sampling 327 effort. For instance, habitat use patterns by hummingbirds in the Arima Valley 328 network (Table 2; Snow & Snow, 1972) impose a marked pattern of microhabitat 16 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 329 mismatches causing up to 44.5% of the forbidden links. A myriad of biological 330 causes beyond those included as FL in Table 2 may contribute explanations for 331 UL: limits of color perception and or partial preferences, presence of secondary 332 metabolites in fruit pulp and leaves, toxins and combinations of monosaccharides 333 in nectar, etc. For example, aside from FL, some pairwise interactions may sim- 334 ply have an asymptotically-zero probability of interspecific encounter between the 335 partner species, if they are very rare. However, it is surprising that just the limited 336 set of forbidden link types considered in Table 1 explain between 24.6–77.2% of 337 the unobserved links. Notably, the Arima Valley, Santa Virgńia, and Hato Ratón 338 networks have > 60% of the unobserved links explained, which might be related 339 to the fact that they are subnetworks (Arima Valley, Santa Virgínia) or relatively 340 small networks (Hato Ratón). All this means that empirical networks may have 341 sizable fractions of structural zeroes. Ignoring this biological fact may contribute 342 to wrongly inferring undersampling of interactions in real-world assemblages. 343 To sum up, two elements of inference are required in the analysis of unobserved 344 interactions in ecological interaction networks: first, detailed natural history infor- 345 mation on the participant species that allows the inference of biological constraints 346 imposing forbidden links, so that structural zeroes can by identified in the adja- 347 cency matrix. Second, a critical analysis of sampling robustness and a robust 348 estimate of the actual fraction of missing links, M, resulting in a robust estimate 349 of I. In the next sections I explore these elements of inference, using IACs to 350 assess the robustness of interaction sampling. 17 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 351 Asymptotic diversity estimates 352 Let’s assume a sampling of the diversity in a specific locality, over relatively ho- 353 mogeneous landscape where we aim at determining the number of species present 354 for a particular group of organisms. To do that we carry out transects or plot 355 samplings across the landscape or use any other type of direct or indirect record- 356 ing method, adequately replicated so we obtain a number of samples. Briefly, S obs 357 is the total number of species observed in a sample, or in a set of samples. S est 358 is the estimated number of species in the community represented by the sample, 359 or by the set of samples, where est indicates an estimator. With abundance data, 360 let S be the number of species each represented by exactly k individuals in a sin- 361 gle sample. Thus, S is the number of undetected species (species present in the 362 community but not included in the sample), S is the number of singleton species 363 (represented by just one individual), S is the number of doubleton species (species 364 with two individuals), etc. The total number of individuals in the sample would be: obs n = S k=1 367 A frequently used asymptotic, bias corrected, non-parametric estimator is S Chao1 368 (Hortal, Borges & Gaspar, 2006; Chao, 2005; Colwell, 2013): S (S 1) 1 1 S = S + Chao1 obs 2(S + 1) 369 Another frequently used alternative is the Chao2 estimator, S (Gotelli & Chao2 18 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 370 Colwell, 2001), which has been reported to have a limited bias for small sample 371 sizes (Colwell & Coddington, 1994; Chao, 2005). Instead of using counts it uses 372 incidence frequencies (Q ) among samples (number of species present in just one 373 sample, in two samples, etc.): Q (Q 1) 1 1 S = S + Chao2 obs 2(Q + 1) 374 A plot of the cumulative number of species recorded, S , as a function of some 375 measure of sampling effort (say, n samples taken) yields the species accumulation 376 curve (SAC) or collector’s curve (Colwell & Coddington, 1994). Similarly, inter- 377 action accumulation curves (IAC), analogous to SACs, can be used to assess the 378 robustness of interactions sampling for plant-animal community datasets (Jordano, 379 1987; Jordano, Vázquez & Bascompte, 2009; Olesen et al., 2011), as discussed in 380 the next section. 381 Assessing sampling effort when recording interac- 382 tions 383 The basic method we can propose to estimate sampling effort and explicitly show 384 the analogues with rarefaction analysis in biodiversity research is to vectorize the 385 interaction matrix AP so that we get a vector of all the potential pairwise interac- 386 tions (I , Table 1) that can occur in the observed assemblage with A animal max obs 387 species and P plant species. The new “species” we aim to sample are the pairwise obs 388 interactions (Table 3). So, if we have in our community Turdus merula (Tm) and 389 Rosa canina (Rc) and Prunus mahaleb (Pm), our problem will be to sample 2 new 19 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 390 “species”: TmRc and TmPm. In general, if we have A = 1:::i , animal species 391 and P = 1:::j plant species (assuming a complete list of species in the assemblage), 392 we’ll have a vector of “new” species to sample: A P ; A P ; :::A P ; A P ; :::A P . 1 1 1 2 2 1 2 2 i j 393 We can represent the successive samples where we can potentially get records of 394 these interactions in a matrix with the vectorized interaction matrix and columns 395 representing the successive samples we take (Table 3). This is simply a vectorized 396 version of the interaction matrix. This is analogous to a biodiversity sampling ma- 397 trix with species as rows and sampling units (e.g., quadrats) as columns (Jordano, 398 Vázquez & Bascompte, 2009). The package EstimateS (Colwell, 2013) includes 399 a complete set of functions for estimating the mean IAC and its unconditional 400 standard deviation from random permutations of the data, or subsampling with- 401 out replacement (Gotelli & Colwell, 2001) and the asymptotic estimators for the 402 expected number of distinct pairwise interactions included in a given reference 403 sample of interaction records (see also the specaccum function in library vegan of 404 the R Package)(R Development Core Team, 2010; Jordano, Vázquez & Bascompte, 405 2009; Olesen et al., 2011). In particular, we may take advantage of replicated in- 406 cidence data, as it takes account of heterogeneity among samples (days, censuses, 407 etc.; R.K Colwell, pers. comm.) (see also Colwell, Mao & Chang, 2004; Colwell, 408 Dunn & Harris, 2012; Chao et al., 2014). 409 In this way we effectively extend sampling theory developed for species diversity 410 to the sampling of ecological interactions. Yet future theoretical work will be 411 needed to formally assess the similarities and differences in the two approaches 412 and developing biologically meaningful null models of expected interaction richness 413 with added sampling effort. 414 Diversity-accumulation analysis (Magurran, 1988; Hortal, Borges & Gaspar, 20 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 415 2006) comes up immediately with this type of dataset. This procedure plots 416 the accumulation curve for the expected number of distinct pairwise interactions 417 recorded with increasing sampling effort (Jordano, Vázquez & Bascompte, 2009; 418 Olesen et al., 2011). Asymptotic estimates of interaction richness and its associ- 419 ated standard errors and confidence intervals can thus be obtained (Hortal, Borges 420 & Gaspar, 2006) (see Supplementary Online Material). It should be noted that 421 the asymptotic estimate of interaction richness explicitly ignores the fact that, 422 due to forbidden links, a number of pairwise interactions among the I number max 423 specified in the adjacency matrix  cannot be recorded, irrespective of sampling 424 effort. 425 We may expect undersampling specially in moderate to large sized networks 426 with multiple modules (i.e., species subsets requiring different sampling strategies) 427 (Jordano, 1987; Olesen et al., 2011; Chacoff et al., 2012); adequate sampling may be 428 feasible when interaction subwebs are studied (Olesen et al., 2011; Vizentin-Bugoni, 429 Maruyama & Sazima, 2014), typically with more homogeneous subsets of species 430 (e.g., bumblebee-pollinated flowers). In any case the sparseness of the  matrix 431 is by no means an indication of undersampling whenever the issue of structural 432 zeroes in the interaction matrices is effectively incorporated in the estimates. 433 For example, mixture models incorporating detectabilities have been proposed 434 to effectively account for rare species (Mao & Colwell, 2005). In an analogous line, 435 mixture models could be extended to samples of pairwise interactions, also with 436 specific detectability values. These detection rate/odds could be variable among 437 groups of interactions, depending on their specific detectability. For example, 438 detectability of flower-pollinator interactions involving bumblebees could have a 439 higher detectability than flower-pollinator pairwise interactions involving, say, ni- 21 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 440 tidulid beetles. These more homogeneous groupings of pairwise interactions within 441 a network define modules (Bascompte & Jordano, 2014), so we might expect that 442 interactions of a given module (e.g., plants and their hummingbird pollinators; Fig. 443 1a) may share similar detectability values, in an analogous way to species groups 444 receiving homogeneous detectability values in mixture models (Mao & Colwell, 445 2005). In its simplest form, this would result in a sample with multiple pairwise 446 interactions detected, in which the number of interaction events recorded for each 447 distinct interaction found in the sample is recorded (i.e., a column vector in Table 448 3, corresponding to, say, a sampling day). The number of interactions recorded for 449 the i pairwise interaction (i.e., A P in Table 3), Y could be treated as a Poisson th i j i 450 random variable with a mean parameter  , its detection rate. Mixture models 451 (Mao & Colwell, 2005) include estimates for abundance-based data (their analogs 452 in interaction sampling would be weighted data), where Y is a Poisson random 453 variable with detection rate  . This is combined with the incidence-based model, 454 where Y is a binomial random variable (their analogous in interaction sampling 455 would be presence/absence records of interactions) with detection odds  . Let 456 T be the number of samples in an incidence-based data set. A Poisson/binomial 457 density can be written as (Mao & Colwell, 2005): [1] y!e g(y; ) = > y > T [2] y (1+) 458 where [1] corresponds to a weighted network, and [2] to a qualitative network. 459 The detection rates  depend on the relative abundances  of the interactions, i i 460 the probability of a pairwise interaction being detected when it is present, and the 22 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 461 sample size (the number of interactions recorded), which, in turn, is a function 462 of the sampling effort. Unfortunately, no specific sampling model has been de- 463 veloped along these lines for species interactions and their characteristic features. 464 For example, a complication factor might be that interaction abundances,  , in 465 real assemblages are a function of the abundances of interacting species that de- 466 termine interspecific encounter rates; yet they also depend on biological factors 467 that ultimately determine if the interaction occurs when the partner species are 468 present. For example,  should be set to zero for all FL. It its simplest form, i i 469 could be estimated from just the product of partner species abundances, an ap- 470 proach recently used as a null model to assess the role of biological constraints in 471 generating forbidden links and explaining interaction patterns (Vizentin-Bugoni, 472 Maruyama & Sazima, 2014). Yet more complex models (e.g., Wells & O’hara 473 2012) should incorporate not only interspecific encounter probabilities, but also 474 interaction detectabilities, phenotypic matching and incidence of forbidden links. 475 Mixture models are certainly complex and for most situations of evaluating sam- 476 pling effort better alternatives include the simpler incidence-based rarefaction and 477 extrapolation (Colwell, Dunn & Harris, 2012; Chao et al., 2014). 478 The real missing links 479 Given that a fraction of unobserved interactions can be accounted for by for- 480 bidden links, what about the remaining missing interactions? We have already 481 discussed that some of these could still be related to unaccounted constraints, and 482 still others would be certainly attributable to insufficient sampling. Would this 483 always be the case? Multispecific assemblages of distinct taxonomic relatedness, 23 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 484 whose interactions can be represented as bipartite networks (e.g., host-parasite, 485 plant-animal mutualisms, plant-herbivore interactions- with two distinct sets of 486 unrelated higher taxa), are shaped by interspecific encounters among individuals 487 of the partner species (Fig. 2). A crucial ecological aspect limiting these inter- 488 actions is the probability of interspecific encounter, i.e., the probability that two 489 individuals of the partner species actually encounter each other in nature. 490 Given log-normally distributed abundances of the two species groups, the ex- 491 pected probabilities of interspecific encounter (PIE) would be simply the product 492 of the two lognormal distributions. Thus, we might expect that for low PIE val- 493 ues, pairwise interactions would be either extremely difficult to sample, or just 494 simply not occurring in nature. Consider the Nava de las Correhuelas interaction 495 web (NCH, Table 2), with A = 36, P = 25, I = 181, and almost half of the unob- 496 served interactions not accounted for by forbidden links, thus M = 53.1%. Given 497 the robust sampling of this network (Jordano, Vázquez & Bascompte, 2009), a 498 sizable fraction of these possible but missing links would be simply not occurring 499 in nature, most likely by extremely low PIE, in fact asymptotically zero. Given 500 the vectorized list of pairwise interactions for NCH, I computed the PIE values for 501 each one by multiplying element-wise the two species abundance distributions. The 502 PIE = 0.0597, being a neutral estimate, based on the assumption that interac- max 503 tions occur in proportion to the species-specific local abundances. With PIE median 4 4 504 < 1:4 10 we may safely expect (note the quantile estimate Q =3:27 10 ) 75% 505 that a sizable fraction of these missing interactions may not occur according to 506 this neutral expectation (Jordano, 1987; Olesen et al., 2011) (neutral forbidden 507 links, sensu Canard et al., 2012). 508 When we consider the vectorized interaction matrix, enumerating all pairwise 24 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 509 interactions for the AP combinations, the expected probabilities of finding a given 510 interaction can be estimated with a Good-Turing approximation (Good, 1953). 511 The technique, developed by Alan Turing and I.J. Good with applications to lin- 512 guistics and word analysis (Gale & Sampson, 1995) has been recently extended in 513 novel ways for ecological analyses (Chao et al., 2015). It estimates the probability 514 of recording an interaction of a hitherto unseen pair of partners, given a set of past 515 records of interactions between other species pairs. Let a sample of N interactions 516 so that n distinct pairwise interactions have exactly r records. All Good-Turing 517 estimators obtain the underlying frequencies of events as: (N + 1) E(1) P (X ) = (1 ) (1) T T 518 where X is the pairwise interaction, N is the number of times interaction X 519 is recorded, T is the sample size (number of distinct interactions recorded) and 520 E(1) is an estimate of how many different interactions were recorded exactly once. 521 Strictly speaking Equation (1) gives the probability that the next interaction type 522 recorded will be X, after sampling a given assemblage of interacting species. In 523 other words, we scale down the maximum-likelihood estimator by a factor of 1E(1) 524 . This reduces all the probabilities for interactions we have recorded, and 525 makes room for interactions we haven’t seen. If we sum over the interactions we 1E(1) 526 have seen, then the sum of P (X ) is 1 . Because probabilities sum to one, E(1) 527 we have the left-over probability of P = of seeing something new, where new 528 new means that we sample a new pairwise interaction. Note, however, that Good- 529 Turing estimators, the traditional asymptotic estimators, do not account in our 530 case for the forbidden interactions. 25 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 531 Discussion 532 Recent work has inferred that most data available for interaction networks are 533 incomplete due to undersampling, resulting in a variety of biased parameters and 534 network patterns (Chacoff et al., 2012). It is important to note, however, that 535 in practice, many surveyed networks to date have been subnets of much larger 536 networks. This is also true for protein interaction, gene regulation, and metabolic 537 networks, where only a subset of the molecular entities in a cell have been sam- 538 pled (Stumpf, Wiuf & May, 2005). Despite recent attempts to document whole 539 ecosystem meta-networks (Pocock, Evans & Memmott, 2012), it is likely that most 540 ecological interaction networks will illustrate just major ecosystem compartments. 541 Due to their high generalization, high temporal and spatial turnover, and high 542 complexity of association patterns, adequate sampling of ecological interaction 543 networks is challenging and requires extremely large sampling effort. Undersam- 544 pling of ecological networks may originate from the analysis of assemblage subsets 545 (e.g., taxonomically or functionally defined), and/or from logistically-limited sam- 546 pling effort. It is extremely hard to robustly sample the set of biotic interactions 547 even for relatively simple, species-poor assemblages; thus, we need to assess how 548 robust is the characterization of the adjacency matrix . Concluding that an 549 ecological network dataset is undersampled just by its sparseness would be unreal- 550 istic. The reason stems from a biological fact: a sizeable fraction of the maximum, 551 potential links that can be recorded among two distinct sets of species is simply un- 552 observable, irrespective of sampling effort (Jordano, 1987). In addition, sampling 553 effort needs to be explicitly gauged because of its potential influence on parameter 554 estimates for the network. 26 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 555 Missing links are a characteristic feature of all plant-animal interaction net- 556 works, and likely pervade other ecological interactions. Important natural history 557 details explain a fraction of them, resulting in unrealizable interactions (i.e., for- 558 bidden interactions) that define structural zeroes in the interaction matrices and 559 contribute to their extreme sparseness. Sampling interactions is a way to monitor 560 biodiversity beyond the simple enumeration of component species and to develop 561 efficient and robust inventories of functional interactions. Yet no sampling theory 562 for interactions is available. Some key components of this sampling are analo- 563 gous to species sampling and traditional biodiversity inventories; however, there 564 are important differences. Focusing just on the realized interactions or treating 565 missing interactions as the expected unique result of sampling bias would miss 566 important components to understand how mutualisms coevolve within complex 567 webs of interdependence among species. 568 Contrary to species inventories, a sizable fraction of non-observed pairwise 569 interactions cannot be sampled, due to biological constraints that forbid their 570 occurrence. Moreover, recent implementations of inference methods for unobserved 571 species (Chao et al., 2015) or for individual-based data (Wells & O’Hara, 2012) 572 can be combined with the forbidden link approach. They do not account either 573 for the existence of these ecological constraints, but can help in estimating their 574 relative importance, simply by the difference between the asymptotic estimate of 575 interaction richness in a robustly-sampled assemblage and the maximum richness 576 I of interactions. max 577 Ecological interactions provide the wireframe supporting the lives of species, 578 and they also embed crucial ecosystem functions which are fundamental for sup- 579 porting the Earth system. We still have a limited knowledge of the biodiversity 27 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 580 of ecological interactions, and they are being lost (extinct) at a very fast pace, 581 frequently preceding species extinctions (Valiente-Banuet et al., 2014). We ur- 582 gently need robust techniques to assess the completeness of ecological interactions 583 networks because this knowledge will allow the identification of the minimal com- 584 ponents of their ecological complexity that need to be restored to rebuild functional 585 ecosystems after perturbations. 586 Acknowledgements 587 I am indebted to Jens M. Olesen, Alfredo Valido, Jordi Bascompte, Thomas 588 Lewinshon, John N. Thompson, Nick Gotelli, Carsten Dormann, and Paulo R. 589 Guimara˜es Jr. for useful and thoughtful discussion at different stages of this 590 manuscript. Jeferson Vizentin-Bugoni kindly helped with the Sta Virgínia data. 591 Jens M. Olesen kindly made available the Grundvad dataset; together with Robert 592 K. Colwell, Néstor Pérez-Méndez, JuanPe González-Varo, and Paco Rodríguez pro- 593 vided most useful comments to a final version of the ms. Robert Colwell shared 594 a number of crucial suggestions that clarified my vision of sampling ecological in- 595 teractions. The study was supported by a Junta de Andalucía Excellence Grant 596 (RNM–5731), as well as a Severo Ochoa Excellence Award from the Ministerio de 597 Economía y Competitividad (SEV–2012–0262). The Agencia de Medio Ambiente, 598 Junta de Andalucía, provided generous facilities that made possible my long-term 599 field work in different natural parks. 28 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 600 Data accessiblity 601 This review does not use new raw data, but includes some re-analyses of previously 602 published material. All the original data supporting the paper, R code, supple- 603 mentary figures, and summaries of analytical protocols is available at the author’s 604 GitHub repository (https://github.com/pedroj/MS_Network-Sampling), with 605 DOI: 10.5281/zenodo.29437. 606 References 607 Banasek-Richter, C., Cattin, M. & Bersier, L. (2004) Sampling effects and the ro- 608 bustness of quantitative and qualitative food-web descriptors. Journal of Theo- 609 retical Biology, 226, 23–32. 610 Bascompte, J. & Jordano, P. (2014) Mutualistic networks. Monographs in Popu- 611 lation Biology, No. 53. Princeton University Press, Princeton, NJ. 612 Bersier, L., Banasek-Richter, C. & Cattin, M. (2002) Quantitative descriptors of 613 food-web matrices. Ecology, 83, 2394–2407. 614 Blüthgen, N. (2010) Why network analysis is often disconnected from community 615 ecology: A critique and an ecologist’s guide. Basic And Applied Ecology, 11, 616 185–195. 617 Bosch, J., Martín González, A.M., Rodrigo, A. & Navarro, D. (2009) Plant- 618 pollinator networks: adding the pollinator’s perspective. Ecology Letters, 12, 619 409–419. 29 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 620 Brose, U., Martinez, N. & Williams, R. (2003) Estimating species richness: Sen- 621 sitivity to sample coverage and insensitivity to spatial patterns. Ecology, 84, 622 2364–2377. 623 Bunge, J. & Fitzpatrick, M. (1993) Estimating the number of species: a review. 624 Journal of the American Statistical Association, 88, 364–373. 625 Canard, E., Mouquet, N., Marescot, L., Gaston, K.J., Gravel, D. & Mouillot, 626 D. (2012) Emergence of structural patterns in neutral trophic networks. PLoS 627 ONE, 7, e38295. 628 Chacoff, N.P., Vazquez, D.P., Lomascolo, S.B., Stevani, E.L., Dorado, J. & Padrón, 629 B. (2012) Evaluating sampling completeness in a desert plant-pollinator network. 630 Journal of Animal Ecology, 81, 190–200. 631 Chao, A. (2005) Species richness estimation. Encyclopedia of Statistical Sciences, 632 pp. 7909–7916. Oxford University Press, New York, USA. 633 Chao, A., Gotelli, N.J., Hsieh, T.C., Sander, E.L., Ma, K.H., Colwell, R.K. & Elli- 634 son, A.M. (2014) Rarefaction and extrapolation with Hill numbers: a framework 635 for sampling and estimation in species diversity studies. Ecological Monographs, 636 84, 45–67. 637 Chao, A., Hsieh, T.C., Chazdon, R.L., Colwell, R.K. & Gotelli, N.J. (2015) Un- 638 veiling the species-rank abundance distribution by generalizing the Good-Turing 639 sample coverage theory. Ecology, 96, 1189–1201. 640 Coddington, J.A., Agnarsson, I., Miller, J.A., Kuntner, M. & Hormiga, G. (2009) 30 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 641 Undersampling bias: the null hypothesis for singleton species in tropical arthro- 642 pod surveys. Journal of Animal Ecology, 78, 573–584. 643 Cohen, J.E. (1978) Food webs and niche space. Princeton University Press, Prince- 644 ton, New Jersey, US. 645 Cohen, J.E., Beaver, R.A., Cousins, S.H., DeAngelis, D.L., Goldwasser, L., Heong, 646 K.L., Holt, R.D., Kohn, A.J., Lawton, J.H., Martinez, N., O’Malley, R., Page, 647 L.M., Patten, B.C., Pimm, S.L., Polis, G., Rejmanek, M., Schoener, T.W., 648 Schenly, K., Sprules, W.G., Teal, J.M., Ulanowicz, R., Warren, P.H., Wilbur, 649 H.M. & Yodis, P. (1993) Improving food webs. Ecology, 74, 252–258. 650 Colwell, R. & Coddington, J. (1994) Estimating terrestrial biodiversity through ex- 651 trapolation. Philosophical Transactions Of The Royal Society Of London Series 652 B-Biological Sciences, 345, 101–118. 653 Colwell, R.K. (2009) Biodiversity: concepts, patterns, and measurement. The 654 Princeton Guide to Ecology (ed. S.A. Levin), pp. 257–263. Princeton University 655 Press, Princeton. 656 Colwell, R.K. (2013) EstimateS: Biodiversity Estimation. -, pp. 1–33. 657 Colwell, R.K., Dunn, R.R. & Harris, N.C. (2012) Coextinction and persistence of 658 dependent species in a changing world. Annual Review of Ecology Evolution and 659 Systematics, 43, 183–203. 660 Colwell, R.K., Mao, C.X. & Chang, J. (2004) Interpolating, extrapolating, and 661 comparing incidence-based species accumulation curves. Ecology, 85, 2717–2727. 31 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 662 Dorado, J., Vazquez, D.P., Stevani, E.L. & Chacoff, N.P. (2011) Rareness and 663 specialization in plant-pollinator networks. Ecology, 92, 19–25. 664 Dormann, C.F., Frund, J., Bluthgen, N. & Gruber, B. (2009) Indices, graphs and 665 null models: Analyzing bipartite ecological networks. Open Ecology Journal, 2, 666 7–24. 667 Dupont, Y.L., Trøjelsgaard, K. & Olesen, J.M. (2011) Scaling down from species 668 to individuals: a flower–visitation network between individual honeybees and 669 thistle plants. Oikos, 120, 170–177. 670 Dupont, Y.L., Trøjelsgaard, K., Hagen, M., Henriksen, M.V., Olesen, J.M., Ped- 671 ersen, N.M.E. & Kissling, W.D. (2014) Spatial structure of an individual-based 672 plant-pollinator network. Oikos, 123, 1301–1310. 673 Eklöf, A., Jacob, U., Kopp, J., Bosch, J., Castro-Urgal, R., Chacoff, N.P., 674 Dalsgaard, B., de Sassi, C., Galetti, M., Guimaraes, P.R., Lomáscolo, S.B., 675 Martín González, A.M., Pizo, M.A., Rader, R., Rodrigo, A., Tylianakis, J.M., 676 Vazquez, D.P. & Allesina, S. (2013) The dimensionality of ecological networks. 677 Ecology Letters, 16, 577–583. 678 Elberling, H. & Olesen, J.M. (1999) The structure of a high latitude plant-flower 679 visitor system: the dominance of flies. Ecography, 22, 314–323. 680 Frund, J., McCann, K.S. & Williams, N.M. (2015) Sampling bias is a challenge 681 for quantifying specialization and network structure: lessons from a quantitative 682 niche model. Oikos, pp. n/a–n/a. 32 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 683 Gale, W.A. & Sampson, G. (1995) Good-Turing frequency estimation without 684 tears. Journal of Quantitative Linguistics, 2, 217–237. 685 Gibson, R.H., Knott, B., Eberlein, T. & Memmott, J. (2011) Sampling method 686 influences the structure of plant–pollinator networks. Oikos, 120, 822–831. 687 González-Varo, J.P., Arroyo, J.M. & Jordano, P. (2014) Who dispersed the seeds? 688 The use of DNA barcoding in frugivory and seed dispersal studies. Methods in 689 Ecology and Evolution, 5, 806–814. 690 Good, I.J. (1953) The population frequencies of species and the estimation of 691 population parameters. Biometrika, 40, 237–264. 692 Gotelli, N.J. & Colwell, R.K. (2011) Estimating species richness. Biological Di- 693 versity Frontiers in Measurement and Assessment (eds. A.E. Magurran & B.J. 694 McGill), pp. 39–54. Oxford University Press, Oxford, UK. 695 Gotelli, N. & Colwell, R. (2001) Quantifying biodiversity: procedures and pitfalls 696 in the measurement and comparison of species richness. Ecology Letters, 4, 697 379–391. 698 Hortal, J., Borges, P. & Gaspar, C. (2006) Evaluating the performance of species 699 richness estimators: sensitivity to sample grain size. Journal of Animal Ecology, 700 75, 274–287. 701 Ibanez, S. (2012) Optimizing size thresholds in a plant–pollinator interaction web: 702 towards a mechanistic understanding of ecological networks. Oecologia, 170, 703 233–242. 33 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 704 Jordano, P. (1987) Patterns of mutualistic interactions in pollination and seed dis- 705 persal: connectance, dependence asymmetries, and coevolution. The American 706 Naturalist, 129, 657–677. 707 Jordano, P., Bascompte, J. & Olesen, J. (2003) Invariant properties in coevolu- 708 tionary networks of plant-animal interactions. Ecology Letters, 6, 69–81. 709 Jordano, P., Vázquez, D. & Bascompte, J. (2009) Redes complejas de interac- 710 ciones planta—animal. Ecología y evolución de interacciones planta-animal (eds. 711 R. Medel, R. Dirzo & R. Zamora), pp. 17–41. Editorial Universitaria, Santiago, 712 Chile. 713 Jurado-Rivera, J.A., Vogler, A.P., Reid, C.A.M., Petitpierre, E. & Gomez-Zurita, 714 J. (2009) DNA barcoding insect-host plant associations. Proceedings Of The 715 Royal Society B-Biological Sciences, 276, 639–648. 716 Magurran, A. (1988) Ecological diversity and its measurement. Princeton Univer- 717 sity Press, Princeton, US. 718 Mao, C. & Colwell, R.K. (2005) Estimation of species richness: mixture models, 719 the role of rare species, and inferential challenges. Ecology, 86, 1143–1153. 720 Martinez, N.D. (1993) Effects of resolution on food web structure. Oikos, 66, 721 403–412. 722 Martinez, N. (1991) Artifacts or attributes? Effects of resolution on food-web 723 patterns in Little Rock Lake food web. Ecological Monographs, 61, 367–392. 724 Maruyama, P.K., Vizentin-Bugoni, J., Oliveira, G.M., Oliveira, P.E. & Dalsgaard, 34 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 725 B. (2014) Morphological and spatio-temporal mismatches shape a neotropical 726 savanna plant-hummingbird network. Biotropica, 46, 740–747. 727 Moré, M., Amorim, F.W., Benitez-Vieyra, S., Medina, A.M., Sazima, M. & 728 Cocucci, A.A. (2012) Armament Imbalances: Match and Mismatch in Plant- 729 Pollinator Traits of Highly Specialized Long-Spurred Orchids. PLoS ONE, 7, 730 e41878. 731 Morris, R.J., Gripenberg, S., Lewis, O.T. & Roslin, T. (2013) Antagonistic inter- 732 action networks are structured independently of latitude and host guild. Ecology 733 Letters, 17, 340–349. 734 Nielsen, A. & Bascompte, J. (2007) Ecological networks, nestedness and sampling 735 effort. Journal of Ecology, 95, 1134–1141–1141. 736 Olesen, J.M., Bascompte, J., Dupont, Y.L., Elberling, H. & Jordano, P. (2011) 737 Missing and forbidden links in mutualistic networks. Proceedings Of The Royal 738 Society B-Biological Sciences, 278, 725–732. 739 Olesen, J.M., Dupont, Y.L., O’gorman, E., Ings, T.C., Layer, K., Melin, C.J., 740 Trjelsgaard, K., Pichler, D.E., Rasmussen, C. & Woodward, G. (2010) From 741 Broadstone to Zackenberg. Advances in Ecological Research, 42, 1–69. 742 Olesen, J. & Jordano, P. (2002) Geographic patterns in plant-pollinator mutualistic 743 networks. Ecology, 83, 2416–2424. 744 Olito, C. & Fox, J.W. (2014) Species traits and abundances predict metrics of 745 plant-pollinator network structure, but not pairwise interactions. Oikos, 124, 746 428–436. 35 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 747 Ollerton, J. & Cranmer, L. (2002) Latitudinal trends in plant-pollinator interac- 748 tions: are tropical plants more specialised? Oikos, 98, 340–350. 749 Pereira, H.M., Ferrier, S., Walters, M., Geller, G.N., Jongman, R.H.G., Scholes, 750 R.J., Bruford, M.W., Brummitt, N., Butchart, S.H.M., Cardoso, A.C., Coops, 751 N., Dulloo, E., Faith, D., Freyhof, J., Gregory, R.D., Heip, C., Hoft, R., Hurtt, 752 G., Jetz, W., Karp, D.S., Mcgeoch, M., Obura, D., Onoda, Y., Pettorelli, N., 753 Reyers, B., Sayre, R., Scharlemann, J.P.W., Stuart, S., Turak, E., Walpole, M. 754 & Wegmann, M. (2013) Essential biodiversity variables. Science, 339, 277–278. 755 Pocock, M.J.O., Evans, D.M. & Memmott, J. (2012) The Robustness and Restora- 756 tion of a Network of Ecological Networks. Science, 335, 973–977. 757 Preston, F. (1948) The commonness, and rarity, of species. Ecology, 29, 254–283. 758 R Development Core Team (2010) R: A language and environment for statis- 759 tical computing. R Foundation for Statistical Computing. Vienna, Austria. 760 http://www.R-project.org, Vienna, Austria. 761 Rivera-Hutinel, A., Bustamante, R.O., Marín, V.H. & Medel, R. (2012) Effects of 762 sampling completeness on the structure of plant-pollinator networks. Ecology, 763 93, 1593–1603. 764 Schleuning, M., Frund, J., Klein, A.M., Abrahamczyk, S., Alarcón, R., Albrecht, 765 M., Andersson, G.K.S., Bazarian, S., Böhning-Gaese, K., Bommarco, R., Dals- 766 gaard, B., Dehling, D.M., Gotlieb, A., Hagen, M., Hickler, T., Holzschuh, A., 767 Kaiser-Bunbury, C.N., Kreft, H., Morris, R.J., Sandel, B., Sutherland, W.J., 768 Svenning, J.C., Tscharntke, T., Watts, S., Weiner, C.N., Werner, M., Williams, 36 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 769 N.M., Winqvist, C., Dormann, C.F. & Blüthgen, N. (2012) Specialization of 770 mutualistic interaction networks decreases toward tropical latitudes. Current 771 Biology, 22, 1925–1931. 772 Snow, B. & Snow, D. (1972) Feeding niches of hummingbirds in a Trinidad valley. 773 Journal of Animal Ecology, 41, 471–485. 774 Snow, B. & Snow, D. (1988) Birds and berries. Poyser, Calton, UK. 775 Stang, M., Klinkhamer, P., Waser, N.M., Stang, I. & van der Meijden, E. (2009) 776 Size-specific interaction patterns and size matching in a plant-pollinator inter- 777 action web. Annals Of Botany, 103, 1459–1469. 778 Strogatz, S. (2001) Exploring complex networks. Nature, 410, 268–276. 779 Stumpf, M.P.H., Wiuf, C. & May, R.M. (2005) Subnets of scale-free networks are 780 not scale-free: Sampling properties of networks. Proceedings of the National 781 Academy of Sciences USA, 102, 4221–4224. 782 Thébault, E. & Fontaine, C. (2010) Stability of ecological communities and the 783 architecture of mutualistic and trophic networks. Science, 329, 853–856. 784 Valiente-Banuet, A., Aizen, M.A., Alcántara, J.M., Arroyo, J., Cocucci, A., 785 Galetti, M., García, M.B., García, D., Gomez, J.M., Jordano, P., Medel, R., 786 Navarro, L., Obeso, J.R., Oviedo, R., Ramírez, N., Rey, P.J., Traveset, A., 787 Verdú, M. & Zamora, R. (2014) Beyond species loss: the extinction of ecological 788 interactions in a changing world. Functional Ecology, 29, 299–307. 789 Vázquez, D.P., Chacoff, N.P. & Cagnolo, L. (2009) Evaluating multiple deter- 37 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 790 minants of the structure of plant-animal mutualistic networks. Ecology, 90, 791 2039–2046. 792 Vazquez, D.P., Ramos-Jiliberto, R., Urbani, P. & Valdovinos, F.S. (2015) A con- 793 ceptual framework for studying the strength of plant-animal mutualistic inter- 794 actions. Ecology Letters, 18, 385–400. 795 Vazquez, D., Morris, W. & Jordano, P. (2005) Interaction frequency as a surrogate 796 for the total effect of animal mutualists on plants. Ecology Letters, 8, 1088–1094. 797 Vizentin-Bugoni, J., Maruyama, P.K. & Sazima, M. (2014) Processes entangling 798 interactions in communities: forbidden links are more important than abundance 799 in a hummingbird-plant network. Proceedings Of The Royal Society B-Biological 800 Sciences, 281, 20132397–20132397. 801 Wells, K. & O’Hara, R.B. (2012) Species interactions: estimating per-individual 802 interaction strength and covariates before simplifying data into per-species eco- 803 logical networks. Methods in Ecology and Evolution, 4, 1–8. 804 Wirta, H.K., Hebert, P.D.N., Kaartinen, R., Prosser, S.W., Várkonyi, G. & Roslin, 805 T. (2014) Complementary molecular information changes our perception of food 806 web structure. Proceedings of the National Academy of Sciences USA, 111, 807 1885–1890. 38 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 808 Figure captions 809 Figure 1. Sampling ecological interaction networks (e.g., plant-animal interac- 810 tions) usually focus on different types of subsampling the full network, yielding 811 submatrices [m; n] of the full interaction matrix  with A and P animal and 812 plant species. a) all the potential plants interacting with a subset of the animals 813 (e.g., studying just the hummingbird-pollinated flower species in a community); 814 b) all the potential animal species interacting with a subset of the plant species 815 (e.g., studying the frugivore species feeding on figs Ficus in a community); and c) 816 sampling a subset of all the potential animal species interacting with a subset of all 817 the plant species (e.g., studying the plant-frugivore interactions of the rainforest 818 understory). 820 Figure 2. Sampling species interactions in natural communities. Suppose an 821 assemblage with A = 3 animal species (red, species 1–3 with three, two, and 1 822 individuals, respectively) and P = 3 plant species (green, species a-c with three 823 individuals each) (colored balls), sampled with increasing effort in steps 1 to 6 824 (panels). In Step 1 we record animal species 1 and plant species 1 and 2 with 825 a total of three interactions (black lines) represented as two distinct interactions: 826 1 a and 1 b. As we advance our sampling (panels 1 to 6, illustrating e.g., 827 additional sampling days) we record new distinct interactions. Note that we actu- 828 ally sample and record interactions among individuals, yet we pool the data across 829 species to get a species by species interaction matrix. Few network analyses have 830 been carried out on individual data(Dupont et al., 2014). 39 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 832 Figures Figure 1: Plants a b c P n n Animals bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Figure 2: 1 2 2 2 2 2 3 3 4 1 2 3 1 1 b 1 1 b 1 1 b c c c a a a b a b a b a 3 3 3 a a a c 2 c 2 c 2 1 1 1 2 2 2 c b c b c b 4 5 6 1 1 b 1 1 b 1 1 b c c c a a a b a b a b a 3 3 3 a a a c 2 c 2 c 2 1 1 1 2 2 2 c b c b c b 3 3 4 3 3 5 3 3 6 Jordano - Figure 1 41 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 833 Table captions 834 Table 1. A taxonomy of link types for ecological interactions (Olesen et al. 2011). 835 A, number of animal species; P , number of plant species; I, number of observed 836 links; C = 100I=(AP ), connectance; FL, number of forbidden links; and ML, 837 number of missing links. As natural scientists, our ultimate goal is to eliminate 838 ML from the equation FL = AP I ML, which probably is not feasible given 839 logistic sampling limitations. When we, during our study, estimate ML to be 840 negligible, we cease observing and estimate I and FL. 842 Table 2. Frequencies of different type of forbidden links in natural plant-animal 843 interaction assemblages. AP , maximum potential links, I ; I, number of ob- max 844 served links; UL, number of unobserved links; FL, number of forbidden links; 845 FL , phenology; FL , size restrictions; FL , accessibility; FL , other types of P S A O 846 restrictions; ML, unknown causes (missing links). Relative frequencies (in paren- 847 theses) calculated over I = AP for I, ML, and FL; for all forbidden links types, max 848 calculated over FL. References, from left to right: Olesen et al. 2008; Olesen & 849 Myrthue unpubl.; Snow & Snow 1972 and Jordano et al. 2006; Vizentin-Bugoni 850 et al. 2014; Jordano et al. 2009; Olesen et al. 2011. 852 Table 3. A vectorized interaction matrix. 854 Table 4. Sampling statistics for three plant-animal interaction networks (Olesen 855 et al. 2011). Symbols as in Table 1; N, number of records; Chao1 and ACE are 856 asymptotic estimators for the number of distinct pairwise interactions I (Hortal 42 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks 857 et al. 2006), and their standard errors; C, sample coverage for rare interactions 858 (Chao & Jost 2012). Scaled asymptotic estimators and their confidence intervals 859 (CI) were calculated by weighting Chao1 and ACE with the observed frequencies 860 of forbidden links. 862 Tables 43 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Table 1: Link type Formulation Definition Potential links I = A P Size of observed network matrix, max obs obs i.e. maximum number of potentially observable interactions; A and P , numbers of interacting obs obs animal and plant species, respectively. These might be below the real numbers of animal and plant species, A and P . est est Observed links I Total number of observed links obs in the network given a sufficient sampling effort. Number of ones in the adjacency matrix. True links I Total number of links in the network est given a sufficient sampling effort; expected for the augmented A P matrix. est est Unobserved links UL = I I Number of zeroes in the adjacency max obs matrix. True unobserved links UL = I I Number of zeroes in the augmented max obs adjacency matrix that, eventually, includes unobserved species. Forbidden links FL Number of links, which remain unobserved because of linkage constraints, irrespectively of sufficient sampling effort. Observed Missing links ML = A P I FL Number of links, which may exist in obs obs obs nature but need more sampling effort and/or additional sampling methods to be observed. True Missing links ML = A P I FL Number of links, which may exist in est est est nature but need more sampling effort and/or additional sampling methods to be observed. Augments ML for the A P matrix. est est 44 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Table 2: Pollination Seed dispersal Link Zackenberg Grundvad Arima Sta. Hato Nava type Valley Virginia Ratón Correhuelas I 1891 646 522 423 272 825 max I 268 212 185 86 151 181 (0.1417) (0.3282) (0.3544) (0.1042) (0.4719) (0.2194) UL 1507 434 337 337 169 644 (0.7969) (0.6718) (0.6456) (0.4085) (0.5281) (0.7806) FL 530 107 218 260 118 302 (0.3517) (0.2465) (0.6469) (0.7715) (0.6982) (0.4689) FL 530 94 0 120 67 195 (1.0000) (0.2166) (0.0000) (0.1624) (0.3964) (0.3028) FL    (  ) 8 30 140 31 46 (0.0714) (0.0184) (0.0890) (0.1894) (0.1834) FL    (  ) 5 150    (  ) 20 61 (0.0947) (0.0115) (0.445) (0.1183) FL    (  )    (  ) 38    (  )    (  ) 363 (0.1128) (0.5637) ML 977 327 119 77 51 342 (0.6483) (0.7535) (0.3531) (0.1042) (0.3018) (0.5311) , Lack of accessibility due to habitat uncoupling, i.e., canopy-foraging species vs. understory species. , Colour restrictions, and reward per flower too small relative to the size of the bird. 45 bioRxiv preprint first posted online Sep. 1, 2015; doi: http://dx.doi.org/10.1101/025734. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license. Jordano - Sampling networks Table 3: Interaction Sample 1 Sample 2 Sample 3 . . . Sample i A1 - P2 12 2 0 . . . 6 A1 - P2 0 0 0 . . . 1 . . . . . . . . . . . . . . . . . . A5 - P3 5 0 1 . . . 18 A5 - P4 1 0 1 . . . 3 . . . . . . . . . . . . . . . . . . A - P 1 0 1 . . . 2 i i Table 4: Hato Ratón Nava Correhuelas Zackenberg A 17 33 65 P 16 25 31 I 272 825 1891 max N 3340 8378 1245 I 151 181 268 C 0.917 0.886 0.707 Chao1 263:1  70:9 231:4  14:2 509:6  54:7 ACE 240:3  8:9 241:3  7:9 566:1  14:8 % unobserved 8.33 15.38 47.80 , estimated with library Jade (R Core Development Team 2010, Chao et al. 2015)

Journal

bioRxivbioRxiv

Published: Sep 6, 2015

There are no references for this article.