Access the full text.
Sign up today, get DeepDyve free for 14 days.
www.nature.com/npjmgrav ARTICLE OPEN Meta-analysis of the space flight and microgravity response of the Arabidopsis plant transcriptome 1 2 3 4,5 4,6 1 Richard Barker , Colin P. S. Kruse , Christina Johnson , Amanda Saravia-Butler , Homer Fogle , Hyun-Seok Chang , 7 1 8 8 8 9 Ralph Møller Trane , Noah Kinscherf , Alicia Villacampa , Aránzazu Manzano , Raúl Herranz , Laurence B. Davin , 9 10 11 12 12 4 Norman G. Lewis , Imara Perera , Chris Wolverton , Parul Gupta , Pankaj Jaiswal , Sigrid S. Reinsch , 13 1 Sarah Wyatt and Simon Gilroy Spaceflight presents a multifaceted environment for plants, combining the effects on growth of many stressors and factors including altered gravity, the influence of experiment hardware, and increased radiation exposure. To help understand the plant response to this complex suite of factors this study compared transcriptomic analysis of 15 Arabidopsis thaliana spaceflight experiments deposited in the National Aeronautics and Space Administration’s GeneLab data repository. These data were reanalyzed for genes showing significant differential expression in spaceflight versus ground controls using a single common computational pipeline for either the microarray or the RNA-seq datasets. Such a standardized approach to analysis should greatly increase the robustness of comparisons made between datasets. This analysis was coupled with extensive cross-referencing to a curated matrix of metadata associated with these experiments. Our study reveals that factors such as analysis type (i.e., microarray versus RNA-seq) or environmental and hardware conditions have important confounding effects on comparisons seeking to define plant reactions to spaceflight. The metadata matrix allows selection of studies with high similarity scores, i.e., that share multiple elements of experimental design, such as plant age or flight hardware. Comparisons between these studies then helps reduce the complexity in drawing conclusions arising from comparisons made between experiments with very different designs. npj Microgravity (2023) 9:21 ; https://doi.org/10.1038/s41526-023-00247-6 INTRODUCTION to this environment. In the field of plant biology the National Aeronautics and Space Administration’s (NASA’s) GeneLab data Spaceflight imposes a unique suite of environmental effects on 10,11 repository has aggregated many such omics datasets. biology. For example, microgravity severely curtails the signals Critically, the deposited data are associated with extensive normally generated on Earth from the intrinsic weight of a plant’s 1 2–4 metadata covering elements of each experiment’s design ranging organs and by its gravity perceptive cells . By contrast, in the from features of the hardware, radiation exposure and lighting terrestrial environment, these are key factors driving normal regime to treatment duration, genotype and organism age. Such growth and development. In addition, gravitational forces on extensive and accurate metadata are critical to understanding the Earth govern a host of physical processes including gas and liquid breadth of differences in experimental designs when making flow that are important for normal plant function. Thus, the comparisons between studies. This insight is important as the microgravity environment can lead to the development of anoxic flight hardware used, the analysis methodology employed (e.g., regions around metabolically active plant tissues and altered microarray versus RNA-seq for transcriptome studies) and other patterns of evaporative and convective cooling that can impact 5–8 experimental parameters likely superimpose their own, often leaf function and physiology . Additionally, the increased poorly defined, influences on the results (so-called batch radiation exposure inherent in spaceflight is likely to trigger its own array of responses within the plant. The combination of these effects ). Indeed, recent analysis of rodent spaceflight data spaceflight-linked effects is outside the evolutionary history of suggests differences in sample preservation eclipsed spaceflight- driven differences in mouse transcriptional profiling . However, terrestrial biology and so it remains complicated to predict the given the relatively few opportunities to conduct experiments in effects of spaceflight on organisms. Yet, understanding the molecular and physiological responses of plants to these space, making comparisons between existing studies represents a conditions remains an important goal for space biologists, not potentially powerful approach to identify common responses in the least because plants are integral to many plans for life support the often-limited available spaceflight data. We have therefore imported 15 spaceflight-related plant on long-duration crewed missions and for colonization . transcriptome datasets from the GeneLab data repository and One way to probe the responses of organisms to spaceflight is manually curated the associated metadata to develop a metadata by analysis of changes in their transcriptomes, proteomes, metabolomes, genomes and epigenomes induced by exposure matrix (hereafter, the Matrix). This approach allows the more 1 2 3 Department of Botany, University of Wisconsin, Madison, WI 53706, USA. Los Alamos National Laboratory, Bioscience Division, Los Alamos, NM 87545, USA. NASA John F. 4 5 Kennedy Space Center, Titusville, FL 32899, USA. Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA. Logyx, LLC, Mountain View, CA 94043, 6 7 8 USA. Bionetics, Yorktown, VA 23693, USA. Department of Statistics, University of Wisconsin, Madison, WI 53706, USA. Centro de Investigaciones Biológicas Margarita Salas 9 10 (CSIC), 28040 Madrid, Spain. Institute of Biological Chemistry, Washington State University, Pullman, WA 99164-741, USA. Department of Plant and Microbial Biology, North 11 12 Carolina State University, Raleigh, NC 27695, USA. Department of Botany and Microbiology, Ohio Wesleyan University, Delaware, OH 43015, USA. Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA. Department of Environmental and Plant Biology, Ohio University, Athens, OH 45701, USA. email: sgilroy@wisc.edu Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA 1234567890():,; R. Barker et al. Fig. 1 Uniform analysis pipeline applied to Arabidopsis GLDS datasets used in this study. Normalized expression arrays are imported from NASA’s GeneLab repository (https://genelab-data.ndc.nasa.gov/genelab/projects) and then parsed by the TOAST X-Species Transcriptional Explorer (https://astrobiology.botany.wisc.edu/x-species-astrobiology-genelab) for analysis of common features between experiments (cross experiment intersect analysis). The iDEP.92 R-shiny app is then used to generate expression heatmaps for clustering, and to perform Principal Component Analysis (PCA), Multidimensional Scaling analysis (MDS), t-distributed Stochastic Neighbor Embedding (T-SNE), Weighted Gene Correlation Network Analysis (WGCNA) and K-means statistical analyses. Functional analyses are then performed using the 53 56 57 58 online tools at Ensembl GO , KEGG (Kyoto Encyclopedia of Gene and Genomes) , AraCyc and Reactome . These data are then visualized as tables and dendrograms of the enriched functional groups that are altered by spaceflight and/or related stimuli. robust design of comparisons between studies that share unique insights of the researchers who performed each experi- commonalities in experimental design. Our meta-analyses broadly ment and the tailored analytical tools and approaches they then confirmed the spaceflight-related changes in cell wall processes employed to define differentially expressed genes (DEGs) in their and oxidative stress that were highlighted in many of the original original publications. We will refer to these studies as in-house publications associated with each individual study. Additionally, analyses. However, the wide range of analytical pipelines used in Matrix-driven analyses helped reveal new response elements, such such a primary literature-based approach inevitably imposes some as conserved spaceflight-triggered changes in expression of the limitations on the robustness of any conclusions that can be cold response gene COLD RESPONSIVE 78 (COR78), and likely shifts drawn between studies. This problem arises because differences in in ion transport processes. We also identified factors within the gene expression patterns between datasets likely involve both the experimental design such as choice of flight hardware and effects of experimental treatments, such as growing plants in especially assay technique (i.e., microarray versus RNA-seq) that spaceflight versus a ground control, and of elements specificto can impose greater differences between datasets than the the different analytical programs and statistical approaches used spaceflight treatment. Thus, the Matrix allows researchers to to analyze the data. Indeed, differences between results from explore the wealth of plant biology transcriptomic data generated different software packages analyzing the same raw transcrip- during spaceflight-related studies and provides an approach to tomics datasets are well-documented in the literature . Therefore, better understand underlying factors impacting the robustness of a complementary methodology was also applied by reanalyzing comparisons made between the different datasets. the plant studies used in our analysis via the common computational pipelines summarized in Fig. 1. A similar strategy RESULTS AND DISCUSSION of reanalyzing published datasets using a common computational Comparative transcriptomics of plant spaceflight-response approach has been used in the EMBL-EBI gene Expression Atlas. data For example, when these researchers import RNA-seq data, a One method to assess the similarities and differences in standardized analysis is performed using the integrated RNA-seq transcriptome-level responses between different plant spaceflight Analysis Pipeline, or iRAP, approach . Although this analysis pipeline is different from the one we have adopted, the experiments is to make comparisons using the results of the analyses already presented in the primary literature on each study. standardizing of analysis across all datasets for the EMBL-EBI gene Expression Atlas is designed with the same goal in mind: to This approach can be further expedited using tools, such as the Test of Arabidopsis Space Transcriptome (TOAST) database that reduce the potential for generating artifacts that are caused by aggregates these analyses into an interactive data exploration making comparisons between datasets that have been the subject environment. Such comparative studies capitalize upon the of different initial data analysis methodologies. npj Microgravity (2023) 21 Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA 1234567890():,; R. Barker et al. Fig. 2 Principal component analysis (PCA) of the 15 plant datasets in the Matrix reveals clustering based on analystical approach (microarray versus RNA-seq) and by lighting conditions. Principal components sperate datasets by a microarray versus RNA-seq-based analyses and b by growth in the light versus the dark environment of the growth hardware. PC1 principal component 1, PC2 principal component 2, PC3 principal component 3. Percentage reflects the degree of variance accounted for by each principal component. c Euclidian hierarchical clustering confirms grouping by assay type (microarray versus RNA-seq) as major factor within the data. Ecotypes: Col Columbia, Cvi Cape Verde Island, Ws Wassilewskija, Col-0 + Ws, mixed sample 80% Ws, and 20% Col ecotypes. Genotypes: WT wild-type, act2 actin 2, arg1 altered response to gravity 1, atm1 ataxia-telangiectasia mutated 1, hsfa2 heat shock transcription factor A2, phyD phytochrome D. Using our common analysis pipeline approach to comparing change cut off to maintain the most inclusive list of DEGs for DEGs across all the Arabidopsis studies, batch effects (i.e., analysis. However, fold-change in expression level is also confounding variables imposing effects on patterns of gene presented in the Supplementary Data, allowing the reader to expression over and above those of the spaceflight treatment) filter the gene lists using a fold-change cut off as appropriate for became readily evident. Thus, Principal Component Analysis (PCA) their analyses and Euclidean hierarchical clustering revealed that rather than the We next created a connectivity network visualization system comparison between of spaceflight and ground control, whether using all the pairwise comparisons that can be made between the RNA-seq or microarray was used to detect patterns of gene GLDS used in our analysis (Fig. 3a–g; Supplementary Data 2; an expression is the factor with the largest effect on separating interactive version of this connectivity analysis is available at: studies (PCA1, explaining 83% of the variance between experi- https://gilroy-qlik.botany.wisc.edu/a/sense/app/20aa802b-6915- ments; Fig. 2a, c). Similar analysis showed the important but lesser 4b1a-87bd-c029a1812e2b/sheet/6241e71a-a3c5-4c63-9210- impact of lighting environment (Fig. 2b). It is important to note e05c743699d7/state/analysis). Pairwise factor correlation analysis here that we have used a statistical threshold of p < 0.01 to define was performed by inspection of the Matrix in Supplementary Data a DEG. Our analysis pipeline also generates the more stringent 1 and manually scoring factors that are similar between different adjusted p-value (or q-value) that corrects the p-value for the false pairs of studies, assigning a value of 1 for each factor shared discovery rate associated with multiple testing. Although we have between a pair and a value of zero if that factor was different. analyzed the p-value filtered results to encompass as broad a set Thus, the more factors in common between a pair of studies, the of DEGs as possible, q-values are presented in the tables of greater their similarity scores. The full pairwise similarity matrix Supplementary Data, to allow the reader to define DEG lists using can be found in Supplementary Data 2. This approach linked this parameter. Similarly, a cut off related to fold-change in the studies using similarity scores reflecting commonalities in the expression (such as only evaluating genes showing ≥2-fold different experimental designs and metadata factors within the change in e.g., spaceflight versus their paired ground control) is datasets in our study. Such network analysis allowed us to further often used in the literature to limit the extent of the gene lists visualize and dissect links between the factors that potentially being analyzed. Again, we have opted not to apply such a fold- cause the clustering of studies identified in the PCA (Fig. 2). Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA npj Microgravity (2023) 21 R. Barker et al. Fig. 3 Pairwise factor correlation analysis creates a weighted network linking studies based on metadata similarity score. a Whole connectivity network. Numbers and thickness of connection (network edge) reflect degree of connectivity through shared metadata factors. b–f 5 sub-networks based on common BRIC hardware experiment design: b sub-network of experiments performed using the BRIC hardware (mean connectivity score: 6.3), c BRIC experiments involving seedlings (mean connectivity score: 6.0). Seedling experiments analyzed using d RNA-seq (mean connectivity score: 8) or e microarray (mean connectivity score: 7.6) and f BRIC experiments that have used cell cultures, all analyzed by microarray (mean connectivity score: 7.4). For a–g size of circle for each study reflects the number of connected factors available for pairwise comparison. g Examples of connectivity of GLDS-7, GLDS-37, GLDS-38, and GLDS-120 by tissues sampled and ecotypes analyzed. Colored lines reflect factor connecting studies. Ecotypes: Col, Columbia; Ws, Wassilewskija; Ler, Landsberg. See Supplementary Data 2 for full connectivity matrix. When represented graphically as links between studies and It is important to remember here that the lighting environment metadata factors, this analysis demonstrated that hardware and for an experiment is often dictated by the hardware that was used, its associated lighting regimes were indeed likely key components for example, most plant experiments performed to date using the that influence clustering of responses in the data (Fig. 4). Biological Research in Canister (BRIC) hardware are conducted in npj Microgravity (2023) 21 Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA R. Barker et al. Fig. 4 Graphical representation of metadata related to tissues, assay type and flight vehicle. The specific assay and tissue types for each dataset are indicated with network clustering based on hardware. See Supplementary Data 1 and 2 for the Matrix driving this visualization. Note the hardware used to analyze plant response to spaceflight often defines the types of tissue that are available and so these two variables are often linked. Purple color circles represent RNAseq analysis of wild-type Col-0 plants, shades of blue represent other WT ecotypes, the pink circle represents RNA-seq analysis performed on mutants. The size of circles is a qualitative representation of the amount of differentially expressed loci relative to other genetic varieties used during that study. Ecotypes: Col Columbia, Cvi Cape Verde Island, Ws Wassilewskija, Ler Landsberg, Col-0 + Ws mixed sample 80% Ws and 20% Col ecotypes. Genotypes: WT wild-type, arg1 altered response to gravity 1, hsfa2 heat shock transcription factor A2, atm1 ataxia-telangiectasia mutated 1, phyD phytochrome D, Hardware: BRIC Biological Research in Canister, EMCS European Modular Cultivation System, VEGGIE Vegetable production system, SIMBOX SIMBOX incubator system, ABRS Advanced Biological Research System. An interactive version of this visualization is available at: https://gilroy-qlik.botany.wisc.edu/a/sense/app/20aa802b-6915- 4b1a-87bd-c029a1812e2b/sheet/6241e71a-a3c5-4c63-9210-e05c743699d7/state/analysis. the dark. Therefore, lighting and hardware are inevitably closely took place with a unique hardware setup (square Petri plates that linked in our network analyses of current datasets. This observa- were attached to the inside wall of the International Space tion also highlights the insight that could be gained by Station), but contains multiple ecotypes, genotypes and light performing more studies that use the same hardware but with a treatments that link it to many other studies in the Matrix. These range of lighting environments. Such analyses could help separate connections to other studies suggest that comparisons within the hardware-related effects on the plant during growth in space from local networks where each study acts as a hub are likely to be those specifically triggered by the lighting environment under fruitful targets to extract common spaceflight-related responses. those conditions. Conversely, such network analyses also revealed studies that are Such network analyses distinguish those studies sharing a high the most distinct (i.e., least shared metadata factors with other degree of network linkages within the Matrix, i.e., studies with a Matrix studies). One clear set of such studies are those designed larger number of common features in their experimental design. around terrestrial spaceflight analogs such as GLDS-46, GLDS-136, Results of comparisons between such highly connected studies and GLDS-144. These experiments use elements such as are candidates for more robust analyses due to these shared hyperbaric chambers, space radiation analog exposures, and factors. For example, although the overall experimental designs microgravity simulation on clinostats and random positioning behind GLDS-7, GLDS-37, GLDS-38, and GLDS-120 differ from each machines to mimic specific aspects of the spaceflight environment other by ecotype, hardware or experiment duration, each links to and so are more distant in design to the other spaceflight multiple other spaceflight experiments within the Matrix and form experiments. Thus, as shown in Supplementary Fig. 1, pairwise hubs in networks related to hardware and/or tissue sample type similarity matrix comparisons show spaceflight studies are most (Fig. 3g). Thus, GLDS-7 was performed in the Advanced Biological similar to other spaceflight studies (average pairwise similarity Research System (ABRS) and is of interest due to the high number score of 5.88 ± 1.93) and significantly less similar (p < 0.01) when of tissues and ecotypes in its experimental design that link to compared to ground analog studies (average similarity many other studies in the Matrix. GLDS-37 was conducted in the 4.21 ± 1.36), which are most similar to other ground analog BRIC hardware and is extensively linked to other studies due to studies. This Matrix-driven network visualization then highlights the large number of Arabidopsis ecotypes analyzed, as well as the the opportunity to design follow-up experiments that use these many other Arabidopsis BRIC experiments available for compar- analogs of putative spaceflight stressors but where the design of ison. GLDS-38 (BRIC) provides RNA-seq and paired proteomics the study is more interconnected to the factors seen in their data that likewise connect to many other BRIC datasets. GLDS-120 closest spaceflight studies within the Matrix. Such aligned Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA npj Microgravity (2023) 21 R. Barker et al. experimental designs could help increase the robustness of intensity for microarray and FPKM for RNAseq) from all the subsequent comparisons to the existing spaceflight data. studies in the Matrix and calculated the Pearson’s correlation We next asked if we could define factors within the metadata coefficient for all possible pairwise combinations (Supplementary other than spaceflight treatment that help define clustering within Data 3). We next calculated the average Pearson’s correlation the studies. We therefore took the expression level data for each coefficient from this analysis for each set of replicates within an individual sample replicate (normalized probe fluorescence experiment, providing a measure of correlation for each treatment npj Microgravity (2023) 21 Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA R. Barker et al. Fig. 5 Unguided Weighted Gene Correlation Network Analysis (WGCNA) clustering of the Arabidopsis datasets used in this study. This analysis was performed on the DEGs identified in the RNA-seq (a) and microarray (c) datasets from the spaceflight experiments imported into the Matrix (see Table 2 for specific datasets used). This analysis identified 4 clusters of DEGs within the RNA-seq (a) and 3 clusters within the microarray analyses (c). b Overlap in the DEGs within each cluster between the WGCNA RNA-seq and microarray analyses. Purple curves link identical genes and light blue curves link genes that, although not identical, belong to the same enriched Gene Ontology term found in each clade. The inner circle represents gene lists, where hits are arranged along the arc. Genes that hit multiple clusters are colored in dark orange, and genes unique to a single cluster are shown in light orange. d List of top 20 significantly enriched Gene Ontologies drawn from the clusters of DEGs depicted in a–c that are shared by 2 or more clusters. The full list of enriched Gene Ontology terms is reproduced in Supplementary Data Fig. 1. Multiple colors under the PATTERN column indicate a pathway or process that is shared across multiple microarray or RNA-seq clades as denoted by their color coding in a and c. Count number of loci included in enrichment analysis, % proportion of all query genes that are found in the given Gene Ontology term, P p-value, q p-value adjusted for multiple testing. Analysis made using Metascape . within a dataset to all other treatments in all datasets in the this analysis combines the responses across all the diverse plant Matrix. We then sorted these data by each metadata factor within datasets within the Matrix. Even though the analysis in Fig. 5dis the Matrix to ask if a particular metadata factor stood out as filtered to exclude the broadest Gene Ontology terms (i.e., those explaining the patterns of correlation within the transcriptomics terms encompassing more than 100 genes), such wide-ranging data. Of these, radiation treatment was the most highly correlated analysis might be expected to reveal only the most general factor (Supplementary Data 3), followed by genotype, tissue/ common responses, whilst being relatively insensitive to more developmental stage, flight hardware and then altered gravity (i.e., subtle or specific spaceflight responses. This is because of the spaceflight). This analysis again highlights the likelihood that variation likely imposed by the wide range of experimental many experimental factors are imposing patterns on spaceflight designs encompassed by these datasets, i.e., in addition to transcriptional profiles and reinforces the effects of radiation spaceflight-related effects a host of other responses are likely exposure as a key area for future spaceflight-related superimposed on the data, diluting the signal from some experimentation. spaceflight responses. It seems likely a similar reason explains the observation that, although there are shared spaceflight enriched gene ontologies between experiments, there is no Mining the Matrix for common patterns of spaceflight- individual DEG common to all these experiments. The Matrix responsive gene expression facilitates a more targeted subset of comparisons between Insights from the network of connections between the datasets (e.g., chosen based on commonalities in the hardware spaceflight-related datasets in the Matrix were then used to make or plant samples used within each experiment) that might be comparisons between gene expression patterns seen in space- expected to reduce this experimental design-driven noise to flight treatments and ground control samples (e.g., excluding the reveal these more specific shared gene groupings. An example of ground-based spaceflight analog studies). Having defined the such an analysis described in the following section. assay type (microarray versus RNA-seq) as one of the most important confounding factors when comparing spaceflight Analysis of studies using common hardware: BRIC datasets responsive transcripts across multiple datasets (Fig. 2), the provide 2 tissue types and 2 transcriptome assay types for microarray and RNA-seq datasets were separated into two parallel meta-analysis analysis pipelines. The data of the DEGs within the two series of datasets was then analyzed using Weighted Gene Co-expression The analyses in Figs. 2 and 3 suggest that both the specific flight Network Analysis (WGCNA). Unguided WGCNA clustering identi- hardware used and its associated lighting regime significantly fied 3 groupings within the microarray datasets and 4 within the impact the patterns of gene expression noted in plants in RNA-seq data (Fig. 5a–c). Krishnamurthy et al. have compared spaceflight. Further, our network analyses (Fig. 3b) show that microarray and RNA-seq analyses of identical samples from GLDS-17, GLDS-37, GLDS-38, GLDS-44, and GLDS-121 are all highly Arabidopsis roots, concluding that although the two approaches connected, especially for these factors. Thus, these studies all used broadly agreed (on ~66% of ~6400 DEGs in their study), RNA-seq etiolated seedlings grown in the dark in the Petri Dish Fixation analysis revealed significantly more DEGs. Thus, in our study the Unit (PDFU) cassettes of the BRIC hardware. Additionally, samples RNA-seq is likely providing a broader dataset within which to find were harvested at the young seedling stage of development (up enriched Gene Ontologies likely leading to the increased number to 12 days old) and all included a paired on-orbit and ground of groupings found by our analysis. control design to allow for exploration of spaceflight-related 18–22 The top 20 enriched ontology groupings shared between the patterns of DEGs . Differences between the studies include RNA-seq and microarray analyses are summarized in Fig. 5d and ecotype and analysis type (microarray versus RNA-seq). Never- the full set of significantly enriched Gene Ontologies is shown in theless, their high levels of similarity, especially at the level of the Supplementary Fig. 2. Clade I in both the microarray and RNA-seq hardware and lighting used, suggested to us that they could stands out as sharing the most common significantly DEGs. When provide an important set of similarly designed experimentation to expanding this analysis to include shared significantly enriched help more robustly reveal common spaceflight responses. Gene Ontology terms (Fig. 5c), terms that broadly cover response Additionally, all these studies have published in-house analyses to environmental stresses (such as to light, cold and bacteria) are from their respective research groups, allowing us to further test seen (Supplementary Fig. 2). This observation supports the the relative merits of comparisons drawn between the in-house conclusions from numerous previous spaceflight analyses that results from each original publication versus the common plants exhibit a suite of stress-related responses when encounter- analytical pipeline that we have used in this study. Table 1 ing the spaceflight environment. However, a further prominent presents a summary of the total numbers of DEGs detected in and novel shared element seen across these analyses is changes in these analyses at p ≤ 0.01 alongside the overlap in these gene lists the expression of genes related to photosynthesis, other aspects between the in-house and common pipeline analyses (the full lists of primary metabolism and also changes to secondary metabolism of DEGs are shown in Supplementary Data 4). (Fig. 5d). This observation suggests that spaceflight and The five datasets were separated into microarray (three studies) spaceflight-related treatments are likely impacting these funda- and RNA-seq (two studies) groups and analyzed using the GeneLab mental aspects of plant function. However, as a note of caution, pipelines outlined above before making an overall comparison Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA npj Microgravity (2023) 21 R. Barker et al. Table 1. Comparison of the differentially expressed gene counts from in-house and common pipeline analyses. GeneLab Accession Assay GeneLab count Original count Loci in both Difference In-house reference GLDS-17 Microarray 2459 499 34 1960 GLDS-44 Microarray 4031 3826 2597 205 GLDS-121 Microarray 2122 2177 2121 –55 GLDS-37 RNA-seq 2785 2084 927 701 GLDS-38 RNA-seq 3870 2919 2404 951 Data is taken from the original spaceflight research publications (in-house, i.e., using the original authors’ analyses with p ≤ 0.01) and the GeneLab analysis (p ≤ 0.01, adjusted for multiple hypothesis testing using the Benjamini and Hochberg method). across all the BRIC experiments. The significance threshold to common analytical pipeline. Comparison of these approaches in identify DEGs was set at p < 0.01. In a previous comparative analysis, the WT Col-0 samples in both datasets shows that at a threshold of Johnson et al. reported no common genes amongst the BRIC-16 p < 0.01, the common pipeline identified 701 and 951 new loci as mission microarray studies (GLDS-17, -44 and -121) when using a showing altered expression in spaceflight from GLDS-37 and cutoff of p < 0.01 but also applying a threshold of 5-fold or greater GLDS-38 respectively, or about 25% more loci than found in the for induction or repression in transcript level as measured on their original authors’ analysis (Table 1 and Supplementary Data 4). microarray (to define the most strongly regulated genes). We Comparing the GeneLab pipeline-based analysis with that of the therefore reanalyzed these microarray results using a pipeline original peer-reviewed publications indicates agreement on 927 similar to the original authors’ analyses (Affymetrix Express pipeline (GLDS-37) and 2404 (GLDS-38) DEGs. Further, within this analysis, and the Probe Logarithmic Intensity Error (PLIER) approach for 164 loci were significantly differentially expressed in both GLDS-37 normalization , with a significance setting p < 0.01) but now using and GLDS-38 (Supplementary Data 6). Gene Ontology enrichment no fold-change filtering (Supplementary Data 4) to be more analysis of these spaceflight-responsive DEGs across both studies analogous to the GeneLab analytical pipeline we have also applied. and in both the in-house and GeneLab analytical pipelines (Fig. Using the in-house analysis by the researchers (GLDS-17 ,GLDS- 22 21 6d) revealed enrichment in responses such as to oxidative stress, 44 ) and the reanalysis of GLDS-121 using PLIER, the results of heat shock and changes in cell wall dynamics that have been our comparisons across all the microarray studies conducted with highlighted in multiple previous plant spaceflight transcriptome wild-type seedlings identified 86 spaceflight-related DEGs found 14,19,21,22,26,27 studies (e.g., refs. ). Reanalysis with the Matrix across all studies (Supplementary Data 5). The GeneLab reanalysis identified 114 loci in common between the 3 studies, including approach was also able to reveal a fingerprint of hypoxia which 85% of the genes from the in-house analysis. Analysis of the 75 has been predicted as an important factor impacting biology DEGs identified in all 3 studies by both analytical techniques using operating with the reduced convective gas movements inherent 6,7 MetaScape (Fig. 6) revealed enrichment in Gene Ontology terms in a microgravity environment but which has previously proven including: regionalization, response to Karrikin (a plant stress difficult to observe in analyses of transcriptional responses of response pathway triggered by volatiles originally found in smoke), individual flight experiments using the BRIC. In addition, Gene regulation of stomatal movement and tropism. These latter two Ontologies related to various aspects of ion transport are terms are particularly interesting as disruption of gravitropic growth prominent in our analysis targeting future investigations focused is one of the predicted responses of plants growing in the on both anion and cation transport as likely to be a fruitful targets microgravity environment of spaceflight. Although patterns of for further understanding the effects of spaceflight on plants. development seen in plants growing in space often show more Lastly, since processes associated with responses to spaceflight randomized directional growth than on the ground, molecular are still largely unknown, Supplementary Data 7 provides a list of evidence for altered tropic response reflected in the patterns of spaceflight-responsive DEGs from this analysis that currently have transcriptional changes observed in spaceflight has been less clear. no GO or KEGG annotation. These genes provide potential targets Thus, the highlighting of tropisms in Fig. 6c suggests the use of the for study for novel processes triggered by plant growth in space. Matrix approach for analyzing the available data may help reveal We next used Metascape analysis on these lists of DEGs from these molecular signatures. Further, this analysis revealed that the analysis of GLDS-37 and GLDS-38 to explore potential stomatal behavior may be affected by spaceflight. Factors such as protein:protein networks, applying Metascape’s protein-protein reduced buoyancy-driven convection in microgravity would be 5,8 predicted to alter gas exchange at the stomatal pore ,likely interaction enrichment analysis and Molecular Complex Detection playing out as altered stomatal function. Again, this effect has been (MCode) . These analyses take the lists of differentially expressed difficult to reliably detect in the transcriptional fingerprints of genes and mine an array of protein interaction databases 29 30 31 32 spaceflight responses, but the targeted analyses driven by insights (STRING , BioGrid , OmniPath , InWeb_IM ) for enriched net- from the Matrix appear able to reveal evidence for these previously works of physical interactions. MCode then allows a focus on cryptic patterns of molecular changes. Repeating this analysis made highly connected hubs when the numbers of proteins in the using Metascape but with DAVID (the Database for Annotation, network become very high. These analyses again revealed Visualization and Integrated Discovery ) asanalternative,widely enrichment for a response network associated with ion transport used tool to assess gene ontology enrichments agreed with the and chaperone activity (Fig. 7). In addition, multiple network analysis outlined above and did not reveal any new significantly clusters related to protein ubiquitinylation were identified. This enriched ontology terms at p < 0.01. This observation suggests the observation suggests spaceflight may have triggered alterations in Metascape analysis outlined in Fig. 6 is likely capturing most of the proteasome activity, possibly related to stress-induced protein patterns of ontology enrichment in the data. turnover. Such a response to stress-related protein dysfunction GLDS-37 and GLDS-38 represent the BRIC samples studied with would be consistent with the elevated chaperone activities RNA-seq, necessitating independent analysis from the microarray suggested by the heat shock protein (HSP)-related protein:protein datasets discussed above. Again, we used data from the published 19,20 interaction cluster identified in this same analysis. in-house bioinformatics approaches and the GeneLab npj Microgravity (2023) 21 Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA R. Barker et al. Fig. 6 Analysis of shared DEGs between the in-house and GeneLab pipeline analyses of plant experiments performed in spaceflight using the BRIC hardware. Overlap between gene lists for microarray studies (a) or RNA-seq (b) where purple curves link identical genes and light blue includes the shared Gene Ontology term level. Curves link genes that belong to the same enriched Gene Ontology term. The inner circle represents gene lists, where hits are arranged along the arc. Genes that hit multiple lists are colored in dark orange, and genes unique to a list are shown in light orange. Sectors denoted by GeneLab ## show the analysis using GeneLab common pipeline; sectors denoted by a citation show the original authors’ in-house analysis. c, d Significantly enriched GO terms from analysis of common genes found in the microarray (c) and RNA-seq (d) analyses identified in both the in-house and GeneLab pipelines. Analysis in c and d performed using Metascape. Intersection between RNA and microarray analyses Ler-0 (also used in GLDS-121), two additional ecotypes were investigated (Cvi-0, Ws-2). These same six genes were also By combining the differentially expressed gene lists from both differentially expressed across all the ecotypes in this study, microarray and RNA-seq analyses identified using the GeneLab reinforcing their likely common response nature. These 6 common pipeline approach, 6 common spaceflight response loci common genes were: AT1G74310 (HOT1/HSP101; HEAT SHOCK were identified but in only three of the studies (GLDS-37, GLDS- PROTEIN 101), AT1G58340 (ABS4, a plant MATE multidrug and 38,and GLDS-121). This observation reinforces the idea that variation in experimental design and analysis approach may be toxic compound extrusion transporter), AT5G52310 (COR78; COLD REGULATED 78), AT4G11290 (PRX39, a cell wall peroxidase), obscuring some common patterns of response (see below). Within the BRIC-19 experiment that generated the data in GLDS- AT5G09220 (AAP2, AMINO ACID PERMEASE 2), and AT1G73480 37, in addition to the Col-0 ecotype (also used in GLDS-38) and (MAGL4,an α-β hydrolase). Analysis of these loci using the graph- Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA npj Microgravity (2023) 21 R. Barker et al. Fig. 7 Protein:protein interaction network inferred from the common DEGs identified using the GeneLab analysis pipeline of GLDS-37 and GLDS-38. Analysis using Metascape with annotation of densely connected network elements identified with the MCode algorithm. Colors represent clusters grouped by shared ontology term. Size of circle shows the number of protein:protein interactions that each node/locus is annotated as being involved with as identified by the MCode analysis. based network analysis tool KnetMiner revealed broad connec- to investigate the changes that take place in seedlings that tions to the plastid and membrane function (Fig. 8). experience low oxygen stress with or without the addition of Some of these genes have been discussed in the individual external sucrose. Their analyses revealed that exogenous sucrose 18–22 analyses originally published on each BRIC experiment(s) . significantly alters patterns of anoxia-related transcriptional However, the power of the current meta-analysis lies in high- change. Thus, the increased sucrose concentration in the media lighting these particular genes as possible core markers of the found in GLDS-17 should dramatically affect the plant hypoxia spaceflight response across multiple experiments within the BRIC response and so is likely to alter responses to this particular effect hardware and revealing a difference in GLDS-17 and GLDS-44. of the spaceflight environment. Interrogating the experimental design reveals that one obvious Precisely why GLDS-44 also does not show the conserved difference between GLDS-17 and the other BRIC investigations is transcriptional responses seen in GLDS-37, GLDS-38, and GLDS- that 3% (w/v) sucrose was used in the seedling media in GLDS-17 121 is less obvious as its experimental design is very similar to compared to 0.3–1% (w/v) in the other studies. Sucrose is these other BRIC-based experiments and it was flown side-by-side generally added to the media of Arabidopsis seedlings to support on the same mission as GLDS-121. However, subtle features such the heterotrophic growth of the plants in the dark conditions in as the seed planting density differed between these studies and the BRIC. However, the higher sucrose in BRIC-17 [https://genelab- so effects of plant density and competition might be super- data.ndc.nasa.gov/genelab/accession/GLDS-17] was specifically imposed on these results. This analysis then highlights how added to facilitate comparisons between the seedlings in this understanding the feature(s) in these experiments responsible for experiment and a parallel set of cell cultures that required much the differences in expression pattern offers enormous potential to higher sucrose for growth. The differences in gene expression define factors with wide-ranging effects on the plant spaceflight between BRIC-17 [https://genelab-data.ndc.nasa.gov/genelab/ response; i.e., the difference(s) between GLDS-44 and GLDS-121 accession/GLDS-17] seedlings and the other BRIC experiments and the other BRIC experiments clearly had dramatic effects on then implies that changes in primary metabolism experienced by the patterns of spaceflight-related gene expression and so the etiolated seedlings in the BRIC may be an important factor in exploring how these studies differ in design should help define determining spaceflight related transcriptional responses, echoing some key spaceflight-response related factors. the altered primary metabolism inferred from our meta-analysis Looking at the shared DEGs between GLDS-37, GLDS-38, and across all the spaceflight datasets analyzed as part of the Matrix in GLDS-121 identifies HSP101 as a common spaceflight response Fig. 5. Such observations are especially relevant in the context of marker. Indeed, upregulation of Heat Shock Proteins (HSPs) in the 18,19,27,34,35 possible spaceflight-related hypoxia discussed above. Indeed, spaceflight environment is well known . Heat Shock Loreti et al., used microarray analysis in ground-based research Proteins are molecular chaperones associated with protecting and npj Microgravity (2023) 21 Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA R. Barker et al. Fig. 8 Network analysis of the 6 common spaceflight responsive genes identified from analysis of Arabidopsis seedlings flown in the BRIC hardware. Query genes are highlighted in yellow. AT1G74310 (HOT1/HSP101; HEAT SHOCK PROTEIN 101), AT1G58340 (ABS4, a plant MATE multidrug and toxic compound extrusion transporter), AT5G52310 (COR78; COLD REGULATED 78), AT4G11290 (PRX39, PEROXIDASE 39, a cell wall peroxidase), AT5G09220 (AAP2, AMINO ACID PERMEASE 2), and AT1G73480 (MAGL4,an α-β hydrolase family protein). Analysis performed using KnetMiner. Purple connector, link to biochemical function; cyan connector, link to physical location in cell; green connector, link to associated phenotype; black connector, direct physical or genetic linkage. Note links to plastid (green oval) for MAGl4, HSP101 and COR78. An interactive version of this analysis is available at: https://knetminer.com/beta/knetspace/network/970c571c-15da-4b93-87ad-ef1418ef9d29. refolding proteins in response to cellular damage . Consistent hardware. Although originally identified as a cold induced with the enriched clades corresponding to photosynthesis transcript, COR78 is now known to be highly inducible in response identified in Fig. 5, patterns of HSP101 upregulation and its to a range of abiotic factors ranging from wounding and salt relationship to the chloroplast (Fig. 8), suggest that this protein exposure to osmotic stress, drought and even the hypobaric (low may play an important role in ameliorating chloroplastic pressure) environments predicted for future large scale, space- 39–41 proteotoxic stress possibly resulting from spaceflight-induced based plant growth facilities . A common feature of all these production of reactive oxygen species (ROS) in the plastid. Indeed, stressors is that they trigger signaling through ROS and induce the HSP100 family are known to be induced by abiotic stressors oxidative stress. Indeed, COR78 expression is regulated through such as oxidative stress and have even been linked to tolerance the same ROS-responsive transcriptional cascades (i.e., H O 2 2 to the proteotoxic damage caused by hypoxia . Previous responsive modulation through the DREB2A transcription factor) 14,19 42 work has demonstrated a significant correspondence between that modulates heat shock response elements such as HsfA3 , patterns of gene expression altered by oxidative damage from the providing a possible link to the heat shock factor component of high light stress response on Earth (which is strongly linked to the spaceflight response. In a further tantalizing link between the damaging levels of plastid ROS production) and the spaceflight- chloroplast and COR78 response, COR78 expression is co-regulated associated DEGs identified in the seedlings from BRIC experi- with elements of the plastid antioxidant system and indeed, its ments. Similarly, the large number of plastid genes responding to expression is thought to be tightly linked to the levels of H O 2 2 22 43 spaceflight identified in seedling samples from BRIC-16/GLDS-44 processing by the plant . In summary, the GeneLab database is accumulating an ever- reinforces the idea that this organelle may be an important site of spaceflight-induced responses. However, it is important to note increasing number of datasets that investigate the transcriptional that the BRIC experiments we have analyzed were all conducted effects of spaceflight on early plant development. This aggrega- under dark growth conditions. Therefore, the light-driven reac- tion of information, along with careful curation represents a powerful resource to begin to understand spaceflight responses in tions of photosynthetic electron transport that are a major source of plastid ROS production on Earth are not responsible for these these organisms. Spaceflight imposes some commonly encoun- spaceflight-related effects and so the source of any spaceflight- tered and some unique challenges when comparing datasets. Thus, as with all large omics-level analyses, differences in triggered ROS production within the plastid remains to be defined. protocols and analysis pipelines can impact the robustness of Our meta-analysis using the common GeneLab analysis comparisons. However, spaceflight also leads to further challenges pipelines also highlights COR78 as likely a part of a conserved related to an often-restricted capacity for biological replication transcriptional response of Arabidopsis on orbit in the BRIC and with limitations on experimental design dictated by available Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA npj Microgravity (2023) 21 R. Barker et al. spaceflight hardware. We have begun to address some of these data landscape will provide a powerful approach to supplement issues by applying a common analytical pipeline for datasets and the insights drawn from analyses focused on each individual study then constructing a matrix of metadata to allow for sorting and in isolation. The Matrix analysis presented herein provides a comparison across studies driven by their known similarities and toolset to help expedite the development of such new investiga- differences. In this work we focused on two elements to highlight tions. Additionally, while the scope of potential hypotheses the potential of this approach: (1) making broad comparisons generated by these analyses is extensive, the current Matrix across the entire sets of data to draw conclusions about meta-analysis highlights three specific focus areas for future confounding variables that likely superimpose differences on research that may prove particularly fruitful. These include: (1) spaceflight datasets, and (2) making analyses focusing on the studies examining the effect of variable light regimes on space commonly used BRIC hardware to help researchers understand grown plant productivity and physiology, (2) analyses aimed at the possibilities offered by designing comparative analyses in the determining the potential causes of altered redox activities in the context of the Matrix metadata. However, the possible compar- plastids of space flown plants, and (3) experiments examining the isons guided by this Matrix are vast and so there remain many function of HSP101, ABS4, COR78, PRX39, AAP2, and MAGL4 in more opportunities for the research community to draw new response to spaceflight stressors. insights from Matrix-focused analyses. From the Matrix-driven exploration presented here, we found that: (1) comparisons across different transcriptome monitoring METHODS technologies (RNA-seq versus microarray) should be performed Assay pipelines and datasets with great care as differences in the technology used can impose The GeneLab data repository currently holds the largest number of greater variation on results than the biological treatment (space- publicly accessible datasets of omics- (transcriptomics-, proteomics-, flight versus ground control); (2) environmental conditions and epigenomics- and genomics-) based studies of biological, hardware-related constraints in the experimental design produce spaceflight-related studies. For our analyses of plant responses smaller but also important differences that can confound using this resource we focused on the results assessing changes in interpretation of the spaceflight versus ground control compar- the transcriptome as the most numerous kind of dataset available. isons; (3) when these factors are controlled for, comparisons We included such studies based on the minimal criteria that they: (1) across the breadth of spaceflight-related Arabidopsis experiments were performed on the most widely used plant model species, reveal alterations in general responses to environmental stresses, Arabidopsis thaliana (which represents nearly all of the plant data photosynthesis, and other elements of primary and secondary currently deposited in GeneLab) and (2) had at least 3 biological metabolism (Fig. 5). These broad areas provide targets for the replicates per treatment (to provide statistical rigor on subsequent generation of future models of how spaceflight may affect plant analyses). A summary of the 15 studies (encompassing 10 physiology and development. Our analysis of the BRIC hardware microarray and 6 RNA-seq GeneLab Data Sets, or GLDS) that fulfill shows how with a more targeted approach, common response these requirements is presented in Table 2. These experiments were genes can be identified that then point to potentially core performed on missions run by NASA, the European Space Agency spaceflight responses. For example, the BRIC analysis strongly and the Chinese Space Agency. To ensure the greatest degree of points to the plastid as a likely shared response site across many comparability between results, all of the primary data was spaceflight experiments. The observations of conserved roles for reanalyzed through common computational approaches developed HSP101 and COR78 suggest that a fundamental disruption of the by GeneLab and implemented in the Galaxy computing environ- ROS and/or antioxidant systems related to the plastid may be ment .Briefly, the microarray analysis pipeline used the R/ accompanying plant growth in space. Bioconductor software package limma to perform differential It is important to note limitations inherent in the approach we gene expression analysis. Background correction by the Robust have applied to meta-analysis of these spaceflight datasets. Due to Multichip Average (RMA) method and between array normalization flight and hardware constraints, the spaceflight experiments by the quantile method were performed through the Bioconduc- collected for this meta-study were limited to young seedlings of tor Oligo package . Gene level estimation was generated using the Arabidopsis. Although some experiments have tested the viability Maximum Interquartile Range method and annotations were added of plants to reach mature and reproductive developmental stages using the Annotation-Db class gene annotations specificto these have generally not involved omics research. We must await Arabidopsis thaliana from the Bioconductor repository the data from more studies throughout the phases of the plant life (www.bioconductor.org). In cases where multiple probes mapped cycle to understand how well our developing insights from to the same gene ID, representative probes were selected with the seedlings and young plants will apply to individuals at maturity. highest mean normalized intensity across all samples. Differential Similarly, we must await further studies on a wider array of plant gene expression analysis used the linear model fit from the limma R species to extend these approaches beyond the plant most package to perform pairwise comparisons for all groups. For each commonly grown in spaceflight, Arabidopsis. probe set, the variance of mean signal intensities was estimated, One further limitation on our approach is that at present we improved by an empirical Bayes method for combining variances of manually curate the import of each experiment’s metadata into probes showing similar variability, and the significance of the the Matrix. However, the GeneLab data repository has standar- difference between the means was evaluated with a t-test to obtain dized its metadata formats for both current and future datasets p-values. p-values were also adjusted to q-values to account for offering us the opportunity to automate both import and curation. possible errors introduced through multiple hypothesis testing using This automated approach will be facilitated through GeneLab’s the Benjamini and Hochberg method and so control for the false automatic programming interface (API) which offers a program- discovery rate. Details of the code used to process each dataset are accessible link to the metadata files. Continual updates to the available at https://github.com/nasa/GeneLab_Data_Processing/ Matrix will allow the power of inferences drawn to grow as quickly tree/master/Microarray/1-channel_arrays/GLDS_Processing_Scripts. as the new plant spaceflight datasets are deposited. Both the raw and processed data can be downloaded at https:// Our analysis of the BRIC datasets suggests that focusing on a genelab-data.ndc.nasa.gov/genelab/projects. few hardware options that can then be the subject of multiple The RNA-seq analysis pipeline used the universal RNA-seq flight studies would greatly add to the power of such comparative omics-level analyses. Nevertheless, the results presented here aligner STAR v2.7.1a and the RNA-Seq by Expectation Maximiza- 13,50 offer the promise that as these experimental data become tion approach (RSEM v1.3.1) along with the TAIR10 genome 51 52,53 available, meta-analyses across the broad plant biology omics assembly accessed through Ensembl Plants . Raw sequence npj Microgravity (2023) 21 Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA R. Barker et al. Table 2. Studies used in developing the plant transcriptional Matrix. Accession Study title Assay type Refs. GLDS-7 The Arabidopsis spaceflight transcriptome: a comparison of whole plants to discrete root hypocotyl and Microarray shoot responses to the orbital environment GLDS-17 Transcription profiling by array of the response of Arabidopsis cultivar Columbia etiolated seedlings and Microarray undifferentiated tissue culture cells to the spaceflight environment GLDS-37 Comparison of the spaceflight transcriptome of four commonly used Arabidopsis thaliana ecotypes (Col, RNA-seq Ws, Ler and Cvi) GLDS-38 Proteomics and transcriptomics analysis of Arabidopsis seedlings in microgravity RNA-seq GLDS-44 Transcriptomics analysis of etiolated Arabidopsis thaliana seedlings in response to microgravity Microarray GLDS-46 Gamma radiation and HZE treatment of seedlings in Arabidopsis Microarray GLDS-120 Genetic dissection of the spaceflight transcriptome responses in plants: are some responses RNA-seq unnecessary? GLDS-121 Biological Research in Canisters-16 (BRIC-16): investigations of the plant cytoskeleton in microgravity with Microarray gene profiling and cytochemistry GLDS-136 Dissecting low atmospheric pressure stress: transcriptome responses to the components of hypobaria in Microarray Arabidopsis GLDS-147 Arg1 functions in the physiological adaptation of undifferentiated plant cells to spaceflight Microarray GLDS-205 HSFA2 functions in the physiological adaptation of undifferentiated plant cells to spaceflight Microarray microgravity environment GLDS-208 Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray Microarray and RNA-seq transcriptome profiles GLDS-213 A whole-genome microarray study of Arabidopsis cell cultures exposed to microgravity for 5 days on Microarray board of Shenzhou 8 GLDS-218 Spaceflight-induced alternative splicing during seedling development in Arabidopsis thaliana RNA-seq GLDS-251 RNA-seq analysis of the response of Arabidopsis thaliana to fractional gravity under blue-light stimulation RNA-seq during spaceflight In the table, the reference column denotes the initial publication on the data with the authors’ in-house analyses, when available. Datasets are publicly available at the GeneLab data repository using the url: https://genelab-data.ndc.nasa.gov/genelab/accession/GLDS-#/, where # represents the GLDS accession number for each study. 53 56 data were trimmed and filtered with Trim Galore! (v0.6.2). The Ensembl GO , the Kyoto Encyclopedia of Gene and Genomes , 57 58 Arabidopsis thaliana Ensembl reference genome TAIR10, release AraCyc and Reactome are noted in the text and figure legends. 44, and respective GTF file were used to align trimmed reads with Principal Component Analysis (PCA), Multidimensional Scaling STAR (v2.7.1a) then the aligned reads were quantified using RSEM analysis (MDS), t-distributed Stochastic Neighbor Embedding (T- (v1.3.1). Quantification data was imported to R (v3.6.0) using the SNE), Weighted Gene Correlation Network Analysis (WGCNA) and tximport package (v1.14.0) and normalized using the DESeq2 K-means statistical analyses were performed using the iDEP.94 (v1.26.0) median of ratios method . Differential expression R-package . For these analyses, the normalized counts were analysis was performed with DESeq2 (v1.26.0) and pairwise imported from the GeneLab data repository and processed using comparisons of all groups were performed using the Wald test R-studio. The R programming language provides for the statistical to generate p- and adjusted p-values, and the likelihood ratio test analysis of data (https://www.r-project.org/about.html) within a was used to generate the F statistic p-value. Gene annotations commercial development environment called R-studio (R-Studio were assigned using the Bioconductor org.At.tair.db (v3.8.2), inc. Boston, MA, USA). 29 55 STRINGdb (v1.24.0) , and PANTHER.db (v1.0.4) packages. Processing code for each RNA-seq dataset are available at Reporting summary https://github.com/nasa/GeneLab_Data_Processing/tree/master/ Further information on research design is available in the Nature RNA-seq/GLDS_Processing_Scripts and both the raw and pro- Research Reporting Summary linked to this article. cessed data are deposited at https://genelab-data.ndc.nasa.gov/ genelab/projects. The associated metadata for each dataset was aggregated using DATA AVAILABILITY a combination of the information provided alongside each Source data for this study are publicly available in the GeneLab data repository GeneLab data submission, parallel manual curation from the (https://genelab-data.ndc.nasa.gov/genelab/projects/) under the Accession codes literature and through interviews with the primary researchers. GLDS-7; GLDS-17; GLDS-37; GLDS-38; GLDS-44; GLDS-46; GLDS-120; GLDS-121; GLDS- The Matrix of this data is available as both Supplementary Data 1 136; GLDS-147; GLDS-205; GLDS-208; GLDS-213; GLDS-218; GLDS-251. and as an interactive exploration environment developed in the Qlik database management software environment (Qlik Technol- ogies Inc., King of Prussia, PA, USA) at https://gilroy- CODE AVAILABILITY qlik.botany.wisc.edu/a/sense/app/20aa802b-6915-4b1a-87bd- Details of the code used to process each dataset are available at https://github.com/ c029a1812e2b. nasa/GeneLab_Data_Processing/tree/master/Microarray/1-channel_arrays/ When they have been employed in the data analyses, online GLDS_Processing_Scripts. Both the raw and processed data can be downloaded at tools such as the TOAST X-Species Transcriptional Explorer https://genelab-data.ndc.nasa.gov/genelab/projects. R scripts used for raw data (https://gilroy-qlik.botany.wisc.edu/a/sense/app/ab2250b5-ee3a- processing, iDEP.92 analysis and visualization are available at https://github.com/ 23 24 4da8-b5da-fe87d5f2dbe6/overview), KnetMiner , Metascape , dr-richard-barker/The-Matrix-2022 The Matrix of this data is available as both Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA npj Microgravity (2023) 21 R. Barker et al. Supplementary Data 1 and as an interactive exploration environment developed in 26. Sugimoto, M. et al. Genome-wide expression analysis of reactive oxygen species the Qlik database management software environment (Qlik Technologies Inc., King of gene network in Mizuna plants grown in long-term spaceflight. BMC Plant Biol. Prussia, PA, USA) at https://gilroy-qlik.botany.wisc.edu/a/sense/app/20aa802b-6915- 14, 4 (2014). 4b1a-87bd-c029a1812e2b. 27. Zupanska, A. K., Denison, F. C., Ferl, R. J. & Paul, A. L. Spaceflight engages heat shock protein and other molecular chaperone genes in tissue culture cells of Arabidopsis thaliana. Am. J. Bot. 100, 235–248 (2013). Received: 22 July 2021; Accepted: 10 January 2023; 28. Bader, G. D. & Hogue, C. W. V. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 4, 2 (2003). 29. Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experi- mental datasets. Nucleic Acids Res. 47, D607–D613 (2019). REFERENCES 30. Stark, C. et al. BioGRID: a general repository for interaction datasets. Nucleic Acids 1. Hoson, T. & Soga, K. New aspects of gravity responses in plant cells. Int. Rev. Cytol. Res. 34, D535–D539 (2006). 229, 209–244 (2003). 31. Türei, D., Korcsmáros, T. & Saez-Rodriguez, J. OmniPath: guidelines and gateway 2. Morita, M. T. Directional gravity sensing in gravitropism. Annu. Rev. Plant Biol. 61, for literature-curated signaling pathway resources. Nat. Methods 13, 966–967 705–720 (2010). (2016). 3. Su, S.-H., Gibbs, N. M., Jancewicz, A. L. & Masson, P. H. Molecular mechanisms of 32. Li, T. et al. A scored human protein-protein interaction network to catalyze root gravitropism. Curr. Biol. 27, R964–R972 (2017). genomic interpretation. Nat. Methods 14,61–64 (2017). 4. Nakamura, M., Nishimura, T. & Morita, M. T. Bridging the gap between amylo- 33. Loreti, E., Poggi, A., Novi, G., Alpi, A. & Perata, P. A genome-wide analysis of the plasts and directional auxin transport in plant gravitropism. Curr. Opin. Plant Biol. effects of sucrose on gene expression in arabidopsis seedlings under anoxia. 52,54–60 (2019). Plant Physiol. 137, 1130–1138 (2005). 5. Kitaya, Y. et al. The effect of gravity on surface temperature and net photo- 34. Paul, A. L. et al. Genetic dissection of the Arabidopsis spaceflight transcriptome: synthetic rate of plant leaves. Adv. Sp. Res. 28, 659–664 (2001). are some responses dispensable for the physiological adaptation of plants to 6. Stout, S. C., Porterfield, D. M., Briarty, L. G., Kuang, A. & Musgrave, M. E. Evidence spaceflight? PLoS ONE 12, e0180186 (2017). of root zone hypoxia in Brassica rapa l. Grown in microgravity. Int. J. Plant Sci. 162, 35. Zupanska, A. K. et al. ARG1 functions in the physiological adaptation of undif- 249–255 (2001). ferentiated plant cells to spaceflight. Astrobiology 17, 1077–1111 (2017). 7. Porterfield, D. M. The biophysical limitations in physiological transport and 36. Wang, W., Vinocur, B., Shoseyov, O. & Altman, A. Role of plant heat-shock proteins exchange in plants grown in microgravity. J. Plant Growth Regul. 21,177–190 (2002). and molecular chaperones in the abiotic stress response. Trends Plant Sci. 9, 8. Hirai, H. & Kitaya, Y. Effects of gravity on transpiration of plant leaves. Ann. N. Y. 244–252 (2004). Acad. Sci. 1161, 166–172 (2009). 37. Swindell, W. R., Huebner, M. & Weber, A. P. Transcriptional profiling of Arabidopsis 9. Wheeler, R. M. Agriculture for space: People and places paving the way. Open heat shock proteins and transcription factors reveals extensive overlap between Agric. 2,14–32 (2017). heat and non-heat stress response pathways. BMC Genomics 8, 125 (2007). 10. Ray, S. et al. GeneLab: Omics database for spaceflight experiments. Bioinformatics 38. Banti, V., Mafessoni, F., Loreti, E., Alpi, A. & Perata, P. The heat-inducible tran- 35, 1753–1759 (2019). scription factor HsfA2 enhances anoxia tolerance in Arabidopsis. Plant Physiol. 11. Berrios, D. C., Galazka, J., Grigorev, K., Gebre, S. & Costes, S. V. NASA GeneLab: 152, 1471–1483 (2010). interfaces for the exploration of space omics data. Nucleic Acids Res. 49, 39. Yamaguchi-Shinozaki, K. & Shinozaki, K. Characterization of the expression of a D1515–D1522 (2021). desiccation-responsive rd29 gene of Arabidopsis thaliana and analysis of its 12. Fei, T. & Yu, T. ScBatch: Batch-effect correction of RNA-seq data through sample promoter in transgenic plants. MGG Mol. Gen. Genet. 236, 331–340 (1993). distance matrix adjustment. Bioinformatics 36, 3115–3123 (2020). 40. Msanne, J., Lin, J., Stone, J. M. & Awada, T. Characterization of abiotic stress- 13. Lai Polo, S.-H. et al. RNAseq analysis of rodent spaceflight experiments is con- responsive Arabidopsis thaliana RD29A and RD29B genes and evaluation of founded by sample collection techniques. iScience https://doi.org/10.1016/ transgenes. Planta 234,97–107 (2011). j.isci.2020.101733 (2020) . 41. Paul, A.-L. et al. Patterns of Arabidopsis gene expression in the face of hypobaric 14. Barker, R. J., Lombardino, J., Rasmussen, K. & Gilroy, S. TOAST: a discovery stress. AoB Plants 9, plx030 (2017). environment to explore multiple plant biology spaceflight experiments. Front. 42. Wu, A. et al. JUNGBRUNNEN1, a reactive oxygen species-responsive NAC tran- Plant Sci. 11, 147 (2020). scription factor, regulates longevity in Arabidopsis. Plant Cell 24, 482–506 (2012). 15. Seyednasrollah, F., Laiho, A. & Elo, L. L. Comparison of software packages for 43. Juszczak, I., Cvetkovic, J., Zuther, E., Hincha, D. K. & Baier, M. Natural variation of detecting differential expression in RNA-seq studies. Brief. Bioinform. 16,59–70 cold deacclimation correlates with variation of cold-acclimation of the plastid (2015). antioxidant system in Arabidopsis thaliana accessions. Front. Plant Sci. 7, 305 16. Papatheodorou, I. et al. Expression Atlas: gene and protein expression across (2016). multiple studies and organisms. Nucleic Acids Res. 46, D246–D251 (2018). 44. Jalili, V. et al. The Galaxy platform for accessible, reproducible and collaborative 17. Krishnamurthy, A., Ferl, R. J. & Paul, A. L. Comparing RNA-Seq and microarray biomedical analyses: 2020 update. Nucleic Acids Res. 48, W395–W402 (2020). gene expression data in two zones of the Arabidopsis root apex relevant to 45. Ritchie, M. E. et al. Limma powers differential expression analyses for RNA- spaceflight. Appl. Plant Sci. 6, e01197 (2018). sequencing and microarray studies. Nucleic Acids Res. 43, e47–e47 (2015). 18. Paul, A. L. et al. Spaceflight transcriptomes: unique responses to a novel envir- 46. Irizarry, R. A. et al. Exploration, normalization, and summaries of high density onment. Astrobiology 12,40–56 (2012). oligonucleotide array probe level data. Biostatistics 4, 249–264 (2003). 19. Choi, W. G., Barker, R. J., Kim, S. H., Swanson, S. J. & Gilroy, S. Variation in the 47. Carvalho, B. S. & Irizarry, R. A. A framework for oligonucleotide microarray pre- transcriptome of different ecotypes of Arabidopsis thaliana reveals signatures of processing. Bioinformatics 26, 2363–2367 (2010). oxidative stress in plant responses to spaceflight. Am. J. Bot. 106, 123–136 (2019). 48. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and 20. Kruse, C. P. S. et al. Spaceflight induces novel regulatory responses in Arabidopsis powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995). seedling as revealed by combined proteomic and transcriptomic analyses. BMC 49. Dobin, A. et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29,15–21 Plant Biol. 20, 237 (2020). (2013). 21. Johnson, C. M., Subramanian, A., Pattathil, S., Correll, M. J. & Kiss, J. Z. Comparative 50. Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data transcriptomics indicate changes in cell wall organization and stress response in with or without a reference genome. BMC Bioinform. 12, 323 (2011). seedlings during spaceflight. Am. J. Bot. 104, 1219–1231 (2017). 51. Berardini, T. Z. et al. The arabidopsis information resource: making and mining 22. Kwon, T. et al. Transcriptional response of Arabidopsis seedlings during space- the ‘gold standard’ annotated reference plant genome. Genesis 53, 474–485 flight reveals peroxidase and cell wall remodeling genes associated with root hair (2015). development. Am. J. Bot. 102,21–35 (2015). 52. Kersey, P. J. et al. Ensembl Genomes 2018: an integrated omics infrastructure for 23. Hassani-Pak, K. et al. KnetMiner: a comprehensive approach for supporting non-vertebrate species. Nucleic Acids Res. 46, D802–D808 (2018). evidence-based gene discovery and complex trait analysis across species. Plant 53. Howe, K. L. et al. Ensembl Genomes 2020—enabling non-vertebrate genomic Biotechnol. J. https://doi.org/10.1111/pbi.13583 (2021). research. Nucleic Acids Res. 48, D689–D695 (2020). 24. Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis 54. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and of systems-level datasets. Nat. Commun. 10, 1523 (2019). dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). 25. Sherman, B. T. et al. DAVID: a web server for functional enrichment analysis and 55. Mi, H., Muruganujan, A., Ebert, D., Huang, X. & Thomas, P. D. PANTHER version 14: functional annotation of gene lists (2021 update). Nucleic Acids Res. https:// More genomes, a new PANTHER GO-slim and improvements in enrichment doi.org/10.1093/nar/gkac194. (2022) analysis tools. Nucleic Acids Res. 47, D419–D426 (2019). npj Microgravity (2023) 21 Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA R. Barker et al. 56. Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new COMPETING INTERESTS perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, The authors declare no competing interests. D353–D361 (2017). 57. Mueller, L. A., Zhang, P. & Rhee, S. Y. AraCyc: a biochemical pathway database for Arabidopsis. Plant Physiol. 132, 453–460 (2003). ADDITIONAL INFORMATION 58. Fabregat, A. et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 46, Supplementary information The online version contains supplementary material D649–D655 (2018). available at https://doi.org/10.1038/s41526-023-00247-6. 59. Ge, S. X., Son, E. W. & Yao, R. iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data. BMC Bioinform. 19,534 Correspondence and requests for materials should be addressed to Simon Gilroy. (2018). Reprints and permission information is available at http://www.nature.com/ reprints ACKNOWLEDGEMENTS This work was coordinated through the GeneLab Plant Analysis Working Group and Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims was supported by NASA grants 80NSSC19K0126, 80NSSC18K0132 and in published maps and institutional affiliations. 80NSSC21K0577 to S.G. and R.B., through NASA 80NSSC19K1481 to S.W., NNX15AG55G to C.W., and NNX15AG56G to L.D. and N.L., from the Spanish Agencia Estatal de Investigación grant RTI2018-099309-B-I00 and ESA 1340112 4000131202/ 20/NL/PG/pt to R.H. Contributions from P.J. and P.G. were partially supported by Open Access This article is licensed under a Creative Commons funds from the Oregon State University, NSF awards 1127112 and 1340112 and the Attribution 4.0 International License, which permits use, sharing, United States Department of Agriculture, Agriculture Research Service. The Qlik adaptation, distribution and reproduction in any medium or format, as long as you give software used in this work is provided under a free-to-use educational license from appropriate credit to the original author(s) and the source, provide a link to the Creative Qlik Technologies Inc. GeneLab datasets were obtained from https://genelab- Commons license, and indicate if changes were made. The images or other third party data.ndc.nasa.gov/genelab/projects/, maintained by NASA GeneLab, NASA Ames material in this article are included in the article’s Creative Commons license, unless Research Center, Moffett Field, CA 94035. indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly AUTHOR CONTRIBUTIONS from the copyright holder. To view a copy of this license, visit http:// creativecommons.org/licenses/by/4.0/. R.B., C.P.S.K., C.J., A.S.-B., H.F., H.C., R.M.T., N.K., A.V., A.M., R.H., L.D.B., N.G.L., I.P., C.W., P.G., P.J., S.S.R., S.W., and S.G. contributed to data analysis. D.B., K.C., and S.G. wrote the manuscript which R.B., C.P.S.K., C.J., A.S.-B., H.F., H.C., R.M.T., N.K., A.V., A.M., R.H., © The Author(s) 2023 L.D., N.G.L., I.P., C.W., P.G., P.J., S.S.R., S.W., and S.G. edited. Published in cooperation with the Biodesign Institute at Arizona State University, with the support of NASA npj Microgravity (2023) 21
npj Microgravity – Springer Journals
Published: Mar 20, 2023
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.