This article provides a comprehensive guide for researchers and drug development professionals on the critical task of defining the taxonomic domain of applicability (tDOA) for Key Event Relationships (KERs) within...
This article provides a comprehensive guide for researchers and drug development professionals on the critical task of defining the taxonomic domain of applicability (tDOA) for Key Event Relationships (KERs) within Adverse Outcome Pathways (AOPs). The tDOA determines across which species a mechanistic toxicity pathway is biologically plausible, a cornerstone for reliable cross-species extrapolation in regulatory safety assessment. The content spans from foundational AOP and KER concepts to advanced methodologies employing bioinformatics tools like SeqAPASS and G2P-SCAN for tDOA expansion [citation:1][citation:2][citation:5]. It addresses common troubleshooting challenges in establishing taxonomic conservation and details systematic, evidence-based approaches for validation. By synthesizing current best practices and case studies, this article aims to equip scientists with the knowledge to enhance the confidence and regulatory utility of AOPs for protecting both human and ecological health under a One Health framework [citation:1][citation:10].
This whitepaper establishes Key Event Relationships (KERs) as the fundamental causal and predictive linkages that define the mechanistic structure of Adverse Outcome Pathways (AOPs). Within the AOP framework, KERs are the connections between measurable, sequential biological steps, leading from an initial molecular interaction to an adverse outcome relevant to risk assessment [1]. Their rigorous definition and quantitative characterization are paramount for transforming AOPs from qualitative narratives into predictive tools for toxicology and drug development. This document provides an in-depth technical guide on the anatomy, evidence assessment, and quantitative modeling of KERs, framed within the critical research imperative of understanding their taxonomic conservation—the extent to which these causal biological relationships are consistent across species. Mastery of KERs enables researchers to extrapolate data across levels of biological organization, enhance the weight-of-evidence for AOPs, and support the development of targeted testing strategies that reduce reliance on conventional animal studies [2] [3].
An Adverse Outcome Pathway (AOP) is a structured, linear representation of existing knowledge that describes a logical chain of causally linked biological events. This chain begins with a Molecular Initiating Event (MIE), where a chemical or stressor interacts with a specific biological target, and concludes at the level of an Adverse Outcome (AO) that is of direct relevance to risk assessment for human health or ecological systems [1]. The primary utility of the AOP framework lies in its ability to organize mechanistic information, facilitating the extrapolation of data measured at lower levels of biological organization (e.g., molecular, cellular) to predict outcomes at higher levels (e.g., organ, organism, population) [2].
The AOP framework has been formally adopted by the Organisation for Economic Co-operation and Development (OECD), which maintains a collaborative AOP Knowledge Base (AOP-KB). This platform allows the scientific community to develop, share, and review AOPs, ensuring that knowledge about key events and their relationships can be reused and built upon across multiple pathways [1].
Key Event Relationships (KERs) are the explanatory links that form the causal spine of an AOP. Each KER explicitly describes the directional and causal relationship between a pair of sequential Key Events (KEs)—an upstream KE (the cause) and a downstream KE (the effect) [2]. The relationship articulated in a KER provides the biological rationale for why a perturbation in the upstream event is expected to lead to a change in the downstream event.
The formal elements of a KER, as structured in the AOP-KB, include [2]:
By deconstructing a complex adverse outcome into a series of linked KERs, the framework provides a transparent, evidence-based map of toxicity pathways. This structure is essential for identifying knowledge gaps, designing relevant in vitro or in chemico tests, and supporting integrated approaches to testing and assessment (IATA) [3].
A robust KER description moves beyond simple assertion to a comprehensive evidence package. The core components, as defined by the AOP Wiki, are summarized below [2].
Table 1: Core Descriptive Components of a Key Event Relationship
| Component | Description | Purpose |
|---|---|---|
| Biological Plausibility | The biological, biochemical, or mechanistic rationale for the connection. | Establishes theoretical credibility based on established scientific knowledge. |
| Empirical Support | Direct, citable experimental evidence showing that a change in the upstream KE leads to a change in the downstream KE. | Provides observational or experimental proof of the linkage. |
| Uncertainties & Inconsistencies | Acknowledgment of conflicting data, knowledge gaps, or contextual factors that weaken the relationship. | Ensures transparency and identifies areas for further research. |
| Applicability Domain | Definition of the taxonomic, life stage, and sex contexts for which the KER is believed to hold true. | Critical for defining the boundaries and confidence in extrapolation. |
The transition from qualitative to quantitative AOPs (qAOPs) hinges on the quantitative understanding of individual KERs [1]. This involves defining the functional relationship between the measurable changes in linked Key Events.
Table 2: Elements for the Quantitative Understanding of a KER
| Element | Description | Example Data/Analysis |
|---|---|---|
| Response-Response Relationship | The mathematical function describing how the magnitude/timing of the downstream KE change depends on the upstream KE change. | Dose-response curves, kinetic models, linear/non-linear regression outputs (e.g., EC50, slope). |
| Time-Scale | The temporal dynamics (lag/lead time) between the perturbation of the upstream KE and the observable change in the downstream KE. | Time-course study data, kinetic rate constants. |
| Known Modulating Factors | Factors (e.g., age, sex, genotype, diet, co-exposure) that alter the strength, sensitivity, or dynamics of the KER. | Data showing different dose-response curves in different sub-populations or conditions. |
| Known Feedback Loops | Descriptions of positive or negative feedback mechanisms that may amplify or dampen the relationship, including their homeostatic limits. | Evidence of compensatory mechanisms or feed-forward signaling. |
Quantitative analysis methods are crucial for deriving these relationships. Descriptive statistics (mean, variance) summarize experimental data, while inferential statistics are used to establish and model the linkage. Key techniques include regression analysis (to define response-response functions), correlation analysis (to measure association strength), and comparative tests like ANOVA or t-tests (to evaluate the impact of modulating factors across groups) [4].
Diagram: Quantitative Linkage Between Key Events. This model depicts a KER where an upstream KE perturbation drives a downstream KE response via a quantifiable relationship, which is subject to modulation by intrinsic/extrinsic factors and potential feedback mechanisms.
The following protocol outlines a standardized approach for generating empirical evidence to support a hypothetical KER.
1. Objective: To experimentally test whether a defined perturbation in an upstream Key Event (KEup) causes a predictable and measurable change in a downstream Key Event (KEdown). 2. Experimental Design:
This protocol is designed to test the conservation of a KER across species, a core aspect of KER taxonomic conservation research.
1. Objective: To evaluate whether a well-supported KER in a reference species (e.g., human, rat) is conserved in one or more alternative species (e.g., zebrafish, nematode). 2. Experimental Design (Comparative Approach):
Diagram: Workflow for Assessing KER Taxonomic Conservation. This protocol outlines a systematic approach to test the universality of a KER by comparing quantitative response relationships across different species.
The assessment of a KER's taxonomic applicability is not a secondary consideration but a foundational research question with significant implications for the utility of an AOP. The core thesis of KER taxonomic conservation research posits that the fidelity and quantitative parameters of a KER may be conserved, modified, or absent across different species, depending on the evolutionary conservation of the underlying biological pathway [2].
Research in this domain systematically investigates whether a KER established in a model organism (e.g., rat) reliably predicts the same causal relationship in other species of interest (e.g., human, fish, or bird). This involves comparative studies, as outlined in Section 3.2, which mirror the methodologies used in large-scale ecological research to identify biases and gaps in evidence. For instance, a systematic map of meta-analyses in agricultural biodiversity revealed significant geographical and taxonomic biases, with certain groups (arthropods) over-studied and others (annelids, vertebrates) under-represented [5]. Similar systematic mapping of KER evidence across the taxonomical spectrum is essential to identify:
Addressing these gaps is critical for building AOP networks that are robust for both human health and ecological risk assessment. It ensures that predictions are not erroneously extrapolated beyond their valid biological domain and guides the targeted generation of new data where it is most needed [5].
This table details critical reagents, tools, and methodological approaches essential for investigating and characterizing Key Event Relationships.
Table 3: Research Toolkit for KER Investigation
| Tool/Reagent Category | Specific Examples | Function in KER Research |
|---|---|---|
| Perturbation Agents | Selective chemical agonists/antagonists, siRNA pools, CRISPR-Cas9 gene editing kits, neutralizing antibodies. | To selectively modulate the upstream Key Event in a controlled manner to test its causal effect on the downstream event. |
| Activity/Quantification Assays | ELISA kits, phospho-specific antibodies, enzymatic activity assays (e.g., luminescence-based), qRT-PCR probes, reporter gene assays. | To quantitatively measure the changes in molecular or cellular key events (both upstream and downstream) with high specificity and sensitivity. |
| High-Content Screening (HCS) | Automated fluorescence microscopy, image analysis software (e.g., CellProfiler). | To capture complex phenotypic downstream KEs (e.g., cytotoxicity, morphological changes) in a quantitative, high-throughput manner. |
| Omics Technologies | RNA-Seq, targeted mass spectrometry proteomics, metabolomics platforms. | To explore unknown intermediates in a KER, identify novel modulating factors, or provide comprehensive evidence for pathway perturbations. |
| Data Analysis & Modeling Software | R/Bioconductor packages, Python (SciPy, Pandas), GraphPad Prism, specialized qAOP modeling platforms. | To perform statistical analysis, derive response-response models (regression), and visualize quantitative KER data [4]. |
| AOP/KB Management Tools | OECD AOP-KB Wiki, AOP modeling software (e.g., AOPXplorer). | To formally document KERs according to OECD guidelines, link them to AOPs, and explore network relationships [2] [1]. |
Diagram: KERs as the Supported Causal Links in an AOP Network. An AOP is a chain of Key Events, but its predictive power resides in the well-supported KERs (evidence blocks) that causally link them together.
In predictive toxicology, the Taxonomic Domain of Applicability (tDOA) defines the biological taxa to which a given toxicity pathway or prediction is confidently applicable [6]. Its formalization is critical for moving beyond assumptions and providing evidence-based boundaries for extrapolating toxicological findings, particularly within frameworks like the Adverse Outcome Pathway (AOP) and its core unit, the Key Event Relationship (KER) [7]. A KER describes a causal, mechanistic link between two measurable Key Events (KEs) within an AOP. The central thesis of modern KER research asserts that the confidence in extrapolating a KER across species is predicated on the evolutionary conservation of the underlying biological mechanism [6]. Consequently, tDOA is not a peripheral descriptor but a foundational element that determines the utility of AOPs and KERs in regulatory decision-making and safety assessment for untested species [6].
The traditional development of AOPs often relies on empirical data from a single or a handful of model species. While biological plausibility may suggest broader relevance, the tDOA remains narrowly defined without explicit evidence [6]. This limitation is a significant hurdle in ecological risk assessment, where protecting diverse species is paramount, and in drug development, where translation from preclinical models to humans is critical [8]. Defining the tDOA systematically transforms an AOP from a species-specific narrative into a generalized, portable template for prediction. This guide details the mechanistic basis, computational assessment methodologies, and practical integration of tDOA to enhance the reliability and scope of predictive toxicology.
The tDOA of a KER or an AOP is supported by two pillars of evidence: structural conservation and functional conservation [6]. Structural conservation evaluates whether the essential biological components (e.g., genes, proteins, receptors, tissues) are present and measurably similar in the taxa of interest. Functional conservation assesses whether those components perform the same role within the proposed pathway in different species.
A seminal tool for evaluating structural conservation is the U.S. EPA's Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool [6]. It operates via a hierarchical, three-level analysis:
The output provides a line of evidence for structural conservation, which, when combined with empirical data for functional conservation, forms a weight-of-evidence basis for defining the tDOA [9].
The following protocol, derived from published case studies [6] [9], outlines the steps for using SeqAPASS to inform the tDOA of an AOP or KER.
1. Identify Query Proteins:
2. Perform SeqAPASS Analysis:
3. Data Integration & tDOA Assignment:
Diagram 1: Workflow for Evidence-Based tDOA Assessment (94 characters)
Beyond single-protein analysis, modern approaches integrate pathway-level conservation to strengthen tDOA predictions. Tools like Unilever's Genes to Pathways – Species Conservation Analysis (G2P-SCAN) map human gene sets to biological pathways and evaluate their conservation across common model species [9]. When combined with SeqAPASS, this provides a multi-layered, consensus view of biological conservation.
Integrated Protocol: SeqAPASS & G2P-SCAN for Enhanced tDOA [9]
Diagram 2: Three-Level SeqAPASS Analysis (79 characters)
The following tables summarize quantitative outcomes from key studies applying these methodologies.
Table 1: SeqAPASS Conservation Levels for Case Study Proteins (AOP 89: nAChR Activation to Colony Death) [6]
| Protein Target | Role in AOP | Conservation in Apis mellifera (Honey Bee) | Conservation in Bombus spp. (Bumble Bee) | Conservation in Drosophila melanogaster (Fruit Fly) |
|---|---|---|---|---|
| nAChR alpha1 | Molecular Initiating Event | Reference Species (100%) | High (Levels 1, 2, 3 Pass) | High (Levels 1, 2, 3 Pass) |
| nAChR beta1 | Molecular Initiating Event | Reference Species (100%) | High (Levels 1, 2, 3 Pass) | Moderate (Levels 1, 2 Pass) |
| PLCgamma | Intracellular Signaling Key Event | Present | High Conservation Predicted | High Conservation Predicted |
| PKC | Intracellular Signaling Key Event | Present | High Conservation Predicted | High Conservation Predicted |
Table 2: Summary of Integrated Tool Performance in Cross-Species Predictions [9]
| Case Study Target | Tool Used | Primary Prediction | Key Outcome for tDOA |
|---|---|---|---|
| PPARα | SeqAPASS | High conservation across vertebrates, low in invertebrates. | Supported vertebrate-specific tDOA for PPARα-mediated AOPs. |
| PPARα | G2P-SCAN | Fatty acid metabolism pathway highly conserved in human, rat, mouse, zebrafish. | Corroborated pathway-level relevance in standard test species. |
| ESR1 (Estrogen Receptor) | SeqAPASS | High conservation in jawed vertebrates; absent in arthropods, mollusks. | Clearly defined tDOA boundary between vertebrates and invertebrates. |
| GABRA1 | SeqAPASS & G2P-SCAN | High protein & pathway conservation across vertebrates and some invertebrates. | Provided evidence to expand tDOA for GABA-gated chloride channel AOPs. |
AOP: Activation of the nicotinic acetylcholine receptor (nAChR) leading to colony death/failure in honey bees (Apis mellifera) [6]. Challenge: The AOP was developed for honey bees, but regulatory protection is needed for thousands of other bee species. tDOA Assessment: SeqAPASS analysis was performed on nine proteins in the pathway, from nAChR subunits to neuronal proteins. Results demonstrated high structural conservation of the MIE (nAChR) across Apis and non-Apis bees, providing strong evidence to expand the plausible tDOA to other bee genera like Bombus (bumble bees) [6]. This computational evidence can guide targeted empirical testing on key species.
KER: Decreased all-trans retinoic acid (atRA) levels in developing ovaries leads to disrupted meiotic entry of oogonia [7]. Context: This KER is part of a potential AOP for reduced female fertility. Its utility depends on the conservation of atRA's role in meiosis across mammals. tDOA Rationale: The KER description explicitly reviews comparative biological evidence, showing the role of atRA in initiating meiosis is conserved across studied mammalian species [7]. This functional conservation, rooted in developmental biology, defines the KER's (and future AOP's) tDOA as "mammals," providing clear boundaries for extrapolation from rodent models to human health risk assessment.
Diagram 3: Logic Flow for KER Taxonomic Conservation (98 characters)
Table 3: Research Reagent Solutions for tDOA and KER Conservation Studies
| Tool / Resource | Type | Primary Function in tDOA Research | Access / Example |
|---|---|---|---|
| SeqAPASS | Bioinformatics Tool | Evaluates structural conservation of protein targets across species via three-tiered sequence analysis. Provides direct line of evidence for tDOA. | https://seqapass.epa.gov/seqapass/ [6] |
| G2P-SCAN | Bioinformatics Tool | Maps gene sets to biological pathways and evaluates pathway conservation across a defined set of model species. | Available from Unilever; complementary to SeqAPASS [9]. |
| AOP-Wiki | Knowledge Base | Central repository for AOPs and KERs. Platform for publishing and viewing defined tDOA based on assembled evidence. | https://aopwiki.org/ [6] |
| Comparative Toxicology Databases | Data Resource | Provide empirical toxicity data across species (e.g., ECOTOX, PubChem). Essential for anchoring/validating computational tDOA predictions. | US EPA ECOTOXicology Knowledgebase |
| Ortholog Prediction Databases | Data Resource | Provide pre-computed ortholog groups (e.g., OrthoDB, Ensembl Compara). Useful for rapid initial assessment of gene conservation. | https://www.orthodb.org/ |
| Reactome / KEGG | Pathway Database | Provide curated biological pathways. Used to identify the broader context of a molecular target for pathway-level conservation analysis. | https://reactome.org/ [9] |
The taxonomic domain of applicability is a critical, evidence-driven component that determines the real-world utility of predictive toxicology frameworks. By anchoring tDOA definitions in the systematic assessment of KER conservation—through integrated bioinformatics tools like SeqAPASS and G2P-SCAN, and empirical data—scientists can transform AOPs from descriptive models into reliable extrapolation tools. This rigorous approach directly addresses core challenges in ecological risk assessment and translational drug safety, ensuring protective measures and predictions are grounded in evolutionary biology. Future advancements in comparative 'omics and systems biology will further refine tDOA precision, solidifying its role as the bedrock of credible cross-species prediction.
Abstract Defining the taxonomic domain of applicability (tDOA) is critical for the confident use of Adverse Outcome Pathways (AOPs) in regulatory decision-making, particularly for the protection of untested species. This technical guide establishes structural and functional conservation as the two foundational pillars for extrapolating Key Event Relationships (KERs) across species. We present integrated bioinformatics and empirical methodologies to evaluate these pillars, supported by case studies on neurotoxicants in pollinators and disrupted retinoid signaling in mammalian fertility. The proposed framework enables the systematic expansion of AOP applicability, transforming tDOA from a static assumption into a dynamic, evidence-driven construct essential for predictive toxicology and chemical safety assessment.
The Adverse Outcome Pathway (AOP) framework organizes mechanistic knowledge into a causal sequence linking a Molecular Initiating Event (MIE) to an Adverse Outcome (AO) via intermediate Key Events (KEs) and Key Event Relationships (KERs) [10]. While AOPs are developed based on data from specific test species, their ultimate utility in ecological and human health risk assessment hinges on reliable extrapolation to broader taxa. The taxonomic domain of applicability (tDOA) defines the species for which a given AOP is considered valid [6]. Historically, tDOA has often been narrowly or ambiguously defined, limiting confidence in cross-species predictions [6].
This gap underscores the need for a rigorous, evidence-based approach. As articulated in OECD guidance, evaluating structural conservation (the presence and similarity of a biological entity) and functional conservation (the preservation of its biological role) forms the scientific basis for extrapolating KEs and KERs [6]. This guide posits that plausible tDOA is built upon these dual pillars. By leveraging publicly accessible bioinformatics tools to assess structural conservation and designing targeted in vitro and in vivo assays to confirm functional conservation, researchers can systematically expand and defend the tDOA of critical AOPs [6] [11]. This approach is fundamental to advancing the AOP framework from a descriptive exercise to a predictive, regulatory-ready tool.
Within an AOP, a Key Event Relationship is a scientifically supported, causal link between an upstream and a downstream Key Event [10]. It is the KER that enables predictive inference: the state of a downstream KE can be inferred from the measured state of an upstream KE. Recent proposals argue that KERs, which encapsulate these causal hypotheses, should be recognized as the core modular building blocks of the AOP knowledge base [11]. This modularity is essential for the taxonomic extrapolation of AOPs. Establishing the tDOA for an entire AOP first requires establishing the tDOA for its constituent KERs, which in turn depends on the conservation of the KEs they connect.
The OECD identifies two primary considerations for defining the tDOA of a KE: structural conservation and functional conservation [6].
These pillars are hierarchical and interdependent. Structural conservation is a prerequisite for—but does not guarantee—functional conservation. Functional conservation provides the definitive evidence for plausibility but is more resource-intensive to establish. Therefore, a robust assessment begins with broad screening for structural conservation to prioritize candidate taxa for focused empirical testing of functional conservation.
The AOP-Wiki serves as the central repository for AOP knowledge [10]. The OECD AOP Developer's Handbook provides a standardized template for AOP development, emphasizing the need for clear documentation of the evidence supporting KERs and their taxonomic applicability [10]. Incorporating lines of evidence for structural and functional conservation into the AOP-Wiki is essential for transparently defining and expanding the tDOA [6].
A tiered strategy that integrates computational bioinformatics with empirical validation provides the most efficient and defensible pathway to establish plausible tDOA.
Primary Tool: Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) The SeqAPASS tool is a publicly accessible web-based platform designed to evaluate cross-species protein conservation through a hierarchical, three-level analysis [6].
Table 1: Hierarchical Analysis Levels of the SeqAPASS Tool [6]
| Level | Analysis Focus | Interpretation & Output |
|---|---|---|
| Level 1 | Primary amino acid sequence similarity. | Identifies putative orthologs across species. Provides a percent identity score and generates a taxonomic tree visualizing similarity. |
| Level 2 | Conservation of known functional domains (e.g., ligand-binding domains, catalytic sites). | Determines if the core functional regions of the protein are present. Output indicates domain preservation or loss. |
| Level 3 | Conservation of specific amino acid residues critical for function (e.g., chemical binding, protein-protein interaction). | Assesses if residues known to be essential for the MIE or KE are identical. Highest specificity for predicting susceptibility. |
Protocol 1: Conducting a SeqAPASS Analysis for tDOA
Bioinformatics provides a plausible hypothesis of conservation, which must be tested empirically. Functional assays confirm that the conserved structure leads to conserved function within the biological pathway.
Protocol 2: In Vitro Assay for Functional Conservation of an MIE (e.g., Receptor Activation)
Protocol 3: In Vivo or Ex Vivo Assay for Functional Conservation of a KER
Background: An AOP network linking the activation of the nicotinic acetylcholine receptor (nAChR) to colony death/failure was developed for the honey bee (Apis mellifera) [6]. The tDOA for non-Apis bees (e.g., bumblebees, solitary bees) was uncertain.
Application of the Dual-Pillar Framework:
Background: A KER linking decreased all-trans retinoic acid (atRA) levels in the fetal ovary to disrupted meiotic entry of oogonia is a component of a proposed AOP for reduced fertility [11]. The initial evidence was primarily from mouse models.
Application of the Dual-Pillar Framework:
Table 2: Summary of Case Study Evidence for tDOA
| Case Study | AOP/KER Focus | Source Species | Structural Evidence | Functional Evidence | Plausible tDOA Conclusion |
|---|---|---|---|---|---|
| Bee Neurotoxicity [6] | AOP 89: nAChR activation -> colony failure | Apis mellifera (Honey bee) | High SeqAPASS scores for nAChR & other proteins across bee genera. | Similar neonicotinoid toxicity profiles in multiple bee species. | Expanded to include other bees (e.g., Bombus). |
| Mammalian Fertility [11] | KER 2477: ↓ atRA -> disrupted meiosis | Mouse (Mus musculus) | High sequence conservation of ALDH1A & STRA8 across mammals. | Conserved dose-response & essentiality in rats & human tissue data. | Mammals. |
Table 3: Key Research Reagents and Tools for tDOA Assessment
| Item / Solution | Function in tDOA Assessment | Example/Provider |
|---|---|---|
| SeqAPASS Tool | Publicly accessible bioinformatics platform for Tier 1 structural conservation analysis across three hierarchical levels. | U.S. EPA SeqAPASS (https://seqapass.epa.gov/seqapass/) |
| UniProt / NCBI Protein Databases | Source of reliable reference protein sequences and functional annotations for query setup in SeqAPASS. | UniProtKB (https://www.uniprot.org/), NCBI Protein (https://www.ncbi.nlm.nih.gov/protein) |
| Recombinant Protein Expression Systems | Enables production of target proteins from source and test species for in vitro functional assays (Tier 2). | Baculovirus (insect cells), HEK293 (mammalian cells), or cell-free systems. |
| Functional Assay Kits | Provide standardized, optimized methods to measure protein activity (e.g., receptor activation, enzyme inhibition). | Calcium flux assays (FLIPR), cAMP detection kits, luciferase reporter assays. |
| Target-Specific Reference Chemicals | Well-characterized agonists/antagonists used as positive controls to validate functional assays across species. | e.g., Acetylcholine for nAChR, all-trans Retinoic Acid for retinoid receptors. |
| Custom RNA/DNA Probes/Primers | For measuring species-specific gene expression changes as molecular KEs in in vivo or ex vivo studies. | Designed from target species' sequenced genomes. |
| AOP-Wiki (aopwiki.org) | The central knowledge base for publishing AOPs, KEs, and KERs, including documented evidence for tDOA. | Managed by the OECD. |
The establishment of a plausible taxonomic domain of applicability is a non-negotiable requirement for the credible use of AOPs in protecting human health and the environment. This guide demonstrates that a rigorous, tiered framework—grounded in the dual assessment of structural conservation via bioinformatics and functional conservation via empirical testing—provides a systematic and defensible pathway to achieve this goal.
Future advancements will depend on tighter integration between computational predictions and high-throughput empirical screening. The expansion of high-quality genomic and proteomic databases will enhance the resolution of tools like SeqAPASS. Concurrently, the development of standardized, cross-species in vitro assays (e.g., using conserved cell lines or tissue models) will improve the efficiency of functional conservation testing. By embracing this integrated approach, the toxicology community can transform tDOA from a statement of assumption into a dynamic, evidence-based conclusion, significantly strengthening the predictive power and regulatory applicability of the AOP framework.
The paradigm of chemical risk assessment is undergoing a fundamental transformation, driven by the regulatory and ethical imperative to reduce reliance on animal testing and to develop faster, more mechanistic safety evaluations. This shift is encapsulated in the development of Next Generation Risk Assessment (NGRA), defined as an exposure-led, hypothesis-driven approach that integrates in silico, in chemico, and in vitro New Approach Methodologies (NAMs) [12]. A critical challenge within this framework is ensuring the human and ecological relevance of NAM-based predictions. This necessitates a rigorous understanding of the Taxonomic Domain of Applicability (tDOA) for the Adverse Outcome Pathways (AOPs) and their constituent Key Event Relationships (KERs) that form the backbone of mechanistic risk assessment. Defined tDOA specifies the range of species for which a KER is biologically plausible, based on the conservation of molecular targets and pathways. The demand for its explicit definition is a direct regulatory driver, essential for justifying the use of NAM data in safety decisions, enabling credible cross-species extrapolation, and fulfilling the core NGRA principles of being relevant to humans and preventing harm [12]. This whitepaper provides a technical guide to defining tDOA through the lens of KER taxonomic conservation research, detailing the experimental and computational protocols that underpin this emerging standard in regulatory science.
The traditional risk assessment paradigm, heavily dependent on apical endpoint data from animal studies, is increasingly viewed as resource-intensive, low-throughput, and limited in its mechanistic insight. In response, regulatory bodies worldwide are promoting NGRA. The U.S. EPA's NexGen program, initiated over a decade ago, exemplifies a long-standing effort to incorporate advances in molecular and systems biology into risk assessment [13]. The modern consensus defines NGRA as a tailored, iterative, and tiered process that moves away from prescribed animal tests toward a hypothesis-driven integration of diverse data sources [14] [12].
Central to the NGRA paradigm is the Adverse Outcome Pathway (AOP) framework. An AOP is a structured, linear representation of a toxicological mechanism, linking a Molecular Initiating Event (MIE) through a series of measurable Key Events (KEs) to an Adverse Outcome (AO) at the organism or population level. The causal linkages between two adjacent KEs are termed Key Event Relationships (KERs). KERs are recognized as the fundamental building blocks of toxicological knowledge within the AOP knowledge base [7]. Their quality, supported by empirical evidence and mechanistic understanding, determines the predictive utility and regulatory acceptance of an AOP.
The modularity of the AOP framework is its strength, allowing for the assembly of pathways and networks based on conserved biological processes. However, a pathway described in a model organism (e.g., Caenorhabditis elegans) is only relevant to human or ecological risk assessment if the underlying KERs are operative across the species of concern. This is where tDOA becomes critical. For a given KER, the tDOA defines the set of taxa for which there is established or inferred biological plausibility that the relationship holds, based on the conservation of the proteins, signaling pathways, and cellular functions involved.
The tDOA is not a single data point but a conclusion derived from a weight-of-evidence analysis. Its definition rests on two pillars: 1) empirical evidence from experiments in multiple species, and 2) in silico inference based on evolutionary conservation. Regulatory demand for a "defined" tDOA means moving from vague statements (e.g., "likely applicable to vertebrates") to a well-justified, evidence-based taxonomic scope.
Table 1: Core Concepts in tDOA Definition for KERs
| Concept | Definition | Role in NGRA | Source of Evidence |
|---|---|---|---|
| Molecular Initiating Event (MIE) | The initial interaction between a stressor and a biomolecule within an organism. | Starting point for mechanistic prediction; high conservation increases tDOA breadth. | In vitro binding/activity assays, structural biology. |
| Key Event Relationship (KER) | A scientifically supported causal or associative link between two Key Events. | Core unit of predictive knowledge; the primary entity for tDOA assessment. | Empirical dose-response/temporal data, mechanistic studies. |
| Taxonomic Domain of Applicability (tDOA) | The range of taxa for which a KER is considered biologically plausible. | Justifies extrapolation of NAM data from test systems to target species (human/wildlife). | Cross-species empirical data, in silico sequence/pathway conservation analysis. |
| Empirical tDOA | tDOA based on direct experimental observation of the KER in listed species. | Provides highest confidence but is limited by the scope of tested species. | Published in vivo or in vitro studies across multiple taxa. |
| Inferred tDOA | tDOA extrapolated using computational tools analyzing evolutionary conservation. | Enables expansion of tDOA beyond empirically tested species; essential for broad screening. | SeqAPASS, G2P-SCAN, phylogenetic analysis. |
The regulatory driver for defined tDOA is clear: without it, the use of an AOP for decision-making lacks a defined boundary of relevance, introducing unacceptable uncertainty. For instance, an AOP for reproductive toxicity developed in nematodes must have its tDOA rigorously defined to assess its utility for predicting risk to mammals or fish [15].
Defining tDOA is a multi-step process that integrates data curation, empirical analysis, and computational prediction. The following protocols, drawn from a seminal case study on extending the tDOA for an AOP involving silver nanoparticle (AgNP)-induced reproductive toxicity, provide a replicable blueprint [15].
Objective: To assemble empirical evidence from multiple species into a unified AOP network and quantitatively assess the confidence in each KER.
Workflow:
Table 2: Example Dataset for Cross-Species AOP Network Construction [15]
| Ecological Compartment | Initial Empirically Tested Species | Number of Studies Integrated | Key MIE/KEs Mapped |
|---|---|---|---|
| Terrestrial | Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens (in vitro) | 17 | ROS generation, MAPK activation, oxidative damage, reproductive output. |
| Aquatic | Chironomus riparius, Daphnia magna, Oryzias latipes | 8 | Oxidative stress, genotoxicity, growth inhibition, mortality. |
Objective: To extrapolate the tDOA of KERs beyond empirically tested species using computational tools that assess the conservation of molecular targets and pathways.
Workflow:
Table 3: In Silico Tools for tDOA Extension
| Tool | Primary Function | Input | Output for tDOA | Regulatory Relevance |
|---|---|---|---|---|
| SeqAPASS | Protein sequence/functional domain conservation analysis. | Amino acid sequence or PFAM domain. | List of species with conserved molecular target; informs on MIE/KE applicability. | Justifies extrapolation of molecular interactions. Supported by US EPA. |
| G2P-SCAN | Biological pathway and gene set conservation analysis. | Set of human genes (Entrez IDs) representing a pathway. | Assessment of pathway completeness across >400 species. | Confirms functional biological context for KERs, strengthening tDOA. |
| Bayesian Network Modeling | Probabilistic quantification of KER strength and uncertainty. | Empirical dose-response and temporal data for KEs. | Conditional probability tables for KERs; validates network logic. | Provides transparent, quantitative confidence metrics for use in weight-of-evidence assessments. |
The following diagrams, created using Graphviz DOT language, illustrate the core workflows and conceptual relationships described in this guide.
Diagram 1: Workflow for Defining KER tDOA in NGRA
Diagram 2: Structure of a Quantitative AOP Network with KERs
Diagram 3: Extending tDOA via In Silico Conservation Analysis
Table 4: Research Toolkit for KER and tDOA Investigation
| Category | Item/Resource | Function in tDOA Research | Example/Supplier |
|---|---|---|---|
| Data & Knowledge Bases | AOP-Wiki (aopwiki.org) | Central repository for curated AOPs, KEs, and KERs; provides standardized ontology for mapping data. | OECD-hosted database. |
| US EPA SeqAPASS Tool | Web-based platform for performing protein sequence conservation analysis to predict molecular target applicability. | https://seqapass.epa.gov/seqapass/ | |
| G2P-SCAN R Package | Tool for assessing conservation of human biological pathways across a wide range of species. | Available via Bioconductor. | |
| Software & Modeling | Bayesian Network Software (e.g., Netica, AgenaRisk, R packages bnlearn, gRain) |
Platform for constructing, parameterizing, and performing probabilistic inference on quantitative KER/AOP network models. | Commercial and open-source options. |
| Molecular Visualization & Alignment Software (e.g., PyMOL, Clustal Omega) | Visualizes protein structures and performs multiple sequence alignments to support SeqAPASS analysis and threshold determination. | ||
| Experimental Models | Phylogenetically Diverse Model Organisms | Provide empirical data for KER validation across taxa (e.g., C. elegans, D. rerio, X. laevis). | Strain centers and commercial suppliers. |
| Recombinant Proteins & Cell Lines | Express conserved molecular targets from human and non-human species for in vitro comparative assays to validate conservation predictions. | Commercial cDNA clones, ATCC cell lines. | |
| Reference Materials | Chemical Stressors with Known MIEs (e.g., reference agonists/antagonists) | Positive controls for establishing KERs in novel test systems (e.g., a specific kinase inhibitor). | Sigma-Aldrich, Tocris. |
| Conserved Pathway Antibodies | Immunodetection tools that cross-react with orthologous proteins in multiple species, enabling comparative KE measurement. | Commercial antibody suppliers with cross-reactivity data. |
The regulatory demand for defined tDOA is not an abstract scientific ideal but a practical necessity for the implementation of NGRA. It provides the scientific boundary conditions for applying mechanistic, NAM-derived data to protect human health and the environment. As demonstrated, defining tDOA is a rigorous process that moves from curated empirical evidence to quantitative KER modeling and finally to computational inference of conservation. The integration of tools like SeqAPASS and G2P-SCAN with probabilistic AOP networks represents the cutting edge of this field, enabling scientists to confidently extrapolate mechanistic toxicological knowledge across the tree of life [15]. For researchers and drug development professionals, mastering these protocols is essential for building NGRA packages that meet the evolving standards of regulatory agencies, ultimately supporting the transition to a more predictive, efficient, and animal-free safety assessment paradigm.
The Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool is a computational framework developed by the U.S. Environmental Protection Agency (EPA) to address a central challenge in toxicology and chemical safety: predicting chemical susceptibility across thousands of species for which empirical toxicity data are absent [16] [17]. In the context of research on Key Event Relationships (KER) within the Adverse Outcome Pathway (AOP) framework, SeqAPASS provides a critical methodology for assessing the taxonomic conservation of Molecular Initiating Events (MIEs). The tool operates on the principle that a species' intrinsic susceptibility to a chemical is largely determined by the conservation of the protein targets with which that chemical interacts [16] [18].
By leveraging publicly available protein databases, SeqAPASS allows researchers to extrapolate known chemical-protein interactions from well-studied model organisms (e.g., humans, rats, zebrafish) to non-target species, including plants, wildlife, and endangered species [16]. This capability is increasingly vital in a regulatory landscape moving towards New Approach Methodologies (NAMs) that reduce reliance on whole-animal testing, while simultaneously demanding broader ecological risk assessments [17] [19]. The tool's hierarchical design, which progresses from primary sequence to three-dimensional structural analysis, offers a tiered weight-of-evidence approach for evaluating protein conservation, making it a powerful asset for defining the domain of applicability for KERs across taxa and strengthening the scientific basis for cross-species extrapolation in modern toxicology [18] [19].
SeqAPASS is structured around a multi-level analytical hierarchy, each level providing an increasingly refined line of evidence regarding protein conservation and predicted chemical susceptibility [20]. This design allows users to tailor the analysis based on the depth of available knowledge about the chemical-protein interaction of interest.
This foundational level performs a whole-protein sequence alignment using the Basic Local Alignment Search Tool (BLASTp) algorithm. It compares the primary amino acid sequence of a query protein from a known sensitive species against sequences from all species within the National Center for Biotechnology Information (NCBI) protein database [17]. The result is a broad prediction of potential susceptibility across species, based on overall sequence similarity. This level is most useful for initial screening when detailed knowledge of the protein's functional domains or chemical interaction residues is limited [20].
Level 2 refines the analysis by focusing on conserved functional domains. Using tools like COBALT for multiple sequence alignment, this level evaluates whether the specific domains responsible for a protein's function (e.g., a ligand-binding domain, an active site) are preserved in other species [17]. Conservation of these domains suggests the protein's core function is retained, providing stronger evidence for a conserved chemical interaction than whole-sequence similarity alone.
The most sequence-specific level involves evaluating the conservation of individual critical amino acid residues known to be essential for the chemical-protein interaction [17] [20]. Users input the specific residue positions (e.g., from a solved crystal structure or site-directed mutagenesis studies). SeqAPASS then checks for their preservation across species. A match at these precise locations offers high-confidence evidence that the specific molecular interaction is conserved, even if other parts of the protein sequence vary.
Introduced in SeqAPASS v7.0, Level 4 is an advanced feature that generates protein structural models for cross-species comparison [21]. It employs the Iterative Threading ASSEmbly Refinement (I-TASSER) tool to create 3D structural predictions from amino acid sequences [19]. Users can then align these predicted structures to a reference structure to assess structural conservation. This level provides a direct, biophysical line of evidence and yields models suitable for downstream applications like molecular docking or dynamics simulations [22] [19].
The following diagram illustrates this hierarchical workflow and its role in supporting KER-based research.
The standard workflow for a SeqAPASS analysis, as detailed in peer-reviewed protocols, involves the following steps [17]:
For advanced users, the Level 4 workflow extends SeqAPASS into structural bioinformatics [22] [19]:
The diagram below integrates this structural workflow with the core SeqAPASS hierarchy.
SeqAPASS is explicitly designed to evaluate the conservation of Molecular Initiating Events (MIEs)—a specific type of KER—across species, thereby defining the taxonomic applicability of an AOP [18]. For instance, if an MIE is defined as "Chemical X binding to Androgen Receptor (AR) leading to antagonism," SeqAPASS can predict which vertebrate species possess a conserved AR ligand-binding domain, suggesting they are potentially susceptible to this MIE.
This application is demonstrated in several published case studies:
Since its initial release in 2016, SeqAPASS has undergone significant feature enhancements, driven by user feedback and advances in bioinformatics [17]. The following table summarizes its version history and key developments.
Table 1: Evolution of the SeqAPASS Tool and Its Capabilities [17]
| Version | Release Date | Key Features and Updates |
|---|---|---|
| 1.0 | Jan 2016 | Initial public release with Levels 1 & 2 (primary sequence and domain comparison). |
| 2.0 | May 2017 | Introduction of Level 3 for critical amino acid residue comparison. |
| 3.0 | Mar 2018 | Added interactive data visualization capabilities for Levels 1 & 2. |
| 4.0 | Oct 2019 | Enhanced interoperability: links to ECOTOX Knowledgebase, AOP-Wiki, and summary reports. |
| 5.0 | Dec 2020 | Introduced customizable heatmaps for Level 3 and a downloadable Decision Summary Report. |
| 6.0 | Sep 2021 | Added widget to directly query the ECOTOX Knowledgebase from SeqAPASS results. |
| 7.0 | Sep 2023 | Introduced Level 4 for protein structural evaluation using I-TASSER [21]. |
| 8.0 | 2025 | Current version; allows submission of sequences to generate protein structures across species [16]. |
The tool's performance is benchmarked by its ability to efficiently process massive datasets. A single Level 1 query compares the query sequence against the entire NCBI protein database, facilitating predictions for thousands of species within a short timeframe [16]. Quantitative outputs from case studies demonstrate its predictive scale. For example, in the PFOA-TTR study, SeqAPASS predicted 952, 976, and 750 species as susceptible at Levels 1, 2, and 3, respectively [22]. The integration of structural modeling (Level 4) and MD simulations provided quantitative binding metrics (e.g., binding free energy, root-mean-square deviation) that confirmed the interaction's conservation across a subset of these species, validating the sequence-based predictions with biophysical data [22].
Table 2: Example Quantitative Output from a SeqAPASS Case Study (PFOA-TTR Interaction) [22]
| Analysis Level | Number of Species Predicted as Susceptible | Key Output Metrics |
|---|---|---|
| Level 1 | 952 | Primary sequence similarity threshold met. |
| Level 2 | 976 | Functional TTR domains (binding pockets) conserved. |
| Level 3 | 750 | Critical lysine residue (K15) for PFOA binding conserved. |
| Level 4 (MD Simulation) | Subset of above | Quantitative confirmation: No significant difference in predicted binding affinity (ΔG) or interaction stability across tested vertebrate species. |
Successful application of the SeqAPASS framework, especially in advanced workflows, relies on a suite of integrated databases and computational tools.
Table 3: Key Research Reagent Solutions for SeqAPASS-Based Analysis
| Item / Resource | Primary Function in SeqAPASS Workflow | Source / Availability |
|---|---|---|
| NCBI Protein Database | The primary source repository for protein sequence data against which all SeqAPASS queries are compared. Contains over 153 million sequences [16]. | Publicly available from the National Library of Medicine. |
| BLASTp Algorithm | Executes the primary amino acid sequence alignments for Level 1 analysis, identifying homologous sequences across species [17]. | Integrated into the SeqAPASS backend. |
| Conserved Domain Database (CDD) | Provides curated information on protein functional domains used for Level 2 comparative analysis [17]. | Integrated into SeqAPASS from NCBI. |
| I-TASSER (Iterative Threading ASSEmbly Refinement) | The primary engine for de novo protein structure prediction from amino acid sequences in Level 4 analysis [21] [19]. | Open-source, publicly available tool. |
| AlphaFold Protein Structure Database | A source of highly accurate, pre-computed protein structure predictions that can be imported into SeqAPASS for Level 4 structural comparisons [19]. | Publicly available from DeepMind/EMBL-EBI. |
| Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) | The global archive for experimentally determined 3D structures of proteins. Provides reference structures for alignment, docking, and simulation studies [21]. | Publicly available. |
| Molecular Docking Software (e.g., AutoDock, GOLD) | Used downstream of SeqAPASS to predict the binding orientation and affinity of a chemical to protein models generated from diverse species [22] [19]. | Various commercial and open-source packages. |
| Molecular Dynamics (MD) Simulation Software (e.g., GROMACS, AMBER) | Used to run physics-based simulations that quantify the stability and dynamics of chemical-protein complexes across species, providing free energy calculations and residue interaction profiles [22]. | Various commercial and open-source packages. |
The SeqAPASS tool represents a mature and critically important hierarchical framework for bridging comparative genomics and predictive toxicology. By providing a systematic, publicly accessible method to evaluate protein conservation from sequence to structure, it directly addresses the core challenge of taxonomic applicability in KER and AOP research. Its integration with molecular modeling and simulation techniques marks the frontier of next-generation risk assessment, moving beyond qualitative predictions to generate quantitative, biophysical lines of evidence for cross-species extrapolation [22] [19].
Future development of SeqAPASS will likely focus on deeper automation of the advanced workflow, potentially embedding simplified docking or simulation modules within the web interface. Furthermore, as databases like AlphaFold continue to expand the universe of available protein structures, the accuracy and ease of Level 4 structural comparisons will improve dramatically. For researchers and drug development professionals, SeqAPASS offers a powerful, validated platform to efficiently assess potential chemical risks across the tree of life, prioritize testing, and ultimately build more credible and defensible safety assessments for both human health and ecological systems.
G2P-SCAN (Genes-to-Pathways Species Conservation Analysis) is a computational pipeline designed to assess the conservation of human biological pathways across multiple species by integrating data on gene orthologs, protein families, and pathway entities [23]. This tool addresses a critical need in modern toxicology and safety assessment: the ability to extrapolate biological effects and chemical susceptibility across species with confidence [23] [24]. Its development is a direct response to the global regulatory shift towards New Approach Methodologies (NAMs) that reduce reliance on animal testing [23].
The operational context for G2P-SCAN is firmly embedded within the Adverse Outcome Pathway (AOP) framework, which structures toxicological knowledge into causal sequences from a Molecular Initiating Event (MIE) to an Adverse Outcome (AO) [10]. The core, inferential units of an AOP are Key Event Relationships (KERs), which describe the biologically plausible and empirically supported causal link between an upstream and a downstream Key Event (KE) [11] [10]. A fundamental challenge in AOP development and application is defining its Taxonomic Domain of Applicability (tDOA)—the range of species for which the described causal pathway is biologically plausible [10] [24].
G2P-SCAN directly informs KER taxonomic conservation research by providing a systematic, evidence-based method to evaluate whether the molecular and cellular processes underpinning a KER are conserved in non-human species. For example, a KER linking decreased retinoic acid levels to disrupted meiosis in oocytes relies on a conserved signaling pathway [11]. G2P-SCAN can analyze the core genes in this pathway (e.g., ALDH1A1, STRA8) to objectively assess its conservation in model organisms like mouse or zebrafish, thereby strengthening or limiting the tDOA claim for that KER [11] [24]. This moves beyond simple sequence similarity of individual proteins to a functional, pathway-level assessment of conservation, which is more relevant for predicting the propagation of a toxicological perturbation along an AOP [23] [25].
The G2P-SCAN pipeline is implemented as an R package and functions as a structured workflow that queries, synthesizes, and analyzes data from multiple established biological databases [23] [26].
The primary analytical workflow of G2P-SCAN consists of four main stages: Pathway Mapping, Orthology Identification, Functional Analysis, and Data Synthesis [26].
The pipeline begins by mapping the user-provided human gene symbols (e.g., ESR1, PPARA) to biological pathways using the Reactome knowledgebase via the InterMineR API [26]. It retrieves all human pathways containing the input genes. Users can specify the desired level of the Reactome hierarchy for analysis: "terminal" (most specific pathways), "parental" (broad parent pathways), or "intermediate" [26].
For every human gene contained within the mapped pathways, G2P-SCAN identifies orthologous genes in the selected target species. This step also utilizes the InterMineR API to access orthology data [26]. The pipeline supports analysis for six key model organisms: mouse (Mus musculus), rat (Rattus norvegicus), zebrafish (Danio rerio), fruit fly (Drosophila melanogaster), roundworm (Caenorhabditis elegans), and yeast (Saccharomyces cerevisiae) [23] [26]. An orthology filter ("ALL" or "LDO"-Least Divergent Ortholog) can be applied to refine the results [26].
To move beyond gene-level lists and assess functional conservation, the pipeline queries the UniProt API to obtain protein identifiers for each human gene and its orthologs [26]. These protein identifiers are then submitted to the InterPro API to map each protein to one or more protein families or domains (e.g., "Nuclear hormone receptor," "Zinc finger") [23] [26]. Protein families serve as a proxy for conserved functional units, offering a more biologically meaningful metric than gene counts alone [23].
In the final stage, G2P-SCAN compiles multiple quantitative metrics for each pathway and species:
All results are organized into two primary outputs: a "counts" summary (tabular quantitative data) and a "data" file (the underlying gene, protein, and family lists) [26].
Application of G2P-SCAN provides multi-dimensional metrics for evaluating conservation. The following table summarizes hypothetical output for two pathways central to case studies in the literature [24].
Table 1: Example G2P-SCAN Conservation Metrics for Selected Pathways and Species
| Pathway (Human Input Gene) | Species | Human Genes in Pathway | Ortholog Count | Proteins Mapped | Protein Families Assigned | Pathway Entities (Species) | Pathway Reactions (Species) |
|---|---|---|---|---|---|---|---|
| Estrogen Signaling (ESR1) | Homo sapiens (Ref) | 15 | 15 (Self) | 15 | 22 | 150 | 95 |
| Mus musculus | 15 | 15 | 15 | 22 | 148 | 94 | |
| Danio rerio | 15 | 14 | 13 | 18 | 132 | 82 | |
| Drosophila melanogaster | 15 | 1 | 1 | 1 | 25 | 10 | |
| PPARα Signaling (PPARA) | Homo sapiens (Ref) | 12 | 12 (Self) | 12 | 18 | 110 | 70 |
| Rattus norvegicus | 12 | 12 | 12 | 18 | 109 | 69 | |
| Danio rerio | 12 | 11 | 10 | 16 | 105 | 65 | |
| Caenorhabditis elegans | 12 | 0 | 0 | 0 | 15 | 5 |
Note: Data is illustrative, based on case study descriptions [24]. Actual counts vary by database version and analysis parameters.
The power of G2P-SCAN is amplified when integrated with complementary tools. A pivotal study combined G2P-SCAN with the SeqAPASS (Sequence Alignment to Predict Across Species Susceptibility) tool [24] [25]. SeqAPASS performs deep sequence and structural analysis of specific molecular targets (like a receptor) to predict cross-species chemical susceptibility [24] [25]. The integration provides a weight-of-evidence approach: SeqAPASS assesses the conservation of the Molecular Initiating Event (MIE) target, while G2P-SCAN evaluates the conservation of the downstream cellular pathway context [24] [25].
Table 2: Combined G2P-SCAN and SeqAPASS Analysis for Chemical Targetsa
| Chemical Class | Primary Molecular Target (MIE) | SeqAPASS Prediction (Target Conservation) | G2P-SCAN Analysis (Pathway Conservation) | Enhanced Inference for AOP tDOA |
|---|---|---|---|---|
| Fibrate drugs | PPARα | High confidence in mammals; moderate in zebrafish; low in invertebrates. | PPAR signaling pathway largely conserved in mammals & zebrafish; fragmented in Drosophila; absent in C. elegans. | Strong support for mammal & fish tDOA; suggests limited applicability to invertebrates. |
| Environmental Estrogens | Estrogen Receptor (ESR1) | High confidence in vertebrates; no orthologs identified in insects or nematodes. | Core estrogen signaling pathway conserved in vertebrates; highly divergent in Drosophila. | Corroborates vertebrate-specific tDOA for ESR1-mediated AOPs. |
| Pyrethroid insecticides | GABA-A Receptor (GABRA1) | Subunit orthologs present in insects, fish, and mammals with varying sequence similarity. | GABA receptor signaling & neurotoxicity pathways show modular conservation across taxa. | Supports broad tDOA but highlights potential for species-specific differences in sensitivity. |
a Based on combined methodology described in [24] [25].
This section provides a detailed protocol for executing a G2P-SCAN analysis using the R package, based on the official documentation and case studies [23] [26].
devtools package is required for installation.Install G2P-SCAN: Install the package directly from GitHub using the following R commands:
Load Libraries: Load the G2P-SCAN package and the parallel package to enable faster processing.
The primary wrapper function runGenes2Pathways() executes the entire pipeline. Below is an annotated example call analyzing the acetylcholinesterase genes ACHE and BCHE.
outputDir:
[prefix]_counts.xlsx: Contains quantitative summary tables.[prefix]_data.xlsx: Contains the underlying lists of genes, orthologs, proteins, and families.results in the example) containing all structured data for programmatic access in R (e.g., results$all_counts).countSummary tab. High conservation for a pathway in a species is indicated by: a high proportion of human genes with orthologs, high protein family assignment overlap, and significant counts of conserved pathway entities and reactions (see Table 1 for example).Table 3: Key Computational Tools and Databases for Pathway Conservation Research
| Tool/Resource Name | Type | Primary Function in Conservation Analysis | Relevance to G2P-SCAN/KER Research |
|---|---|---|---|
| G2P-SCAN R Package [23] [26] | Computational Pipeline | Core tool for integrated pathway-to-orthology analysis from human gene sets. | Directly executes the analysis framework described in this guide. |
| SeqAPASS [24] [25] | Computational Tool (Web-based) | Predicts chemical susceptibility across species via protein sequence/structure analysis of specific molecular targets. | Provides complementary, target-specific evidence to combine with G2P-SCAN's pathway evidence for robust tDOA assessment. |
| Reactome [23] [26] | Biological Pathway Knowledgebase | Provides curated human pathway data and orthology-projected pathway data for other species. | Primary source for pathway mapping and entity/reaction counts in G2P-SCAN. |
| InterPro [23] [26] | Protein Family/ Domain Database | Classifies proteins into families and domains based on sequence signatures. | Source for functional protein family assignments, a key conservation metric in G2P-SCAN. |
| UniProt [26] | Protein Sequence/Annotation Database | Provides authoritative protein identifiers and functional annotations. | Critical for accurately linking genes to protein sequences for subsequent family analysis. |
| AOP-Wiki [11] [10] | Knowledgebase | Central repository for published Adverse Outcome Pathways, Key Events, and Key Event Relationships. | Source of biological hypotheses (KERs) to be tested for taxonomic applicability using G2P-SCAN. |
| Orthology Data (via InterMine) [26] | Data Resource | Provides pre-computed orthology relationships across multiple species. | Foundational data source for the orthology identification step in the G2P-SCAN pipeline. |
The assessment of chemical safety for both human and ecological health fundamentally depends on the accurate extrapolation of toxicological effects across diverse species. This process is central to the Key Event Relationship (KER) taxonomic conservation research within the Adverse Outcome Pathway (AOP) framework, which seeks to define the taxonomic domain of applicability (tDOA) for mechanistic toxicity pathways [9]. Historical reliance on in vivo vertebrate testing presents significant ethical, resource, and time constraints, creating an urgent need for efficient, non-animal New Approach Methodologies (NAMs) [9].
This guide details the synergistic integration of two pivotal computational NAMs: the U.S. EPA's Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool and Unilever's Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tool [9] [25]. While SeqAPASS evaluates the conservation of primary protein targets (often Molecular Initiating Events) across species using sequence and structural similarity, G2P-SCAN analyzes the broader conservation of entire biological pathways triggered by chemical interaction [9] [23]. When used in combination, these tools generate complementary lines of evidence that significantly strengthen the weight of evidence (WoE) for cross-species extrapolation, thereby refining the tDOA of AOPs and supporting more confident safety decisions in chemical and pharmaceutical development [9].
Table 1: Core Tool Comparison for KER Conservation Analysis
| Feature | SeqAPASS | G2P-SCAN |
|---|---|---|
| Primary Purpose | Predict intrinsic susceptibility by assessing conservation of specific protein targets/MIEs [18]. | Infer biological pathway conservation from human gene sets across model species [23]. |
| Core Methodology | Tiered sequence and structural alignment (Levels 1-4) [18]. | Orthology mapping and pathway enrichment analysis using Reactome [9] [23]. |
| Taxonomic Scope | Broad; any species with available protein sequence data [9]. | Focused on 7 key model species: Human, Mouse, Rat, Zebrafish, Fruit Fly, Roundworm, Yeast [9]. |
| Output for WoE | Binary susceptibility calls & structural models for molecular docking [9] [27]. | Pathway conservation scores and lists of conserved/non-conserved reactions [9] [23]. |
| Role in AOP/tDOA | Defines conservation of the MIE (AOP Key Event 1) [9]. | Informs conservation of downstream key events and the overall pathway [9]. |
The integrated workflow is built upon the distinct yet complementary functions of its two core tools.
SeqAPASS operates on the principle that susceptibility to a chemical is conferred by the presence and structural similarity of a specific molecular target. Its analysis progresses through four tiers:
G2P-SCAN functions as an R package that translates a set of human genes (e.g., a ToxCast assay target or an AOP key event) into pathway-level information. It maps human genes to their orthologs in six other model species, retrieves associated biological pathways from the Reactome database, and performs a species conservation analysis for each pathway [9] [23]. Its output indicates whether the entire pathway, specific reactions within it, or the involved protein families are conserved, offering a higher-order biological context beyond the single protein target.
Diagram 1: Integrated SeqAPASS & G2P-SCAN Workflow for KER Conservation (760px)
The following protocol describes the stepwise integration of SeqAPASS and G2P-SCAN to build a WoE for KER taxonomic conservation.
Step 1: Molecular Target Identification. For the chemical of interest, identify its primary protein molecular initiating event (MIE) and associated human gene(s). Sources include:
Step 2: SeqAPASS Analysis for Target Conservation.
Step 3: G2P-SCAN Analysis for Pathway Conservation.
Step 4: WoE Integration & tDOA Refinement.
Table 2: Summary of Quantitative Outcomes from Case Studies [9]
| Case Study Target | Example Chemical | SeqAPASS Prediction | G2P-SCAN Pathway Conservation Insight | Integrated WoE Conclusion |
|---|---|---|---|---|
| PPARα (Peroxisome Proliferator-Activated Receptor Alpha) | Various fibrates | High predicted susceptibility across mammals; variable in fish. | PPARα activation pathway reactions highly conserved in mammals, partially conserved in zebrafish. | Strong WoE for AOP applicability in mammals. Limited, plausible WoE for zebrafish requiring further investigation. |
| ESR1 (Estrogen Receptor 1) | Oxybenzone, Butylparaben | High conservation of ligand-binding domain across vertebrates. | Estrogen signaling pathway highly conserved in vertebrates; metabolism reactions show species-specific differences. | Supports tDOA for MIEs and early KERs across vertebrates. Downstream outcomes may vary due to metabolic differences. |
| GABRA1 (GABA-A Receptor) | Muscimol, Fipronil | Critical neurotransmitter-binding residues conserved from humans to insects. | GABA receptor activation pathway and neural signal transmission broadly conserved across bilaterians. | Provides strong mechanistic WoE for neurotoxic AOPs across a very broad taxonomic range. |
Table 3: Key Reagents, Databases, and Tools for Integrated Analysis
| Item Name | Type | Primary Function in Workflow | Access/Source |
|---|---|---|---|
| RefChemDB | Chemical Database | Provides curated in vitro bioactivity data for target identification [9]. | US EPA |
| ToxCast/Tox21 Data | Bioactivity Database | Supplies high-throughput screening data for chemical-protein interactions [9]. | US EPA CompTox Dashboard |
| RCSB Protein Data Bank (PDB) | Structural Database | Source of experimental protein-chemical co-crystal structures for critical residue identification and docking reference [9] [27]. | www.rcsb.org |
| SeqAPASS v8.0 | Computational Tool | Performs cross-species sequence/structure alignment to predict protein target conservation and susceptibility [18]. | EPA SeqAPASS Web Tool |
| G2P-SCAN R Package | Computational Tool | Analyzes conservation of biological pathways from human gene sets across model species [23]. | R Package (Publication [23]) |
| Reactome Database | Pathway Database | Provides curated biological pathways used by G2P-SCAN for conservation analysis [9] [23]. | reactome.org |
| AOP-Wiki | Knowledgebase | Framework for organizing KERs and defining the initial tDOA for assessment [9]. | aopwiki.org |
| I-TASSER/AlphaFold | Modeling Tool | Used within or alongside SeqAPASS for predicting 3D protein structures in Level 4 analysis [27]. | Standalone Servers |
The combined application of SeqAPASS and G2P-SCAN directly addresses critical challenges in KER taxonomic conservation research. By simultaneously evaluating the conservation of the initial molecular target and the broader biological pathway, this approach moves beyond assumptions based solely on phylogenetic relatedness. It enables the generation of mechanistically grounded hypotheses about which species are likely to experience adverse outcomes along a defined AOP, thereby making the tDOA more biologically plausible and defensible [9].
This methodology is a cornerstone for the emerging Next Generation Risk Assessment (NGRA) paradigm. It exemplifies how multiple computational NAMs can be integrated in a WoE framework to reduce uncertainty and potentially replace animal testing for certain extrapolation questions [9]. Future developments, such as the direct integration of cross-species molecular docking outputs from SeqAPASS Level 4 models [27] with pathway conservation scores from G2P-SCAN, promise to add even deeper layers of functional understanding, further solidifying the role of integrated computational approaches in modern toxicology and drug development.
This technical guide details a formalized methodology for conducting systematic reviews and evidence mapping to establish transparent support for Key Event Relationships (KERs) within the Adverse Outcome Pathway (AOP) framework. Framed within the critical research objective of defining and expanding the taxonomic domain of applicability (tDOA), this guide provides a step-by-step protocol for integrating diverse data streams—from in vivo and in vitro studies to in silico computational predictions [15]. The core workflow enables researchers to synthesize fragmented toxicological evidence, quantitatively assess KER confidence, and systematically extrapolate pathways across species. This process is foundational for robust, predictive toxicology and chemical safety assessment that aligns with the One Health perspective, bridging human and ecological risk assessment [15].
The Adverse Outcome Pathway framework provides a structured model for describing causal linkages between a Molecular Initiating Event (MIE) and an Adverse Outcome (AO) via intermediate Key Events (KEs) [15]. The scientific confidence in an AOP hinges on the empirical support for each causative Key Event Relationship (KER). However, evidence for KERs is often fragmented across studies employing different model species, experimental designs, and levels of biological organization. This creates significant challenges in assessing the taxonomic domain of applicability (tDOA)—the range of species for which the AOP is biologically plausible [9].
Conducting a systematic review and evidence map is no longer optional but essential for transparent KER support. This process addresses critical gaps and biases analogous to those identified in broader ecological research, such as the over-representation of certain taxa (e.g., vertebrates over invertebrates) and geographic regions [5] [28]. A systematic methodology ensures objectivity, reproducibility, and the identification of true knowledge gaps, thereby preventing skewed or incomplete AOP development that could misinform regulatory decisions [5]. This guide outlines a standardized protocol to meet this need.
The following integrated workflow (Figure 1) provides a comprehensive protocol for systematic evidence gathering, KER evaluation, and taxonomic domain expansion. This multi-step process transitions from qualitative data assembly to quantitative network analysis and finally to computational extrapolation.
Figure 1: Integrated Workflow for Systematic KER Review and tDOA Expansion
This protocol adapts established systematic review standards for the specific context of KER development [5].
This protocol details the quantitative assessment of KER confidence, moving beyond qualitative linkage [15].
This integrated computational protocol expands the tDOA [9] [15].
Systematic reviews generate critical quantitative data that must be presented clearly. The following tables exemplify structured summaries for key outputs.
Table 1: Summary of Evidence Base for AOP Network Development This table catalogues the foundational studies, demonstrating the integration of data across testing modalities and species, a hallmark of a robust systematic review [15].
| Ecological Compartment | Initial Taxonomic Domain (tDOA) | Number of Primary Studies | Key References (Examples) | Extended tDOA (Post-Analysis) |
|---|---|---|---|---|
| Terrestrial | Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens (in vitro) | 17 | Ahn et al. (2014); Eom & Choi (2010); Kim et al. (2009) [15] | Fungi (98 species), Birds (28), Rodents (1), Reptiles (1) [15] |
| Aquatic | Danio rerio (zebrafish), Daphnia magna | 8 | Choi et al. (2010); Kim et al. (2016) [15] | Fish (157 species), Amphibians (5), Aquatic invertebrates (11) [15] |
| Integrated | Multi-compartment synthesis | 25 | Collection of studies from 2009-2019 [15] | Total Extended: Over 100 taxonomic groups [15] |
Table 2: Analysis of Evidence Gaps and Biases in KER Support Inspired by systematic maps in related fields, this table diagnoses the distribution of evidence, crucial for prioritizing future research and qualifying confidence in the AOP's tDOA [5] [28].
| Analysis Dimension | Evidence Clusters (Well-Represented) | Evidence Gaps (Under-Represented) | Implication for KER Confidence |
|---|---|---|---|
| Taxonomic Focus | Arthropods (esp. insects), Microorganisms, Vertebrates (fish, rodents) [5] | Annelids, Amphibians, Reptiles, Plants [5] | KERs may be less certain for gap taxa; extrapolation required. |
| Biological Metric | Abundance, Species Richness, Mortality [5] [28] | Functional Diversity, Phylogenetic Diversity, Behavioral Endpoints [5] [28] | AOP supports population-level AOs but may miss ecosystem function impacts. |
| Geographic Origin | Studies from USA, China, Brazil, European countries [5] | Studies from tropical and Global South regions [5] [28] | tDOA may be biased towards species and conditions in well-studied regions. |
| Practice/Intervention | Fertilizer use, pesticide application, crop diversification [5] | Combined practice effects, landscape-level management [5] | AOPs for single stressors are stronger than for complex mixture or multi-stressor scenarios. |
Clear visualization of the AOP structure and the evidence supporting each KER is fundamental. The following diagram depicts a generalized AOP network for a chemical stressor, highlighting the strength of KERs based on systematic review output.
Figure 2: AOP Network with Mapped KER Evidence and tDOA Support
Successful execution of this systematic methodology relies on a suite of specific tools and databases. The following table details these essential "research reagents."
Table 3: Key Digital Tools and Databases for Systematic KER Review
| Tool/Resource Name | Type | Primary Function in KER/tDOA Research | Access Link/Reference |
|---|---|---|---|
| Abstrackr | Screening Software | A semi-automated tool for accelerating the title/abstract screening phase of systematic reviews [5]. | https://abstrackr.cebm.brown.edu/ |
| SeqAPASS | Computational NAM | Predicts chemical susceptibility across species by analyzing protein sequence and structural conservation of molecular targets [9] [15]. | https://seqapass.epa.gov/seqapass/ |
| G2P-SCAN | Computational NAM | Estimates conservation of biological pathways across species from human gene inputs, providing pathway-level context for tDOA [9] [15]. | Rivetti et al. (2023) [9] |
| AOP-Wiki | Knowledgebase | Central repository for collaborative AOP development. Essential for KE ontology and hosting published AOPs [15]. | https://aopwiki.org/ |
| CompTox Chemicals Dashboard | Chemistry Database | Provides curated chemical information, identifiers, and linked bioactivity data (e.g., ToxCast) for stressor characterization [9]. | https://comptox.epa.gov/dashboard/ |
| Reactome | Pathway Database | A curated, peer-reviewed knowledgebase of biological pathways. Used by tools like G2P-SCAN for pathway mapping [9]. | https://reactome.org/ |
| RCSB Protein Data Bank | Structural Database | Provides 3D structural data for proteins, useful for understanding MIEs and informing SeqAPASS analysis [9]. | https://www.rcsb.org/ |
| Bayesian Network Software (e.g., Netica, AgenaRisk) | Modeling Software | Enables the construction, training, and inference of Bayesian Network models for quantitative KER analysis [15]. | Commercial & Open-Source Options |
This case study is situated within a broader thesis investigating the conservation of Key Event Relationships (KERs) across taxonomic groups. The central premise is that the mechanistic toxicity pathways of chemicals, when structured as Adverse Outcome Pathways (AOPs), often rely on biological processes conserved through evolution. Therefore, validating an AOP in one model organism provides a powerful, hypothesis-driven framework for predicting toxicity in other species, bridging human and ecological toxicology under a One Health perspective [15] [29]. Silver nanoparticles (AgNPs) serve as an exemplary stressor for this research due to their widespread use, documented toxicity, and a well-characterized AOP (AOP 207) initiating from oxidative stress [30] [15]. This study demonstrates a methodological workflow for extending the taxonomic Domain of Applicability (tDOA) of an existing AOP by integrating ecotoxicological data, human toxicology data, and in silico cross-species extrapolation tools [15].
Silver nanoparticles (AgNPs) are defined as particles with at least one dimension between 1 and 100 nm [30]. Their extensive commercial use in textiles, cosmetics, food packaging, and medical products for their antimicrobial properties leads to potential exposure via ingestion, inhalation, and dermal contact [30] [31]. A primary mechanism of AgNP toxicity is the induction of oxidative stress. This can occur via the "Trojan-horse" mechanism, where intracellular dissolution of AgNPs releases Ag⁺ ions that impair thiol (SH)-containing antioxidants like glutathione, or through surface reactions generating reactive oxygen species (ROS) [30] [32].
The Adverse Outcome Pathway (AOP) framework organizes this knowledge into a causal sequence: a Molecular Initiating Event (MIE) leads to a series of measurable Key Events (KEs), culminating in an Adverse Outcome (AO) [30] [2]. AOP 207: "NADPH oxidase and P38 MAPK activation leading to reproductive failure in Caenorhabditis elegans" provides the foundation for this case study [30] [15]. Its KEs include oxidative stress (MIE), PMK-1/p38 MAPK activation, HIF-1 activation, mitochondrial damage, DNA damage, apoptosis, and finally, reproductive failure (AO) [30].
Extending the tDOA of an AOP requires a multi-step, integrative approach that moves from data collection to computational validation and prediction.
The first phase involves a systematic gathering of existing evidence from diverse studies to construct a putative cross-species AOP network.
To evaluate the strength and confidence in the proposed AOP network, a probabilistic modeling approach is employed.
bnlearn) to infer the probabilistic dependency structure between KEs from the data, or impose the structure based on the hypothesized AOP [33].Computational tools are used to predict the biological plausibility of the AOP across a wide range of species, formally extending its tDOA [15].
The distribution of AgNPs, a modulating factor for toxicity, varies by particle size and organ.
Table 1: Size-Dependent Accumulation of Intravenously Administered AgNPs in Rat Organs [30]
| Organ | 20 nm AgNP Concentration (ng/g tissue) | 80 nm AgNP Concentration (ng/g tissue) | 110 nm AgNP Concentration (ng/g tissue) |
|---|---|---|---|
| Spleen | ~80 | ~1,600 | ~1,600 |
| Liver | 169 | 539 | 1,077 |
| Testes | Low and comparable for all sizes | Low and comparable for all sizes | Low and comparable for all sizes |
Applying the integrated in silico workflow to the AgNP reproductive toxicity AOP network successfully extended its biologically plausible tDOA far beyond the initial model organism.
Table 2: Extended Taxonomic Domain of Applicability for the AgNP Reproductive Toxicity AOP Network [15]
| Ecological Compartment | Initial tDOA (Model Species) | Extended tDOA (Number of Species/Groups) |
|---|---|---|
| Terrestrial | Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens (in vitro) | Fungi (98), Birds (28), Rodents (1), Reptiles (1), Nematodes (1) |
| Aquatic | Danio rerio (zebrafish), Daphnia magna | Fish (154), Arthropods (43), Amphibians (17), Mollusks (8), Annelids (3) |
Putative AOP for AgNP-Induced Male Reproductive Toxicity [30]
Workflow for Cross-Species AOP Development and tDOA Extension [15]
Table 3: Essential Reagents and Materials for AgNP AOP Research
| Item | Function/Description | Relevance to AOP Research |
|---|---|---|
| Characterized AgNPs | Nanoparticles of defined size (e.g., 20, 50, 100 nm), coating (e.g., PVP, citrate), and charge. | The foundational stressor. Physicochemical properties dictate dissolution, uptake, and MIE potency [30] [31]. |
| Thiol-containing Biomolecules | e.g., Glutathione (GSH), N-acetylcysteine (NAC), Cysteine. | Used to test the "thiol-scavenging" mechanism of Ag⁺ ions. Supplementation can rescue oxidative stress, confirming the MIE [30] [32]. |
| ROS Detection Assays | e.g., DCFH-DA, DHE, MitoSOX. | Fluorescent probes to quantitatively measure intracellular or mitochondrial ROS generation, a core MIE/early KE [30] [32]. |
| Mitochondrial Function Assays | e.g., JC-1 (membrane potential), MTT/XTT (metabolic activity), ATP luminescence kits. | Measure mitochondrial damage (a KE) resulting from oxidative stress [30] [31]. |
| Apoptosis Detection Kits | e.g., Annexin V/PI staining, caspase-3/7 activity assays. | Quantify apoptotic cell death, a critical KE preceding tissue/organ dysfunction [30] [31]. |
| AR Antagonism Assay | e.g., AR-CALUX, MDA-kb2 cell line. | Validated OECD test (No. 458) for detecting androgen receptor antagonism, a potential alternative MIE for reproductive toxicity AOPs [34]. |
| BN Analysis Software | e.g., R packages (bnlearn, gRain), commercial BN software. |
Essential for implementing the probabilistic quantitative assessment of KERs and building predictive qAOP models [33] [15]. |
Within the broader thesis on Key Event Relationship (KER) taxonomic conservation research, a central challenge is the development of robust, predictive models when empirical data linking molecular events to adverse outcomes is sparse or confined to a narrow range of species. The adverse outcome pathway (AOP) framework has been widely adopted to structure this causal knowledge, but its predictive power for ecological and human health risk assessment depends on a clear understanding of a pathway's taxonomic domain of applicability (tDOA) [15]. Traditionally, establishing tDOA required extensive, costly, and ethically challenging in vivo testing across multiple species. Today, the convergence of New Approach Methodologies (NAMs)—spanning in vitro, in chemico, and in silico tools—provides a revolutionary strategy for addressing these data gaps [9]. This whitepaper details a core, integrative methodology that combines computational toxicology, bioinformatics, and probabilistic modeling to extrapolate KER confidence across the tree of life, thereby reducing reliance on novel animal testing and accelerating the application of mechanistic data in safety decision-making [15].
The proposed framework is an iterative, weight-of-evidence process that transforms limited empirical KER data into a predictive, cross-species AOP network (AOPN). It progresses from data collation to quantitative assessment and finally to taxonomic extrapolation.
The initial phase involves the systematic aggregation of all available evidence related to a molecular initiating event (MIE) and its downstream key events (KEs). Data sources must include:
Each study is deconstructed, and its endpoints are mapped to standardized KE terms within the AOP framework. This integrated evidence forms a putative qualitative AOPN, representing hypothesized causal linkages across biological scales [15].
To move from qualitative linkage to quantitative prediction, the strength and uncertainty of each KER must be evaluated. A Bayesian Network (BN) modeling approach is recommended over deterministic regression for this purpose [15].
This phase extends the biologically plausible tDOA of the qAOP using two complementary in silico tools.
Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS): This tool predicts potential chemical susceptibility by comparing the primary through quaternary structures of a protein target (the MIE) across species [15] [9].
Genes-to-Pathways Species Conservation Analysis (G2P-SCAN): This tool evaluates the conservation of entire biological pathways (the chain of KEs) beyond the initial molecular target [15] [9].
The convergence of evidence from SeqAPASS (target-level susceptibility) and G2P-SCAN (pathway-level conservation) provides a robust, multi-layered rationale for extending the tDOA [15] [9].
Applying this integrated methodology yields two primary quantitative outputs: a vastly expanded tDOA and a probabilistic qAOP model. The power of the approach is demonstrated through case studies, such as the extension of an AOP for silver nanoparticle (AgNP)-induced reproductive toxicity.
Table 1: Cross-Species Extrapolation Results for AgNP Reproductive Toxicity AOP (AOP 207) [15]
| Ecological Compartment | Initial tDOA (Empirical Data) | Extended tDOA (In Silico Prediction) | Number of Taxonomic Groups |
|---|---|---|---|
| Terrestrial | C. elegans, D. melanogaster, H. sapiens (in vitro) | Fungi, Birds, Rodents, Reptiles, Nematodes | 128+ |
| Aquatic | C. riparius, D. rerio, H. sapiens (in vitro) | Fish, Amphibians, Crustaceans, Mollusks | 100+ |
| Combined Cross-Species AOPN | 3 Model Species | >228 Taxonomic Groups | >228 |
The quantitative confidence in the KERs underpinning this network is derived from the BN model. The performance of the computational prediction tools themselves can be benchmarked.
Table 2: Performance Metrics of Computational KER Extrapolation Methods
| Method | Primary Data Input | Output/ Prediction | Key Strength for KER Research | Typical Application Context |
|---|---|---|---|---|
| SeqAPASS [15] [9] | Protein sequence/structure | Taxonomic susceptibility for molecular target | High-throughput, broad taxonomic coverage | Establishing plausibility of MIE across species |
| G2P-SCAN [15] [9] | List of human genes | Pathway conservation across model species | Contextualizes MIEs within conserved biological processes | Supporting conservation of downstream KEs and pathways |
| Bayesian Network Modeling [15] | Empirical dose-response data | Probabilistic KERs with uncertainty quantification | Integrates disparate data types; handles variability and gaps | Building quantitative, predictive qAOPs for risk assessment |
| Agnolog Identification [36] | Transcriptomic & network data | Functionally equivalent genes/gene sets | Identifies conserved function beyond strict sequence homology | Mapping KERs in non-traditional or distantly related species |
Implementing this framework requires a suite of specific computational and data resources.
Table 3: Essential Research Toolkit for Cross-Species KER Analysis
| Item | Function in Research | Key Resource / Example |
|---|---|---|
| AOP-KB (AOP-Wiki) | Central repository for accessing, developing, and sharing structured AOP knowledge, including KEs and KERs [35]. | https://aopwiki.org/ |
| SeqAPASS Tool | Web-based tool for predicting protein target conservation and chemical susceptibility across species via sequence alignment [9]. | https://seqapass.epa.gov/seqapass/ |
| G2P-SCAN R Package | Tool for evaluating conservation of biological pathways (Reactome) from human gene lists across model species [15]. | R package G2P-SCAN |
| CompTox Chemicals Dashboard | Provides access to HTS data (ToxCast/Tox21), chemical properties, and bioactivity data to inform MIE and KE identification [9]. | https://comptox.epa.gov/dashboard |
| Bayesian Network Software | Platform for constructing, parameterizing, and running probabilistic models to quantify KERs under uncertainty [15]. | Netica, AgenaRisk, or R packages (bnlearn, gRain) |
| Protein Data Bank (PDB) | Repository for 3D structural data of proteins and complexes, critical for understanding MIEs at the atomic level [9]. | https://www.rcsb.org/ |
Diagram 1: Integrated workflow for cross-species KER analysis.
Diagram 2: Example AOP network for SSRI-induced feeding inhibition.
Diagram 3: Decision logic for extending the taxonomic domain of applicability.
Successfully implementing this strategy requires navigating several technical and philosophical challenges. A primary consideration is defining confidence thresholds for the in silico predictions; the "degree of similarity" in SeqAPASS or pathway coverage in G2P-SCAN that constitutes sufficient evidence for inclusion in the tDOA is context-dependent and must be justified [15] [9]. Furthermore, the field must grapple with the identification of "agnologs"—functionally equivalent genes or pathways that are not orthologous—which may be critical for transferring knowledge between evolutionarily distant species [36]. Finally, the validation of extrapolated KERs remains an iterative process. While this methodology minimizes animal testing, targeted in vitro assays in cells from predicted susceptible species or limited in vivo studies in non-traditional model organisms can provide crucial confirmatory evidence, strengthening the overall weight of evidence for regulatory application [15].
In the context of Key Event Relationship (KER) taxonomic conservation research, accurately distinguishing between true absence and lack of data is a fundamental challenge with significant implications for predictive model reliability and cross-species extrapolation. This distinction is critical within the Adverse Outcome Pathway (AOP) framework, where defining the taxonomic domain of applicability (tDOA) relies on precise understanding of whether a key event is genuinely not conserved in a taxonomic group or simply unmeasured [15]. The problem parallels challenges in ecological species distribution modeling, where models require information about where species are not found to accurately predict where they could exist [37] [38].
True absence in taxonomic analysis refers to confirmed non-conservation of a molecular initiating event, key event, or adverse outcome pathway across species, supported by empirical evidence or robust phylogenetic inference. In contrast, lack of data represents uncertainty stemming from insufficient investigation, where the taxonomic conservation status remains unknown due to limited research scope, methodological constraints, or inadequate detection methods [37]. This ambiguity directly impacts the confidence with which AOPs can be extrapolated for chemical safety assessment and drug development applications.
The consequences of misclassification are substantial. Overestimation of taxonomic applicability occurs when lack of data is misinterpreted as true absence, potentially leading to inappropriate cross-species predictions in toxicology or pharmacology. Conversely, underestimation of conservation patterns results when true absence is misclassified as lack of data, causing researchers to overlook legitimate taxonomic boundaries in pathway functionality. Within the broader thesis on KER taxonomic conservation, resolving this ambiguity enables more precise definition of AOP boundaries, enhances confidence in New Approach Methodologies (NAMs), and supports the development of reliable, taxonomically-aware predictive models for chemical and drug safety assessment [15] [9].
The terminology surrounding absence in taxonomic analysis requires precise differentiation to avoid conceptual confusion in KER research. These definitions establish the foundation for methodological approaches to resolving ambiguity.
True Absence represents confirmed non-existence of a biological element within a defined taxonomic context. In KER research, this indicates that a molecular initiating event, key event, or entire pathway is genuinely not conserved in certain taxonomic groups, supported by either direct empirical evidence or robust phylogenetic inference. True absence data implies that "environmental conditions are unsuitable for a species to survive" in ecological terms [37], which translates to biological contexts where phylogenetic distance or evolutionary divergence has resulted in non-conservation of specific molecular pathways. The confidence in true absence designation increases with repeated, methodologically appropriate surveys that would detect the element if present [38].
Pseudo-Absence (in ecological modeling) or Inferred Non-Conservation (in taxonomic analysis) represents inferred rather than confirmed absence. This concept is derived from species distribution modeling where pseudo-absence points are generated in locations where a species has not been recorded but might potentially exist [37]. In taxonomic conservation research, this parallels situations where preliminary evidence suggests non-conservation, but confirmatory studies are lacking. Pseudo-absence serves as a practical substitute when true absence data is unavailable but comparative analysis requires contrast between presence and absence conditions [38].
Lack of Data constitutes genuine uncertainty stemming from insufficient investigation rather than biological reality. This occurs when taxonomic groups have been inadequately studied, detection methods are insufficiently sensitive, or research scope has been limited. Lack of data represents the "unknown" category that must be systematically addressed rather than assumed to represent either presence or absence. In ecological contexts, this would equate to unsurveyed areas where no information exists about species distribution [37].
Table 1: Comparative Analysis of Absence Classifications in Taxonomic Research
| Classification | Definition | Confidence Level | Primary Source | Implications for KER Extrapolation |
|---|---|---|---|---|
| True Absence | Confirmed non-conservation supported by empirical evidence | High | Direct experimental evidence or robust phylogenetic analysis | Defines boundaries of tDOA; prevents over-extrapolation |
| Pseudo-Absence/Inferred Non-Conservation | Inferred absence based on available evidence but lacking confirmation | Medium to Low | Indirect evidence, preliminary data, or predictive modeling | Requires verification; useful for preliminary hypothesis generation |
| Lack of Data | Insufficient investigation to determine conservation status | Very Low | Gaps in research coverage or methodological limitations | Identifies research priorities; prevents erroneous conclusions |
The classification of absence types directly influences methodological choices in taxonomic conservation research. True absence data, when available, enables the most reliable definition of taxonomic domains of applicability for AOPs. However, comprehensive surveys to establish true absence are "time-consuming" and therefore rarely available across broad taxonomic ranges [37] [38]. This scarcity necessitates the development of alternative approaches that can differentiate between true absence and lack of data with reasonable confidence.
The prevalence ratio (proportion of occupied locations relative to absence points) significantly influences model accuracy in ecological contexts [37], suggesting analogous considerations in taxonomic analysis where the ratio of confirmed conservation to confirmed non-conservation across taxonomic groups affects the reliability of pattern identification. Furthermore, the method of generating pseudo-absence data—whether random, environmentally contrasted, or geographically constrained—affects outcomes in distribution modeling [37], indicating that the approach to handling uncertain taxonomic conservation similarly impacts conclusions in KER research.
A systematic, multi-tiered approach is required to differentiate true absence from lack of data in KER taxonomic conservation research. The following workflow integrates established ecological methods with novel computational toxicology approaches to address this challenge comprehensively.
Decision Framework for Differentiating Absence Types in KER Taxonomic Research
Protocol 1: Comprehensive Taxonomic Survey Assessment
Objective: Determine whether sufficient investigation has been conducted to support true absence designation.
Materials: Taxonomic database access (NCBI, Ensembl), systematic review tools, phylogenetic analysis software.
Procedure:
Interpretation: Taxonomic groups with high confidence scores and consistent negative results across multiple studies may be classified as true absence. Groups with limited or methodologically inadequate investigation remain as lack of data.
Protocol 2: Phylogenetic Signal Analysis
Objective: Utilize evolutionary relationships to distinguish true absence from lack of data.
Materials: Phylogenetic trees, sequence alignment tools, ancestral state reconstruction software.
Procedure:
Interpretation: Strong phylogenetic signal with distinct clades showing conserved and non-conserved patterns supports true absence designation for non-conserved clades. Weak or absent phylogenetic signal suggests insufficient data or convergent evolution.
Protocol 3: Integrated SeqAPASS and G2P-SCAN Analysis
Objective: Leverage computational NAMs to extend taxonomic domain assessments and resolve ambiguity [15] [9].
Materials: SeqAPASS tool access, G2P-SCAN software, reference proteomes, pathway databases.
Procedure:
Interpretation: Computational evidence can upgrade classification from lack of data to inferred non-conservation (pseudo-absence) or, when combined with other evidence, support true absence designation.
Protocol 4: Bayesian Network Modeling of KERs
Objective: Quantitatively assess confidence in absence designations through probabilistic modeling of key event relationships [15].
Materials: Bayesian network software, empirical data on KERs, prior probability estimates.
Procedure:
Interpretation: Bayesian networks provide quantitative confidence estimates for absence designations and can identify which uncertain classifications most significantly impact model predictions.
Table 2: Summary of Experimental Protocols for Resolving Absence Ambiguity
| Protocol | Primary Objective | Key Methodologies | Output Metrics | Strength for Absence Determination |
|---|---|---|---|---|
| Comprehensive Taxonomic Survey Assessment | Evaluate sufficiency of empirical investigation | Systematic review, detection probability assessment | Confidence scores, coverage metrics | Direct assessment of research gaps; identifies true lack of data |
| Phylogenetic Signal Analysis | Leverage evolutionary relationships to predict conservation | Ancestral state reconstruction, phylogenetic comparative methods | Phylogenetic signal metrics, ancestral state probabilities | Evolutionary context for absence patterns; distinguishes true absence from sampling artifacts |
| Integrated SeqAPASS and G2P-SCAN Analysis | Computational assessment of molecular and pathway conservation | Sequence alignment, pathway mapping, conservation scoring | Sequence similarity scores, pathway conservation metrics | Extends assessment beyond empirically studied taxa; provides mechanistic basis for absence |
| Bayesian Network Modeling of KERs | Quantitative probabilistic assessment of AOP functionality | Bayesian inference, sensitivity analysis, predictive modeling | Conditional probabilities, confidence intervals, sensitivity indices | Quantifies uncertainty in absence designations; identifies most impactful knowledge gaps |
Table 3: Research Reagent Solutions for KER Taxonomic Conservation Studies
| Reagent/Tool Category | Specific Examples | Function in Absence Determination | Key Considerations for Use |
|---|---|---|---|
| Cross-Species Antibody Panels | Phospho-specific antibodies conserved across taxa, domain-targeted antibodies | Detection of protein expression/post-translational modifications across species | Validate cross-reactivity for each taxon; consider epitope conservation |
| Conserved Molecular Probes | Fluorescent in situ hybridization (FISH) probes for conserved gene regions, activity-based protein profiling probes | Visualization of gene expression/protein activity patterns across taxa | Design against most conserved regions; test specificity in each taxon |
| Taxonomic-Broad PCR Primers | Degenerate primers for conserved functional domains, universal primer sets for gene families | Amplification of target sequences from diverse taxa for comparative analysis | Optimize annealing temperatures for broad specificity; include positive controls |
| Pathway Activity Reporters | Conserved response element-driven luciferase constructs, pathway activation biosensors | Functional assessment of pathway activity/conservation across cell types from different species | Normalize for transfection efficiency/species-specific cellular properties |
| Reference Tissue Banks | Multi-species tissue collections, cell line repositories from diverse taxa | Provide biological materials for comparative studies across under-represented taxa | Ensure proper preservation methods; document taxonomic verification |
Table 4: Computational Tools for Taxonomic Conservation Assessment
| Tool Name | Primary Function | Application in Absence Determination | Access/Reference |
|---|---|---|---|
| SeqAPASS | Protein sequence comparison across species to predict chemical susceptibility [9] | Assess conservation of molecular targets across taxa; identify taxa likely lacking specific targets | https://seqapass.epa.gov/seqapass/ [9] |
| G2P-SCAN | Evaluate biological pathway conservation from human gene inputs across model species [15] [9] | Determine if entire pathways (not just individual components) are conserved across taxa | R package v0.0.1.0 [9] |
| AOP-Wiki | Collaborative repository of adverse outcome pathways with taxonomic applicability information [15] | Access existing knowledge on AOP conservation; identify knowledge gaps for specific taxa | https://aopwiki.org/ |
| PhyloTree | Interactive visualization and analysis of phylogenetic relationships | Provide evolutionary context for absence patterns; identify clade-specific conservation | Multiple implementations available |
| Taxonomic Domain Mapper | Custom tool for visualizing tDOA across phylogenetic trees (conceptual) | Visual representation of conservation patterns and knowledge gaps across taxonomy | Development recommended based on [15] |
The integration of absence determination methodologies into AOP development follows a systematic workflow that enhances the confidence in tDOA specification. This approach is exemplified by recent research extending the tDOA for AOP 207 involving reproductive toxicity of silver nanoparticles via oxidative stress in Caenorhabditis elegans [15].
Integrated AOP Development Workflow with Absence Ambiguity Resolution
The application of absence determination methodologies is illustrated by the extension of AOP 207 (NADPH oxidase and P38 MAPK activation leading to reproductive failure in Caenorhabditis elegans) to a broader taxonomic domain [15].
Experimental Approach:
Key Findings:
Quantitative Outcomes:
Table 5: Quantitative Results from AOP 207 Taxonomic Extension Study [15]
| Taxonomic Group | Initial Evidence Base | Absence Determination Outcome | Extended tDOA Inclusion | Confidence Level |
|---|---|---|---|---|
| Fungi | Limited direct studies | SeqAPASS indicated protein conservation; G2P-SCAN suggested pathway functionality | Included (98 species) | Medium (computational evidence) |
| Birds | Some ecotoxicology data | Phylogenetic analysis suggested conservation; limited empirical confirmation | Included (28 species) | Medium-High |
| Fish | Substantial ecotoxicology literature | Strong evidence of pathway conservation; some species-specific variations | Included (not quantified in source) | High |
| Insects (beyond Drosophila) | Limited studies | Computational prediction suggested conservation; empirical data lacking | Conditionally included | Low-Medium |
The rigorous differentiation between true absence and lack of data provides substantial value throughout the drug development pipeline, particularly in safety assessment and species selection for toxicology studies.
Early Discovery Phase Applications:
Preclinical Development Applications:
Regulatory Submission Support:
The integration of absence determination protocols enables development of quantitative decision frameworks for cross-species extrapolation in toxicological assessment.
Confidence Scoring System:
Decision Thresholds:
Implementation in Risk Assessment:
The field of absence determination in taxonomic analysis is evolving rapidly with several promising directions for methodological advancement:
High-Throughput Experimental Approaches:
Advanced Computational Integration:
Knowledge Synthesis Frameworks:
Based on the methodologies and applications presented, the following recommendations emerge for researchers addressing absence ambiguity in KER taxonomic conservation:
Methodological Recommendations:
Integration Recommendations:
Translational Recommendations:
The systematic differentiation between true absence and lack of data represents more than a technical challenge—it constitutes a fundamental requirement for robust, reliable, and responsible extrapolation of adverse outcome pathways across taxonomic boundaries. By implementing the rigorous methodologies outlined in this framework, researchers can transform absence ambiguity from a source of uncertainty to a structured component of evidence-based decision-making in toxicological science and drug development.
The taxonomic domain of applicability (tDOA) defines the biological space—the species and taxa—within which a defined Key Event Relationship (KER) is considered biologically plausible [9]. Within the Adverse Outcome Pathway (AOP) framework, which structures mechanistic toxicological knowledge from a Molecular Initiating Event (MIE) to an Adverse Outcome (AO), accurately defining the tDOA is critical for reliable cross-species extrapolation in ecological and human health risk assessment [15]. A persistent and central challenge is managing the inherent tension between claiming a broad tDOA to maximize the utility of existing data for prediction and ensuring biological realism by acknowledging evolutionary divergence and taxonomic specificity.
Overly broad tDOA claims, while useful for screening-level assessments, risk generating false positives in toxicity predictions by assuming pathway conservation where it does not exist. Conversely, an overly narrow tDOA can lead to false negatives and a failure to protect susceptible species, unnecessarily complicating risk assessment and demanding extensive new animal testing [9]. This technical guide, framed within the broader thesis on KER taxonomic conservation, addresses this challenge. It provides a methodological roadmap for researchers and drug development professionals to systematically evaluate, evidence, and bound the tDOA of their KERs. The goal is to achieve a defensible balance that supports the use of New Approach Methodologies (NAMs) while maintaining scientific credibility and regulatory acceptance [15] [9].
A KER's tDOA exists on a spectrum. At one end are highly conserved relationships, often rooted in fundamental cellular processes (e.g., oxidative phosphorylation, DNA repair) shared across vast taxonomic groups. At the other end are taxon-specific KERs, dependent on unique anatomical features, receptor subtypes, or metabolic pathways found only in certain clades. The core task is to determine where a given KER falls on this spectrum. This requires moving beyond assumptions based solely on phylogenetic relatedness and towards evidence-based assessments of the conservation of the specific molecular targets and pathways involved [9].
The consequences of insufficient specificity are not merely theoretical. In population genetics, analogous problems arise when inferring range expansions from genetic data. Boundary effects in spatially structured populations—where genetic drift is stronger at distribution edges—can create clinal patterns in genetic indices (like the directionality index, ψ) that mimic the signatures of a true range expansion, leading to high false positive rates if not properly accounted for [39]. This is a powerful analogue for tDOA assessment: a superficial pattern (e.g., a toxic response in several tested species) can create a false signal of broad conservation. Just as population geneticists must normalize ψ against overall genetic structuring (FST) to identify true expansion signals [39], toxicologists must evaluate KER conservation against the background of known molecular and pathway divergence to avoid over-extrapolation.
A tDOA can be supported by two complementary lines of evidence:
A robust tDOA description requires both. The biologically plausible tDOA, often wider than the empirically demonstrated one, must be carefully constructed and transparently documented to prevent overreach [15].
A systematic, multi-tool approach is essential to manage specificity. The following workflow integrates established computational NAMs to build a weight of evidence.
1. Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS):
2. Genes-to-Pathways Species Conservation Analysis (G2P-SCAN):
The outputs from SeqAPASS and G2P-SCAN must be integrated with existing empirical data and AOP knowledge.
A 2024 study provides a paradigm for this balanced approach. Researchers aimed to extend the tDOA of AOP 207, which describes silver nanoparticle (AgNP)-induced reproductive toxicity via oxidative stress in C. r elegans [15].
1. Initial Position: The empirical tDOA was narrow: primarily C. elegans and some limited in vitro human data.
2. Methodology Application:
3. Outcome: The integrated computational analysis allowed the authors to propose a biologically plausible tDOA extending to over 100 taxonomic groups, including fungi, birds, and rodents, far beyond the empirically tested ones. This extension was not a blanket claim but was supported by specific evidence on pathway and target conservation [15].
Table 1: Key Quantitative Outcomes from the AgNP AOP Case Study [15]
| Analysis Component | Tool/Method Used | Key Quantitative Output | Interpretation for tDOA |
|---|---|---|---|
| Molecular Target Conservation | SeqAPASS | High sequence identity (>80%) and functional domain conservation for oxidative stress targets across diverse taxa. | Supported inclusion of vertebrates and invertebrates in the plausible tDOA. |
| Pathway Conservation | G2P-SCAN | High Pathway Conservation Score (PCS > 0.7) for "Cellular response to stress" in core model species (zebrafish, fruit fly). | Induced that the overarching pathway is functionally conserved, strengthening cross-species plausibility. |
| KER Confidence | Bayesian Network | Probabilistic strength for key KERs (e.g., "Oxidative stress leads to apoptosis") was robust when integrating human in vitro and C. elegans in vivo data. | Provided quantitative confidence for extrapolating the KER structure across species within the proposed tDOA. |
Table 2: Research Reagent Solutions for tDOA Investigations
| Item / Resource | Category | Function in tDOA Management |
|---|---|---|
| SeqAPASS (Web Tool) | Computational NAM | Predicts protein target conservation and potential chemical susceptibility across species using sequence and structural data [9]. |
| G2P-SCAN (R Package) | Computational NAM | Evaluates conservation of biological pathways (Reactome) from human gene sets across model species, providing a systems-level view [15] [9]. |
| AOP Wiki | Knowledge Repository | Central database for published AOPs and KERs; provides the structured framework to which tDOA evidence must be anchored [15]. |
| Comparative Tissue Biobanks | Biological Material | Provide preserved tissues from multiple species for in vitro or ex vivo assays (e.g., receptor binding, gene expression) to generate empirical conservation data. |
| Phylogenetic Analysis Software | Computational Tool | Allows construction of phylogenetic trees based on target gene sequences, visually contextualizing conservation data within evolutionary relationships. |
| Defined Reference Chemicals | Chemical Reagent | Chemicals with well-characterized, specific modes of action (agonists, antagonists) are essential for testing KER performance across different species' models. |
Purpose: To empirically test a specific Molecular Initiating Event (e.g., receptor activation) across species to bound the tDOA. Materials: Cell lines or primary cells from multiple species (human, rat, zebrafish, etc.), reference agonist/antagonist, reporter assay kit (e.g., luciferase), cell culture reagents. Procedure:
Purpose: To computationally propose a biologically plausible tDOA using SeqAPASS and G2P-SCAN. Materials: Protein sequence of the key molecular target (FASTA format), list of human genes comprising the KER pathway, access to SeqAPASS web tool and G2P-SCAN R package. Procedure:
Balancing broad tDOA claims with biological realism is not a one-time exercise but a dynamic, evidence-driven process. By adopting the integrated methodological framework presented here—leveraging computational NAMs like SeqAPASS and G2P-SCAN within the AOP paradigm—researchers can replace assumption-based extrapolation with evidence-bounded extrapolation. The resulting tDOA statements are both more scientifically defensible and more useful for regulatory application. They enable confident use of data across species where justified, flag potential vulnerabilities in untested taxa, and strategically focus precious resources for empirical testing on true taxonomic boundaries. In doing so, they advance the core mission of KER taxonomic conservation research: to build a predictive toxicology capable of protecting biological diversity based on a deep understanding of biological unity and difference.
The expansion of the taxonomic domain of applicability (tDOA) for Adverse Outcome Pathways (AOPs) is a cornerstone for advancing next-generation, animal-sparing risk assessment. This requires the strategic integration of disparate evidence streams. This technical guide details a cohesive methodology for harmonizing in vitro bioactivity, in vivo phenotypic anchoring, and in silico cross-species extrapolation data. Framed within research on Key Event Relationship (KER) taxonomic conservation, we present a stepwise workflow from target identification to tDOA validation. The protocol leverages computational New Approach Methodologies (NAMs), including the SeqAPASS and G2P-SCAN tools, to build a weight-of-evidence for pathway conservation [9] [15]. A case study on antidiabetic phytochemicals demonstrates the quantitative correlation of in vitro enzyme inhibition (IC₅₀: 55.08–246.5 μg/mL) with in silico binding affinity and positive in vivo outcomes [40]. This integrative framework provides researchers and drug development professionals with a standardized, predictive approach for establishing ecologically and toxicologically relevant tDOAs.
The Adverse Outcome Pathway (AOP) framework provides a mechanistic bridge between a Molecular Initiating Event (MIE) and an Adverse Outcome (AO) through a series of Key Events (KEs). A critical, yet often poorly defined, element of an AOP is its Taxonomic Domain of Applicability (tDOA)—the range of species for which the described KERs are biologically plausible [15]. Explicitly defining the tDOA is essential for the reliable application of AOPs in chemical safety assessment across ecological and human health contexts under a One Health perspective [15].
Traditional tDOA definition relies on limited in vivo toxicity data from standard model organisms, creating significant uncertainty for extrapolation. Research into KER taxonomic conservation seeks to solve this by determining which key relationships in a toxicological pathway are conserved across phylogeny. This demands the integration of diverse data types:
Harmonizing these disparate data streams into a cohesive evidentiary package is the key to robustly and confidently expanding tDOAs, thereby reducing dependency on animal testing and improving risk predictions for untested species [9] [15].
A robust tDOA assessment is built on three pillars of evidence, each with standardized protocols.
In vitro systems provide the foundational data on chemical-target interactions.
Computational tools predict the conservation of molecular targets and pathways.
In vivo studies confirm the pathway leading from the MIE to the AO.
Table 1: Quantitative Data from an Integrated Antidiabetic Study [40]
| Assay Type | Target/Endpoint | Test Compound | Result (Mean ± SD or IC₅₀) | Control (Acarbose) |
|---|---|---|---|---|
| In Vitro | α-Amylase Inhibition | Cicer arietinum extract | 55.08 μg/mL | 196.3 ± 10 μg/mL |
| In Vitro | α-Amylase Inhibition | Hordeum vulgare extract | 115.8 ± 5 μg/mL | 196.3 ± 10 μg/mL |
| In Vitro | α-Glucosidase Inhibition | Cicer arietinum extract | 100.2 ± 5 μg/mL | 246.5 ± 10 μg/mL |
| In Vitro | α-Glucosidase Inhibition | Hordeum vulgare extract | 216.2 ± 5 μg/mL | 246.5 ± 10 μg/mL |
| In Silico | Molecular Docking (α-Amylase) | Medicagol | Strong binding affinity (specific score not provided) | N/A |
| In Vivo | Blood Glucose Reduction | C. arietinum extract | Significant reduction in STZ-mice | N/A |
| In Vivo | Antioxidant Activity (Liver) | C. arietinum extract | Increased SOD, CAT, GSH; decreased MDA | N/A |
The individual data streams must be logically synthesized. A Bayesian network (BN) modeling approach is particularly effective for integrating heterogeneous data and managing uncertainty in KERs [15].
Table 2: tDOA Extension for a Hypothetical AOP [15]
| Ecological Compartment | Initial tDOA (Empirical Data) | Extended tDOA (In Silico Prediction) |
|---|---|---|
| Terrestrial | Caenorhabditis elegans, Drosophila melanogaster | Fungi (98 species), Birds (28 species), Rodents, Reptiles |
| Aquatic | Danio rerio (zebrafish) | Bony fishes (multiple orders), Amphibians |
Integrated Data Workflow for tDOA
Cross-Species AOP Extrapolation via tDOA
Table 3: Key Research Reagent Solutions for Integrated tDOA Studies
| Item | Category | Function in tDOA Research | Example/Supplier |
|---|---|---|---|
| α-Amylase/α-Glucosidase Assay Kits | In Vitro Reagent | Measures inhibitory potential of chemicals against carbohydrate-digesting enzymes, defining potency for an MIE [40]. | Sigma-Aldrich, Global Scientific [40] |
| Streptozotocin (STZ) | In Vivo Reagent | Chemical inducer of diabetes in rodent models, used for phenotypic anchoring of metabolic disruptors [40]. | Sigma-Aldrich [40] |
| Acarbose | In Vitro/In Vivo Control | Standard inhibitor drug used as a positive control in enzyme inhibition assays and in vivo studies [40]. | Pharmaceutical grade |
| AutoDock Vina, GOLD | In Silico Software | Performs molecular docking to predict binding affinity and mode of a ligand to a protein target, informing the MIE [40]. | Open Source / Commercial |
| SeqAPASS Web Tool | In Silico Tool | Predicts protein susceptibility and conservation across species via sequence alignment, core to tDOA expansion [9] [15]. | US EPA (Publicly available) |
| G2P-SCAN R Package | In Silico Tool | Evaluates the conservation of entire biological pathways across model species, providing systems-level evidence [9] [15]. | Publicly available |
| Bayesian Network Software | Data Analysis Tool | (e.g., Netica, GeNIe) Integrates probabilistic data from different streams to model KERs and quantify uncertainty [15]. | Commercial & Open Source |
| Reference Protein Structures | In Silico Data | High-resolution 3D structures (e.g., from PDB ID 1B2Y for α-amylase) are essential for molecular docking studies [40]. | RCSB Protein Data Bank |
The harmonization of in vitro, in vivo, and in silico evidence is not sequential but iterative. In silico predictions can prioritize in vitro testing on non-standard species cell lines, the results of which can refine computational models. The ultimate goal is a predictive, pathway-based framework where a well-defined MIE and its associated KERs, supported by strong conservation evidence, can be used to anticipate AOs in a wide range of species within the tDOA with high confidence.
Future advancements will depend on:
By adopting the integrative framework outlined here, researchers can systematically build and expand the tDOA of AOPs, transforming them from descriptive models into powerful, predictive tools for chemical safety evaluation in the 21st century.
Evaluating the taxonomic domain of applicability (tDOA) of an Adverse Outcome Pathway (AOP) is a fundamental challenge in modern regulatory toxicology and chemical safety assessment. The tDOA defines the range of species for which the causal relationships described within an AOP—a structured sequence of events linking a molecular perturbation to an adverse outcome—are considered biologically plausible and operative [41]. Establishing a robust tDOA is critical for cross-species extrapolation, a core component of the One Health approach that seeks to protect human, animal, and environmental health in an integrated manner [41].
The central building block of an AOP is the Key Event Relationship (KER), which describes a scientifically supported, causal link between an upstream and a downstream Key Event (KE) [11] [10]. The confidence in any AOP, and by extension its tDOA, hinges entirely on the collective weight of evidence for its constituent KERs [11] [10]. This technical guide proposes the application of a modified set of Bradford-Hill (BH) "viewpoints"—originally formulated for epidemiological causation—as a rigorous, structured framework to assess the evidence supporting the taxonomic conservation of KERs [42] [43]. By adapting this framework to the context of comparative biology and pathway conservation, researchers can systematically evaluate tDOA evidence, moving beyond assumptions based solely on phylogenetic proximity to a more mechanistic, evidence-based determination of applicable taxa.
Sir Austin Bradford Hill proposed nine "viewpoints" (often termed "criteria") to guide the assessment of whether an observed association might reflect a causal relationship [44]. He emphasized they were not a checklist but considerations to weigh [42] [45]. Modern causal thinking, built on the potential outcomes framework, has refined the application and interpretation of these viewpoints [42] [45]. For the specialized task of evaluating tDOA, a subset of these viewpoints is particularly relevant, and their interpretation requires modification to address questions of biological conservation across species.
The following table outlines the traditional BH viewpoints, their modern reinterpretation in light of contemporary causal inference frameworks like Directed Acyclic Graphs (DAGs) and Sufficient-Component Cause (SCC) models, and their proposed modification for application to tDOA assessment for KERs [42] [43] [45].
Table 1: Modification of Bradford-Hill Viewpoints for tDOA Assessment of Key Event Relationships
| Bradford-Hill Viewpoint | Modern Interpretation & Role in Causal Inference | Modified Application to KER Taxonomic Conservation (tDOA) |
|---|---|---|
| Strength of Association | A strong association is less likely to be fully explained by unmeasured confounding. Statistical significance and effect size are considered [43]. | The degree of evolutionary conservation of the molecular sequence (e.g., protein target) and the functional response of the intervening biological pathway across taxa. Strong, conserved sequence-structure-function relationships support a broader tDOA. |
| Consistency | Reproducible findings across different studies, locations, and populations. In modern practice, consistency is also sought across different types of evidence (e.g., epidemiological, in vitro, in vivo) [43]. | Observation of the KER (upstream KE leads to downstream KE) across multiple, taxonomically diverse species. Consistency in the direction and essential nature of the relationship strengthens tDOA evidence. |
| Specificity | Considered rare in multifactorial disease etiology. A more useful modern concept is the use of "negative controls" or falsification analyses [42] [45]. | Demonstration that the downstream KE does not occur in taxonomic groups where the upstream molecular target or essential pathway component is legitimately absent or non-functional. This helps define the boundaries of the tDOA. |
| Plausibility | Biological plausibility is informed by current knowledge. DAGs and SCC models help articulate plausible mediating pathways and component interactions [42]. | Biological plausibility for conservation is based on established principles of evolutionary biology, comparative genomics, and the essentiality of the pathway for conserved physiological functions. |
| Coherence | The causal interpretation should not conflict with generally known facts of the natural history of the disease [44]. | The hypothesized tDOA should be coherent with known phylogenetic relationships, life histories, and ecological/physiological adaptations of the species in question. |
| Experiment | Evidence from experimental interventions (e.g., randomized trials) provides the strongest support for causality [42]. | Experimental evidence demonstrating that modulation of the upstream KE (e.g., via chemical inhibition, genetic knockout) prevents or alters the downstream KE in multiple species. This is a powerful line of evidence for KER essentiality and conservation. |
| Analogy | Reasoning based on similar, established cause-effect relationships [44]. | Inference of KER conservation in a new taxon based on its established operation in a well-studied surrogate species, considering analogous anatomical structures, physiological processes, and molecular pathways. |
Applying the modified BH viewpoints to tDOA evaluation involves a sequential, evidence-weighted process. The workflow begins with the definition of the KER of interest and proceeds through the assembly and assessment of evidence for the conservation of its biological underpinnings [10] [9].
Diagram 1: Workflow for Applying Modified BH Viewpoints to tDOA Assessment (Max. 760px)
Step 1: Identify Essential Molecular Target/Pathway The evaluation begins by deconstructing the KER to identify the essential molecular target(s) (e.g., a specific enzyme, receptor, or ion channel) and the biological pathway that mechanistically links the upstream and downstream Key Events. This is the foundational unit for conservation analysis [9].
Step 2: Assess Taxonomic Conservation of Target This step investigates the strength, plausibility, and coherence of target conservation. Computational New Approach Methodologies (NAMs) are critical here. The US EPA's SeqAPASS tool analyzes protein sequence and structural similarity across species to predict potential chemical susceptibility, providing a line of evidence for the conservation of the molecular initiating event [9]. Complementary tools like G2P-SCAN map human gene targets to biological pathways (e.g., Reactome) and assess the conservation of those entire pathways across a core set of model species [9]. High sequence similarity in critical functional domains and conservation of core pathway architecture support a broader tDOA.
Step 3: Evaluate Functional Conservation of KER This step gathers evidence for consistency and experiment. It involves reviewing empirical data demonstrating that the causal relationship described in the KER holds in multiple species. This includes in vivo or in vitro studies showing that perturbation of the upstream KE leads to the downstream KE in taxonomically diverse organisms [11] [43]. Dose-response data (biological gradient) within a species further strengthens the causal claim for that species, while consistent directional effects across species bolster the case for conservation.
Step 4 & 5: Integrate Evidence and Test Boundaries All evidence is integrated to propose a preliminary tDOA. The final, critical step is to apply the principle of specificity by actively seeking falsification evidence. Are there taxonomic groups related to those within the proposed tDOA that legitimately lack the molecular target or pathway? If the KER is claimed to be broadly conserved, evidence of its absence in a well-studied species (where confounding factors are ruled out) would sharply delineate the tDOA boundary [42].
Generating and compiling evidence for tDOA requires both empirical biology and bioinformatics. The following table details key experimental and computational protocols relevant to assessing different BH viewpoints.
Table 2: Protocols for Generating tDOA Evidence Aligned with BH Viewpoints
| BH Viewpoint | Experimental/Computational Protocol | Objective & Relevance to tDOA |
|---|---|---|
| Strength & Plausibility | SeqAPASS Analysis: Input the amino acid sequence of the protein target from a reference species (e.g., human). The tool performs tiered assessments (primary, secondary, tertiary) comparing sequence, domain, and active site conservation across species in its database [9]. | Provides quantitative data on protein conservation. High percent identity/alignment scores in functional domains provide strength for the hypothesis of conserved molecular interaction. The biological plausibility of extrapolation is grounded in evolutionary biology. |
| Plausibility & Coherence | G2P-SCAN Pathway Analysis: Input human gene symbols for the molecular target and associated pathway components. The tool maps genes to Reactome pathways and evaluates the conservation of the pathway architecture and gene-content across seven core model organisms [9]. | Moves beyond single-protein conservation to assess the plausibility of the entire intervening pathway being conserved. Results should cohere with known phylogenetic relationships and physiological adaptations. |
| Consistency & Experiment | Multi-Species In Vitro Assay: Employ standardized cell-based assays (e.g., reporter gene assays, high-content imaging) using primary cells or cell lines from multiple species to measure the downstream KE response to modulation of the upstream KE [43] [41]. | Provides direct experimental evidence for the functional operability of the KER across species. Consistency in the response direction and potency across taxa is powerful supporting evidence. |
| Experiment & Analogy | Essentiality Testing (e.g., CRISPR/Cas9): Use genetic knockout or knockdown of the upstream KE target in embryo or adult models of multiple species (e.g., zebrafish, mouse) and assess the impact on the downstream KE and adverse outcome [11]. | Provides the strongest possible experimental evidence for the KER's essential role. Successful analogy from one model organism to another is supported if the same intervention produces comparable phenotypic results. |
| Specificity | Negative Control / Falsification Analysis: Intentionally investigate species groups (e.g., insects, mollusks for a vertebrate-specific hormone receptor) where the target pathway is known to be absent or fundamentally different. Confirm the absence of the KER response [42]. | Actively tests the boundaries of the tDOA. The absence of effect where the mechanism is absent provides high confidence in the specificity of the KER to taxa possessing the conserved mechanism. |
The integration of computational and empirical data is best visualized in a converging lines-of-evidence model.
Diagram 2: Integration of Evidence Streams for tDOA Confidence (Max. 760px)
Successfully applying this framework requires a suite of specialized databases, software tools, and experimental resources.
Table 3: Research Toolkit for tDOA Evidence Evaluation
| Tool/Resource Name | Type | Primary Function in tDOA Evaluation | Relevant BH Viewpoints |
|---|---|---|---|
| SeqAPASS (Sequence Alignment to Predict Across Species Susceptibility) | Computational Tool / NAM | Predicts potential chemical susceptibility across species by analyzing conservation of protein sequences, functional domains, and active sites [9]. | Strength, Plausibility |
| G2P-SCAN (Genes to Pathways - Species Conservation Analysis) | Computational Tool / NAM | Maps human gene sets to biological pathways and evaluates the conservation of those pathways across seven model species, providing pathway-level context [9]. | Plausibility, Coherence, Analogy |
| AOP-Wiki (aopwiki.org) | Knowledgebase | The central repository for published AOPs, KEs, and KERs. Provides the structured descriptions and existing evidence that form the starting point for tDOA analysis [11] [10]. | All (Foundation) |
| Reactome (reactome.org) | Pathway Database | A curated, peer-reviewed database of human biological pathways. Serves as a reference for pathway architecture used by tools like G2P-SCAN to assess conservation [9]. | Plausibility |
| OECD AOP Developers' Handbook | Guidance Document | Provides formal guidance on AOP development, including weight-of-evidence assessment and considerations for taxonomic applicability [10]. | All (Framework) |
| In vitro Bioactivity Data (e.g., ToxCast/Tox21) | Empirical Data | High-throughput screening data showing chemical effects on molecular targets. Can be used to identify potential molecular initiating events and assess conservation of target response [9]. | Experiment, Consistency |
| Ortholog Databases (e.g., Ensembl Compara, OrthoDB) | Bioinformatics Database | Provide predictions of orthologous genes (genes diverged after a speciation event) across species, which are crucial for correct cross-species comparisons [9]. | Strength, Plausibility, Coherence |
The determination of a toxicological pathway's taxonomic domain of applicability is a critical inference with major implications for ecological risk assessment, chemical regulation, and the reduction of animal testing through cross-species extrapolation. By adapting the time-tested Bradford-Hill viewpoints to the specific question of KER conservation, researchers gain a structured, transparent, and scientifically defensible framework for tDOA evaluation. This modified approach moves beyond qualitative guesswork, demanding convergent evidence from computational predictions of conservation (strength, plausibility), empirical demonstrations of functional operability across species (consistency, experiment), and deliberate testing of proposed boundaries (specificity). As the AOP knowledgebase expands and computational NAMs become more sophisticated, the systematic application of this framework will be essential for building confidence in pathway-based safety assessments and realizing the promise of 21st-century toxicology.
In scientific assessments for environmental conservation, human health, and drug development, decision-making is rarely supported by a single, definitive study. Instead, it relies on synthesizing multiple lines of evidence of varying types and quality [46]. The Weight of Evidence (WoE) approach is a structured process for integrating this diverse evidence to determine the relative support for possible answers to a scientific or risk assessment question [47]. Critically, a robust WoE argument does not choose between quantitative (quant) and qualitative (qual) data but strategically integrates both to leverage their complementary strengths [48] [49].
Quantitative data provides objective, numerical measurements that answer "how many," "how much," or "how often," enabling statistical analysis and generalization [50] [51]. Qualitative data provides descriptive, contextual information that explores "why" and "how," uncovering meanings, mechanisms, and subjective experiences [48] [49]. In the context of Key Event Relationship (KER) taxonomic conservation research—which investigates the relationships between stressors, biological key events, and adverse outcomes in species and ecosystems—this integration is paramount. Conservation decisions must consider not only population statistics (quantitative) but also behavioral observations, genetic purity, and ecological context (qualitative) [52] [53].
This guide outlines a framework for building a defensible WoE argument by systematically assembling, weighing, and integrating quantitative and qualitative evidence, with a focus on applications in taxonomic conservation and biomedical research.
Understanding the inherent characteristics and appropriate applications of each data type is the first step in their integration.
Table 1: Core Characteristics of Quantitative and Qualitative Data and Research [48] [49] [50].
| Characteristic | Quantitative Data & Research | Qualitative Data & Research |
|---|---|---|
| Nature of Data | Numerical, measurable, countable [50]. | Descriptive, involving words, images, or observations [51]. |
| Core Question | What? How many? How much? How often? [48]. | Why? How? What is the experience? [48]. |
| Research Goal | To test hypotheses, measure variables, establish patterns, and generalize [51]. | To explore ideas, understand concepts, experiences, and generate deep insights [51]. |
| Sample & Design | Large samples for statistical power; structured and predetermined design [51]. | Small, focused samples for depth; flexible and iterative design [51]. |
| Collection Methods | Surveys, experiments, structured observations, analysis of existing metrics [50]. | In-depth interviews, focus groups, participant observation, open-ended surveys [51]. |
| Analysis Approach | Statistical analysis to identify relationships, differences, and trends [51]. | Thematic, content, or discourse analysis to identify patterns, themes, and narratives [51]. |
| Output | Statistical significance, effect sizes, predictive models [51]. | Detailed descriptions, conceptual frameworks, hypotheses, and illustrative quotes [51]. |
Advantages and Limitations: Quantitative data excels at providing objective, generalizable, and statistically testable evidence but may miss contextual nuance and underlying causes [50]. Qualitative data provides rich, explanatory depth and is ideal for exploring complex phenomena but is subject to researcher interpretation and is not statistically generalizable [51]. An integrated WoE approach mitigates these individual limitations by using each data type to address the gaps of the other.
The WoE process is more than a simple tally of studies. It is a transparent and structured methodology for assembling and weighing diverse evidence [46]. Best practice integrates the rigorous, bias-minimizing approach of Systematic Review (SR) with the inferential judgment characteristic of traditional WoE [46].
The European Food Safety Authority (EFSA) guidance outlines a three-step WoE assessment: (1) assembling evidence, (2) weighing evidence, and (3) integrating evidence to reach a conclusion [47]. Integrating SR principles ensures the assembly phase is comprehensive and unbiased.
Table 2: Integrated SR & WoE Framework, Adapted from Classic Approaches [46] [47].
| Assessment Phase | Integrated Activities & Considerations |
|---|---|
| 1. Problem Formulation | Define the specific KER or assessment question. Determine the required lines of evidence (e.g., exposure, toxicity, ecological effect). |
| 2. Assemble Evidence (SR-driven) | Conduct a systematic literature search and screening for all evidence types [46]. Extract data from quant studies (e.g., effect sizes) and qual studies (e.g., themes, mechanistic descriptions). Include grey literature, field data, and expert input where relevant [46]. |
| 3. Weigh Evidence | Evaluate each piece of evidence for reliability (methodological quality, risk of bias), relevance (directness to the KER), and consistency (agreement across studies) [47]. Use predefined scoring criteria or ranking (e.g., high, medium, low confidence). |
| 4. Integrate Evidence | Triangulate findings across qualitative and quantitative lines of evidence. Examine if different data types converge (strengthens conclusion), are complementary (provides complete picture), or contradict (requires resolution). Use formal methods (e.g., meta-analysis for quant data) or structured expert judgment (e.g., Hill's criteria) to draw an inference [46]. |
| 5. Document and Conclude | Clearly state the conclusion (e.g., "The evidence is sufficient/insufficient to support the KER..."). Articulate the uncertainty and the relative contribution of qualitative and quantitative evidence to the conclusion. |
For assessing causal KERs (e.g., "Does chemical X cause population decline in species Y?"), Bradford Hill's criteria provide a qualitative-quantitative framework for weighing evidence [46]. These include strength of association (quantitative), consistency across studies (quantitative), specificity, temporality, biological gradient (dose-response, quantitative), plausibility (often qualitative, mechanistic evidence), coherence, experiment, and analogy [46]. Not all criteria must be met, but a WoE judgment considers the pattern across them.
Title: A Weight of Evidence Integration Framework Based on Hill's Criteria
The conservation challenge of the Australian dingo (Canis familiaris dingo) threatened by hybridization with domestic dogs (C. f. familiaris) exemplifies the need for a WoE approach integrating multiple data types [52].
Assessment Question: What is the genetic purity and conservation status of a dingo population?
Quantitative - Genetic Analysis:
Quantitative - Morphometric Analysis:
Qualitative - Phenotypic (Coat Colour) Assessment:
Qualitative - Behavioral & Ecological Observation:
A robust WoE argument for a management plan (e.g., removing hybrids from a conservation area) would not rely on coat colour alone. It would prioritize high-weight genetic evidence to definitively identify hybrids, use morphometric data from culled animals to validate genetic findings, and employ field observations to understand pack dynamics and the ecological impact of removal. The qualitative evidence provides the "why" for conservation actions (ecological role, cultural value), while the quantitative evidence provides the "how much" and "which ones" for tactical decisions.
Title: WoE for Dingo Conservation Integrating Quantitative and Qualitative Lines
Table 3: The Scientist's Toolkit for Dingo Hybridization Assessment [52].
| Research Reagent / Tool | Primary Function | Evidence Type Generated |
|---|---|---|
| Microsatellite or SNP Panel | Genotyping to quantify proportional ancestry of dingo vs. domestic dog. | Quantitative (high reliability). |
| Digital Calipers / 3D Scanner | Precise measurement of skull and skeletal morphological traits. | Quantitative (moderate reliability). |
| Standardized Phenotype Scoring Sheet | Field guide for consistent visual assessment of coat colour, markings, and form. | Qualitative (low-moderate reliability). |
| GPS Collars & Camera Traps | Monitoring movement, pack interactions, and breeding behavior in situ. | Qualitative/Quantitative (high contextual relevance). |
| Pre-European Reference Specimens | Historical baseline (bones, skins) for genetic and morphological comparison. | Quantitative & Qualitative (high relevance, scarce). |
The WoE framework is equally critical in biomedical sciences. Assessing the therapeutic potential of a drug or the hazard of a chemical requires integrating evidence across in vitro assays, animal models, and human studies [46].
This process moves beyond a single "key study" to build a convincing, holistic argument for regulatory submission or a conservation management plan, explicitly acknowledging the role and limitations of each type of scientific evidence.
The Adverse Outcome Pathway (AOP) framework is a structured representation that connects a Molecular Initiating Event (MIE), through a series of measurable Key Events (KEs), to an Adverse Outcome (AO) relevant to risk assessment [6]. A critical component for the regulatory application of an AOP is defining its Taxonomic Domain of Applicability (tDOA)—the range of species for which the described causal pathway is biologically plausible [6]. For most developed AOPs, the tDOA is narrowly defined, often limited to the single species (e.g., Apis mellifera, the European honey bee) used in the foundational empirical studies [6]. This presents a significant challenge for ecological risk assessment, which must protect a wide diversity of untested species.
Expanding the tDOA relies on evaluating the structural and functional conservation of KEs and their causal relationships (Key Event Relationships, KERs) across taxa [6]. Bioinformatics tools that leverage publicly available protein sequence data provide a powerful, efficient method to generate evidence for structural conservation. The Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool, developed by the U.S. Environmental Protection Agency, is explicitly designed for this purpose [6] [17]. It enables researchers to rapidly extrapolate knowledge of chemical-protein interactions and pathway components from a model species to thousands of others by analyzing protein sequence similarity at multiple levels.
This case study analysis details the process of using SeqAPASS to define the biologically plausible tDOA for a neurotoxic AOP of critical ecological concern: the pathway linking the activation of the nicotinic acetylcholine receptor (nAChR) to colony death/failure in bees (AOP 89) [6]. The analysis demonstrates how computational evidence for protein conservation strengthens the tDOA for individual KEs and KERs, thereby supporting the broader thesis that KER taxonomic conservation is fundamental to credible, widely applicable AOPs for ecological and translational toxicology.
Neonicotinoid insecticides, which target the nAChR, have been implicated in the global decline of pollinator populations [6]. AOP 89 was developed to organize the mechanistic understanding of how the MIE (nAChR activation) leads, through intermediate KEs at cellular, organ, and organism levels, to the AO of colony death/failure in Apis mellifera [6].
The initial, empirically derived tDOA for this AOP was restricted primarily to A. mellifera. However, concerns extend to other managed bees (e.g., Apis cerana) and, importantly, to a wide array of non-Apis bees (e.g., bumble bees and solitary bees), which are also vulnerable to pesticide exposure [6]. To evaluate the potential applicability of this AOP across Hymenoptera, a bioinformatics-driven approach was employed to assess the conservation of nine proteins critical to the KEs and KERs within the pathway [6].
SeqAPASS is a freely available, web-based tool that performs a hierarchical, three-level evaluation of protein conservation to predict potential chemical susceptibility across species [17]. The following protocol, adapted from the tool's detailed methodology, was applied to the bee neurotoxic AOP case study [17].
Step 1: Protein Target Identification Nine proteins integral to the neurotoxic AOP were identified from the AOP-Wiki description (AOP 89). These included the primary molecular target (nAChR subunits) and downstream proteins involved in subsequent KEs [6].
Step 2: Sequence Acquisition and Query Submission For each protein, the primary amino acid sequence from Apis mellifera (the "sensitive" model species) was obtained using a standard NCBI protein accession number. This sequence was submitted as the query to the SeqAPASS tool [17].
Step 3: Tiered Evaluation of Conservation
Step 4: Data Synthesis and tDOA Inference Results from all three levels are synthesized. Conservation of primary sequence, functional domains, and critical residues in a non-target species provides evidence for structural conservation of that KE. When structural conservation is established for proteins across linked KEs, it supports the biological plausibility that the entire KER and AOP may be conserved, thereby expanding the proposed tDOA [6].
The following virtual "reagents" and resources are essential for executing this bioinformatics analysis.
Table 1: Research Reagent Solutions for SeqAPASS-driven tDOA Analysis
| Item | Function/Description | Source/Example |
|---|---|---|
| SeqAPASS Web Tool | Core platform for conducting multi-level protein sequence comparisons and generating susceptibility predictions. | US EPA website (seqapass.epa.gov) [17] |
| Query Protein Sequence(s) | The reference amino acid sequence(s) from the model organism. Serves as the baseline for all cross-species comparisons. | NCBI Protein Database (e.g., Accession XP_016911190.1 for an A. mellifera nAChR subunit) [6] |
| NCBI Databases | Comprehensive, publicly archived repositories for protein sequences, conserved domains, and taxonomic information that form the backend data for SeqAPASS. | National Center for Biotechnology Information |
| AOP-Wiki | Collaborative knowledge base providing the detailed structure of the AOP (MIE, KEs, KERs) and identifying critical proteins for analysis. | aopwiki.org [6] |
| Critical Residue Data | Published empirical or structural data (e.g., from X-ray crystallography) identifying amino acids vital for chemical binding or protein function. | Scientific literature; referenced within AOP-Wiki KER descriptions [6] |
The SeqAPASS analysis of the nine AOP-relevant proteins generated quantitative data on their conservation across various bee species. The summary below illustrates the type of findings generated, which support inferences about the tDOA.
Table 2: SeqAPASS Analysis Summary for Key Proteins in the Bee Neurotoxic AOP [6]
| Protein Target | Role in AOP | Level 1 Conservation (Primary Sequence) | Level 3 Conservation (Critical Residues) | Inference for tDOA |
|---|---|---|---|---|
| nAChR α1 Subunit | MIE: Chemical binding & receptor activation. | High (≥80% identity) across Apis and many non-Apis bees. | Key binding site residues fully conserved across all major bee families. | Strongly conserved. MIE is biologically plausible for a broad bee tDOA. |
| nAChR β1 Subunit | MIE: Part of receptor complex. | High across bees; moderate in more distant Hymenoptera. | Critical residues conserved in bees but not in all insects. | Conserved within bees. Supports bee-specific tDOA for MIE. |
| Voltage-Gated Sodium Channel | KE: Neuronal hyperexcitation. | High sequence similarity across all insects analyzed. | Functional residues critical for channel gating are universally conserved. | Widely conserved. This KE likely has a very broad tDOA (Insecta). |
| Acetylcholinesterase | KE: Synaptic signaling modulation. | High among bees; variable in other taxa. | Active site residues are conserved, but peripheral sites may differ. | Functionally conserved in bees. Supports KERs involving synaptic disruption. |
| Dopamine Receptor | KE: Altered behavior & learning. | Moderate to high among bees. | Binding pocket characteristics are maintained across Apis species. | Likely conserved in Apis. tDOA for behavior-based KERs may be narrower. |
Visualization 1: SeqAPASS Tool Workflow for tDOA Analysis The following diagram illustrates the hierarchical, evidence-building workflow of the SeqAPASS tool as applied in this case study.
Visualization 2: The Neurotoxic AOP for Bees with tDOA Evidence Integration This diagram maps the essential structure of AOP 89, highlighting where SeqAPASS-derived evidence for protein conservation informs the tDOA of specific KEs and KERs.
The case study demonstrates that SeqAPASS provides objective, scalable lines of evidence for the structural conservation of molecular KEs. For AOP 89, results strongly supported the conservation of the MIE (nAChR) across a broad range of bee species, thereby expanding its tDOA beyond Apis mellifera [6]. This directly strengthens the biological plausibility of the upstream KERs within the pathway for these additional species, a core objective of KER taxonomic conservation research.
However, the analysis also revealed nuances. While primary sequence (Level 1) was often highly conserved, critical residue comparisons (Level 3) provided definitive evidence for predicting functional interaction with neonicotinoids [6]. Furthermore, conservation varied among downstream proteins, suggesting that the tDOA might narrow for certain later-stage KERs (e.g., those involving specific behavioral receptors). This underscores that the tDOA is not necessarily uniform for an entire AOP but must be considered on a KE-by-KE and KER-by-KER basis.
Integrating SeqAPASS outputs with other New Approach Methodologies (NAMs), such as the G2P-SCAN tool for biological pathway analysis, can create a more robust weight-of-evidence for functional conservation [24]. This combined approach can further refine the biologically plausible tDOA, helping to fulfill the AOP framework's potential in predictive toxicology for both ecological and human health applications [24].
This analysis confirms that bioinformatics tools like SeqAPASS are indispensable for systematically defining the tDOA of AOPs. By providing evidence for the structural conservation of protein targets, the tool moves tDOA descriptions from assertions based on limited empirical data to defensible, evidence-based inferences. For the neurotoxic AOP in bees, SeqAPASS enabled the proposed expansion of the tDOA to include numerous non-Apis bees, directly informing ecological risk assessments for neonicotinoid insecticides. Ultimately, embedding such computational analyses into AOP development is crucial for building taxonomically broad, mechanistically credible pathways that can reliably support cross-species prediction in regulatory decision-making.
The taxonomic domain of applicability (tDOA) is a critical, yet often narrowly defined, component of an Adverse Outcome Pathway (AOP) that determines the species for which the described biological pathway is relevant [6]. This evaluation is foundational for reliable use in regulatory decision-making, particularly when extrapolating knowledge to protect untested species [6]. This whitepaper provides a comparative technical evaluation of tDOA assessment methodologies, framed within the broader thesis that Key Event Relationships (KERs) represent the core, conserved units of AOPs [11]. We detail protocols for evaluating tDOA through bioinformatics and empirical approaches, using case studies from ecotoxicology (neonicotinoids and pollinators) and mammalian reproductive toxicology (retinoic acid signaling). The analysis underscores that a robust, comparative understanding of tDOA enhances confidence in AOP application for chemical safety assessment across diverse taxa and stressor classes [54].
The AOP framework organizes mechanistic knowledge into a causal chain from a Molecular Initiating Event (MIE) to an Adverse Outcome (AO), linked by measurable Key Events (KEs) and causal Key Event Relationships (KERs) [10]. While AOPs are often developed with specific model species, their utility in ecological and human health risk assessment depends on accurately defining their tDOA—the range of taxa for which the pathway is biologically plausible [6].
Recent conceptual advances posit that KERs are the fundamental building blocks of AOP knowledge [11]. This perspective shifts the focus of taxonomic conservation from the entire AOP to its constituent KERs. Evaluating tDOA, therefore, involves determining the conservation of the biological plausibility and empirical support for each causal link (KER) across species [10]. This requires evidence for both structural conservation (e.g., presence and similarity of proteins, receptors) and functional conservation (e.g., similar physiological role) of the entities involved in the KEs [6]. The integration of public bioinformatics tools with traditional toxicological data is essential for expanding tDOA definitions beyond the limited species for which empirical toxicity data exist [54].
Evaluating tDOA is a multi-evidence process combining computational and empirical lines of evidence. The following structured workflow is recommended.
The Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool is a publicly available web-based resource developed by the U.S. EPA to evaluate cross-species protein conservation [6]. Its hierarchical, three-level analysis provides key evidence for structural conservation.
Protocol Application: For a given AOP, identify all relevant proteins (MIE target, intermediate signaling molecules). Submit each as a query to SeqAPASS. The aggregated results across levels and proteins inform a biologically plausible tDOA for the KEs and KERs [6].
Complementing bioinformatics, the empirical assessment of a KER follows a standardized template to evaluate its strength and taxonomic anchors [11] [10].
The following case studies illustrate the application of tDOA evaluation across different stressor classes (synthetic insecticides vs. endogenous signaling disruptors) and taxonomic groups (invertebrates vs. mammals).
This AOP, developed for the honey bee (Apis mellifera), links the MIE of nicotinic acetylcholine receptor (nAChR) activation to the AO of colony death/failure [6].
Table 1: tDOA Evaluation for AOP 89 (Neonicotinoid - nAChR - Colony Collapse)
| Evaluation Aspect | Empirical tDOA (from Literature) | Bioinformatics (SeqAPASS) Inferred tDOA | Key Evidence & Confidence |
|---|---|---|---|
| MIE: nAChR Activation | Apis mellifera (Honey bee) | Likely all insects possessing conserved nAChR ligand-binding domain. Specificity within bees informed by residue analysis. | High confidence for insects; variable confidence within insects based on Level 3 residue conservation [6]. |
| Intermediate KEs (Neuronal Hyperexcitation) | Primarily A. mellifera | Plausible for taxa with conserved neuronal physiology and target proteins. | Moderate confidence, dependent on functional conservation of downstream signaling pathways [6]. |
| AO: Colony Death/Failure | A. mellifera (some evidence for Bombus spp.) | Limited to eusocial bees. Not applicable to solitary species. | Low extrapolation confidence; AO is highly dependent on social behavior, not just molecular conservation [6]. |
This developing AOP in mammals links inhibition of ALDH1A enzymes (MIE) to reduced fertility (AO). A core KER (2477) describes the link between decreased all-trans retinoic acid (atRA) in the fetal ovary and disrupted meiotic entry of oogonia [11].
Table 2: tDOA Evaluation for KER 2477 (Reduced atRA → Disrupted Meiosis) within AOP 398
| Evaluation Aspect | Empirical tDOA (from Literature) | Inferred Biological Plausible tDOA | Key Evidence & Confidence |
|---|---|---|---|
| Upstream KE: Decreased atRA in Ovary | Mouse, Rat, Rabbit, Human | All mammalian species. | High confidence based on conserved role of atRA in gonad development [11]. |
| KER 2477: Link to Disrupted Meiosis | Strong evidence in mouse, rat, rabbit; observational in human. | Strongly plausible for therian mammals. | High biological plausibility. Essentiality shown via genetic knockout (Stra8-/-) and dietary vitamin A deficiency studies [11]. |
| Downstream KE: Disrupted Meiotic Entry | Mouse, Rat, Human | All mammalian species. | High confidence; meiotic marker STRA8 is a direct target of atRA signaling and is conserved [11]. |
Diagram 1: Integrated Workflow for Evaluating tDOA (72 characters)
Diagram 2: AOP 89 tDOA Analysis with Evidence (58 characters)
Table 3: Research Reagent Solutions for tDOA and KER Conservation Studies
| Tool/Resource Name | Type | Primary Function in tDOA Evaluation | Access/Source |
|---|---|---|---|
| SeqAPASS | Bioinformatics Tool | Evaluates protein sequence and structural conservation across species via three-tiered analysis to inform structural tDOA. | https://seqapass.epa.gov/ [6] |
| AOP-Wiki | Knowledgebase | Central repository for published AOPs, KEs, and KERs. Provides templates for development and captures tDOA information. | https://aopwiki.org/ [10] |
| EcoDrug | Database | Links human drug targets to orthologs in >600 eukaryotes, aiding in predicting pharmaceutical target conservation across species. | https://www.ecodrug.org/ [54] |
| OECD AOP Developers' Handbook | Guidance Document | Provides standardized methods and principles for AOP/KER development, including weight-of-evidence assessment for KERs and tDOA. | https://aopwiki.org/handbooks [10] |
| EcoToxChips | Experimental Tool | Cross-species qPCR arrays for measuring conserved transcriptional responses, providing functional evidence for KE activation. | Cited in literature [54] |
The comparative analysis reveals that a robust tDOA is not a binary designation but a gradient of confidence informed by multiple lines of evidence. The following strategic recommendations are proposed:
In conclusion, advancing the science of tDOA evaluation through the comparative, integrated methodologies outlined here is essential for realizing the promise of the AOP framework in predictive toxicology and fit-for-purpose chemical risk assessment for a wide range of species and stressor classes.
The Taxonomic Domain of Applicability (tDOA) is a foundational concept within the Adverse Outcome Pathway (AOP) framework, defining the range of species for which a described sequence of Key Event Relationships (KERs) is biologically plausible [9]. In the context of KER taxonomic conservation research, a predicted tDOA starts as a plausible hypothesis, often based on initial data from a single model organism. The critical scientific challenge is transitioning this plausible prediction to a confirmed and empirically validated tDOA, thereby expanding the utility of AOPs for cross-species chemical safety assessment and drug development without further animal testing [15].
This transition is not trivial. It requires a multi-faceted validation strategy that integrates quantitative causal analysis of KERs with computational cross-species extrapolation. Recent advancements in New Approach Methodologies (NAMs) have created a pathway for this empirical validation, combining probabilistic modeling of KER confidence with bioinformatic tools that assess the conservation of molecular targets and biological pathways across the tree of life [15] [9]. The validation of a tDOA thus becomes a confirmatory process, substantiating that the mechanistic toxicity described by an AOP is not an artifact of a single species but a conserved biological response with defined taxonomic boundaries.
The initial prediction and subsequent expansion of a tDOA rely on integrating data from multiple sources and lines of evidence. The process begins with a well-constructed AOP, typically developed from a model organism, and systematically seeks to extrapolate its KERs across broader taxonomic groups.
Table 1: Core Components for tDOA Prediction and Expansion
| Component | Description | Role in tDOA Validation |
|---|---|---|
| Adverse Outcome Pathway (AOP) Network | A structured framework linking a Molecular Initiating Event (MIE) to an Adverse Outcome (AO) via intermediate Key Events (KEs) [15]. | Provides the mechanistic KER sequence whose taxonomic conservation is being evaluated. Serves as the foundational hypothesis for the predicted tDOA. |
| Key Event Relationship (KER) Assessment | Quantitative evaluation of the causal, correlative, or predictive links between KEs, often using Bayesian networks [15]. | Establishes confidence in the AOP's internal logic. A robust KER network within the source species strengthens the plausibility of its conservation. |
| Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) | A bioinformatics tool that compares primary protein sequence, domain, and 3D structural similarity to extrapolate potential chemical susceptibility [15] [9]. | Empirically tests the conservation of the MIE's molecular target (e.g., a specific receptor or enzyme) across diverse species, providing direct evidence for tDOA expansion. |
| Genes-to-Pathways Species Conservation Analysis (G2P-SCAN) | A computational tool that maps human genes to biological pathways and evaluates the conservation of those pathways across a defined set of species [15] [9]. | Provides evidence for the conservation of the broader biological pathway downstream of the MIE, supporting the plausibility that the entire KER sequence could be conserved. |
A pivotal case study demonstrating this framework involved extending the tDOA for AOP 207 (reproductive toxicity of silver nanoparticles via oxidative stress in C. elegans). Researchers integrated in vivo ecotoxicology data, in vitro human toxicology data, and in silico tools (SeqAPASS and G2P-SCAN) to build a cross-species AOP network. This approach extended the biologically plausible tDOA from a few model species to over 100 taxonomic groups, including fungi, birds, rodents, and reptiles [15].
Diagram Title: Integrated workflow for expanding tDOA from model organism data.
Moving from a qualitative, plausible tDOA to a quantitative, confirmed one requires rigorous empirical validation methods. These methodologies assess both the strength of the underlying KERs and the performance of the tDOA prediction against independent data.
A core step in validating the AOP itself is the quantitative assessment of KERs. Bayesian Network (BN) modeling is a probabilistic approach adept at managing the inherent uncertainty and variability in biological systems [15]. It is used to analyze the causal relationships between KEs based on experimental data.
Once a tDOA is predicted computationally, its validity must be tested. Empirical validation strategies from mechanistic model evaluation can be adapted for this purpose [55].
Table 2: Empirical Validation Metrics for Predicted tDOA Performance
| Validation Metric | Application to tDOA Validation | Interpretation of a Successful Result |
|---|---|---|
| Bootstrapped Log-Rank/MaxCombo Test | Statistically compares the predicted vs. observed survival (or adverse outcome) curves in a validation species. | No significant difference (p > 0.05) suggests the AOP-derived prediction aligns with empirical data, supporting tDOA inclusion. |
| Prediction Interval Coverage | Checks if observed endpoint measurements (e.g., brood size, enzyme activity) fall within the predicted range. | High coverage (e.g., ≥95%) indicates the model accurately captures the variability of the response in the new species. |
| Juncture Metric | Evaluates the accuracy of predicting the timing of a key event (e.g., onset of pathology) in validation studies. | A low juncture error score indicates the model's temporal predictions are reliable for the new species. |
The following protocols outline a step-by-step pathway for the empirical validation of a predicted tDOA, integrating the frameworks and methods described above.
This protocol uses computational NAMs to generate a testable tDOA prediction [15] [9].
This protocol validates the AOP's predictive power within the proposed tDOA [15].
Diagram Title: Two-phase protocol for tDOA prediction and empirical validation.
Table 3: Research Reagent Solutions for tDOA Validation
| Tool/Reagent Category | Specific Item | Function in tDOA Validation |
|---|---|---|
| Bioinformatics Software | SeqAPASS Tool (v6.1+) [9] | Provides empirical evidence for the conservation of the molecular initiating event (MIE) protein across species. |
| Bioinformatics Software | G2P-SCAN Tool (v0.0.1.0+) [15] [9] | Evaluates the conservation of the broader biological pathway implicated in the AOP across key model species. |
| Statistical Modeling Software | Bayesian Network Software (e.g., Netica, GeNIe, R packages bnlearn, gRbase) |
Enables the construction, parameterization, and simulation of quantitative AOP networks for probabilistic prediction. |
| Reference Chemical | AgNO₃ or characterized AgNPs [15] | A positive control stimulus for validating AOPs involving oxidative stress and reproductive toxicity (e.g., AOP 207). |
| Reference Chemical | Prototypical Receptor Agonists/Antagonists (e.g., for PPARα, ESR1) [9] | Used in validation studies to directly perturb a specific MIE and test the downstream KER sequence in a new species. |
| Validated Assay Kits | ROS detection kits (e.g., DCFDA), Caspase-3 activity kits, Hormone ELISA kits | Provide standardized methods to quantitatively measure key events (KEs) such as oxidative stress, apoptosis, or endocrine disruption in validation studies. |
| Reference Genomic Material | cDNA or gDNA from species across the proposed tDOA | Essential for in vitro cloning and expression of putative orthologs to functionally test MIE-chemical interaction (e.g., in reporter gene assays). |
The empirical validation of a predicted tDOA transforms an AOP from a species-specific model into a generalized tool for predictive toxicology. For drug development professionals, this has direct applications:
The pathway from plausible to confirmed tDOA is emblematic of the evolving paradigm in toxicological sciences. It leverages computational power to generate hypotheses and empirical rigor to test them, ultimately strengthening the scientific confidence in cross-species extrapolation and enabling more efficient, ethical, and predictive safety assessments.
Defining the taxonomic domain of applicability for Key Event Relationships is not a peripheral task but a fundamental requirement for the credible application of AOPs in modern, animal-sparing toxicology. As demonstrated, the integration of computational bioinformatics tools like SeqAPASS and G2P-SCAN provides a powerful, evidence-based methodology to extrapolate mechanistic knowledge beyond the model organisms used in initial AOP development [citation:1][citation:2][citation:5]. Success hinges on a systematic, transparent approach that combines structural sequence analysis with functional pathway conservation, all framed within a rigorous weight-of-evidence assessment. The future of this field lies in the development of standardized, accepted workflows for tDOA definition and their integration into the AOP-Wiki framework, fostering consistency and collaboration. Initiatives like the International Consortium to Advance Cross-Species Extrapolation in Regulation (ICACSER) are pivotal in driving this harmonization forward [citation:10]. Ultimately, robust KER taxonomic conservation strengthens the predictive power of the AOP framework, accelerating its use in regulatory decision-making to achieve comprehensive chemical safety assessments for both human and environmental health under the unifying vision of One Health [citation:1][citation:10].