Defining the Taxonomic Domain of Applicability in Adverse Outcome Pathways: A Framework for Cross-Species Prediction in Toxicology and Drug Development

Isaac Henderson Nov 26, 2025 554

This article provides a comprehensive overview of the Taxonomic Domain of Applicability (tDOA) for Adverse Outcome Pathways (AOPs), a critical concept for enhancing the reliability of cross-species extrapolation in chemical...

Defining the Taxonomic Domain of Applicability in Adverse Outcome Pathways: A Framework for Cross-Species Prediction in Toxicology and Drug Development

Abstract

This article provides a comprehensive overview of the Taxonomic Domain of Applicability (tDOA) for Adverse Outcome Pathways (AOPs), a critical concept for enhancing the reliability of cross-species extrapolation in chemical risk assessment and drug development. Aimed at researchers, scientists, and drug development professionals, it explores the foundational principles of tDOA, detailing the importance of structural and functional conservation of key events. The article further examines advanced methodological approaches, including bioinformatics tools like SeqAPASS, for defining tDOA and demonstrates their application through case studies. It addresses common challenges and optimization strategies in tDOA determination and discusses the vital processes of validation and comparison with other New Approach Methodologies (NAMs). The synthesis offers a forward-looking perspective on integrating tDOA evaluation into regulatory science and predictive toxicology.

What is the Taxonomic Domain of Applicability? Foundational Concepts for AOP Development

In the context of the Adverse Outcome Pathway (AOP) framework, the taxonomic domain of applicability (tDOA) defines the range of species for which a given AOP is biologically plausible [1] [2]. This concept has emerged as a critical component in modern toxicology and chemical risk assessment, bridging the gap between molecular initiating events and adverse outcomes across diverse species. The tDOA concept challenges the traditional assumption that taxonomic relatedness alone confers similar chemical susceptibility, instead focusing on the conservation of specific protein targets and biological pathways [1] [3]. As regulatory science moves toward animal-free testing methodologies, accurately defining tDOA has become essential for reliable cross-species extrapolation in both human toxicology and ecotoxicology [4] [2].

The fundamental premise underlying tDOA is that shared molecular targets and pathway conservation—rather than phylogenetic proximity—determine chemical susceptibility [3]. This paradigm shift enables researchers to predict chemical effects across taxonomically diverse species using computational approaches, supporting the One Health perspective that integrates human and ecosystem health [2]. The precise definition of tDOA allows for more scientifically grounded chemical safety assessments while reducing reliance on traditional animal testing [1] [4].

Computational Methodologies for tDOA Determination

Determining tDOA relies on computational new approach methodologies (NAMs) that leverage existing biological knowledge to predict chemical susceptibility across species. Two primary tools have emerged as standards in this field, each offering complementary approaches to tDOA definition.

Table 1: Core Computational Tools for tDOA Analysis

Tool Name	Developer	Primary Function	Input Data	Output
SeqAPASS (Sequence Alignment to Predict Across Species Susceptibility)	US Environmental Protection Agency [1]	Predicts chemical susceptibility across species based on protein sequence and structural similarity [1] [3]	Protein sequence data from NCBI database [2]	Susceptibility predictions across taxonomic groups [1]
G2P-SCAN (Genes to Pathways - Species Conservation Analysis)	Unilever [1] [3]	Estimates biological pathway conservation across species [1] [3]	Human gene inputs [1]	Pathway conservation across 7 model species [1]

Tool Integration and Workflow

The power of these computational approaches lies in their strategic integration, creating a weight-of-evidence framework that enhances confidence in tDOA predictions [1] [3]. The typical workflow begins with SeqAPASS, which utilizes protein sequence information to extrapolate chemical susceptibility across the diversity of species with available protein sequence data [1]. This tool expands the biological space in which toxicity predictions are possible by identifying conserved molecular targets across taxonomic groups [3].

G2P-SCAN complements this approach by providing biological pathway-level information from human gene inputs, supporting inferences of pathway conservation across seven species commonly used in chemical safety assessment: humans (Homo sapiens), mice (Mus musculus), rats (Rattus norvegicus), zebrafish (Danio rerio), fruit flies (Drosophila melanogaster), roundworms (Caenorhabditis elegans), and yeast (Saccharomyces cerevisiae) [1]. The combination of these tools generates multiple lines of evidence associated with chemical effects on biological pathways and taxonomic relevance, significantly strengthening tDOA predictions [1] [3].

Figure 1: Integrated Workflow for tDOA Definition in AOP Development

Experimental Protocols and Case Studies

Standardized Methodological Approach

The experimental protocol for defining tDOA follows a systematic workflow that integrates multiple computational approaches with empirical data. A recent study demonstrated this methodology through a comprehensive analysis of 40 chemicals with diverse molecular targets, use categories, and mechanisms of action [1] [3]. The protocol consists of four distinct phases that progress from target identification to tDOA expansion.

The initial phase involves target identification and evaluation, where molecular targets for chemicals of interest are identified using multiple data sources, including EPA high-throughput in vitro data, ToxCast bioactivity data, structural data from the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), and existing chemical activity data from literature [1]. This multipronged approach ensures comprehensive target identification, capturing both primary and secondary molecular interactions.

The subsequent computational analysis phase applies SeqAPASS and G2P-SCAN to the identified molecular targets. SeqAPASS analysis compares protein sequence similarities across species using the National Center for Biotechnology Information database to predict potential chemical susceptibility [2]. Concurrently, G2P-SCAN maps human gene inputs to biological pathways and evaluates their conservation across the seven model species [1]. This phase provides the foundational data for initial tDOA estimation.

The AOP development phase integrates the computational predictions with adverse outcome pathway construction. Researchers collect and structure various data types into AOP networks, then assess key event relationships using Bayesian network modeling approaches to quantify confidence in the proposed pathways [2]. This phase establishes the mechanistic links between molecular initiating events and adverse outcomes.

The final tDOA expansion phase uses the integrated computational results to extrapolate the biologically plausible tDOA beyond the initially studied species. The combined evidence from SeqAPASS and G2P-SCAN supports expansion of the taxonomic domain of applicability to potentially include over 100 taxonomic groups [4] [2].

Case Study: Reproductive Toxicity of Silver Nanoparticles

A compelling case study demonstrating the practical application of tDOA definition comes from research on silver nanoparticles (AgNPs) [2]. The study began with an existing AOP for AgNP-induced reproductive toxicity in Caenorhabditis elegans (AOPwiki ID 207) and systematically expanded its tDOA through integrated computational approaches.

The research collected and analyzed 25 mechanism-based toxicity studies on AgNPs featuring different data types, including in vitro human cells, in vivo models, and molecular-to-individual level assessments [2]. The molecular initiating event was identified as NADPH oxidase and P38 MAPK activation, leading to reproductive failure [2]. The key events included oxidative stress, DNA damage, and impaired gametogenesis, culminating in reduced reproductive output as the adverse outcome.

Computational analysis using SeqAPASS and G2P-SCAN enabled the extension of the biologically plausible tDOA from the initial model organisms (C. elegans, D. melanogaster, and in vitro human cells) to over 100 taxonomic groups, including fungi (98 species), birds (28 species), rodents, reptiles, and nematodes [2]. This expansion demonstrated how integrated computational approaches can significantly broaden the taxonomic applicability of AOPs while maintaining mechanistic credibility.

Table 2: tDOA Expansion for Silver Nanoparticle Reproductive Toxicity

AOP Element	Initial Scope	Expanded Scope via Computational NAMs
Molecular Initiating Event	NADPH oxidase and P38 MAPK activation in C. elegans [2]	Conserved across 100+ taxonomic groups [2]
Biological Pathways	Oxidative stress response in nematodes [2]	Pathway conservation across fungi, birds, rodents, reptiles [2]
Adverse Outcome	Reproductive failure in C. elegans [2]	Plausible across taxonomically diverse species [2]
Taxonomic Domain	C. elegans, D. melanogaster, in vitro human cells [2]	100+ taxonomic groups including fungi (98), birds (28) [2]

Comparative Analysis of tDOA Determination Approaches

Performance Metrics and Validation

The integration of SeqAPASS and G2P-SCAN for tDOA determination represents a significant advancement over traditional approaches to cross-species extrapolation. Comparative analysis reveals distinct advantages in terms of predictive accuracy, taxonomic range, and mechanistic insight.

Traditional cross-species extrapolation in toxicology has primarily relied on the assumption that taxonomic relatedness confers similar chemical susceptibility [1]. This approach typically utilizes surrogate species to represent related taxonomic groups, with limited consideration of molecular mechanism conservation [3]. In contrast, the computational NAM approach focuses specifically on the conservation of molecular targets and biological pathways, providing a more mechanistic basis for extrapolation [1] [2].

Validation studies have demonstrated that the integrated computational approach successfully predicts known chemical susceptibilities while identifying previously unrecognized taxonomic domains. For instance, in the case of peroxisome proliferator activated receptor alpha (PPARα) interactions, the combined use of SeqAPASS and G2P-SCAN provided enhanced weight of evidence to support cross-species susceptibility predictions beyond what either tool could accomplish independently [1]. Similarly, for estrogen receptor 1 (ESR1) and gamma-aminobutyric acid type A receptor subunit alpha (GABRA1) interactions, the pathway information from G2P-SCAN complemented the sequence similarity analysis from SeqAPASS, creating a more robust basis for tDOA definition [1].

Figure 2: Comparison of Traditional vs. Computational Approaches to tDOA

Quantitative Assessment of Method Performance

The performance of tDOA determination methods can be quantitatively assessed across multiple dimensions, including predictive accuracy, taxonomic coverage, mechanistic resolution, and utility for AOP development. The integrated computational approach demonstrates superior performance across these metrics compared to traditional methods.

Table 3: Performance Comparison of tDOA Determination Methods

Performance Metric	Traditional Approach	Integrated Computational NAMs
Basis for Extrapolation	Taxonomic relatedness [1]	Protein sequence similarity & pathway conservation [1] [3]
Mechanistic Insight	Limited [3]	High - identifies specific molecular targets and pathways [1] [2]
Taxonomic Coverage	Limited to phylogenetically related species [1]	Extensive - 100+ taxonomic groups possible [4] [2]
Validation Approach	Empirical testing in surrogate species [1]	Computational prediction with targeted verification [2]
AOP Utility	Limited to specific taxa [2]	Enables expansion of biologically plausible tDOA [4] [2]
Animal Use	High - requires multiple species testing [1]	Reduced - minimizes animal testing [4] [2]

Essential Research Toolkit for tDOA Studies

Implementing tDOA definition studies requires access to specific computational tools, databases, and analytical resources. These components form the essential research toolkit that enables scientists to determine the taxonomic domain of applicability for adverse outcome pathways.

Table 4: Essential Research Toolkit for tDOA Definition

Tool/Resource	Type	Function in tDOA Studies	Access Information
SeqAPASS	Computational tool	Predicts chemical susceptibility across species based on protein sequence similarity [1] [2]	Web-based: https://seqapass.epa.gov/seqapass/ [1]
G2P-SCAN	Computational tool	Estimates biological pathway conservation across species [1] [3]	R package [2]
NCBI Databases	Data resource	Provides protein sequence data for cross-species comparisons [2]	Publicly available
AOP-Wiki	Knowledge base	Structured AOP information including molecular initiating events and key events [2]	Publicly available
Reactome	Data resource	Pathway database used for conservation analysis [1] [3]	Publicly available
Comptox Chemicals Dashboard	Data resource	ToxCast bioactivity data for chemical target identification [1]	EPA resource: https://comptox.epa.gov/dashboard/ [1]
RCSB Protein Data Bank	Data resource	Protein-ligand crystallization data for molecular target characterization [1]	https://www.rcsb.org/ [1]

The strategic combination of these resources creates a powerful toolkit for defining tDOA without requiring extensive animal testing. The workflow typically begins with target identification using the CompTox Chemicals Dashboard and RCSB PDB, proceeds through sequence and pathway analysis with SeqAPASS and G2P-SCAN, and culminates in AOP development with tDOA specification using the AOP-Wiki [1] [2]. This integrated approach represents the current state-of-the-art in mechanistic toxicology for cross-species extrapolation.

The precise definition of taxonomic domain of applicability (tDOA) represents a critical advancement in mechanistic toxicology and chemical safety assessment. By integrating computational approaches like SeqAPASS and G2P-SCAN, researchers can now establish biologically plausible tDOAs based on conserved molecular targets and pathways rather than taxonomic proximity alone [1] [3] [2]. This paradigm shift enables more scientifically grounded cross-species extrapolations while supporting the reduction of animal testing through new approach methodologies [4] [2].

The case study on silver nanoparticle reproductive toxicity demonstrates how these computational tools can expand a well-characterized AOP from a few model species to over 100 taxonomic groups [2]. This expanded applicability domain significantly enhances the utility of AOPs for both ecological and human health risk assessment under the One Health framework [2]. As these computational methodologies continue to evolve and integrate with additional data sources, the precision and reliability of tDOA definition will further improve, strengthening the scientific basis for chemical safety decisions across the tree of life.

The Role of Structural and Functional Conservation in Defining Biological Plausibility

Defining the taxonomic domain of applicability (tDOA) is a critical step in Adverse Outcome Pathway (AOP) research, determining the range of species for which a documented pathway of toxicity is biologically plausible [5]. The foundational elements for establishing tDOA are structural conservation (the presence and preservation of biological entities like proteins and their functional domains) and functional conservation (the maintenance of equivalent biological roles across species) [5]. For researchers in toxicology and drug development, accurately defining the tDOA enables reliable cross-species extrapolation, which is vital for next-generation chemical safety assessments that seek to reduce reliance on whole-animal testing [3] [1].

This guide objectively compares the performance of two primary computational methodologies—the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool and the combined use of SeqAPASS and the Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tool. We present supporting experimental data, detailed protocols, and essential research tools to inform their application in AOP development.

Performance Comparison of Computational Methodologies

The strategic combination of New Approach Methodologies (NAMs) can enhance the strengths and mitigate the limitations of individual tools [3] [1]. The following table summarizes the performance of SeqAPASS as a standalone tool versus its integration with G2P-SCAN.

Table 1: Performance Comparison of Standalone and Combined Computational Approaches

Feature	SeqAPASS (Standalone)	SeqAPASS + G2P-SCAN (Combined)
Primary Function	Predicts chemical susceptibility based on protein conservation [5] [1]	Enhances cross-species predictions by integrating protein conservation with pathway-level data [3] [1]
Taxonomic Scope	Broad; any species with available protein sequence data [5] [3]	Focused on 7 key model species (e.g., human, mouse, rat, zebrafish) [1]
Biological Scope	Protein-centric (MIEs & KEs) [5]	Pathway-centric (biological pathways & networks) [1]
Key Output	Evidence for structural conservation of molecular initiating events (MIEs) and key events (KEs) [5]	Consensus evidence for biological pathway conservation, expanding the biologically plausible tDOA of AOPs [1]
Reported Utility	Rapidly expands the potential tDOA for individual KEs in an AOP [5]	Provides a weight-of-evidence approach for predicting chemical susceptibility and pathway disruption [3] [1]

Experimental Protocols for Methodology Application

Protocol 1: SeqAPASS Standalone Analysis for AOP Development

This protocol details the process of using the SeqAPASS tool to evaluate the structural conservation of proteins within an AOP framework, as demonstrated in a case study on an AOP linking nicotinic acetylcholine receptor (nAChR) activation to colony death/failure in bees [5].

Protein Identification: Identify all proteins involved in the AOP, specifically those associated with the Molecular Initiating Event (MIE) and subsequent Key Events (KEs). In the case study, nine proteins were identified [5].
Data Retrieval and Input: For each query protein, obtain its primary amino acid sequence from a reference database (e.g., RefSeq, UniProt). Input the FASTA format sequence into the SeqAPASS web tool (v6.1+).
Level 1 Analysis (Primary Sequence): Execute a Level 1 analysis to identify potential orthologs across the tree of life based on overall protein sequence similarity [5] [1].
Level 2 Analysis (Functional Domains): Perform a Level 2 analysis to evaluate the conservation of known functional domains (e.g., from Pfam) within the identified orthologs [5].
Level 3 Analysis (Critical Residues): Conduct a Level 3 analysis focusing on the conservation of specific amino acid residues known to be critical for protein-ligand interaction, protein-protein interaction, or protein function [5] [1].
Data Interpretation and tDOA Definition: Synthesize the results from all three levels. A positive finding of conservation across these levels provides evidence of structural conservation, which can be used to define the biologically plausible tDOA for the MIE, KEs, and the overall AOP [5].

Protocol 2: Combined SeqAPASS and G2P-SCAN Analysis

This protocol outlines the combined use of SeqAPASS and G2P-SCAN to generate consensus evidence for pathway-level conservation, as applied in a study of 40 chemicals with diverse modes of action [1].

Chemical-Target Identification: Select chemicals of interest and identify their known molecular targets using a combination of high-throughput bioactivity data (e.g., ToxCast), structural data (RCSB PDB), and literature mining [1].
SeqAPASS Evaluation: Subject the identified protein targets to the standard SeqAPASS workflow (Levels 1-3) to predict potential chemical susceptibility across a wide range of species [1].
G2P-SCAN Pathway Mapping: Input the list of human genes encoding the molecular targets into the G2P-SCAN tool. The tool maps these genes to biological pathways (e.g., Reactome pathways) and estimates the conservation of these entire pathways across its seven predefined model species [1].
AOP Network Integration: Compare the molecular and functional data from relevant AOPs with the mapped biological pathways to establish toxicological context [1].
Weight-of-Evidence Synthesis: Integrate the findings from SeqAPASS (protein-level susceptibility) and G2P-SCAN (pathway-level conservation) to build a consensus on cross-species chemical susceptibility and to expand the biologically plausible tDOA for the relevant AOPs [1].

Workflow Visualization

Integrated Workflow for Defining AOP Taxonomic Applicability

Table 2: Key Computational Tools and Databases for Conservation Analysis

Tool / Resource	Primary Function	Application in tDOA Definition
SeqAPASS Tool	A hierarchical bioinformatics tool that compares protein sequence and structural similarity across species [5] [1].	Provides lines of evidence for the structural conservation of MIEs and KEs, which is fundamental for establishing the biologically plausible tDOA of an AOP [5].
G2P-SCAN Tool	A computational tool that maps human gene inputs to biological pathways and assesses their conservation across model species [1].	Offers evidence for functional pathway conservation, supporting the extrapolation of entire AOP networks across taxa when combined with SeqAPASS [1].
RCSB Protein Data Bank (PDB)	A database providing 3D structural data of proteins and their complexes with ligands [1].	Critical for identifying amino acid residues involved in chemical binding (for SeqAPASS Level 3 analysis) and understanding molecular initiating events [1].
RefChemDB	A curated database of high-throughput in vitro screening data [1].	Used for the initial identification of molecular targets for chemicals, forming the starting point for cross-species extrapolation analyses [1].
Reactome	An open-source, open-access, manually curated pathway database [1].	Serves as a knowledgebase within G2P-SCAN for mapping gene targets to biologically relevant pathways whose conservation is then assessed [1].

In regulatory toxicology, the protection of untested species often relies on extrapolating data from a handful of tested species. The taxonomic domain of applicability (tDOA) of an Adverse Outcome Pathway (AOP) defines the range of species for which the described pathway is biologically plausible. For the majority of developed AOPs, the tDOA is typically narrowly defined, creating uncertainty in environmental and chemical risk assessment for the vast majority of species that lack empirical toxicity data [5] [6]. This article explores the critical importance of defining the tDOA, the methodologies employed, and its direct implications for making informed regulatory decisions to protect biodiversity.

The Critical Role of tDOA in the AOP Framework

An Adverse Outcome Pathway (AOP) is a structured representation that links a Molecular Initiating Event (MIE), through a series of intermediate Key Events (KEs), to an Adverse Outcome (AO) relevant for risk assessment [6] [7]. The AOP framework organizes existing knowledge to understand the causal mechanisms of toxicity.

The tDOA is an integral component of an AOP that outlines the species for which the pathway is considered valid. A precisely defined tDOA is crucial because:

It supports the protection of untested species. Regulatory decisions must often consider a wide array of species for which no toxicity data exists. A well-substantiated tDOA provides a scientifically defensible basis to infer susceptibility across the taxonomic tree [5] [6].
It enhances confidence in regulatory decision-making. Moving beyond assumptions of broad taxonomic coverage to evidence-based tDOA definitions reduces uncertainty, making ecological risk assessments more robust and reliable [6].
It helps prioritize testing efforts. By identifying taxonomic groups where structural or functional conservation of a pathway is unlikely, resources can be directed towards testing the most relevant or susceptible species [5].

Defining the tDOA relies on evaluating two primary elements: structural conservation (is the biological entity, such as a protein, present and conserved?) and functional conservation (does the entity play the same role in different species?) [5] [6].

Methodologies for Defining the Taxonomic Domain of Applicability

Expanding the tDOA beyond the few species cited in an AOP's empirical studies requires a combination of bioinformatics and empirical evidence.

Bioinformatics Workhorse: The SeqAPASS Tool

A primary tool for evaluating structural conservation is the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool [5] [6]. This publicly available bioinformatics tool uses a hierarchical approach to evaluate cross-species protein conservation, providing critical lines of evidence for the tDOA.

The tool operates through three tiers of analysis:

Level 1: Compares primary amino acid sequence similarity to identify potential orthologs across species.
Level 2: Evaluates the conservation of known functional domains within the protein sequence.
Level 3: Assesses the conservation of specific amino acid residues critical for protein-ligand interactions, protein-protein interactions, or overall function [5] [6].

The workflow for integrating SeqAPASS into tDOA definition is systematic, as shown in the following diagram.

Case Study: nAChR Activation and Colony Death

A practical case study demonstrates this process. An AOP network links the activation of the nicotinic acetylcholine receptor (nAChR—the MIE) to colony death/failure in honey bees (Apis mellifera), with neonicotinoid insecticides as prototypical stressors [5] [6]. While developed for honey bees, its relevance to over 20,000 other bee species was unknown.

Researchers used SeqAPASS to evaluate nine proteins involved in this AOP. The analysis provided evidence for the structural conservation of these proteins across various bee species, thereby expanding the biologically plausible tDOA of the AOP beyond A. mellifera to include other Apis and non-Apis bees [5]. This directly informs regulatory decisions regarding the potential risks of neonicotinoids to a broader range of pollinators.

Experimental Protocols & Data for tDOA Determination

The process of defining the tDOA combines computational and empirical approaches. The following protocol details the key steps, using the nAChR case study as a template.

Protocol 1: Defining tDOA using Bioinformatics and Empirical Integration

Step	Description	Key Action
1. AOP Selection	Select a defined AOP with a narrowly defined tDOA.	Select AOP (e.g., AOP 89: nAChR activation leading to colony death).
2. Protein Identification	Identify specific proteins critical to the MIE and KEs.	Compile a list of query proteins (e.g., nine proteins from the nAChR AOP).
3. Bioinformatics Analysis	Evaluate structural conservation of proteins across taxa.	Input each query protein into the SeqAPASS tool. Execute Levels 1, 2, and 3 analyses.
4. Ortholog List Generation	Generate a list of species with a high probability of possessing a functional ortholog.	Interpret SeqAPASS results to identify species where primary sequence, domains, and critical residues are conserved.
5. Empirical Integration	Combine computational predictions with available toxicity data.	Overlay bioinformatics results with in vitro or in vivo toxicity data from the AOP-Wiki or literature to assess functional conservation.
6. tDOA Definition	Formally define the biologically plausible tDOA.	Use the combined evidence to specify the species for which the KE, KER, and overall AOP are applicable. Document in AOP-Wiki.

The data generated from these analyses can be synthesized into clear tables to support decision-making. The following table summarizes hypothetical, representative data for one of the nine proteins analyzed in the nAChR case study.

Table 1: Representative SeqAPASS Output and Toxicity Data for a Key Protein (e.g., nAChR subunit α1) in the AOP for nAChR Activation Leading to Colony Death [5] [6].

Species	Level 1 (% Identity)	Level 2 (Domains Conserved)	Level 3 (Critical Residues)	Empirical Evidence (Ligand Binding EC50)	Plausible tDOA
Apis mellifera (Honey bee)	100%	Yes (All)	Yes (All)	1.0 µM (Reference)	Yes (Definitive)
Bombus terrestris (Bumble bee)	95%	Yes (All)	Yes (All)	1.2 µM	Yes (High Confidence)
Osmia bicornis (Red mason bee)	90%	Yes (All)	Yes (4/5)	No data	Yes (Plausible)
Drosophila melanogaster (Fruit fly)	80%	Yes (All)	Yes (3/5)	5.5 µM	Likely
Danio rerio (Zebrafish)	45%	Partial	No	No effect at 100 µM	No

Successfully defining the tDOA of an AOP requires a suite of bioinformatics and data resources.

Table 2: Key Research Reagent Solutions for tDOA Analysis.

Tool / Resource	Function in tDOA Analysis
SeqAPASS Tool	A publicly accessible web-based platform that performs cross-species protein sequence and structural comparisons to predict potential chemical susceptibility [5] [6].
AOP-Wiki (aopwiki.org)	The central repository for AOP knowledge, where information on the tDOA, along with supporting evidence for KEs and KERs, is documented and shared [5] [7].
National Center for Biotechnology Information (NCBI) Protein Database	Provides the extensive, publicly available protein sequence data that tools like SeqAPASS rely on for their comparative analyses [5].
Gene Ontology (GO) & DisGeNET	Bioinformatics resources used for overrepresentation analysis to map and classify AOPs based on their associated genes/proteins and diseases, helping to identify biological gaps and connections [7].

Implications for Regulatory Decision-Making

The formal definition of tDOA moves regulatory science away from assumption-based extrapolation and toward evidence-based prediction. This has profound implications:

Justification for Read-Across: tDOA analysis provides a scientifically rigorous basis for read-across, where data from a tested species (e.g., Apis mellifera) is used to predict hazard in an untested species (e.g., Bombus terrestris) [5].
Informing Testing Strategies: Regulatory testing frameworks can use tDOA information to select taxonomically appropriate surrogate species for testing when dealing with a diverse group of organisms of concern [6].
Strengthening WoE Assessments: Evidence of structural and functional conservation significantly increases the weight of evidence (WoE) for the biological plausibility of an AOP in untested species, leading to more confident regulatory decisions [5] [6] [7].

The relationship between tDOA analysis and the broader AOP-based regulatory process is illustrated below.

Current efforts are focused on improving the FAIRness (Findability, Accessibility, Interoperability, and Reusability) of AOPs and their associated tDOA data [7]. Initiatives like the EU's Partnership for the Assessment of Risks from Chemicals (PARC) are actively using tDOA and AOP frameworks to identify and fill critical biological knowledge gaps, particularly in areas like immunotoxicity, endocrine disruption, and neurotoxicity [7].

In conclusion, defining the taxonomic domain of applicability is not a peripheral activity but a core component of modern, mechanism-based toxicology. By leveraging bioinformatics tools like SeqAPASS and integrating their outputs with empirical data, scientists can transform the tDOA from a narrow, species-limited description into a powerful, evidence-based tool. This evolution is fundamental for advancing regulatory decision-making and achieving the ultimate goal of proactively protecting all species, tested and untested, from environmental chemical stressors.

In the fields of toxicology and drug development, the concept of taxonomic coverage refers to the extent and reliability with which biological findings—particularly those related to chemical mechanisms and adverse outcomes—can be generalized across different species. The central challenge lies in the significant gap between the assumed breadth of these applications and the actual evidence supporting them. This discrepancy poses substantial risks for regulatory decision-making, drug safety profiling, and environmental risk assessment.

The advent of New Approach Methodologies (NAMs)—including in silico models, in vitro assays, and pathway-based frameworks—aims to reduce animal testing while improving human and environmental relevance [8]. However, the adoption of these approaches in regulatory frameworks has been slow, due in part to uncertainties regarding their applicability across species and contexts [8]. Similarly, the Adverse Outcome Pathway (AOP) framework provides a structured model connecting molecular initiating events to adverse outcomes, but its utility depends heavily on the accurate taxonomic characterization of each key event relationship [9] [10].

This guide examines the current limitations in evidence-based taxonomic coverage, comparing the performance of different methodological approaches and providing experimental data that highlights both the progress and persistent gaps in this critical research area.

The Theoretical Promise vs. Documented Limitations of Current Frameworks

The AOP Framework: Opportunities and Taxonomic Constraints

The Adverse Outcome Pathway framework represents a significant advancement in organizing mechanistic toxicological information. AOPs formally connect a Molecular Initiating Event (MIE) to an Adverse Outcome (AO) through a series of biologically plausible Key Events (KEs) and Key Event Relationships (KERs) [9]. By 2017, over 200 AOPs had been established, demonstrating the framework's rapid adoption [9].

However, several limitations affect the taxonomic coverage of AOPs:

Linearity Assumption: AOPs often assume a linear progression of events, while biological systems frequently exhibit pathway plasticity, compensatory mechanisms, and non-linear responses [9]
Event Modifiers: The management of event modifiers (genetic, environmental, life-stage) and their variation across species remains challenging [9]
Toxicokinetic-Toxicodynamic Separation: The separation of toxicodynamics from toxicokinetics including metabolism is difficult within the current AOP structure, limiting accurate cross-species extrapolation [9]

New Approach Methodologies (NAMs): Confidence and Validation Gaps

NAMs encompass diverse technologies and approaches that replace, reduce, or refine animal testing, including in silico methods (QSARs), omics, read-across, in vitro assays, and organoids [8]. While offering significant potential, their adoption faces several taxonomic-related challenges:

Scientific Confidence Frameworks: Establishing confidence in NAMs for cross-species extrapolation requires demonstrating relevance and reliability, defining applicability domains, and documenting strengths and limitations [8]
Context of Use Limitations: Many NAMs have narrowly defined contexts of use, restricting their application across taxonomic boundaries [8]
Validation Processes: Traditional validation approaches requiring ring trials are time- and resource-intensive, slowing the development of taxonomically-broad applications [8]

Table 1: Key Limitations Affecting Taxonomic Coverage in AOP and NAM Frameworks

Limitation Category	Impact on Taxonomic Coverage	Representative Examples
Pathway Plasticity	Limits unidirectional assumptions in AOPs; complicates cross-species conservation	Multiple hit events in liver fibrosis [9]
Metabolic Variations	Restricts extrapolation of molecular initiating events across species	Species-specific metabolism of paracetamol and vinyl acetate [9]
Compensatory Mechanisms	Obscures adverse outcome pathways in resistant species	Tumor promotion mechanisms [9]
Domain Applicability	Constrains NAM use to specific taxonomic contexts	Limited wildlife species coverage for endocrine disruption assessment [8]

Comparative Analysis of Methodological Approaches to Taxonomic Classification

Database-Dependent vs. Machine Learning Methods

In bioinformatics, taxonomic classification methods fall into two primary categories: database-based methods and machine learning approaches. A 2024 comparative study evaluated these methods using simulated datasets, with significant implications for their use in AOP development and validation [11].

Table 2: Performance Comparison of Taxonomic Classification Methods [11]

Method Type	Subcategory	Strengths	Limitations	Conditions for Optimal Performance
Database-Based	Alignment-Based	High accuracy with comprehensive references	Computationally intensive for large datasets	Rich, comprehensive reference database available
	Marker-Based	Efficient for conserved genes	Limited to known marker regions	Well-characterized marker genes exist
	k-mer-Based	Fast, universal application	Sensitive to sequence errors	High-quality sequencing data
Machine Learning	Various Algorithms	Superior with sparse reference data	Performance limited by training data representativeness	Reference sequences sparse or lacking
Integrated Approaches	Multiple DB Methods	Enhanced classification accuracy	Increased computational complexity	Diverse database coverage available

Key findings from the comparison include:

Database methods excel in classification accuracy when supported by comprehensive reference databases but are constrained by database quality and scope [11]
Machine learning methods offer advantages when reference sequences are sparse but their performance depends heavily on training data representativeness [11]
Integration of multiple database methods enhances classification accuracy, suggesting hybrid approaches may offer the best taxonomic coverage [11]

Protein-Drug Interaction Mapping and Evolutionary Conservation

Understanding the evolutionary context of drug-target interactions provides crucial insights for taxonomic coverage. The DrugDomain database represents a significant advancement by mapping interactions between protein domains and drugs from DrugBank using the Evolutionary Classification of Protein Domains (ECOD) [12].

This resource highlights that:

72% of FDA-approved drugs in the last five years are small molecules, primarily targeting specific protein domains [12]
Multi-domain binding sites present particular challenges for taxonomic extrapolation, as seen in human prostaglandin D-synthase and topoisomerase II beta [12]
AlphaFold models have expanded coverage for human protein targets lacking experimental structures, improving taxonomic mapping capabilities [12] [13]

DrugDomain v2.0 now catalogs interactions with over 37,000 PDB ligands and 7,560 DrugBank molecules, providing an extensive resource for evaluating taxonomic conservation of drug-target interactions [13].

Experimental Protocols for Assessing Taxonomic Coverage

Protocol 1: Evaluating Cross-Species Applicability of AOPs

Objective: To experimentally verify the taxonomic applicability of a proposed Adverse Outcome Pathway across multiple species.

Methodology:

Molecular Initiating Event Conservation Analysis
- Identify orthologs of the target protein across species of interest
- Use structural modeling (e.g., AlphaFold) to compare binding sites
- Assess binding affinity conservation using in vitro assays
Key Event Confirmation
- Develop species-specific in vitro models (primary cells or cell lines)
- Expose to graded concentrations of stressor
- Measure key event biomarkers using standardized assays
Adverse Outcome Verification
- Conduct in vivo studies in representative species (when ethically justified)
- Apply benchmark dose modeling to establish quantitative relationships
- Compare pathway sensitivity and potency across species

Data Interpretation: Quantitative consistency in MIEs and KEs across species supports broader taxonomic applicability, while significant variations indicate need for species-specific AOP development.

Protocol 2: Validation of NAMs for Cross-Species Extrapolation

Objective: To establish scientific confidence in New Approach Methodologies for taxonomic extrapolation, particularly for protected species.

Methodology:

Domain of Applicability Definition
- Map phylogenetic relationships of species of interest
- Identify relevant biological similarities and differences
- Establish acceptance criteria for extrapolation
Context of Use Evaluation
- Test reference chemicals with known cross-species effects
- Assess performance against existing in vivo data
- Evaluate under controlled experimental conditions
Uncertainty Characterization
- Identify knowledge gaps for specific taxonomic groups
- Quantify uncertainty using probabilistic methods
- Document limitations for regulatory consideration

Implementation: This protocol supports the use of fit-for-purpose collaborative case studies involving developers, users, and regulators, as encouraged for advancing NAM incorporation into standard practice [8].

Visualization of Taxonomic Coverage Assessment in AOP Development

AOP Taxonomic Coverage Assessment

This workflow outlines the process for evaluating the taxonomic coverage of Adverse Outcome Pathways, highlighting evidence collection across multiple biological levels and the subsequent determination of coverage strength.

Table 3: Key Research Reagent Solutions for Taxonomic Coverage Studies

Resource	Primary Function	Application in Taxonomic Coverage	Access Information
DrugDomain Database	Maps drug interactions to protein domains	Identifies evolutionary conservation of drug targets	http://drugdomain.cs.ucf.edu/ [12] [13]
AOP-Wiki	Repository for adverse outcome pathways	Assesses known AOPs and their taxonomic applicability	https://aopwiki.org/ [14]
ECOD Classification	Evolutionary protein domain classification	Provides framework for domain-level taxonomic comparisons	http://prodata.swmed.edu/ecod/ [12]
FAIR AOP Resources	Implements findable, accessible, interoperable, reusable AOP data	Supports standardized taxonomic annotations	https://www.epa.gov/risk/fair-aop-roadmap-2025 [14]
Integrated IATA	Integrated approaches to testing and assessment	Framework for combining multiple NAMs for taxonomic coverage	OECD IATA Guidance [10]

The gap between assumed and evidence-based taxonomic coverage remains a significant challenge in toxicology and drug development. While frameworks like AOP and methodologies like NAMs offer promising approaches, their full potential is limited by insufficient characterization of taxonomic applicability.

Key strategies for addressing these limitations include:

Enhanced Database Integration: Combining multiple database approaches improves taxonomic classification accuracy [11]
Evolutionary Context Incorporation: Resources like DrugDomain provide crucial protein-domain interaction data for understanding taxonomic conservation [12] [13]
FAIR Data Principles: Implementing findable, accessible, interoperable, and reusable data practices for AOPs facilitates better taxonomic evaluation [14]
Confidence Framework Application: Using Scientific Confidence Frameworks for NAM validation ensures appropriate taxonomic application [8]

As these approaches mature, the scientific community must prioritize transparent reporting of taxonomic limitations and continued development of frameworks that explicitly address—rather than assume—taxonomic coverage. This evidence-based approach is essential for reliable risk assessment, drug safety evaluation, and regulatory decision-making that adequately protects human health and the environment across species boundaries.

From Theory to Practice: Methodological Approaches for Defining and Applying tDOA

The Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool, developed by the U.S. Environmental Protection Agency (EPA), represents a significant advancement in computational toxicology and ecological risk assessment. It addresses a fundamental challenge in toxicology: the impracticality of performing toxicity tests on every species potentially exposed to environmental contaminants [3]. SeqAPASS is a fast, freely available, online screening tool that enables researchers and regulators to extrapolate toxicity information from data-rich model organisms (e.g., humans, mice, rats, zebrafish) to thousands of other non-target species [15]. This capability is particularly vital for protecting biodiversity, assessing risks to pollinators and endangered species, and fulfilling the needs of modern chemical safety evaluations that seek to reduce reliance on animal testing [16] [17].

The tool's core premise is that a species' relative intrinsic susceptibility to a chemical can be predicted by evaluating the conservation of that chemical's known protein targets [16]. By leveraging publicly available protein sequence and structural information, SeqAPASS provides a scientifically grounded method to extrapolate molecular toxicity knowledge across the tree of life. This function is indispensable for defining the taxonomic domain of applicability (tDOA) for Adverse Outcome Pathways (AOPs), a critical element in the AOP framework that specifies the taxonomic space where an AOP is relevant [3]. As such, SeqAPASS has become an essential component in the toolbox of researchers, scientists, and drug development professionals working within the paradigm of 21st-century toxicology.

The Hierarchical Framework of SeqAPASS: A Multi-Tiered Approach for Cross-Species Extrapolation

SeqAPASS employs a hierarchical, multi-tiered approach that allows users to conduct analyses with varying degrees of specificity, from broad sequence comparisons to detailed structural evaluations. This framework is designed to capitalize on any existing knowledge about a chemical-protein interaction, making it flexible and adaptable to diverse research scenarios [16] [15].

Level 1: Primary Amino Acid Sequence Comparison

The initial analysis involves comparing the entire primary amino acid sequence of a query protein from a sensitive species against all species with available sequence data in public databases like the National Center for Biotechnology Information (NCBI) [16]. This level provides a broad, screening-level assessment of protein conservation and potential chemical susceptibility across species.

Level 2: Functional Domain Comparison

The second level of analysis narrows the focus to the specific functional domains of the protein. This is crucial because chemicals often interact with specific protein regions rather than the entire sequence [16]. By evaluating domain conservation, SeqAPASS offers greater taxonomic resolution in its susceptibility predictions.

Level 3: Critical Amino Acid Residue Comparison

The third and most precise sequence-based analysis examines the conservation of individual amino acid residues known to be critical for chemical-protein binding or protein function [16] [17]. Even single amino acid differences can significantly alter species' susceptibility to chemicals, and this level accounts for such subtleties.

Level 4: Protein Structural Comparison (New in v8.0)

The latest version of SeqAPASS (v8.0) introduces a fourth level of analysis: protein structural comparison [18]. This advanced feature allows for the generation and alignment of protein structures across species, adding a powerful line of evidence for understanding conservation based on the principle that structure often determines function [17] [18]. It integrates tools like I-TASSER for protein structure prediction and iCn3D for visualization and structural superposition analyses [17] [18].

The following diagram illustrates the complete SeqAPASS workflow, from data input through the four hierarchical levels of analysis to the final output and interpretation.

Performance and Comparative Analysis of SeqAPASS

Performance Metrics and Tool Evolution

While direct, head-to-head performance comparisons between SeqAPASS and other specific bioinformatic tools for cross-species extrapolation are limited in the available literature, the continuous development and expanding feature set of SeqAPASS demonstrate its robust capabilities. The tool has evolved significantly since its initial release in 2016, with annual version updates incorporating new features based on user feedback and technological advancements [16]. The following table summarizes key performance-related features and their evolution.

Table 1: Evolution of SeqAPASS Tool Features and Capabilities

SeqAPASS Version	Release Date	Key Performance and Feature Updates
v1.0 [16]	January 2016	Initial public release with Level 1 (primary sequence) and Level 2 (functional domain) comparisons.
v3.0 [16]	March 2018	Introduction of interactive data visualization capabilities and automatic Level 3 susceptibility prediction.
v4.0 [16]	October 2019	Enhanced interoperability with ECOTOX Knowledgebase for empirical data comparison and summary reports.
v5.0 [16]	December 2020	Customizable heat map visualization for Level 3 and a downloadable Decision Summary Report (.pdf).
v6.0 [16]	September 2021	Widget for connecting SeqAPASS predictions to empirical toxicity data in the ECOTOX Knowledgebase.
v8.0 [18]	September 2024	Introduction of Level 4 for protein structure generation/alignment and integration of iCn3D visualization.

A key performance aspect of SeqAPASS is its robustness. The tool draws from the comprehensive NCBI protein database, which contains information on over 153 million proteins representing more than 95,000 organisms [15]. This vast data repository ensures that predictions are based on a wide biological space. Furthermore, the tool's design allows for rapid analysis; the protocol for assessing protein conservation can be completed in a short amount of time, generating customizable, publication-quality graphics and data summaries [16] [19].

Comparison with Alternative Approaches and Complementary Tools

SeqAPASS operates within a broader ecosystem of New Approach Methodologies (NAMs). Its unique value becomes apparent when compared to, or used in combination with, other computational and empirical methods.

Comparison to Exome Capture Systems: Tools like NimbleGen's SeqCap EZ, Agilent's SureSelect, and Illumina's TruSeq and Nextera focus on enriching and sequencing the protein-coding regions of a genome for variant detection [20]. Unlike these technologies, which are wet-lab methods for data generation, SeqAPASS is a computational tool for data interpretation and extrapolation. It does not generate new sequence data but leverages existing public data to make predictions about chemical susceptibility. While exome capture is powerful for discovering genetic variation within a species, SeqAPASS specializes in comparing known protein targets across species.
Comparison to Chimeric RNA Prediction Tools: A 2021 benchmarking study evaluated 16 software tools for chimeric RNA prediction, including SOAPfuse, MapSplice, and FusionCatcher [21]. These tools are designed to identify fusion transcripts from RNA-Seq data, which is a distinct application from the cross-species extrapolation of chemical susceptibility performed by SeqAPASS. The domain of applicability for these tools is cancer genomics and transcriptome biology, whereas SeqAPASS is firmly situated in comparative toxicology and ecological risk assessment.
Synergy with G2P-SCAN and the AOP Framework: A powerful demonstration of SeqAPASS's utility is its integration with other computational NAMs. Research has shown that combining SeqAPASS with the Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tool enhances predictions of chemical susceptibility across species [3]. While SeqAPASS evaluates protein conservation, G2P-SCAN infers biological pathway conservation. Used together, they provide multiple, complementary lines of evidence for the taxonomic domain of applicability of an AOP. This combination allows researchers to move from molecular target conservation (via SeqAPASS) to broader pathway conservation (via G2P-SCAN), creating a more comprehensive basis for extrapolation [3].

The following table summarizes the distinct roles and capabilities of SeqAPASS relative to other bioinformatic tools.

Table 2: Comparative Analysis of SeqAPASS and Other Bioinformatics Tools

Tool Category	Example Tools	Primary Purpose	Domain of Applicability	SeqAPASS Differentiation
Cross-Species Extrapolation	SeqAPASS, G2P-SCAN [3]	Predict chemical susceptibility across species based on protein/pathway conservation.	Ecological risk assessment, AOP tDOA.	Directly evaluates protein sequence/structure conservation for chemical targets.
Exome Capture Systems [20]	NimbleGen SeqCap EZ, Agilent SureSelect	Enrichment of exonic regions for deep sequencing.	Medical genetics, variant discovery in humans/model organisms.	A computational analysis tool, not a sequencing preparation method.
Chimeric RNA Prediction [21]	SOAPfuse, FusionCatcher, STAR-Fusion	Identify gene fusion transcripts from RNA-Seq data.	Cancer research, transcriptomics.	Focuses on protein-level conservation, not transcript-level fusion events.
Protein Structure Prediction	I-TASSER [17], AlphaFold	Predict 3D protein structures from amino acid sequences.	Structural biology, drug design.	SeqAPASS v8.0 integrates these tools (e.g., I-TASSER) into its Level 4 workflow.

Experimental Protocols and Methodologies

Core Protocol for Running a SeqAPASS Analysis

The standard methodology for conducting a SeqAPASS analysis is detailed in published protocols [16] [19]. The process is designed to be accessible to both expert and non-expert users.

Getting Started and Protein Identification: Access the SeqAPASS tool through its official website (https://seqapass.epa.gov/seqapass/) using a Chrome browser and log in with a user account [16]. Prior to analysis, identify a protein of interest and a sensitive species through a literature review. The tool provides links to external resources like the CompTox Chemicals Dashboard and AOP-Wiki to aid in target identification [16].
Developing and Running the Query (Level 1): Initiate a Level 1 analysis by entering the query protein, typically using its NCBI protein accession number. The tool uses BLASTp algorithms to compare the primary amino acid sequence of the query protein against all species with available sequence data in the NCBI database [16]. The results are displayed as a taxonomic tree, with color-coding indicating the predicted susceptibility of different taxonomic groups.
Refining the Analysis (Levels 2 and 3): Based on the Level 1 results, the user can proceed to Level 2 by selecting specific functional domains to compare. For an even more refined analysis, Level 3 requires input on the specific amino acid residues critical for chemical binding. This information can be derived from crystallographic data or scientific literature, and the tool provides a "Reference Explorer" to help identify relevant sources [16].
Data Interpretation and Visualization: SeqAPASS provides multiple options for interpreting results, including interactive data visualizations, summary tables, and a comprehensive Decision Summary Report that synthesizes data across all analysis levels into a downloadable PDF [16]. The tool also allows for the creation of heat maps for Level 3 data, enabling rapid assessment of critical residue conservation [16].

Advanced Protocol: Integrating Structural Biology (Level 4)

With SeqAPASS v8.0, the experimental pipeline can be extended to include protein structural comparisons, adding a critical line of evidence for conservation [17] [18].

Sequence-Based Foundation: The process begins with a standard SeqAPASS evaluation (Levels 1-3) to identify orthologous protein sequences across species of interest [17].
Protein Structure Prediction: For species where a protein structure is not available in public databases, the pipeline uses advanced protein structure prediction tools like I-TASSER (Iterative Threading ASSEmbly Refinement). I-TASSER is a top-ranked algorithm that uses threading-based fold recognition and fragment-based assembly to generate accurate 3D protein models from amino acid sequences [17].
Structural Alignment and Comparison: The generated protein structures are then compared using structural alignment tools like TM-align. This algorithm measures structural similarity by calculating a Template Modeling Score (TM-score), which quantifies the conservation of protein folds across species, independent of sequence identity [17].
Visualization and Analysis: The final step involves visualizing the superimposed protein structures to assess conservation of the binding pocket geometry. SeqAPASS v8.0 integrates the iCn3D tool directly into its interface, allowing users to visualize and analyze the generated protein structures and their alignments [18].

The workflow for this advanced, multi-modal analysis is illustrated below.

Successful application of the SeqAPASS tool and its hierarchical framework relies on a suite of essential bioinformatic reagents and databases. The following table details key resources used in typical SeqAPASS experiments.

Table 3: Essential Research Reagents and Resources for SeqAPASS Analyses

Resource Name	Type	Function in SeqAPASS Workflow	Key Features/Details
NCBI Protein Database [15]	Database	Primary data source for protein sequences.	Contains over 153 million protein sequences from >95,000 organisms.
BLAST+ Executable [16]	Software Tool	Performs primary amino acid sequence alignments in Level 1 analysis.	Standard tool for comparing primary biological sequence information.
COBALT Executable [16]	Software Tool	Used for multiple sequence alignments in Level 2 and 3 analyses.	Tool for multiple protein sequence alignment that considers conservation.
I-TASSER [17]	Software Tool	Predicts 3D protein structures from sequences for Level 4 analysis.	Top-ranked algorithm for automated protein structure prediction.
TM-align [17]	Software Tool	Measures structural similarity of proteins for Level 4 analysis.	Algorithm for comparing protein structures; outputs TM-score.
iCn3D [18]	Software Tool	Visualizes protein structures and superpositions in SeqAPASS v8.0.	Integrated into SeqAPASS for interactive 3D structure visualization.
ECOTOX Knowledgebase [16]	Database	Provides empirical toxicity data for comparison with SeqAPASS predictions.	Curated database of chemical toxicity for aquatic and terrestrial life.
CompTox Chemicals Dashboard [16]	Database	Aids in identifying molecular targets for chemicals of interest.	EPA's database for chemistry, toxicity, and exposure data.

SeqAPASS represents a paradigm shift in toxicological research, offering a robust, flexible, and hierarchical framework for addressing the complex challenge of cross-species extrapolation. Its multi-tiered approach—from primary sequence to protein structure—enables researchers to gather increasing levels of evidence to define the taxonomic domain of applicability for chemical interactions and Adverse Outcome Pathways. By leveraging vast public bioinformatic data and integrating with other NAMs like G2P-SCAN, SeqAPASS moves the field beyond reliance on traditional animal testing and toward a more predictive and efficient safety assessment paradigm.

The continuous evolution of the tool, culminating in the recent v8.0 release with its structural biology capabilities, demonstrates a commitment to incorporating scientific advances directly into the hands of researchers. For scientists and drug development professionals, mastering SeqAPASS is no longer just an advantage but a necessity for conducting state-of-the-art ecological risk assessments and for thoughtfully extending human-focused toxicological data to protect the broader environment.

The adverse outcome pathway (AOP) framework provides a structured approach to organizing biological knowledge by delineating causal linkages between a molecular initiating event (MIE) and an adverse outcome (AO) relevant to risk assessment [6]. For this case study, we focus on AOP 89: nAChR Activation Leading to Colony Death/Failure, which was initially developed with specific emphasis on the honey bee (Apis mellifera) [6]. This AOP network emerged from concerns over the role of neonicotinoid insecticides in global bee population declines [22] [6]. The Taxonomic Domain of Applicability (tDOA) of an AOP defines the range of species for which the described pathway is biologically plausible [6]. Accurately defining the tDOA is critical for regulatory decision-making, particularly when extrapolating findings from tested species to protect untested ones. The core consideration for tDOA definition rests on evaluating the structural and functional conservation of Key Events (KEs) and Key Event Relationships (KERs) across taxa [6]. This case study demonstrates how bioinformatics tools, specifically the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool, can be employed to systematically evaluate structural conservation and expand the tDOA for this critical AOP beyond A. mellifera to other bee species.

Experimental Protocols and Methodologies for tDOA Analysis

Bioinformatics Workflow for Taxonomic Extrapolation

Defining the tDOA requires a methodical approach to evaluate the conservation of the AOP's components. The US Environmental Protection Agency's SeqAPASS tool offers a hierarchical framework for this purpose [6]. The workflow is structured into three progressive levels of evaluation, each providing distinct lines of evidence for structural conservation.

Level 1 Evaluation (Primary Sequence Comparison): This initial phase involves comparing the primary amino acid sequence of the molecular target (e.g., nAChR subunits) from a reference species (A. mellifera) against the protein sequences of other species. The analysis identifies putative orthologs—sequences that likely diverged from a common ancestor through speciation and are expected to maintain similar function. A high degree of sequence similarity at this level provides foundational evidence for the presence of the molecular target in other taxa [6].
Level 2 Evaluation (Functional Domain Conservation): This more refined analysis assesses the conservation of specific functional domains within the protein sequence. For nAChRs, this includes evaluating the preservation of agonist-binding domains critical for the interaction with neonicotinoid insecticides. Conservation of these domains across species strengthens the biological plausibility that the molecular initiating event (nAChR activation) can occur similarly [6].
Level 3 Evaluation (Critical Residue Conservation): The most granular level of analysis examines the conservation of individual amino acid residues known to be critical for protein-ligand interactions, protein-protein interactions, or overall protein function. For the nAChR, this involves assessing residues that form the orthosteric binding site where neonicotinoids act as agonists. The preservation of these specific residues across species provides strong evidence for comparable susceptibility to chemical perturbation [6].

Empirical Validation and Integration

While bioinformatics provides powerful evidence for structural conservation, defining the full tDOA also requires evidence of functional conservation. This is achieved by integrating SeqAPASS results with available empirical data from toxicological and physiological studies [6]. For bees, such data might include:

In vitro receptor binding assays to confirm neonicotinoid affinity for nAChRs of different species.
Sublethal behavioral assays measuring effects on learning, memory, or locomotion in response to exposure [23] [24].
Whole-organism toxicity tests to establish dose-response relationships for key events such as impaired foraging or reduced colony growth.

The convergence of computational predictions and empirical observations provides a robust, weight-of-evidence basis for defining the tDOA for each KE, KER, and the AOP as a whole.

Comparative Analysis of AOP Component Conservation

The following tables summarize the evidence supporting the taxonomic domain of applicability for the key events and the overall AOP, based on the bioinformatics analysis and empirical evidence.

Table 1: Taxonomic Domain of Applicability for Key Events in AOP 89

Key Event (KE)	Biological Level	Evidence for Structural Conservation (SeqAPASS)	Empirical Support for Functional Conservation
KE 1: nAChR Activation	Molecular/ Cellular	High conservation of nAChR subunit sequences, functional domains, and critical ligand-binding residues across Apis and non-Apis bees [6].	In vitro binding studies and neurophysiological recordings confirm agonist action of neonicotinoids on nAChRs in multiple bee species [22] [24].
KE 2: Altered Neural Function	Cellular/ Organ	Conservation of neural targets (e.g., mushroom bodies, central complex) and cholinergic signaling pathways across bee species [6].	Impaired olfactory learning and memory demonstrated in laboratory assays for Bombus and Osmia exposed to neonicotinoids [24].
KE 3: Impaired Foraging Behavior	Organism	Conservation of brain structures governing navigation and foraging behavior [6].	Reduced foraging efficiency, disorientation, and impaired homing ability observed in field and semi-field studies with bumble bees and other wild bees [22] [24].
KE 4: Reduced Colony Growth	Population	The social organization of a colony is a shared feature among eusocial bees like Apis and Bombus [22].	Documented declines in brood production, food storage, and colony weight in neonicotinoid-exposed populations of bumble bees [22].
AO: Colony Death/Collapse	Population	The adverse outcome is defined at the population level and is relevant to social bees that live in colonies [22] [6].	Links between neonicotinoid exposure and increased colony failure rates or reduced queen production in multiple eusocial species [22] [24].

Table 2: Summary of tDOA for AOP 89 Across Major Bee Groups

Bee Group	Example Genera	Confidence in AOP Applicability	Key Supporting Evidence
Eusocial Bees	Apis (honey bees), Bombus (bumble bees)	High	Strong evidence for conservation of all KEs from molecular to population level. Empirical data from multiple species confirm functional links [6].
Solitary Bees	Osmia (mason bees), Megachile (leafcutter bees)	Moderate	High confidence for early KEs (nAChR activation, neural impairment). Empirical data on foraging impairment exists. Confidence for colony-level AOs is lower due to different life history [6].
Stingless Bees	Melipona, Trigona	Moderate (Theoretical)	Strong evidence for structural conservation of early KEs via SeqAPASS. Lacks extensive empirical toxicological data to confirm functional links to colony-level outcomes [6].

Visualization of the AOP and tDOA Analysis Workflow

The following diagrams, generated using Graphviz DOT language, illustrate the core AOP and the methodological workflow for tDOA analysis. The color palette and contrast adhere to the specified guidelines to ensure accessibility.

Diagram 1: AOP 89 - nAChR Activation to Colony Death

Diagram 2: Workflow for Defining tDOA using SeqAPASS

This table catalogues key computational, molecular, and bioinformatic resources essential for conducting tDOA analysis as described in this case study.

Table 3: Key Research Reagent Solutions for tDOA Analysis

Tool / Reagent	Type	Primary Function in tDOA Analysis	Application Example
SeqAPASS Tool	Bioinformatics Software	Evaluates protein sequence and structural similarity across species to infer potential chemical susceptibility [6].	Determining conservation of nAChR subunits and critical ligand-binding residues between A. mellifera and non-Apis bees.
AOP-Wiki	Knowledge Repository	Central repository for developed AOPs, including KEs, KERs, and supporting evidence [6].	Accessing the formal description of AOP 89 and its components as a baseline for tDOA expansion.
nAChR Subunit Proteins	Molecular Reagent	Used in in vitro competitive binding assays (e.g., radioligand binding) to confirm receptor-level interactions.	Validating the functional conservation of the Molecular Initiating Event by measuring neonicotinoid binding affinity to nAChRs from different bee species.
Curated Protein Databases	Data Resource	Provide the comprehensive, annotated protein sequence data required for cross-species comparisons in SeqAPASS [6].	Sourcing protein sequences for nAChR subunits from a wide taxonomic range of Hymenoptera and other insects.
DAGitty	Causal Diagram Tool	A browser-based environment for creating, editing, and analyzing causal diagrams/directed acyclic graphs (DAGs) [25].	Refining and visualizing the causal relationships within the AOP network and identifying potential confounding factors during empirical validation.

Integrating Empirical Data with Computational Predictions for Robust tDOA Assessment

Time Difference of Arrival (TDOA) is a pivotal technique for passive localization in fields ranging from wireless networks to underwater navigation. This guide objectively compares the performance of contemporary TDOA methods, from compressed sensing to deep learning hybrids, by synthesizing experimental data on their accuracy under challenging line-of-sight (LoS) and non-line-of-sight (NLoS) conditions. Framed within a thesis on taxonomic domains for the Adverse Outcome Pathway (AOP) research framework, this analysis provides drug development professionals and scientists with a structured comparison of methodological trade-offs in precision, data efficiency, and computational complexity.

Time Difference of Arrival (TDOA) is a passive localization technique that determines the position of a signal source by measuring the difference in the signal's arrival time at multiple, spatially separated receivers [26] [27]. Unlike Time of Arrival (ToA), which requires precise synchronization between the transmitter and all receivers, TDOA only requires synchronization among the receiving nodes, simplifying system design [28]. These time differences define hyperbolas, and the source's location is estimated at the intersection of multiple such hyperbolas, a process known as multilateration [27].

The core challenge in robust TDOA assessment lies in mitigating errors introduced by noise, multipath propagation, and particularly NLoS conditions, where the direct path between the source and receiver is blocked [28] [29]. NLoS conditions can cause significant positive biases in delay estimates, severely degrading localization accuracy. The methods discussed herein aim to address these challenges through a combination of empirical data processing and computational prediction.

Comparative Performance of TDOA Methods

The following table summarizes the key performance characteristics of modern TDOA methods, highlighting their respective approaches to handling NLoS conditions and their demonstrated accuracies.

Table 1: Performance Comparison of Contemporary TDOA Methods

Method / Algorithm	Core Approach	Key Innovation	Test Environment	Reported Localization Accuracy	Key Advantage
EIRCS [26]	Compressed Sensing	Inexact signal reconstruction preserving phase data	Simulation / General	High precision, minimal error at high compression ratios	High data compression, unbiased estimation
TDoA w/ NLoS Masking [28]	Channel Charting (CC) & Sensor Fusion	Masks NLoS measurements using CIR power distributions	Real 5G O-RAN Testbed	2-4 meters (90% of cases)	Real-world robustness in mixed LoS/NLoS
Dual-Driven (AML TDoA) [29]	Data & Model-Driven Fusion	Transformer network for LoS ToA statistics	Urban Canyon Simulation	Approximates Cramer-Rao Lower Bound	Scalability, robust to varying BS combinations
CNN-BiGRU w/ Attention [30]	Deep Learning Hybrid	Attention on key TDOA/FDOA signal features	Underwater Simulation (20 dB SNR)	2.58 meters (Position), 2.88 m/s (Velocity)	Superior in complex, dynamic environments (e.g., underwater)
Generalized Cross Correlation (GCC) [31]	Classical Signal Processing	Pre-filtering signals to sharpen correlation peak	Controlled / Simple Noise	Effective in high SNR, simple scenarios	Simplicity, well-established theory

Detailed Experimental Protocols and Data Analysis

To ensure reproducibility and provide a clear basis for comparison, this section details the experimental methodologies and data analysis protocols for the key studies cited.

Enhanced Inexact Reconstruction Compressed Sensing (EIRCS)

Objective: To achieve high-precision TDOA estimation while simultaneously compressing the volume of sampled data, overcoming challenges related to data acquisition, transmission, and storage [26].
Protocol:
- Signal Model: Two sensors receive a common signal, denoted as ( x1(n) = s(n) + n1(n) ) and ( x2(n) = s(n - D) + n2(n) ), where ( D ) is the TDOA and ( n_i(n) ) is independent receiver noise [26].
- Compressed Sampling: The original signal ( \mathbf{s} ) is projected into a lower-dimensional space using a measurement matrix ( \mathbf{\Phi} ) to obtain linear measurements ( \mathbf{y} = \mathbf{\Phi s} ) [26].
- Inexact Reconstruction: The Orthogonal Matching Pursuit (OMP) algorithm is used to reconstruct a signal approximation from ( \mathbf{y} ) and the sensing matrix ( \mathbf{A} ). The focus is not on perfect signal reconstruction but on retaining the phase relationships critical for TDOA [26].
- TDOA Estimation: The cross-correlation of the inexactly reconstructed signals is computed, and the TDOA is estimated by finding the lag that maximizes this cross-correlation function [26].
Data Analysis & Key Findings: The EIRCS method was validated as an unbiased estimation technique. Experimental results confirmed its ability to maintain high TDOA estimation precision even at high compression ratios, where the number of rows in the measurement matrix is significantly reduced [26].

Self-Supervised Channel Charting with NLoS Mitigation

Objective: To enable global-scale, self-supervised user equipment (UE) localization in real 5G networks that is robust to NLoS conditions [28].
Protocol:
- Data Collection: In a real-world O-RAN-based 5G testbed, Uplink Sounding Reference Signal (UL-SRS) Channel Impulse Response (CIR) data is collected alongside Time Difference of Arrival (TDoA) measurements and known Transmission Reception Point (TRP) locations. Ground truth positioning is provided by a centimeter-accurate Real-Time Kinematic (RTK) system [28].
- NLoS Identification: The empirical power distribution of the CIR data is analyzed to automatically identify and "mask" (i.e., exclude) measurements likely corrupted by NLoS propagation during model training and inference [28].
- Sensor Fusion Model Training: A Channel Charting (CC) model, a form of dimensionality reduction, is trained. The model uses a loss function that incorporates:
  - CIR data to learn the local radio environment geometry.
  - TDoA measurements and TRP locations to anchor the learned chart to a global coordinate system.
  - Short-interval UE displacement measurements to improve trajectory continuity [28].
Data Analysis & Key Findings: When benchmarked against RTK positioning, the proposed model achieved a localization accuracy of 2 to 4 meters in 90% of cases across a range of NLoS ratios, outperforming existing state-of-the-art semi-supervised and self-supervised CC approaches [28].

Dual-Driven AML TDoA with LoS Inference

Objective: To combine the scalability of model-driven methods with the NLoS resilience of data-driven approaches for cooperative localization in mmWave MIMO networks [29].
Protocol:
- Offline Training (LoS Inference): A transformer-based neural network is trained at each base station (BS) using site-specific labeled data. Its task is to infer the statistics (mean and variance) of the LoS Time of Arrival (ToA) from the high-dimensional uplink Channel State Information (CSI), which contains multiple path components [29].
- Online Inference (AML Localization):
  - For a connected user, each BS's trained module processes the CSI to estimate the LoS ToA.
  - These estimates, along with their inferred variances, are sent to a Central Unit (CU).
  - The CU employs an Approximate Maximum Likelihood (AML) TDoA algorithm, which uses the variances as weights, to compute the final user coordinates [29].
Data Analysis & Key Findings: The dual-driven scheme demonstrated strong generalization and scalability, maintaining performance with varying numbers of channel paths and changing combinations of BS measurements. It significantly outperformed both purely data-driven and purely model-driven baseline methods in urban canyon simulations [29].

Visualizing TDOA Workflows and System Architecture

The following diagrams, rendered from DOT scripts, illustrate the core logical relationships and experimental workflows of the discussed TDOA methods.

Generalized TDOA Localization Principle

Diagram 1: The fundamental principle of TDOA-based localization. Time difference measurements from receiver pairs define hyperbolic curves, with the source located at their intersection.

Dual-Driven AML TDoA Architecture

Diagram 2: The dual-driven AML TDoA architecture, showing the offline training of the LoS inference module and its online deployment for scalable, robust localization.

The Scientist's Toolkit: Essential Research Reagents & Materials

Table 2: Key Research Reagents and Solutions for TDOA Experimentation

Item / Solution	Function in TDOA Research	Example Context / Note
Positioning Reference Signal (PRS)	A special cell-specific signal in LTE/5G frames designed for high-precision time difference measurement (RSTD) with low interference [27].	Critical for downlink OTDOA; configured for high "hearability" from neighboring base stations.
Sounding Reference Signal (SRS)	An uplink signal transmitted by User Equipment (UE) used by the network to estimate the uplink channel (CIR/CSI) [28].	Used as the signal source for uplink TDOA and channel charting approaches.
O-RAN Compliant Network	A disaggregated, software-defined RAN architecture enabling AI/ML model integration via the RIC (RAN Intelligent Controller) for positioning [28].	Provides the testbed infrastructure for implementing and testing advanced, data-driven TDOA methods.
Real-Time Kinematic (RTK) GPS	An enhanced GPS system providing centimeter-level accuracy, used as a ground truth reference to validate and benchmark new TDOA algorithms [28].	Serves as the performance benchmark in real-world field tests.
Precision Time Protocol (PTP)	A protocol for clock synchronization across a network, essential for obtaining accurate TDOA measurements between distributed receivers [28].	A prerequisite for any TDOA system; ensures receiver synchronization.
Channel State Information (CSI)	A high-dimensional representation of the channel between transmitter and receiver, capturing amplitude and phase information across subcarriers [29].	The rich data input for deep learning and channel charting methods.

This comparison guide demonstrates that the field of TDOA is evolving beyond classical cross-correlation towards methods that intelligently integrate empirical data with computational models. The EIRCS method offers a compelling solution for data-efficient scenarios, while the dual-driven and channel charting approaches provide a robust framework for dealing with the pervasive challenge of NLoS conditions in real-world networks. For the most complex and dynamic environments, such as underwater, deep learning hybrids like the CNN-BiGRU with attention mechanism show superior performance. Within the AOP research framework, this taxonomic comparison aids in selecting the appropriate TDOA assessment strategy based on the specific environmental domain and precision requirements, ultimately contributing to more reliable spatial analyses in scientific and drug development contexts.

The Adverse Outcome Pathway (AOP) framework provides a structured approach for organizing biological knowledge into a sequential chain of causally linked events, beginning with a Molecular Initiating Event (MIE) at the molecular level and culminating in an Adverse Outcome (AO) at the individual or population level [32]. This conceptual framework is chemically agnostic, enabling the description of potential actions for groups of chemicals rather than being specific to a single substance [33]. AOP development has gained significant momentum for supporting chemical risk assessment and regulatory decision-making, particularly as the field moves toward predictive approaches that utilize New Approach Methodologies (NAMs) [7] [32].

Defining the Taxonomic Domain of Applicability (tDOA) is a critical component of AOP development and application. The tDOA describes the species for which the AOP is considered valid and determines the scope for extrapolating knowledge to untested species [5]. For the majority of developed AOPs, the tDOA is typically narrowly defined, based on the single or handful of species used in the underlying empirical studies [5]. Structural and functional conservation of the key biological elements involved in an AOP are the two primary considerations when defining its tDOA [5]. Two primary resources—the AOP-Wiki and the AOP Database (AOP-DB)—serve as central repositories for AOP knowledge and provide complementary functionalities for tDOA research and discovery.

Comparative Analysis of AOP-Wiki and AOP-DB

The AOP-Wiki and AOP-DB are web-based platforms designed to support AOP development and application. While they share the common goal of organizing AOP-related knowledge, their structures, functionalities, and primary use cases differ significantly, especially in the context of investigating tDOA.

Table 1: Core Functional Comparison of AOP-Wiki and AOP-DB

Feature	AOP-Wiki	AOP-DB
Primary Function	Collaborative, wiki-based AOP development and description [7] [34]	Integrated database supporting computational analysis of AOP components [34]
Data Structure	Modular pages for AOPs, Key Events (KEs), and Key Event Relationships (KERs) [32]	Relational tables linking AOPs, genes, stressors, diseases, and pathways [34]
tDOA Information	Text descriptions and species-specific evidence supporting KEs [5]	Computationally derived associations enabling cross-species analysis via gene orthologs [34]
Key Strengths	Captures biological plausibility and weight of evidence; community-driven [32]	Enables data mining and linkage to external biological databases (e.g., DisGeNET) [34]
Ideal Use Case	Qualitative AOP development and weight-of-evidence assessment [32]	Quantitative analysis, hypothesis generation, and cross-species computational workflows [34]

Table 2: Data Content and Applicability for tDOA Research

Data Category	AOP-Wiki	AOP-DB
AOP Information	Full AOP narratives, KEs, and KERs with supporting references [32]	AOP identifiers and names, linked to molecular targets [34]
Taxonomic Data	Empirical tDOA based on species cited in KE descriptions [5]	Gene and protein data facilitating inference of structural conservation [5] [34]
Molecular Data	Protein Ontology terms for Molecular Initiating Events [34]	Explicit gene identifiers (Entrez IDs) mapped from AOP-Wiki content [34]
Chemical Data	Stressor information, often text-based and sometimes vague [34]	Curated chemical stressors mapped to specific structures and ToxCast assay data [34]
Disease/Phenotype	Adverse Outcomes described as health effects or ecological impacts [32]	Human disease associations with confidence scores sourced from DisGeNET [34]

Research Applications for tDOA Determination

Defining Biologically Plausible tDOA with Bioinformatics

The AOP-Wiki serves as the foundational repository for the biological narrative of an AOP. However, its tDOA is often limited to species with existing empirical data. The AOP-DB and integrated bioinformatic tools like the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool can extend this domain by providing evidence for structural conservation of proteins critical to the AOP [5].

The SeqAPASS tool uses a hierarchical approach to evaluate cross-species protein conservation, which serves as a line of evidence for tDOA definition:

Level 1: Compares primary amino acid sequence similarity to identify potential orthologs across species [5].
Level 2: Evaluates conservation of specific functional domains within the protein [5].
Level 3: Assesses conservation of individual amino acid residues critical for protein-ligand interactions, protein-protein interactions, or overall function [5].

This workflow allows researchers to move from an empirically defined tDOA to a biologically plausible tDOA that includes species where the fundamental molecular components are conserved, even in the absence of toxicity data.

Practical Case Study: nAChR Activation in Bees

A case study on an AOP linking the activation of the nicotinic acetylcholine receptor (nAChR) to colony death/failure in Apis mellifera (honey bee) demonstrates this integrated approach [5]. While the AOP was developed with a focus on a single species, researchers sought to define its applicability to other Apis and non-Apis bees.

Researchers identified nine proteins critical to the AOP and used them as queries in the SeqAPASS tool [5]. The resulting data on protein conservation across a range of species provided scientific evidence to support a broader tDOA, moving beyond the limited species specifically cited in the original AOP-Wiki entry. This methodology demonstrates how bioinformatics can rapidly leverage existing protein sequence and structural knowledge to enhance the tDOA for Key Events, Key Event Relationships, and entire AOPs [5].

Experimental Protocols and Research Toolkit

Protocol for Cross-Species tDOA Analysis

Objective: To define the biologically plausible tDOA for an AOP beyond the empirically tested species. Methodology: Integrated use of AOP-Wiki, AOP-DB, and bioinformatic tools.

AOP Selection and Characterization: Select a target AOP from the AOP-Wiki (https://aopwiki.org/). Identify all Key Events, particularly the Molecular Initiating Event, and note the listed empirical species supporting each event [5] [32].
Molecular Target Identification: For the MIE and molecular KEs, identify the specific proteins involved. The AOP-DB (https://www.epa.gov/healthresearch/adverse-outcome-pathway-database-aop-db) can be queried using AOP ID or name to retrieve associated gene identifiers (Entrez IDs), which may not be directly available in the AOP-Wiki [34].
Bioinformatic Analysis of Structural Conservation: Use the protein sequences from the primary species as queries in the SeqAPASS tool (https://seqapass.epa.gov/seqapass/). Perform a hierarchical analysis through all three levels to evaluate potential orthologs across a broad taxonomic range [5].
Data Integration and tDOA Definition: Integrate the results of the bioinformatic analysis with the existing empirical evidence from the AOP-Wiki. The biologically plausible tDOA can be expanded to include taxa where the relevant proteins show significant structural conservation at the sequence, domain, and critical residue levels [5].
Documentation: The expanded tDOA evidence, including SeqAPASS results, can be incorporated into the AOP-Wiki as lines of evidence for biological plausibility [5].

The Scientist's Toolkit for tDOA Research

Table 3: Essential Research Reagents and Resources for tDOA Investigation

Tool/Resource	Function in tDOA Research	Access Point
AOP-Wiki	Primary source for AOP structure, Key Events, and empirically documented species [7] [32].	https://aopwiki.org/
AOP-DB	Links AOP components to genes, chemicals, and diseases; enables computational queries and cross-species analysis via gene orthologs [34].	https://www.epa.gov/healthresearch/adverse-outcome-pathway-database-aop-db
SeqAPASS Tool	Bioinformatics tool that evaluates protein sequence and structural conservation across species to predict susceptibility and inform tDOA [5].	https://seqapass.epa.gov/seqapass/
AOP-KB	The overarching Adverse Outcome Pathway Knowledge Base, serving as a portal for various AOP tools and resources [34].	https://aopkb.oecd.org
DisGeNET	Database of gene-disease associations integrated into the AOP-DB, providing evidence for human health relevance of AOPs [34].	Via AOP-DB

The AOP-Wiki and AOP-DB are powerful, complementary resources for advancing tDOA knowledge and discovery. The AOP-Wiki excels as a collaborative platform for qualitative AOP development and weight-of-evidence assessment, capturing the biological narrative and empirical foundation of an AOP's taxonomic domain. The AOP-DB provides a robust computational infrastructure that transforms this narrative knowledge into structured, analyzable data, enabling sophisticated queries and cross-species extrapolation. For researchers aiming to define and expand the tDOA of an AOP, an integrated workflow that leverages the strengths of both platforms—combined with bioinformatic tools like SeqAPASS—represents the most effective strategy. This approach moves beyond a simple list of tested species to a mechanistically informed, biologically plausible taxonomic domain, thereby enhancing the utility and confidence in AOPs for chemical risk assessment across diverse species.

Overcoming Challenges: Strategies for Troubleshooting and Optimizing tDOA Determinations

Identifying and Addressing Knowledge Gaps in AOP Networks

Adverse Outcome Pathway (AOP) networks represent functional units of prediction in toxicology, providing a framework to organize mechanistic knowledge about how stressors cause adverse effects in biological systems [35] [36]. Unlike individual AOPs, which represent simplified linear sequences, AOP networks capture the complexity of real biological systems where multiple pathways interact through shared key events (KEs) [35] [32]. However, a significant challenge in AOP network application lies in identifying and addressing knowledge gaps, particularly when considering their applicability across different taxonomic domains. As the AOP framework gains traction for regulatory decision-making and chemical safety assessment, ensuring the completeness and taxonomic relevance of these networks becomes paramount [37] [36]. This guide compares current methodologies for knowledge gap identification and provides experimental approaches for addressing critical gaps in AOP network development.

Methodological Approaches for Knowledge Gap Identification

Comparative Analysis of Identification Methods

Table 1: Methodologies for Identifying Knowledge Gaps in AOP Networks

Method Type	Core Approach	Key Applications	Taxonomic Strengths	Primary Limitations
Structured Search & Data-Driven Workflows [37]	Automated extraction from AOP-KB using predefined search terms and computational filtering	Systematic mapping of existing knowledge; Identifying disconnected network components	Efficient screening of conserved pathways (e.g., EATS modalities across vertebrates)	Highly dependent on consistent KE nomenclature; May miss emerging pathways
Network Topology Analysis [35]	Application of graph theory to analyze KE connectivity and pathway redundancy	Identifying critical paths, bottlenecks, and isolated KEs	Reveals evolutionarily conserved versus taxon-specific network modules	Requires substantial existing network structure; Limited for nascent AOP networks
Weight of Evidence Evaluation [32]	Modified Bradford-Hill considerations assessing biological plausibility, empirical support, and essentiality	Prioritizing gaps based on confidence in existing KERs; Identifying weakly supported taxonomic extrapolations	Framework for evaluating taxonomic domain applicability of individual KERs	Labor-intensive; Subjective elements require expert judgment
Case Study-Driven Development [32]	Building networks from specific toxicological examples with known modes of action	Filling context-specific gaps for regulatory priorities; Testing taxonomic applicability in focused domains	Ground-truthing network predictions in specific model organisms	May not reveal broader taxonomic limitations; Potentially narrow focus

Experimental Protocols for Gap Identification

Protocol 1: Structured AOP Wiki Interrogation for Taxonomic Gap Analysis [37]

Define Problem Formulation: Clearly articulate the taxonomic scope and adverse outcomes of interest (e.g., "thyroid disruption in aquatic vertebrates").
Develop Search Strategy: Formulate comprehensive search terms based on established taxonomic and biological parameters. Simplify complex syntax from regulatory documents for effective database queries.
Execute Automated Extraction: Utilize computational workflows (e.g., R scripts) to extract relevant AOPs, KEs, and KERs from the AOP-Wiki.
Apply Taxonomic Filtering: Manually curate results to exclude AOPs without relevance to target taxonomic groups, noting where taxonomic domain applicability is unspecified.
Visualize Network Structure: Generate network maps highlighting KEs with limited taxonomic support and disconnected network components.

Protocol 2: Weight of Evidence Assessment for Taxonomic Extrapolation [32]

Categorize Empirical Support: For each Key Event Relationship (KER), classify evidence as strong (multiple taxonomic groups), moderate (limited taxonomic groups), or weak (single species).
Evaluate Essentiality: Determine if experimental evidence demonstrates that preventing an upstream KE blocks downstream events across multiple species.
Assess Biological Plausibility: Evaluate conservation of biological pathways across taxonomic groups using genomic and functional data.
Quantify Confidence: Score confidence in each KER's taxonomic applicability as high, moderate, or low based on cumulative evidence.
Identify Critical Gaps: Prioritize KERs with low confidence scores for further experimental validation across taxonomic domains.

Visualizing Knowledge Gaps and Taxonomic Applicability in AOP Networks

Workflow for Taxonomic Gap Analysis in AOP Networks

AOP Network Structure with Knowledge Gap Highlighting

Experimental Approaches for Addressing Knowledge Gaps

Comparative Experimental Strategies

Table 2: Experimental Approaches for Filling AOP Network Knowledge Gaps

Experimental Approach	Primary Application	Key Outputs	Taxonomic Domain Utility	Implementation Complexity
High-Throughput In Vitro Screening [36]	Identifying novel MIEs and early KEs across chemical classes	Potential MIEs for untested pathways; Quantitative response data	Efficient for conserved molecular targets; Limited for taxon-specific physiology	Moderate (requires specialized screening facilities)
Cross-Species Comparative Toxicology [32]	Testing KER conservation across evolutionary lineages	Taxonomic domain applicability boundaries; Species-sensitive KEs	Directly addresses taxonomic gaps; Identifies appropriate model organisms	High (multiple species husbandry and testing)
'Omics Profiling [37]	Uncovering novel pathway connections and intermediate KEs	Candidate KEs for hypothesis generation; Network refinement	Reveals evolutionary conservation of pathway components	Moderate to High (bioinformatics expertise required)
Computational Sequence-Based Conservation Analysis [36]	Predicting taxonomic applicability of MIEs	Taxonomic applicability domains; Testable conservation hypotheses	Efficient preliminary assessment; Guides targeted experimental validation	Low to Moderate (leverages existing genomic databases)

Detailed Experimental Protocol: Cross-Taxa KER Validation

Protocol 3: Empirical Testing of Key Event Relationships Across Taxonomic Groups [32]

Select Focal KER: Choose a KER with limited taxonomic support from the gap analysis.
Design Cross-Taxa Test System: Identify representative species from at least three evolutionary lineages (e.g., fish, amphibian, mammalian models).
Establish Dosimetry: Determine exposure concentrations that produce comparable internal doses relative to the KE of interest across test species.
Implement KE Measurement: Apply consistent methodological approaches for quantifying upstream and downstream KEs across all test systems.
Analyze Response Concordance: Evaluate whether the KER demonstrates consistent response patterns across taxonomic groups.
Refine AOP Network: Incorporate findings into AOP network, documenting taxonomic domains where KER is operative versus inoperative.

The Scientist's Toolkit: Essential Research Reagents and Platforms

Table 3: Research Reagent Solutions for AOP Network Development

Tool/Reagent Category	Specific Examples	Primary Function in AOP Research	Taxonomic Applicability Considerations
AOP Knowledgebase Platforms [35] [37]	AOP-Wiki (aopwiki.org), AOP-KB	Centralized repository for AOP components; Facilitates network assembly and gap identification	Contains limited taxonomic annotation; Variable coverage across species
Computational Conservation Analysis Tools [36]	SeqAPASS, Ortholog Databases	Predict taxonomic applicability of MIEs based on sequence and structural conservation	Essential for extrapolating molecular initiating events across species
High-Throughput Screening Platforms [36]	ToxCast/Tox21 Assay Battery, High-Content Screening	Efficient identification of potential MIEs and chemical bioactivity	Often limited to human/mammalian molecular targets; Taxonomic coverage expanding
Cross-Species Biomarker Panels [32]	Conserved Pathway PCR Arrays, Cross-Reactive Antibodies	Consistent KE measurement across taxonomic groups in comparative studies	Requires validation for each species; Conservation of epitopes variable
AOP Network Visualization Software [37]	Cytoscape, R-based Network Packages	Analysis of network topology and identification of structural gaps	Platform-independent; Customizable for taxonomic annotation layers

Identifying and addressing knowledge gaps in AOP networks requires a systematic approach combining computational analysis, structured evidence evaluation, and targeted experimental validation. The methodologies compared in this guide demonstrate that effective gap analysis must explicitly consider taxonomic domain applicability to enhance the predictive power of AOP networks in ecological and human health risk assessment. As the AOP framework continues to evolve, the integration of data-driven network generation [37] with rigorous weight-of-evidence assessment [32] will be essential for creating taxonomically robust networks that can reliably support regulatory decision-making across species boundaries.

Best Practices for Integrating Diverse Lines of Evidence (in silico, in vitro, in vivo)

In modern toxicology and drug development, the integration of diverse lines of evidence—in silico (computational), in vitro (cell-based), and in vivo (whole organism)—has become critical for comprehensive risk assessment and chemical safety evaluation. This approach is particularly vital for defining the taxonomic domain of applicability (tDOA) in Adverse Outcome Pathway (AOP) research, which determines the biological plausibility of pathways across different species. The need for robust integration frameworks is amplified by the recognition that humans and environmental species are exposed to complex chemical mixtures rather than single substances, requiring sophisticated methods to understand combined effects [38]. This guide examines best practices for combining these evidence streams, providing researchers with methodologies to enhance predictive accuracy and regulatory relevance.

Fundamental Concepts and Definitions

Evidence Streams in Toxicology

In silico: Computational approaches that utilize bioinformatics tools, structure-activity relationships, and mathematical modeling to predict chemical behavior and biological activity. These methods leverage existing protein sequence and structural knowledge to extrapolate findings across species [5].
In vitro: Laboratory-based experiments using cell cultures, tissue samples, or isolated biological components to study chemical effects under controlled conditions. Recent advances include organs-on-a-chip and 3D cell culture systems that better mimic in vivo conditions [38].
In vivo: Whole organism studies that provide information on complex biological responses, system-level interactions, and apical endpoints relevant to risk assessment.

The Adverse Outcome Pathway (AOP) Framework

The AOP framework organizes existing knowledge into a structured paradigm that describes causal linkages between a Molecular Initiating Event (MIE) and an Adverse Outcome (AO) through measurable Key Events (KEs) at different biological levels [5]. Defining the tDOA—the range of species for which an AOP is applicable—requires demonstrating conservation of both structure and function across taxonomic groups [5].

Experimental Protocols and Methodologies

Bioinformatics Approaches for Taxonomic Domain Applicability

Protocol for Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) [5]:

Step 1: Identify query proteins involved in the AOP (e.g., nicotinic acetylcholine receptors for neurotoxicity AOPs)
Step 2: Perform Level 1 analysis comparing primary amino acid sequences to identify orthologs across species
Step 3: Conduct Level 2 evaluation assessing conservation of functional domains
Step 4: Execute Level 3 analysis comparing critical amino acid residues important for protein-ligand interactions
Step 5: Integrate results with empirical toxicity data to define the biologically plausible tDOA

This hierarchical approach provides evidence for structural conservation of KEs and KE relationships across species, enhancing confidence in AOP applicability beyond tested organisms [5].

Integrated Drug Discovery Platform

Protocol for Acute Myeloid Leukemia (AML) Drug Discovery [39]:

In silico phase:
- Process drug treatment profiles (DTPs) from databases like Connectivity Map (CMap)
- Calculate Drug Regulatory Scores (DRS) measuring similarity between drug-induced cell line and patient tumor gene expression profiles
- Correlate DRS with in vitro pharmacological activity metrics
In vitro phase:
- Validate predictions using blood-derived cell lines
- Measure IC50 values across multiple cell lines
- Correlate computational predictions with molecular features
In vivo phase:
- Administer candidate drugs to AML mouse models
- Measure tumor volume reduction using formula: length × width × thickness × 0.5
- Evaluate pharmacological activity through tumor growth metrics

This integrated platform demonstrated that DRS scores highly correlated with in vitro metrics of pharmacological activity, and subsequent in vivo validation showed significant tumor growth inhibition for predicted candidates [39].

Statistical Validation Methods

Protocol for Comparative Data Analysis [40]:

Hypothesis formulation: Establish null hypothesis (H0: no difference between means) and alternative hypothesis (H1: means are significantly different)
F-test implementation: Compare variances between datasets before conducting t-tests
- Calculate F value using formula: F = s₁²/s₂² (where s₁² ≥ s₂²)
- Use α = 0.05 as significance level
t-test execution:
- Apply formula: t = (x̄₁ - x̄₂)/(s√((1/n₁)+(1/n₂))) where s = √(((n₁-1)s₁² + (n₂-1)s₂²)/(n₁+n₂-2))
- Compare calculated t-value to critical value from t-distribution tables
- Consider P-value < 0.05 as statistically significant
Interpretation: Reject null hypothesis when |t-statistic| > t-critical value, indicating significant differences between experimental results

Comparative Analysis of Evidence Streams

Table 1: Strengths and Limitations of Evidence Streams in Toxicological Research

Evidence Stream	Key Strengths	Major Limitations	Primary Applications
In silico	Rapid screening capability; Cost-effective; Species extrapolation; Ethical advantages [5] [39]	Limited biological complexity; Dependent on quality of input data; May lack physiological context [38]	Priority setting; Initial hazard assessment; Taxonomic domain applicability analysis [5]
In vitro	Controlled environment; Mechanistic insights; High-throughput capability; Reduced animal use [38]	May not reflect whole-organism responses; Limited metabolic competence; Absence of integrated physiology [38]	Mechanism of action studies; High-throughput screening; Pathway-based assessment [39]
In vivo	Whole-organism relevance; Complex system responses; Apical endpoint assessment; Regulatory acceptance [38]	Ethical concerns; High cost and time requirements; Species extrapolation uncertainties [5] [38]	Hazard identification; Risk assessment; Regulatory decision-making [38]

Table 2: Quantitative Comparison of Methodological Attributes

Methodological Attribute	In silico Approaches	In vitro Systems	In vivo Models
Throughput	High (1000s compounds/week) [39]	Medium-High (100s compounds/week) [38]	Low (weeks-months per study) [38]
Cost per Compound	Low (~$100-500)	Medium (~$1,000-5,000)	High (>$50,000)
Species Applicability	Broad (multiple species via bioinformatics) [5]	Limited (specific cell types)	Restricted (model organisms)
Regulatory Acceptance	Growing (weight-of-evidence) [5]	Increasing (for specific endpoints)	Established (gold standard) [38]
Metabolic Competence	Simulated (computational metabolism)	Limited (may require S9 fraction)	Complete (intact systems)

Visualization of Workflows and Pathways

Integrated Evidence Workflow

Integrated Evidence Workflow for AOP Development

AOP-Based Taxonomic Domain Assessment

AOP Framework with Taxonomic Domain Assessment

Research Reagent Solutions Toolkit

Table 3: Essential Research Tools for Integrated Evidence Generation

Tool/Reagent	Function	Application Context	Specific Example
SeqAPASS Tool	Bioinformatics platform for cross-species protein sequence and structural comparison [5]	Defining taxonomic domain of applicability for AOPs	Assessing conservation of nicotinic acetylcholine receptors across bee species [5]
Connectivity Map (CMap)	Database of drug-induced gene expression profiles [39]	In silico drug discovery and repurposing	Identifying potential AML therapeutics based on gene expression similarity [39]
3D Cell Culture Systems	Advanced in vitro models that better mimic tissue architecture [38]	Mechanistic studies and toxicity screening	Improved prediction of in vivo responses for chemical mixtures
Pasco Spectrometer	Instrument for measuring absorbance in chemical solutions [40]	Quantitative analysis in experimental chemistry	Determining concentration of FCF Brilliant Blue solutions [40]
Organs-on-a-Chip	Microfluidic devices simulating human organ functions [38]	Intermediate between in vitro and in vivo testing	Assessing compound effects on tissue-level functions without animal use
XLMiner ToolPak	Statistical analysis add-on for Google Sheets [40]	Data analysis and hypothesis testing	Performing t-tests and F-tests for experimental data comparison [40]

Application to Chemical Mixtures and AOP Development

The integration of diverse evidence streams becomes particularly crucial when assessing chemical mixtures, which represent real-world exposure scenarios but present significant methodological challenges [38]. Two primary approaches have emerged:

Whole-mixture approach: Assesses toxicity of mixtures as complete entities, advantageous for environmental samples of unknown composition but limited in identifying specific drivers of toxicity [38].
Component-based approach: Predicts joint effects based on individual chemical information, relying on concepts of additivity (dose addition for similar mode of action; response addition for dissimilar modes) [38].

For AOP development, bioinformatics tools like SeqAPASS provide critical evidence for taxonomic domain applicability by demonstrating structural conservation of key events and their relationships across species [5]. This approach was successfully applied to an AOP linking activation of nicotinic acetylcholine receptors to colony death in honey bees, demonstrating potential applicability to non-Apis bees through computational analysis of protein conservation [5].

The integration of separate lines of evidence follows a weight-of-evidence framework that considers both empirical support and biological plausibility, particularly for key event relationships in AOP development [5]. Modern methodologies including omics technologies, advanced in vitro systems, and computational models collectively improve understanding of toxicity pathways and enable better prediction of risks from chemical exposures [38].

Optimizing the Use of Computational Tools to Overcome Limited Empirical Data

The Adverse Outcome Pathway (AOP) framework organizes existing biological knowledge to illustrate causal linkages from a Molecular Initiating Event (MIE) to an Adverse Outcome (AO) at a level relevant for risk assessment [6]. A critical yet often underdefined component of any AOP is its Taxonomic Domain of Applicability (tDOA)—the range of species for which the AOP is biologically plausible [6]. Traditionally, the tDOA is narrowly defined, limited to the specific species used in the empirical studies that informed the AOP's development. This poses a significant challenge for regulatory decision-making, particularly when considering the protection of untested species [6]. Defining the tDOA with greater confidence is essential for the reliable application of AOPs beyond their initial context.

The primary barriers to defining the tDOA are the scarcity and high cost of generating comprehensive empirical data across a wide range of species. Relying solely on traditional toxicological testing for thousands of potential species is impractical. Consequently, there is a pressing need for robust, computational approaches that can extrapolate existing knowledge to untested species. As highlighted in a case study on an AOP linking nicotinic acetylcholine receptor (nAChR) activation to colony death in honey bees, bioinformatics tools can provide powerful lines of evidence for structural and functional conservation of Key Events (KEs) across species, thereby expanding the plausible tDOA [6]. This article explores how such computational tools are overcoming the limitations of sparse empirical data.

Comparative Analysis of Computational Tools for tDOA Definition

Several computational tools and methods are available to aid in the definition of a tDOA. These tools leverage different types of data and algorithms, each with distinct strengths and applications in AOP development. The following table provides a structured comparison of these key approaches.

Table 1: Comparison of Computational Tools for Defining Taxonomic Domain of Applicability

Tool/Method	Primary Function	Data Inputs	Key Outputs	Advantages	Limitations
SeqAPASS [6]	Evaluates cross-species protein sequence and structural similarity.	Protein sequences, functional domains, critical amino acid residues.	Evidence for structural conservation of molecular initiating events and key events.	Publicly available, hierarchical analysis (sequence -> domain -> residue).	Provides evidence for structural, but not necessarily functional, conservation.
Statistical & Machine Learning Models [41]	Identifies patterns and predicts outcomes from complex datasets.	Toxicity data, chemical properties, biological traits.	Predictive models of susceptibility, clustering of species sensitivity.	Can handle large, complex data; provides probabilistic forecasts.	Reliant on availability of high-quality training data; may lack biological mechanistic insight.
AOP-Knowledgebase (AOP-Wiki)	Central repository for collaborative AOP development.	Published AOPs, key event relationships, empirical support.	Structured AOP information, including proposed tDOA.	Framework for organizing and sharing evidence; supports weight-of-evidence assessment.	Dependent on manual curation; tDOA is often not comprehensively defined.

The integration of these tools is often more powerful than any single approach. For instance, SeqAPASS can provide evidence for the structural conservation of a protein target across a wide range of species. This evidence can then be combined with empirical data from a few representative species and statistical models to build a case for functional conservation, thereby creating a more confident and expanded tDOA [6]. This integrated methodology is crucial for making plausible inferences about species for which no empirical data exists.

Experimental Protocol: Using SeqAPASS to Define tDOA

The following section details the methodology for applying the SeqAPASS tool, a publicly available bioinformatics resource, to investigate the tDOA of an AOP. The protocol is based on the case study defining the tDOA for the nAChR activation AOP [6].

Detailed Step-by-Step Methodology

AOP Selection and KE Identification: Select a well-defined AOP from the AOP-Wiki. For the case study, AOP 89 (nAChR Activation Leading to Colony Death/Failure) was selected. The specific KEs requiring evaluation were identified, starting with the MIE (nAChR activation) [6].
Primary Protein Sequence Acquisition: Obtain the full-length amino acid sequence of the protein(s) involved in the MIE or other molecular-level KEs from a trusted database such as UniProt. For the nAChR case study, this involved retrieving sequences for nAChR subunits from the primary species of interest, Apis mellifera (honey bee) [6].
SeqAPASS Level 1 Analysis (Sequence Similarity):
- Input: The primary amino acid sequence from the previous step.
- Process: The tool performs a BLAST analysis against the National Center for Biotechnology Information (NCBI) non-redundant protein sequence database.
- Output: A list of potential orthologs across other species, based on sequence similarity. This provides an initial, broad estimate of the tDOA for the molecular entity [6].
SeqAPASS Level 2 Analysis (Functional Domain Conservation):
- Input: The primary sequence and identification of known functional domains (e.g., from Pfam).
- Process: SeqAPASS evaluates the conservation of these specific functional domains across the orthologs identified in Level 1.
- Output: Evidence indicating whether the critical functional regions of the protein are preserved in other species, strengthening the case for conserved function [6].
SeqAPASS Level 3 Analysis (Critical Residue Conservation):
- Input: Information on specific amino acid residues known to be critical for protein-ligand interaction, protein-protein interaction, or overall function. For nAChR, this included residues critical for neonicotinoid binding.
- Process: The tool assesses the conservation of these specific residues across the orthologs.
- Output: High-confidence evidence for whether the molecular interaction described in the MIE is likely to be conserved in a given species. A species possessing the critical residues is considered structurally susceptible [6].
Data Integration and tDOA Postulation: The results from all three levels of SeqAPASS analysis are synthesized. This bioinformatics evidence, when combined with any available empirical toxicity data, is used to define the biologically plausible tDOA for the KE, Key Event Relationships (KERs), and the overall AOP [6].

Workflow Visualization

The experimental protocol for using the SeqAPASS tool, from AOP selection to tDOA definition, is summarized in the following workflow diagram.

To conduct a computational analysis of an AOP's tDOA, researchers require access to specific digital resources and tools. The following table details these essential "research reagents" and their functions in the context of AOP development.

Table 2: Key Research Reagent Solutions for Computational tDOA Analysis

Tool / Resource	Category	Primary Function in tDOA Analysis
SeqAPASS Tool [6]	Bioinformatics Tool	Provides a hierarchical framework (sequence, domain, residue) to evaluate structural conservation of molecular targets across species.
AOP-Wiki	Knowledge Repository	Serves as the central repository for AOP information, providing the structured description of KEs and KERs to be evaluated.
UniProt Database	Protein Database	Provides curated, high-confidence protein sequences necessary for the initial SeqAPASS analysis.
NCBI Non-Redundant Database	Sequence Database	Serves as the comprehensive source of protein sequences from diverse taxa for cross-species comparison in SeqAPASS.
Pfam / InterPro	Protein Family Database	Provides annotations for functional domains, which are critical for the SeqAPASS Level 2 analysis of domain conservation.

The effective use of these resources requires a multidisciplinary skillset, combining knowledge in toxicology, molecular biology, and bioinformatics. The SeqAPASS tool itself is designed to be accessible to scientists without deep computational expertise, bridging the gap between traditional toxicology and modern data science [6].

Visualizing AOP Structure and Cross-Species Conservation

A core component of AOP development is mapping the causal pathway from the MIE to the AO. The following diagram illustrates the structure of AOP 89, which was the subject of the nAChR tDOA case study. This visualization helps in understanding the biological scope that computational tools aim to extrapolate.

The application of a tool like SeqAPASS focuses primarily on establishing the conservation of the MIE. The hierarchical logic used to extrapolate the tDOA for this MIE across different taxonomic groups is summarized in the following diagram.

The integration of computational tools like SeqAPASS into the AOP framework represents a paradigm shift in toxicology and risk assessment. By leveraging publicly available bioinformatics data, these tools provide a systematic and scientifically rigorous means to overcome the critical limitation of sparse empirical data [6]. They enable researchers to move beyond a narrow, evidence-based tDOA to a broader, biologically plausible tDOA, thereby increasing the confidence and utility of AOPs for protecting a wider range of species. As the field of data science continues to evolve, the fusion of statistical guarantees from traditional statistics with the computational power of modern machine learning will further enhance our ability to predict and define taxonomic applicability, making AOPs an even more powerful tool in regulatory science [41].

Navigating the Complexities of Cross-Vertebrate and Cross-Species Extrapolation

Cross-species extrapolation of biological data serves as a critical cornerstone in both biomedical research and environmental safety assessment. In drug development, this approach helps translate findings from preclinical models to human patients, while in environmental toxicology, it enables prediction of pharmaceutical risks to wildlife species based on known mammalian data [42]. The fundamental challenge lies in accurately predicting biological effects across the vast taxonomic diversity of species potentially exposed to chemicals, particularly when traditional toxicity testing on every species is impossible, ethically questionable, and resource-prohibitive [5] [3].

The concept of Taxonomic Domain of Applicability (tDOA) within the Adverse Outcome Pathway (AOP) framework has emerged as a pivotal construct in this field. The tDOA defines the taxonomic boundaries within which a defined pathway of toxicity (from molecular initiating event to adverse outcome) is biologically plausible [5]. Accurately defining the tDOA is therefore essential for regulatory decision-making, especially when considering protection of untested species [5]. This guide systematically compares the experimental and computational methodologies advancing this complex scientific frontier.

Foundational Concepts: AOPs and the Taxonomic Domain of Applicability

An Adverse Outcome Pathway (AOP) is a structured framework that organizes existing knowledge to describe causal linkages between a Molecular Initiating Event (MIE; e.g., a drug binding to its protein target) and an Adverse Outcome (AO; e.g., population-level effect) through a series of intermediate Key Events (KEs) [5]. The Taxonomic Domain of Applicability (tDOA) is an formal element of an AOP that defines the species for which the described causal pathway is valid [5].

The biological justification for extrapolation across species rests on the principle of conserved biology. Two primary elements are considered when defining the tDOA:

Structural Conservation: The presence and similarity of biological structures (e.g., genes, proteins, tissues) across species [5].
Functional Conservation: The preservation of biological function (e.g., protein activity, pathway response) across species [5].

The strength of evidence supporting the tDOA determines confidence in using the AOP for predictions in untested species, moving beyond assumptions based solely on taxonomic relatedness [3].

Comparative Analysis of Cross-Species Extrapolation Methodologies

A range of complementary methodologies has been developed to support cross-species extrapolation. The table below summarizes their core applications, advantages, and limitations.

Table 1: Comparison of Cross-Species Extrapolation Methodologies

Methodology	Primary Application	Key Advantages	Inherent Limitations
Biological Read-Across [42]	Using mammalian pharmacological/toxicological data to inform wildlife toxicity predictions.	Leverages existing rich datasets from drug development; practical for prioritization.	Requires understanding of functional target conservation; potential oversimplification.
SeqAPASS Tool [5] [3]	Bioinformatics tool predicting protein structural conservation and potential chemical susceptibility across species.	Publicly accessible; uses available protein sequence databases; hierarchical evaluation (sequence, domain, residue).	Primarily provides evidence for structural conservation; functional data needed for full confidence.
G2P-SCAN Tool [3]	Computational tool inferring biological pathway conservation across 7 model species.	Provides pathway-level context; helps infer functional implications.	Limited to a predefined set of species; dependent on quality of pathway annotations.
Empirical Toxicity Testing [42]	Direct measurement of apical effects (growth, reproduction, survival) in standard model species.	Provides direct empirical evidence; regulatory acceptance.	Resource-intensive, time-consuming, raises ethical concerns; impossible for all species.
Combined NAMs Approach [3]	Integrated use of SeqAPASS, G2P-SCAN, and other NAMs for WoE assessment.	Synergistic strengths; enhances confidence; supports NGRA.	Requires expert interpretation; still developing regulatory acceptance.

Performance Benchmarking of Computational Workflows

Rigorous benchmarking is essential for evaluating computational methods. The guiding principles for such benchmarking include defining a clear purpose and scope, comprehensive selection of methods and datasets, use of appropriate performance metrics, and ensuring reproducible research practices [43]. The selection of evaluation metrics is particularly critical, as different metrics (e.g., Accuracy, F-measure, Area Under the ROC Curve) capture distinct aspects of performance and can lead to different conclusions about method efficacy [44].

Table 2: Experimental Evidence Supporting Cross-Species Predictions for Pharmaceuticals

Pharmaceutical Class	Biological Target	Evidence of Cross-Species Effect	Key Supporting Experimental Data
Antidepressants [42]	Central Nervous System Targets (e.g., serotonin transporter)	Behavioral and neurochemical changes in fish analogous to mammalian effects.	In vitro binding assays showing conserved target affinity; measured behavioral changes in fish exposed to environmentally relevant concentrations.
5α-Reductase Inhibitors [42]	Androgen pathway enzymes	Disruption of sexual development in fish.	In vitro assays showing inhibition of fish 5α-reductase; vitellogenin induction and histopathological changes in fish gonads.
Nicotinic Acetylcholine Receptor Agonists [5]	nAChR	Neurotoxicity in Apis mellifera and other insect species.	Radioligand binding assays confirming receptor activation; sub-lethal effects on foraging and colony performance in bee studies.

Detailed Experimental Protocols for Cross-Species Investigation

Protocol: Defining tDOA Using Bioinformatics (SeqAPASS)

This protocol outlines the process for using the SeqAPASS tool to provide evidence for the structural conservation of a Key Event (e.g., a protein target) across species, thereby informing the tDOA of an AOP [5].

Protein Identification: Identify the specific protein(s) involved in the Molecular Initiating Event or other Key Events of the AOP.
Sequence Acquisition: Obtain the primary amino acid sequence(s) for the query protein(s) from a trusted database (e.g., UniProt).
SeqAPASS Analysis: a. Level 1 (Sequence Comparison): Input the query sequence into SeqAPASS to identify potential orthologs across species based on overall sequence similarity. b. Level 2 (Domain Conservation): Evaluate the conservation of specific functional domains within the protein. c. Level 3 (Residue Conservation): Assess conservation of individual amino acid residues known to be critical for protein-ligand interaction or protein function.
Data Interpretation: Interpret the results across all three levels to generate a prediction of protein structural conservation and potential chemical susceptibility for the species analyzed.

Protocol: Combined NAMs for Pathway Conservation Assessment

This methodology describes a integrated approach using both SeqAPASS and G2P-SCAN to provide multiple lines of evidence for biological pathway conservation [3].

Target Identification: Compile a list of molecular targets for the chemical of interest using pharmacological databases and literature.
Structural Conservation (SeqAPASS): For each identified target, perform a SeqAPASS analysis as described in Protocol 4.1 to predict structural conservation across a broad range of species.
Pathway Mapping (G2P-SCAN): a. Input the list of human genes encoding the molecular targets into G2P-SCAN. b. The tool maps these genes to their associated biological pathways (e.g., using Reactome database). c. G2P-SCAN then outputs an inference of pathway conservation across its predefined set of species (Human, Mouse, Rat, Zebrafish, Fruit fly, Roundworm, Yeast).
Evidence Integration: Synthesize the results from SeqAPASS and G2P-SCAN. A consensus, where both tools indicate conservation, provides stronger evidence for the plausibility of the AOP in a given species.

The following diagram visualizes this integrated workflow.

Workflow for Combined NAMs

Successful cross-species extrapolation relies on a suite of bioinformatic tools, databases, and experimental models. The table below details key resources.

Table 3: Essential Research Reagents and Resources for Cross-Species Extrapolation

Tool/Resource Name	Type	Primary Function in Research	Key Application in Field
SeqAPASS Tool [5] [3]	Bioinformatics Tool	Evaluates protein sequence/structural similarity to infer potential for chemical susceptibility across species.	Provides lines of evidence for structural conservation of MIEs and KEs, helping define the tDOA of an AOP.
G2P-SCAN Tool [3]	Bioinformatics Tool	Maps human genes to biological pathways and infers pathway conservation across 7 core species.	Supports inference of functional pathway-level conservation, complementing protein-level data.
ECOdrug [42]	Database/Informatic Tool	User-friendly database exploring evolutionary conservation of drug targets in ecologically relevant species.	Aids in predicting hazard of pharmaceuticals in the environment based on target conservation.
AOP-Wiki [5]	Knowledge Base	Central repository for developed AOPs, including described KEs and KERs.	Foundational resource for accessing existing AOP knowledge and proposed tDOAs to inform new research.
Reactome [3]	Pathway Database	Curated database of biological pathways and processes.	Used by tools like G2P-SCAN as a source of pathway information for conservation analysis.
UniProt [5]	Protein Database	Repository of comprehensive protein sequence and functional data.	Primary source for obtaining reliable amino acid sequences for SeqAPASS analysis.

Visualization of Taxonomic Domain of Applicability (tDOA) Assessment

Defining the tDOA is a multi-step process that combines empirical evidence with computational predictions. The following diagram outlines the logical workflow for establishing the tDOA for an Adverse Outcome Pathway, integrating the methodologies discussed in this guide.

Process for Defining tDOA

The field of cross-species extrapolation is rapidly evolving from a reliance on surrogate species testing toward a more predictive paradigm grounded in comparative biology and computational New Approach Methodologies. The integration of tools like SeqAPASS and G2P-SCAN exemplifies this shift, providing a structured, evidence-based approach to defining the Taxonomic Domain of Applicability for adverse outcome pathways [5] [3].

Future progress hinges on several key priorities: a deeper understanding of the quantitative relationship between target modulation and adverse outcomes across species, the generation of higher-throughput data on internal exposure dynamics, and the development of more sophisticated integrated testing strategies [42]. Furthermore, the translation of these advanced comparative toxicology approaches into regulatory applications depends on cultivating expertise and fostering ongoing collaboration among industry, academic, and regulatory scientists [42]. As these methodologies mature, they promise to enhance the efficiency, precision, and ethical standing of both environmental safety and drug development assessments.

Ensuring Reliability: Validation Frameworks and Comparative Analysis of tDOA Approaches

Within the Adverse Outcome Pathway (AOP) framework, the taxonomic domain of applicability (tDOA) defines the species for which the described biological pathway is relevant. Accurately defining the tDOA is critical for the use of AOPs in regulatory decision-making, particularly when extrapolating knowledge from tested to untested species [5]. Two primary approaches for defining the tDOA exist: the Empirical tDOA, based on observed experimental data from specific species, and the Plausible tDOA, which uses computational and bioinformatics evidence to infer applicability across a broader taxonomic range. This guide objectively compares the validation strategies, performance, and confidence levels associated with these two approaches, providing researchers with a clear framework for their application in predictive toxicology and drug development.

Comparative Analysis: Empirical versus Plausible tDOA

The following table summarizes the core characteristics, strengths, and limitations of the empirical and plausible tDOA validation strategies.

Table 1: Comparative Overview of Empirical and Plausible tDOA Validation Strategies

Feature	Empirical tDOA	Plausible tDOA
Definition	Based on direct, observed experimental data from specific species.	Inferred from computational evidence of structural and functional conservation.
Primary Evidence	Data from in vivo and in vitro toxicity tests cited within the AOP-Wiki [5].	Bioinformatics analyses, such as protein sequence and structural conservation via tools like SeqAPASS [5].
Confidence Basis	High confidence for specifically tested species; confidence is directly tied to the quantity and quality of experimental data.	Confidence is derived from the degree of conservation of key biological elements (e.g., proteins, domains, residues) [5].
Taxonomic Scope	Typically narrow, limited to the single or handful of species used in the supporting studies [5].	Can be rapidly expanded to include a wide range of species for which genomic/proteomic data exist.
Resource Intensity	High, requiring extensive laboratory work, animal testing, and time.	Low, leveraging existing and growing public databases for rapid analysis [5].
Best Use Cases	- Final validation of an AOP.- Regulatory submissions for known species.- Ground-truthing computational predictions.	- Early hypothesis generation.- Expanding the scope of existing AOPs.- Identifying potentially susceptible untested species.

Experimental Protocols for tDOA Validation

Protocol 1: Establishing Empirical tDOA with In Vivo and In Vitro Data

This protocol outlines the traditional method for defining tDOA based on laboratory evidence.

AOP Identification: Select a defined AOP for evaluation (e.g., AOP 89: nAChR activation leading to colony death/failure in Apis mellifera) [5].
Literature Synthesis: Systematically gather all empirical studies cited in the AOP-Wiki for each Key Event (KE) and Key Event Relationship (KER).
Species Cataloging: Record every species for which empirical data demonstrates the occurrence of a KE or supports a KER.
Evidence Weighting: Assign a level of confidence based on the robustness of the studies (e.g., number of independent studies, dose-response relationships, replication).
tDOA Definition: The compiled list of species constitutes the empirical tDOA for the AOP.

Protocol 2: Establishing Plausible tDOA using the SeqAPASS Tool

This protocol details the bioinformatics approach for inferring tDOA through the US Environmental Protection Agency's Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool [5].

Identify Query Proteins: Determine the specific proteins involved in the Molecular Initiating Event (MIE) and subsequent Key Events of the AOP. For AOP 89, this included nine proteins such as the nicotinic acetylcholine receptor [5].
SeqAPASS Level 1 Analysis (Primary Sequence): Input the amino acid sequence of the query protein. The tool identifies orthologs across species based on overall sequence similarity, providing an initial list of taxa where the protein is likely present.
SeqAPASS Level 2 Analysis (Functional Domains): Evaluate the conservation of known functional domains (e.g., ligand-binding domains) within the identified orthologs. This adds evidence for retained protein function.
SeqAPASS Level 3 Analysis (Critical Residues): Assess the conservation of specific amino acid residues known to be critical for protein-ligand interaction or protein function (e.g., residues essential for neonicotinoid binding in nAChR). This is the highest level of evidence for structural conservation.
Triangulate with Empirical Data: Integrate the bioinformatics results with any available empirical data to build a weight-of-evidence case for functional conservation.
Define Plausible tDOA: The taxonomic groups for which structural conservation is demonstrated at Levels 1, 2, and 3 define the biologically plausible tDOA.

SeqAPASS Workflow for Plausible tDOA

The following table lists key tools and resources essential for conducting tDOA validation studies.

Table 2: Key Research Reagent Solutions for tDOA Validation

Tool / Resource	Function in tDOA Validation
AOP-Wiki (https://aopwiki.org/)	The central repository for AOP knowledge, providing the structured framework and collected empirical evidence on which tDOA is built [5].
SeqAPASS Tool	A publicly available bioinformatics tool that evaluates cross-species protein sequence and structural similarity to provide evidence for the structural conservation of Key Events, informing the plausible tDOA [5].
Ortholog Databases	Databases such as Ensembl or NCBI Orthologs are used to identify and retrieve protein sequences across multiple species, forming the input data for SeqAPASS analysis.
Molecular Cloning & Expression Kits	Essential for empirically validating protein function by expressing orthologs from different species in in vitro systems for functional assays.

Data Visualization: Signaling Pathway and Workflow

AOP 89: nAChR to Colony Failure

The choice between empirical and plausible tDOA strategies is not a matter of selecting a superior option, but of applying the right tool for the specific research or regulatory context. The empirical tDOA provides high-confidence, ground-truthed data but is inherently limited in scope. The plausible tDOA, powered by bioinformatics, offers a powerful and efficient method for extrapolating AOP applicability across the tree of life, though it requires subsequent empirical confirmation for the highest levels of regulatory confidence. The most robust strategy for defining the taxonomic domain of an AOP involves a weight-of-evidence approach that integrates both methodologies, using bioinformatics to generate hypotheses and guide targeted empirical testing, thereby building a comprehensive and defensible assessment of risk across species.

Comparative Analysis of tDOA Across Different AOPs and Biological Systems

The Adverse Outcome Pathway (AOP) framework provides a structured approach to organize biological knowledge into a sequential chain of causally linked events, beginning with a Molecular Initiating Event (MIE) where a chemical stressor interacts with a biological target and culminating in an Adverse Outcome (AO) relevant to risk assessment at the individual or population level [36]. A fundamental principle of this framework is that AOPs are not stressor-specific; rather, they describe biological sequences that can be initiated by any stressor capable of triggering the specific MIE [36]. A critical element within this framework is the taxonomic domain of applicability (tDOA), which defines the taxonomic space across which an AOP is considered biologically plausible [3] [1].

Understanding the tDOA is essential for effective chemical risk assessment, particularly for extrapolating toxicity data from tested to untested species [36]. The central assumption underlying tDOA evaluation is that evolutionary conservation of biological pathways and protein targets across species confers similar susceptibility to chemical stressors [3] [45]. As regulatory toxicology increasingly adopts New Approach Methodologies (NAMs) that reduce reliance on traditional animal testing, accurately defining tDOA has become both more critical and more feasible through computational biology approaches [3] [1] [7]. This comparative analysis examines the current state of tDOA characterization across different AOPs and biological systems, highlighting methodological advances, key challenges, and future research directions.

Methodological Approaches for tDOA Characterization

Computational Tools and Bioinformatics Strategies

The primary computational means of assessing taxonomic relatedness for tDOA determination involves comparing gene or protein sequence and structural similarity across species [3] [1]. Several sophisticated bioinformatics tools have been developed specifically to support these cross-species extrapolations:

The Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool, developed by the US Environmental Protection Agency, utilizes protein sequence information to extrapolate chemical susceptibility across diverse species for which protein sequence data are available [3] [1]. The tool operates through multiple tiers of analysis: (1) primary sequence similarity comparison, (2) functional domain conservation evaluation, and (3) assessment of key amino acid residues known to be critical for protein-chemical interaction [45]. By leveraging existing knowledge about chemical-protein interactions in model species, SeqAPASS can predict potential susceptibility in non-target species, thereby informing the tDOA for specific MIEs [1].

The Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tool complements SeqAPASS by providing biological pathway-level information from human gene inputs, supporting inferences about pathway conservation across seven commonly used model species: humans (Homo sapiens), mice (Mus musculus), rats (Rattus norvegicus), zebrafish (Danio rerio), fruit flies (Drosophila melanogaster), roundworms (Caenorhabditis elegans), and yeast (Saccharomyces cerevisiae) [3] [1]. This tool accesses information from various biological databases to map gene sets to Reactome pathways and estimate conservation across the specified species [1].

Additional resources like EcoDrug contain information for over 600 eukaryotes, allowing users to identify human drug targets for more than 1,000 pharmaceuticals and associated ortholog predictions [45]. These tools collectively enable researchers to harness the power of comparative genomics, proteomics, and transcriptomics to inform tDOA with increasing precision [45].

Experimental and Empirical Approaches

While computational approaches provide valuable predictions, empirical data remains essential for validating tDOA hypotheses. Traditional approaches have relied on toxicity testing across multiple species to establish the taxonomic boundaries of chemical susceptibility [32]. However, such comprehensive testing is resource-intensive, ethically questionable, and impractical given the vast number of chemicals and species of potential concern [3].

Advanced omics technologies now provide more efficient empirical approaches for tDOA characterization. High-throughput transcriptomics can derive transcriptomic points of departure and identify conserved gene expression patterns across species [45]. The development of cross-species quantitative PCR arrays (e.g., EcoToxChips) enables targeted assessment of pathway conservation and perturbation [45]. Additionally, high-resolution mass spectrometry techniques facilitate comparative analyses of protein expression and post-translational modifications relevant to AOP key events [7].

The AOP-Wiki database serves as a central repository for AOP knowledge, including empirical evidence supporting tDOA [7] [46]. However, a comprehensive mapping of this database revealed that limited empirical evidence has been collected for the tDOA of the majority of AOPs, likely because toxicity and pathway data are typically generated for only a few model organisms [7].

Table 1: Methodological Approaches for tDOA Characterization

Approach Category	Specific Methods/Tools	Key Applications in tDOA Assessment	Strengths	Limitations
Computational/Bioinformatics	SeqAPASS	Predicts protein target conservation and chemical susceptibility across species	High-throughput; can analyze thousands of species simultaneously	Relies on available sequence data; may not capture all functional differences
	G2P-SCAN	Evaluates biological pathway conservation across model species	Provides pathway-level context beyond individual proteins	Limited to seven model species
	EcoDrug	Identifies orthologs for pharmaceutical targets	Comprehensive coverage of drug targets and eukaryotic species	Focused primarily on pharmaceutical targets
Empirical/Experimental	Multi-species toxicity testing	Direct observation of adverse outcomes across taxa	Provides definitive evidence of susceptibility	Resource-intensive; ethically challenging
	Cross-species transcriptomics	Identifies conserved gene expression patterns	Captures functional responses to chemical exposure	Complex data interpretation; requires specialized expertise
	EcoToxChips (qPCR arrays)	Targeted assessment of conserved pathway perturbation	Cost-effective; applicable to many species	Limited to predefined gene sets

Comparative Analysis of tDOA Across AOP Case Studies

Nuclear Receptor-Mediated AOPs

Nuclear receptors represent important targets for many environmental chemicals, and AOPs involving these receptors demonstrate varying tDOA depending on the specific receptor and its evolutionary conservation. The peroxisome proliferator-activated receptor alpha (PPARα) pathway provides an instructive case study for tDOA analysis [1]. Through combined application of SeqAPASS and G2P-SCAN, researchers have demonstrated that the PPARα signaling pathway is highly conserved across vertebrates, with more limited conservation in invertebrates [1]. This pattern aligns with the known functions of PPARα in lipid metabolism and the evolutionary history of nuclear receptors.

Similarly, AOPs involving estrogen receptor 1 (ESR1) activation show a well-conserved tDOA across vertebrate species, particularly among fish, amphibians, and mammals [1]. The SeqAPASS tool has confirmed structural conservation of estrogen receptors across these taxa, supporting the biological plausibility that chemicals activating ESR1 in model fish species would likely cause similar effects in untested fish species [36] [1]. This conservation pattern has direct regulatory implications, as it supports the extrapolation of endocrine disruption data from tested to untested species for chemicals acting through estrogen receptor mechanisms.

In contrast, AOPs involving ecdysone receptor (ECR) activation demonstrate a markedly different tDOA, primarily limited to invertebrates [32]. This receptor plays a critical role in molting and development in arthropods, and the corresponding AOP (ECDYSONE RECEPTOR ACTIVATION LEADING TO MORTALITY) has a tDOA predominantly encompassing insects and crustaceans [32]. The narrow tDOA for this AOP reflects the specific physiological functions of ecdysone in invertebrates and the absence of homologous pathways in vertebrates.

Neurotransmitter Receptor-Mediated AOPs

AOPs involving neurotransmitter receptors illustrate how conserved molecular targets can nonetheless manifest different tDOA due to variations in physiological context and pathway conservation. For example, AOPs linked to gamma-aminobutyric acid type A receptor subunit alpha (GABRA1) activation show broad conservation across vertebrate species, with more limited applicability in invertebrates [1]. The GABRA1 protein itself is highly conserved, but downstream pathway elements and physiological consequences of perturbation may vary across taxa.

Case studies examining chemical interactions with GABRA1 have demonstrated the value of combining multiple lines of evidence for tDOA characterization [1]. While SeqAPASS analysis indicated broad conservation of the protein target across vertebrates and some invertebrates, G2P-SCAN provided additional context regarding the conservation of associated neurological pathways [1]. This complementary approach strengthened the weight of evidence for defining the tDOA and highlighted potential taxonomic boundaries where protein conservation does not guarantee identical adverse outcomes.

Analysis of tDOA Patterns in the AOP-Wiki Database

A comprehensive mapping of the AOP-Wiki database has revealed distinct patterns in tDOA characterization across different biological systems [7]. The analysis identified that AOPs related to diseases of the genitourinary system, neoplasms, and developmental anomalies are the most frequently investigated in the database [7]. However, the extent and quality of tDOA information vary substantially across these AOPs.

The mapping exercise also highlighted significant gaps in tDOA knowledge for certain biological domains. For instance, AOPs related to immunotoxicity and non-genotoxic carcinogenesis, endocrine and metabolic disruption, and developmental and adult neurotoxicity have been identified as priority areas within the EU-funded PARC project (Partnership for the Risk Assessment of Chemicals) due to both their regulatory importance and the current inadequacy of tDOA characterization [7]. These gaps underscore the need for targeted research to better define taxonomic domains for these critical endpoints.

Table 2: Comparative tDOA Analysis for Selected AOPs

AOP Focus	Molecular Initiating Event	Taxonomic Domain of Applicability	Key Evidence Supporting tDOA	Taxonomic Boundaries/Limitations
PPARα Activation	Ligand activation of PPARα	Primarily vertebrates; limited conservation in invertebrates	High sequence conservation of PPARα in vertebrates; conserved pathway elements in mammals and fish	Limited functional conservation in invertebrates; species-specific differences in downstream gene regulation
Estrogen Receptor Activation	Binding to ESR1	Vertebrates (particularly fish, amphibians, mammals)	Structural conservation of estrogen receptors; conserved vitellogenin response in oviparous vertebrates	Limited relevance to invertebrates with different endocrine systems
Ecdysone Receptor Activation	Ligand binding to EcR	Primarily arthropods and other invertebrates	Functional conservation of molting regulation in insects and crustaceans	Not applicable to vertebrates which lack ecdysone signaling
GABRA1 Activation	Modulation of GABAA receptor	Broad conservation across vertebrates; limited invertebrate applicability	High protein sequence similarity; conserved neurophysiological responses in vertebrates	Differential downstream effects in invertebrates; variations in blood-brain barriers

Research Reagents and Experimental Tools for tDOA Studies

Table 3: Essential Research Reagents and Tools for tDOA Investigations

Reagent/Tool	Type	Primary Function in tDOA Research	Application Examples
SeqAPASS	Bioinformatics tool	Evaluates protein sequence and structural similarity across species to predict chemical susceptibility	Determining conservation of pharmaceutical targets across fish and invertebrate species [1] [45]
G2P-SCAN	Computational pathway analysis tool	Maps human gene sets to biological pathways and evaluates conservation across model species	Assessing pathway-level conservation for estrogen receptor signaling [3] [1]
AOP-Wiki	Knowledge repository	Central database for AOP information, including tDOA evidence and assumptions	Accessing existing knowledge on taxonomic applicability for AOP development [7] [46]
EcoDrug	Database with ortholog prediction	Identifies human drug targets and predicts orthologs across eukaryotic species	Screening pharmaceutical targets for potential ecological relevance [45]
EcoToxChips	qPCR arrays	Measures expression of toxicologically relevant genes across species	Assessing pathway perturbation in multiple species for AOP validation [45]
RefChemDB	Chemical bioactivity database	Provides high-throughput in vitro bioactivity data for chemical-target interactions	Identifying molecular targets for chemicals during AOP development [1]

Visualization of tDOA Assessment Workflow

The following diagram illustrates the integrated computational and empirical approach for tDOA characterization:

Workflow for tDOA Assessment: This diagram illustrates the integrated approach combining computational predictions with empirical validation to define the taxonomic domain of applicability for an Adverse Outcome Pathway.

The comparative analysis of tDOA across different AOPs and biological systems reveals both consistent patterns and important distinctions in taxonomic applicability. Nuclear receptor-mediated AOPs generally show phylogenetically coherent tDOA that reflect the evolutionary history of these receptor families [1] [32]. Neurotransmitter pathways demonstrate more complex patterns, where high molecular conservation does not always translate to identical adverse outcomes across taxa due to differences in physiological context and pathway integration [1].

Substantial challenges remain in tDOA characterization. The limited empirical data for most AOPs beyond a few model species constrains confident tDOA definition [7]. There is also a need to better integrate quantitative aspects into tDOA assessment, moving beyond qualitative descriptions to probabilistic predictions of susceptibility [32] [45]. Additionally, the increasing use of NAMs necessitates continued refinement of bioinformatics approaches to ensure they adequately capture biological complexity.

Future research should prioritize expanding empirical validation of computationally predicted tDOA, particularly for AOPs of high regulatory concern [7] [45]. Development of integrated testing strategies that combine multiple computational and limited empirical approaches could enhance tDOA characterization while respecting ethical and resource constraints [3] [1]. Finally, establishing standardized reporting frameworks for tDOA evidence in the AOP-Wiki would facilitate more consistent and transparent evaluations of taxonomic applicability across different AOPs [7] [46].

As the AOP framework continues to evolve and play an increasingly important role in chemical safety assessment, refined understanding of tDOA will be essential for ensuring appropriate application of AOP-based knowledge to protect both human health and ecological systems across the diversity of species potentially exposed to chemical stressors.

Bridging AOPs with Other New Approach Methodologies (NAMs) for Integrated Risk Assessment

The evolving landscape of chemical risk assessment is witnessing a paradigm shift toward New Approach Methodologies (NAMs) that reduce reliance on traditional animal testing while improving human health protection. Adverse Outcome Pathways (AOPs) serve as a critical organizing framework within this shift, providing a structured representation of causal linkages between molecular initiating events and adverse outcomes at individual or population levels [47]. The integration of AOPs with other NAMs creates a powerful synergy that enhances the predictive capacity and regulatory acceptance of modern risk assessment paradigms. This integration is particularly valuable for addressing complex toxicological endpoints such as endocrine disruption, reproductive toxicity, and chemical mixture effects, where traditional methods often fall short in capturing underlying biological mechanisms [48].

The conceptual foundation for combining AOPs with other NAMs rests on their complementary strengths. While AOPs provide the conceptual framework for understanding toxicity pathways, other NAMs generate the empirical data needed to populate and quantify these pathways. This partnership enables a more mechanistic approach to risk assessment that can keep pace with the growing number of chemicals in commerce—estimated at over ten thousand substances, many lacking complete toxicological profiles [47] [49]. Furthermore, international regulatory agencies including the US Environmental Protection Agency (EPA), European Chemicals Agency (ECHA), and European Food Safety Authority (EFSA) are actively developing frameworks to implement these integrated approaches for regulatory decision-making [47].

Table 1: Core Components of an Integrated AOP-NAM Framework

Component	Description	Primary Function in Risk Assessment
Adverse Outcome Pathways (AOPs)	Structured sequences of biologically plausible events connecting molecular initiators to adverse outcomes	Provide conceptual framework for organizing mechanistic toxicological knowledge
In Vitro Assays	Cell-based systems (2D, 3D, organoids, MPS)	Generate experimental data on specific key events within AOPs
In Silico Models	Computational approaches (QSAR, PBPK, molecular docking)	Predict chemical properties, bioactivity, and toxicokinetics
OMICS Technologies	High-throughput molecular profiling (transcriptomics, proteomics)	Reveal system-wide biological responses and identify potential key events
Integrated Approaches to Testing and Assessment (IATA)	Structured combinations of multiple information sources	Support regulatory decision-making through weight-of-evidence approaches

Fundamental Concepts: AOPs and the Broader NAM Landscape

Adverse Outcome Pathways Framework

An Adverse Outcome Pathway is a structured representation that maps the sequential chain of events beginning with a molecular initiating event (MIE)—such as a chemical binding to a specific biological target—through a series of intermediate key events (KEs), culminating in an adverse outcome (AO) of regulatory concern [50]. Each key event relationship (KER) describes the causal connection between adjacent events in the pathway. The power of the AOP framework lies in its ability to organize fragmented toxicological knowledge into coherent, testable pathways that transcend individual studies or chemical specificities [48]. For example, a developed AOP network for developmental androgen signaling inhibition connects multiple molecular initiating events (including reduced testosterone synthesis, impaired conversion to dihydrotestosterone, and direct androgen receptor antagonism) to the adverse outcome of shortened anogenital distance in male offspring [50] [51].

The AOP framework supports regulatory applications by identifying measurable key events that can be monitored using alternative methods, thus reducing the need for whole-animal testing. Several AOPs have been formally adopted by the Organisation for Economic Co-operation and Development (OECD) and are referenced in test guidelines, demonstrating their growing regulatory relevance [50]. Importantly, AOPs establish modular causality where the same molecular initiating event may lead to different adverse outcomes depending on the biological context, and conversely, different molecular initiators may converge on the same adverse outcome [50] [51].

The Expanding Universe of New Approach Methodologies

New Approach Methodologies encompass a broad spectrum of innovative tools and strategies that aim to modernize chemical safety assessment. According to current definitions, NAMs include "emerging technology, methodology, approach, or combination thereof, having the potential to improve risk assessment for fulfilling critical information gaps and avoid or reduce the reliance on animal studies" [47] [49]. The NAM landscape includes in vitro systems (such as 3D cell cultures, organoids, and microphysiological systems), computational approaches (including QSAR, read-across, and PBPK modeling), OMICS technologies (transcriptomics, proteomics, metabolomics), and high-throughput screening platforms [47].

These methodologies are increasingly being incorporated into Integrated Approaches to Testing and Assessment (IATA), which combine multiple information sources to support hazard identification, hazard characterization, and safety assessment decisions [47]. The OECD provides guidance on developing and using IATA, reflecting international efforts to harmonize the application of these novel approaches in regulatory contexts [47]. A key advantage of NAMs is their ability to inform population variability by identifying susceptible subpopulations such as pregnant females, infants, and occupationally exposed workers, thereby enabling more refined risk assessments that account for individual susceptibility [47].

Methodological Integration: Experimental Protocols and Workflows

AOP-Informed Testing Strategies

The integration of AOPs with experimental NAMs follows a systematic workflow that begins with AOP analysis to identify measurable key events, proceeds through test system selection and experimental implementation, and culminates in data integration for risk assessment conclusions. A practical example of this approach is demonstrated in a case study on pyrethroids, which implemented a tiered testing strategy incorporating both in vitro bioactivity data and toxicokinetic modeling [52]. The experimental workflow progressed through five sequential tiers:

Tier 1: Bioactivity data gathering from ToxCast high-throughput screening assays established initial indicators of biological activity across different tissue and gene categories [52].
Tier 2: Combined risk assessment exploring relative potencies and correlations between in vitro bioactivity data and traditional points of departure such as NOAELs (No Observed Adverse Effect Levels) [52].
Tier 3: Screening and prioritization using toxicokinetic modeling to simulate plasma and tissue concentrations at realistic human exposure levels [52].
Tier 4: Refinement of bioactivity indicators through comparison of in vitro and in vivo points of departure [52].
Tier 5: Risk characterization using Margin of Exposure (MoE) analysis based on internal dose metrics [52].

This tiered approach exemplifies how AOP-informed testing strategies can efficiently generate data for risk assessment while minimizing resource-intensive testing.

Protocol: AOP-Based Chemical Mixture Assessment Using "Footprinting"

For assessing mixtures of chemicals, the AOP footprinting methodology provides a structured protocol that leverages the AOP framework to identify points of convergence and interaction among mixture components [48]. The step-by-step experimental protocol includes:

Problem Formulation: Identify the adverse outcome of concern and putative AOPs relevant to the mixture components.
AOP Network Development: Compile all known or suspected AOPs contributing to the identified adverse outcome.
Key Event Profiling: For each mixture component, systematically profile activity at all key events within the relevant AOP network using appropriate in vitro or in silico methods.
Footprint Identification: Identify the key events most proximal to the adverse outcome within each AOP where similarity between mixture components can be confidently determined—these constitute the "footprint" for that AOP.
Mixture Assessment: Evaluate combined effects based on activities at the identified footprint key events, considering potential interactions [48].

This methodology enables the use of NAM-based data for mixture risk assessment by focusing evaluation on the most informative points in the toxicity pathway, thus simplifying the complexity of assessing numerous potential interactions across entire AOP networks [48].

Protocol: Integrating OMICS Data into AOP Development

OMICS technologies provide powerful approaches for populating AOPs with empirical data and discovering novel key event relationships. The standard protocol for OMICS-AOP integration involves:

Experimental Design: Expose relevant in vitro or in vivo model systems to graded concentrations of test chemicals across multiple timepoints.
Molecular Profiling: Conduct transcriptomic, proteomic, or metabolomic analysis using standardized platforms such as microarrays or RNA sequencing.
Benchmark Dose (BMD) Modeling: Apply computational approaches to derive point of departure (PoD) estimates from OMICS data [47].
Pathway Analysis: Identify significantly perturbed biological pathways and map these to existing AOP frameworks or identify potential new key events.
Cross-Species Extrapolation: Use physiologically based pharmacokinetic (PBPK) modeling to translate in vitro bioactivity concentrations to human equivalent doses [47].

The OECD OMICS Reporting Framework (OORF) provides guidance for ensuring data quality and reproducibility throughout this process, addressing one of the significant challenges in using high-dimensional data for regulatory applications [47].

Comparative Analysis: Quantitative Data Integration Across NAM Platforms

Performance Metrics for Different NAM Categories

The utility of different NAMs for populating AOPs varies significantly based on the specific key event being measured and the context of use. The table below summarizes the quantitative performance characteristics of major NAM categories when applied to AOP development and use:

Table 2: Performance Metrics of NAMs in AOP Context

NAM Category	Typical Throughput	Key Event Measurement Capability	Regulatory Acceptance Status	Key Limitations
High-Throughput In Vitro Assays	100-10,000 compounds/week	Molecular initiating events & cellular key events	Accepted for screening & prioritization	Limited biological complexity, uncertain in vivo relevance
OMICS Technologies	10-100 compounds/week	Multiple key events simultaneously	Emerging for point of departure derivation	Data interpretation challenges, high dimensionality
Physiologically Based Kinetic Models	Varies by complexity	Interspecies & in vitro to in vivo extrapolation	Growing acceptance for specific applications	Parameter uncertainty, limited validation for novel chemicals
QSAR/Read-Across	1,000+ compounds/day	Molecular initiating event prediction	Established for specific endpoints	Domain of applicability constraints
Microphysiological Systems	10-100 compounds/month	Tissue-level key events	Early stage, limited acceptance	Technical complexity, standardization challenges
AOP Footprinting	Varies by complexity	Key event interactions in mixtures	Conceptual stage, limited implementation	Limited empirical validation

Case Study: Androgen Signaling Disruption AOP Network

A comprehensive AOP network for developmental androgen signaling inhibition demonstrates how quantitative data from various NAMs can be integrated to support regulatory decisions. This network includes three distinct AOPs converging on the adverse outcome of "short anogenital distance in male offspring" [50] [51]. The key events in this network have been measured using multiple NAM platforms:

Molecular Initiating Events: Direct androgen receptor binding measured using high-throughput yeast assays (OECD TG 458) and AR-CALUX assays [50].
Cellular Key Events: Steroidogenesis disruption measured using H295R cell assay (OECD TG 456) [50].
Tissue Key Events: Androgen-dependent tissue changes measured using ex vivo organ culture systems [50].
Organism-level Outcomes: Short anogenital distance measured in vivo (OECD TG 443, 421, 422) [50].

The empirical support for this AOP network demonstrates moderate to high confidence, with most key events measurable by established in vitro methods in the upstream portions of the pathway [50] [51]. This network has broad taxonomic applicability to mammals, with most evidence derived from mouse, rat, and human studies.

The Scientist's Toolkit: Essential Research Reagents and Platforms

Successful implementation of integrated AOP-NAM approaches requires access to specialized research tools and platforms. The following table details key resources that support the experimental workflows described in this guide:

Table 3: Essential Research Reagent Solutions for AOP-NAM Integration

Tool/Reagent Category	Specific Examples	Primary Application	Key Features
High-Throughput Screening Platforms	ToxCast/Tox21 assay battery	Molecular initiating event identification	Standardized assay protocols, extensive reference chemical database
Cell-Based Model Systems	3D organoids, microphysiological systems, spheroids	Tissue-level key event measurement	Enhanced physiological relevance compared to 2D cultures
Computational Toxicology Tools	OECD QSAR Toolbox, CompTox Chemicals Dashboard	Chemical prioritization & read-across	Structured workflows for data gap filling
Toxicokinetic Modeling Software	PK-Sim, httk R package	In vitro to in vivo extrapolation	High-throughput toxicokinetic parameter estimation
OMICS Data Analysis Platforms	BMD Express, Cytoscape	Benchmark dose modeling & network visualization	Integration of dose-response modeling with pathway analysis
AOP Knowledge Bases	AOP-Wiki, AOP-DB	AOP discovery & development	Curated repository of established AOPs

Taxonomic Applicability in AOP Research

The taxonomic domain of applicability represents a critical consideration when extrapolating AOP-based knowledge across species. Most developed AOPs have explicit taxonomic boundaries that define their relevance to specific groups of organisms [50] [51] [53]. For example, the AOP network for developmental androgen signaling inhibition has a documented taxonomic domain of mammals, with most evidence derived from mouse, rat, and human studies [50] [51]. The upstream molecular events in this network (e.g., androgen receptor binding, steroidogenesis inhibition) have broad taxonomic applicability to all mammals and could potentially extend to other vertebrates, while the downstream events specific to perineal development have a narrower applicability domain [50].

Similarly, the AOP for decreased ALDH1A activity leading to female infertility via disrupted meiotic initiation is explicitly applicable to mammals, with evidence primarily from mouse models and supporting human data [53]. The conservation of retinoic acid signaling in germ cell development across mammalian species provides the biological basis for this taxonomic domain [53]. Understanding these taxonomic boundaries is essential for proper application of AOP knowledge in ecological risk assessment where protection goals often extend beyond humans to include wildlife species.

The taxonomic applicability of AOPs has significant implications for chemical safety assessment across regulatory jurisdictions. For pharmaceuticals, agricultural chemicals, and industrial compounds with potential environmental release, defining the taxonomic domain of AOPs enables more informed extrapolation from model test species to species of concern [50] [53]. This approach supports the development of more efficient testing strategies that leverage mechanistic knowledge to reduce animal testing while maintaining environmental protection standards.

The integration of Adverse Outcome Pathways with other New Approach Methodologies represents a transformative advancement in chemical risk assessment. This synergistic approach enables mechanistically informed decisions that can keep pace with the growing number of chemicals requiring evaluation while reducing reliance on animal testing. The case studies and experimental protocols presented in this guide demonstrate practical implementations of this integration across different toxicological endpoints and regulatory contexts.

Future developments in this field will likely focus on quantitative AOP development to support prediction of point of departure values, expansion of AOP networks to address complex adverse outcomes, and enhanced approaches for chemical mixture assessment [48]. Additionally, increasing incorporation of artificial intelligence and machine learning approaches promises to accelerate AOP development and application by identifying novel key event relationships from large-scale toxicological data sources [47]. As these methodologies continue to mature, their regulatory acceptance is expected to grow, ultimately transforming chemical safety assessment into a more efficient, mechanistically grounded, and human-relevant paradigm.

The Role of tDOA in Regulatory Acceptance and the Future of Predictive Toxicology

In the evolving landscape of predictive toxicology, the taxonomic Domain of Applicability (tDOA) has emerged as a critical framework for defining the boundaries within which computational models can reliably predict chemical toxicity. As regulatory agencies increasingly accept non-animal testing approaches, establishing a chemically and biologically defined tDOA provides the scientific confidence needed for adopting these novel methodologies. The tDOA precisely delineates the taxonomic scope—the species, strains, or populations—for which an Adverse Outcome Pathway (AOP) or predictive model is biologically plausible, thereby addressing a fundamental challenge in cross-species extrapolation [4]. This formalization is accelerating a paradigm shift from traditional, observation-based toxicology toward a more predictive, mechanism-driven discipline essential for modern drug development and safety assessment.

The urgent need for such frameworks is underscored by the staggering attrition rates in drug discovery, where safety concerns halt 56% of projects, representing the largest contributor to failure after efficacy [54]. This failure rate, coupled with the ethical and scientific limitations of animal testing, has catalyzed the integration of computational approaches. The global market for AI in predictive toxicology, poised to grow from USD 635.8 million in 2025 to USD 3,925.5 million by 2032 at a remarkable CAGR of 29.7% [55], reflects the strategic importance of these technologies. Within this context, tDOA provides the scientific rigor necessary for regulatory acceptance by ensuring predictions are grounded in conserved biological pathways across defined taxonomic groups.

Defining the Taxonomic Domain of Applicability (tDOA)

Conceptual Framework and Regulatory Significance

The taxonomic Domain of Applicability (tDOA) is a formal boundary within a predictive toxicology model or an Adverse Outcome Pathway (AOP) that specifies the taxonomic groups for which the described biological pathway is valid. It moves beyond simple chemical similarity to encompass the conservation of key biological events across species, thereby providing a biologically plausible basis for extrapolation [4]. This is particularly vital for regulatory applications, where understanding the relevance of a toxicity pathway in humans based on data from model organisms is paramount.

The tDOA framework directly supports the 3Rs principle (Replacement, Reduction, and Refinement) in toxicology testing by providing a scientifically sound basis for using New Approach Methodologies (NAMs) [55] [54]. For regulatory bodies like the U.S. FDA, which has announced plans to reduce or replace animal testing through AI-based toxicity models and other NAMs, clearly defined tDOAs build confidence in these alternative approaches [55]. The establishment of tDOAs enables researchers to make reliable predictions for human toxicity using data from taxonomically relevant species, even when direct human data is unavailable or ethically problematic to obtain.

Methodologies for Establishing tDOA

Establishing a robust tDOA requires multiple computational and experimental approaches that collectively build confidence in cross-species predictions:

Genes-to-Pathways Species Conservation Analysis: This bioinformatics approach assesses the conservation of key molecular pathways across taxonomic groups by analyzing the preservation and functional similarity of genes involved in toxicity pathways [4].
Sequence Alignment to Predict Across Species Susceptibility: Computational tools compare protein sequences (e.g., for receptors, enzymes) critical to the toxicological mechanism to evaluate functional conservation and potential susceptibility across species [4].
Bayesian Network Modeling: This statistical approach quantitatively assesses confidence in Key Event Relationships within an AOP network by evaluating the probabilistic dependencies between molecular initiating events, intermediate key events, and adverse outcomes across different taxonomic groups [4].

Table 1: Methodologies for Establishing Taxonomic Domain of Applicability

Methodology	Primary Function	Key Output	Regulatory Utility
Genes-to-Pathways Species Conservation Analysis	Assess pathway conservation across taxa	Identification of evolutionarily conserved toxicity pathways	Supports mechanistic relevance for human translation
Sequence Alignment to Predict Across Species Susceptibility	Compare protein sequences critical to toxicity	Evaluation of functional conservation for key biomolecules	Justifies specific model organism use for human risk assessment
Bayesian Network Modeling	Quantify confidence in Key Event Relationships	Probabilistic assessment of AOP network robustness across species	Provides quantitative uncertainty analysis for regulatory decisions

Computational Approaches in Predictive Toxicology: Performance Comparison

Classical Machine Learning and Deep Learning Approaches

Predictive toxicology leverages a spectrum of AI technologies, each with distinct strengths for specific applications. Classical machine learning algorithms, including random forests, support vector machines, and artificial neural networks (ANNs), currently dominate the market with a projected 56.1% share in 2025 [55]. These methods excel with structured chemical data and established toxicological endpoints, providing interpretable models with well-understood uncertainty boundaries. For instance, Zhao et al. developed an ANN model that achieved 96.32% accuracy in predicting linezolid-induced thrombocytopenia, significantly outperforming traditional logistic regression [56].

Deep learning approaches offer enhanced capabilities for processing complex, high-dimensional data such as molecular structures, omics profiles, and high-content imaging data. While these models capture intricate structure-activity relationships, they typically require larger training datasets and present greater interpretability challenges—a significant consideration for regulatory submissions. The emerging integration of graph neural networks and generative modeling is further expanding predictive capabilities for novel chemical entities [55].

Integrated Approaches and Multi-Omics Analysis

The most advanced predictive frameworks combine multiple computational approaches with experimental data to enhance reliability and regulatory acceptance. Rodríguez-Belenguer et al. demonstrated a methodology that integrates mechanistic information (Molecular Initiating Events based on AOPs) with toxicokinetic data [56]. By combining multiple QSAR models describing simpler biological phenomena with quantitative in vitro-to-in vivo extrapolation (QIVIVE) models, they significantly enhanced prediction sensitivity for complex endpoints like cholestasis.

Multi-omics integration represents another powerful approach, where transcriptomic, proteomic, and metabolomic data are combined with structural information to map complete toxicity pathways. For example, Sung et al. introduced the Multi-Dimensional Transcriptomic Ruler (MDTR), a knowledge-guided tool for quantifying liver toxicity through KEGG pathways in transcriptomic data [56]. MDTR outperformed conventional metrics in detecting dose-dependent hepatotoxicity, demonstrating how pathway-centric models enhance prediction accuracy.

Table 2: Performance Comparison of AI Approaches in Predictive Toxicology

Technology	Key Advantages	Limitations	Exemplary Performance
Classical Machine Learning	Interpretable models, effective with structured data, works with smaller datasets	Limited ability with complex, unstructured data	96.32% accuracy for thrombocytopenia prediction [56]
Deep Learning	Handles complex data structures, identifies intricate patterns	High data requirements, "black box" interpretability challenges	Enhanced prediction of novel chemical entities [55]
Integrated AOP-TK Modeling	Mechanistically grounded, suitable for complex endpoints	Requires comprehensive biological knowledge	Enhanced sensitivity for cholestasis prediction [56]
Multi-Omics Analysis	Pathway-level insight, human-relevant mechanisms	Data integration challenges, computational complexity	Superior hepatotoxicity detection vs. conventional metrics [56]

Experimental Protocols for Model Development and Validation

Protocol 1: Developing a Cross-Species AOP Network with Extended tDOA

The development of a cross-species AOP network with a defined tDOA involves a systematic, multi-step process as demonstrated in recent research on silver nanoparticle reproductive toxicity [4]:

Data Collection and Literature Mining: Gather existing data from diverse sources including in vitro human cell studies, in vivo animal models, and molecular-to-individual level effects from published literature. The data must fit established AOP criteria, focusing on measurable Key Events and established Key Event Relationships.
AOP Network Construction: Structure the collected information into a preliminary AOP network using standardized AOP frameworks. This involves defining the Molecular Initiating Event (MIE), intermediate Key Events, and Adverse Outcome, with special attention to biological conservation across species.
Confidence Assessment using Bayesian Networks: Apply Bayesian network modeling to quantitatively evaluate the strength and confidence of Key Event Relationships. This statistical approach provides a probabilistic framework for assessing the reliability of the proposed AOP network across taxonomic boundaries.
tDOA Extension via Computational Tools: Utilize in silico approaches including Genes-to-Pathways Species Conservation Analysis and Sequence Alignment to Predict Across Species Susceptibility. These tools analytically extend the biologically plausible tDOA to additional taxonomic groups based on evolutionary conservation of key pathway elements.
Experimental Verification: Conduct targeted in vitro or limited in vivo studies to validate predictions for newly included taxonomic groups, particularly for critical regulatory applications.

This protocol successfully extended the tDOA of a reproductive toxicity AOP for silver nanoparticles from the nematode Caenorhabditis elegans to over 100 taxonomic groups, creating a comprehensive cross-species AOP network applicable to both human toxicology and ecotoxicology risk assessment [4].

Protocol 2: Integrated QSAR and Toxicokinetic Modeling for Complex Endpoints

For predicting complex toxicological endpoints like cholestasis, Rodríguez-Belenguer et al. developed a sophisticated integrated methodology [56]:

Develop Low-Level Models (LLMs): Create multiple QSAR models describing simpler biological phenomena that contribute to the overall toxicological endpoint. These models focus on specific molecular interactions or limited cellular responses.
Incorporate Mechanistic Information: Structure the LLMs within established Adverse Outcome Pathway frameworks, specifically mapping them to relevant Molecular Initiating Events to ensure biological plausibility.
Integrate Toxicokinetic Data: Combine the mechanistic models with toxicokinetic parameters including absorption, distribution, metabolism, and excretion to better reflect in vivo conditions.
Apply Quantitative In Vitro to In Vivo Extrapolation (QIVIVE): Use QIVIVE modeling to translate in vitro effect concentrations to human equivalent doses, incorporating species-specific physiological differences.
Model Validation and Sensitivity Analysis: Rigorously validate the integrated model using external compound sets and perform comprehensive sensitivity analyses to identify the most influential parameters and uncertainty sources.

This protocol demonstrates how integrating multiple modeling approaches with mechanistic knowledge enhances prediction sensitivity for endpoints that are challenging to model with single QSAR approaches.

Visualization of Predictive Toxicology Workflows

Cross-Species AOP Network Development

Diagram Title: Cross-Species AOP Development Workflow

Integrated Computational-Experimental Validation Pipeline

Diagram Title: Integrated Validation Pipeline

The Scientist's Toolkit: Essential Research Reagent Solutions

Table 3: Essential Research Reagents and Platforms for Predictive Toxicology

Reagent/Platform	Function	Application in tDOA Research
ADMET Predictor	Machine learning platform for predicting ADMET properties	Provides in silico predictions for pharmacokinetic and toxicity parameters [55]
Organ-on-a-Chip Systems	Microfluidic devices replicating human organ units	Generates human-relevant toxicity data without species extrapolation [54]
3D Spheroid Cultures	Three-dimensional cell culture models	Provides more physiologically relevant data than 2D cultures for hepatotoxicity assessment [54]
Toxicogenomic Databases	Curated databases of gene expression changes in response to toxins	Training data for AI models and conservation analysis for tDOA [55] [56]
Bayesian Network Software	Statistical modeling platforms	Quantifies confidence in Key Event Relationships in AOP networks [4]
Sequence Alignment Tools	Bioinformatics software for cross-species comparison	Assesses conservation of molecular targets for tDOA definition [4]

Regulatory Acceptance and Future Perspectives

Current Regulatory Landscape and tDOA Integration

Regulatory agencies worldwide are increasingly recognizing the value of well-validated computational approaches in toxicology assessment. The FDA's forward-looking Initiative 2.0 encourages adopting advanced technologies to streamline drug approval processes, with a particular focus on reducing animal testing through New Approach Methodologies (NAMs) [55] [54]. The establishment of the Center for Drug Evaluation and Research (CDER) AI Steering Committee further demonstrates regulatory commitment to facilitating AI integration in toxicology assessment [54].

Within this evolving landscape, clearly defined tDOAs provide the scientific foundation for regulatory acceptance of alternative methods. By explicitly stating the biological boundaries of a model's applicability, tDOAs address key regulatory concerns regarding model interpretability and appropriate use. The integration of tDOA concepts with AOP networks represents a particularly promising approach for regulatory science, as it combines mechanistic understanding with clearly defined applicability domains, thereby supporting more informed risk assessment decisions across multiple species [4].

Future Directions in tDOA Research

The future of tDOA research in predictive toxicology will likely focus on several key areas:

Expansion of Cross-Species AOP Networks: Research will continue to extend the taxonomic domains of existing AOPs, particularly for endpoints with significant regulatory importance such as reproductive toxicity, neurotoxicity, and carcinogenicity [4].
Integration of Real-World Evidence: The use of real-world data from sources like the FDA Adverse Event Reporting System (FAERS) and electronic health records will enhance the validation of tDOA-based predictions and identify potential gaps in current models [56].
Advanced AI for tDOA Definition: Machine learning approaches will increasingly automate and refine tDOA establishment by identifying conserved pathway elements across diverse taxonomic groups and predicting susceptibilities for data-poor species [55] [57].
Standardization and Harmonization: Efforts to standardize tDOA reporting requirements across regulatory agencies will facilitate global acceptance of computational toxicology approaches and support international chemical safety assessment.

As these advancements mature, tDOA-defined models are poised to become central components of integrated testing strategies for regulatory decision-making, ultimately supporting more human-relevant safety assessments while reducing animal testing in accordance with the 3Rs principles.

The establishment of robust taxonomic Domains of Applicability represents a transformative advancement in predictive toxicology, providing the scientific foundation needed for regulatory acceptance of novel computational approaches. By explicitly defining the biological boundaries within which toxicity predictions remain valid, tDOAs address fundamental challenges in cross-species extrapolation and model uncertainty. The integration of tDOA concepts with AOP networks, multi-omics data, and advanced AI modeling creates a powerful framework for predicting chemical toxicity across diverse taxonomic groups while reducing reliance on traditional animal testing. As regulatory agencies continue to modernize their approaches through initiatives like FDA 2.0 and the promotion of New Approach Methodologies, clearly defined tDOAs will play an increasingly critical role in building confidence in computational toxicology approaches. Through continued research, standardization, and validation, tDOA-guided predictive models will accelerate the development of safer pharmaceuticals while supporting both ethical imperatives and scientific progress in toxicological risk assessment.

Conclusion

The systematic evaluation of the Taxonomic Domain of Applicability is paramount for transforming AOPs from descriptive frameworks into reliable, predictive tools for cross-species toxicology and drug development. By integrating foundational principles, advanced bioinformatics methodologies, robust troubleshooting strategies, and rigorous validation, researchers can significantly enhance the confidence in extrapolating AOPs across species. Future directions should focus on expanding the empirical evidence for tDOA, particularly for under-represented taxa, further developing and standardizing computational tools, and fully integrating tDOA-informed AOPs into regulatory paradigms and next-generation risk assessments. This evolution will be crucial for improving the efficiency of drug discovery, reducing late-phase attrition, and ultimately protecting both human and environmental health.