CRED Evaluation Method: Ensuring Reliability and Relevance in Ecotoxicity Data Assessment

Camila Jenkins Jan 09, 2026 245

This article provides a comprehensive overview of the CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) method, a standardized framework for assessing the reliability and relevance of aquatic ecotoxicity studies.

CRED Evaluation Method: Ensuring Reliability and Relevance in Ecotoxicity Data Assessment

Abstract

This article provides a comprehensive overview of the CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) method, a standardized framework for assessing the reliability and relevance of aquatic ecotoxicity studies. Targeted at researchers, scientists, and drug development professionals, it explores the foundational principles of CRED, detailed application methodologies, common troubleshooting scenarios, and comparative validation against traditional approaches like the Klimisch method. The synthesis highlights CRED's role in enhancing regulatory decision-making and suggests future directions for integrating advanced models and non-animal testing methods in biomedical research.

Understanding CRED: The Foundation for Transparent Ecotoxicity Assessment

Introduction to the CRED Framework and Its Regulatory Importance

The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) framework is a standardized, transparent methodology developed to assess the reliability and relevance of aquatic ecotoxicity studies for regulatory decision-making [1] [2]. Designed to replace the less precise Klimisch method, CRED provides detailed criteria and guidance to minimize expert judgment bias, thereby improving the consistency and reproducibility of environmental risk assessments for chemicals and pharmaceuticals [1]. This document outlines the core principles of the CRED evaluation method, details its application protocols, and discusses its growing importance in regulatory frameworks aimed at deriving Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQS) [2].

Thesis Context: The CRED Evaluation Method in Ecotoxicity Data Research

This application note is framed within a broader thesis positing that the standardization of data evaluation is critical for advancing robust, science-based environmental regulation. The CRED framework serves as a cornerstone for this thesis by providing a systematic tool to bridge the gap between peer-reviewed ecotoxicity science and regulatory policy [1]. Its development addresses a documented need for greater transparency and reduced subjectivity in selecting studies for chemical risk assessment, particularly for data-poor substances like pharmaceuticals and nanomaterials [1] [3]. By offering a clear pathway from data evaluation to regulatory application, CRED enhances the credibility and defensibility of environmental safety decisions.

Foundational Definitions and Core Components

The CRED framework is built on two distinct but complementary assessment pillars: Reliability and Relevance [1].

Reliability refers to the inherent scientific quality of a study, assessing the soundness of its experimental design, performance, and data analysis, irrespective of its intended use [1].
Relevance refers to the appropriateness of the study for a specific hazard identification or risk characterization purpose (e.g., its taxonomic, endpoint, and exposure scenario alignment with the assessment goal) [1].

A study can be reliable but not relevant for a particular assessment, and vice versa. CRED's core innovation is its detailed, criteria-based approach to evaluating these aspects, moving beyond the vague categories of the historically used Klimisch method [1] [2].

Table 1: Core CRED Evaluation Criteria for Aquatic Ecotoxicity Studies

Evaluation Dimension	Number of Criteria	Description of Scope	Key Objective
Reliability	20 criteria [1] [2]	Covers test design, substance characterization, organism health, exposure control, statistical analysis, and results reporting.	To determine the intrinsic scientific validity and reproducibility of the study.
Relevance	13 criteria [1] [2]	Covers taxonomic and endpoint appropriateness, environmental realism of exposure, and the significance of observed effects.	To judge the usefulness and applicability of the study's data for a specific regulatory question.

The CRED Evaluation Protocol: A Step-by-Step Application Note

The following protocol details the standardized procedure for applying the CRED method to evaluate individual aquatic ecotoxicity studies.

3.1. Protocol: Application of the CRED Evaluation Framework

Primary Objective: To systematically determine the reliability and relevance of an aquatic ecotoxicity study for use in regulatory environmental risk assessment.
Principle: A trained assessor scores a study against a predefined set of criteria, using explicit guidance to minimize subjective judgment. The outcome is a documented evaluation that supports the study's inclusion, exclusion, or weighted use in a meta-analysis or regulatory derivation process [1].

3.2. Materials and Software

CRED Evaluation Sheet: The official Excel-based tool containing the 33 criteria with accompanying guidance notes [2] [3].
Source Document: The full-text peer-reviewed publication or study report to be evaluated.
Reporting Guidance: The CRED reporting recommendations (50 criteria across 6 categories) to cross-reference ideal reporting standards [1].

3.3. Experimental Procedure

Study Identification and Screening: Identify the study via literature search. Perform an initial, broad relevance check based on title and abstract (e.g., aquatic organism, appropriate chemical) [1].
Initial Read-Through: Read the study in full to understand its objectives, methods, and results holistically.
Reliability Assessment: a. For each of the 20 reliability criteria, consult the guidance in the evaluation sheet. b. Determine if the study fulfills the criterion (e.g., "Yes," "No," "Partly," or "Not Applicable") based solely on the information reported. c. Document the justification for each score with specific references to the study text, tables, or figures.
Relevance Assessment: a. Define the specific regulatory assessment context (e.g., derivation of a freshwater PNEC for a pharmaceutical). b. For each of the 13 relevance criteria, judge the study's alignment with the assessment context. c. Document the relevance score and justification.
Integrated Conclusion: Synthesize the reliability and relevance scores. A study with high scores in both dimensions is a key study for decision-making. A study with high relevance but moderate reliability may be used as supporting evidence, with its limitations clearly stated [1].
Reporting: The completed evaluation sheet serves as the audit trail, ensuring the assessment is transparent, consistent, and reproducible for other experts [1].

CRED Evaluation Workflow for Ecotoxicity Studies

Validation and Comparative Performance: The Ring Test Protocol

The superiority of the CRED method over the Klimisch approach was established through a formal ring test (inter-laboratory comparison) [1] [2] [3].

4.1. Protocol: Ring Test for Comparing Evaluation Methods

Objective: To compare the consistency, accuracy, and user perception of the Klimisch and CRED evaluation methods among expert risk assessors [1].
Design: A crossover study where participants evaluated different sets of ecotoxicity studies using each method.

4.2. Experimental Procedure

Participant Selection: Risk assessors from industry, academia, consultancy, and government across multiple continents were recruited [1].
Study Selection: A diverse set of 8+ aquatic ecotoxicity studies from peer-reviewed literature was chosen [1].
Phase 1 (Klimisch): Each participant evaluated 2 studies using the traditional Klimisch method (categories: 1=reliable without restriction, 2=reliable with restriction, 3=not reliable, 4=not assignable) [1].
Phase 2 (CRED): Each participant evaluated 2 different studies using the draft CRED evaluation sheet [1].
Data Collection: Participants submitted their evaluations and provided feedback on the methods' clarity, ease of use, and transparency.
Analysis: Consistency among assessors was measured for each method. Feedback was analyzed thematically to refine the CRED criteria [1].

Table 2: Key Outcomes from the CRED vs. Klimisch Ring Test

Performance Metric	Klimisch Method	CRED Evaluation Method	Implication
Inter-assessor Consistency	Low; high subjectivity due to vague criteria [1].	Higher; structured criteria reduce arbitrary judgment [1] [3].	Improves harmonization of assessments across institutions/countries.
Transparency	Low; limited guidance for scoring [1].	High; extensive guidance for each criterion provides an audit trail [1] [2].	Increases defensibility and allows for peer review of the evaluation itself.
Perceived Utility by Assessors	Criticized for being unspecific and leaving too much room for interpretation [1].	Preferred for being more accurate, applicable, consistent, and transparent [1] [3].	Facilitates adoption and improves trust in the evaluation process.

Framework Extensions and the Scientist's Toolkit

The core CRED framework has been adapted to address specific testing domains and emerging contaminants, demonstrating its flexibility and ongoing development [3].

NanoCRED: Tailored for the evaluation of ecotoxicity studies on engineered nanomaterials, addressing specific reliability issues like particle characterization, dispersion protocols, and dosimetry [3].
EthoCRED: Designed for behavioral ecotoxicity studies, providing criteria to evaluate the reliability and relevance of complex behavioral endpoints, which are increasingly recognized as sensitive indicators of sublethal toxicity [3] [4].
CRED for Sediment and Soil: Extends the criteria to address the specificities of tests conducted in solid matrices, which differ fundamentally from aquatic tests [3].

CRED Framework and Its Specialized Extensions

Table 3: Essential Research Reagent Solutions for CRED-Aligned Ecotoxicity Testing

Research Reagent / Material	Function in Ecotoxicity Testing	Importance for CRED Reliability
Certified Reference Toxicants (e.g., K₂Cr₂O₇, NaCl)	Used in periodic tests to confirm the health and consistent sensitivity of biological cultures.	Demonstrates test organism viability and performance of laboratory procedures, a key reliability criterion.
Analytical Grade Test Substance	The chemical of interest, with purity and composition verified.	Accurate test substance characterization is fundamental for reliability; impurities can confound results.
Vehicle/Solvent Controls (e.g., acetone, dimethyl sulfoxide)	Used to dissolve poorly soluble substances without causing toxicity themselves.	Proper control selection is critical to isolate the effect of the test substance from artifacts.
Defined Culture Media & Food	Provides standardized nutrition for test organisms during culturing and assay.	Ensures organism health and standardization before and during testing, affecting result reproducibility.
Water Quality Verification Kits (for pH, conductivity, hardness, oxygen)	Monitors the physico-chemical parameters of exposure media.	Exposure condition stability must be documented to confirm the reported concentration is accurate.
Internal Analytical Standards (for chromatography, spectroscopy)	Used to quantify the actual concentration of the test substance in exposure media.	Dosing verification is a core reliability requirement to confirm the exposure scenario.

Regulatory Importance and Implementation

The CRED framework is transitioning from a scientific proposal to a tool embedded in regulatory practice, underscoring its critical importance.

Regulatory Piloting: CRED is being piloted in the revision of the EU's Technical Guidance Document for deriving EQS and in Swiss EQS proposals [2].
Integration into Data Systems: The method is used in the European Joint Research Centre's Literature Evaluation Tool and for populating databases like the NORMAN EMPODAT [2].
Sector-Specific Adoption: The pharmaceutical industry's iPiE project (Intelligence-led Assessment of Pharmaceuticals in the Environment) is considering CRED for data evaluation [2].
Standardization Goal: The framework provides a common language for evaluating studies, facilitating data sharing and acceptance across different regulatory jurisdictions (e.g., REACH, veterinary medicines, plant protection products) [1] [2].

The CRED framework represents a significant advance in the science of ecotoxicity data evaluation. By providing a detailed, transparent, and standardized protocol, it mitigates the bias and inconsistency inherent in expert judgment, thereby strengthening the scientific foundation of environmental risk assessment. Future development will focus on the broader adoption of CRED and its extensions (NanoCRED, EthoCRED) within global regulatory guidelines, training assessors in its use, and potentially expanding its principles to other ecotoxicological domains (e.g., terrestrial toxicology, microbiotoxicity). Its ongoing implementation is key to ensuring that regulatory decisions for chemicals and pharmaceuticals are both protective of the environment and firmly grounded in robust science.

Defining Reliability and Relevance in Ecotoxicity Studies

The regulatory assessment of chemicals relies on high-quality ecotoxicity data to derive environmental quality standards (EQS) and predicted-no-effect concentrations (PNECs). A fundamental step in this process is the evaluation of individual study reliability (the inherent quality of the test report and methodology) and relevance (the appropriateness of the data for a specific hazard or risk assessment)[reference:0]. Historically, the Klimisch method (1997) has been widely used, but it has been criticized for lacking detailed guidance, leading to inconsistent evaluations among experts[reference:1].

To address this, the CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) project developed a transparent, science-based evaluation method. CRED aims to strengthen the consistency and robustness of hazard and risk assessments by providing detailed criteria and guidance for both reliability and relevance evaluations of aquatic ecotoxicity studies[reference:2]. This application note details the CRED framework, its protocols, and its utility for researchers and risk assessors.

The CRED Evaluation Framework: A Structured Approach

The CRED evaluation method is built on two pillars: a set of 20 reliability criteria and 13 relevance criteria[reference:3]. These criteria were refined through an international ring test involving 75 risk assessors from 12 countries, which concluded that CRED was more accurate, consistent, and transparent than the Klimisch method[reference:4][reference:5].

The framework categorizes studies based on the outcome of the evaluation:

Reliability: R1 (Reliable without restrictions), R2 (Reliable with restrictions), R3 (Not reliable), R4 (Not assignable)[reference:6].
Relevance: C1 (Relevant without restrictions), C2 (Relevant with restrictions), C3 (Not relevant), C4 (Not assignable)[reference:7][reference:8].

The CRED project has also spawned specialized tools for specific domains, including NanoCRED for nanomaterials, EthoCRED for behavioral studies, and adaptations for sediment and soil ecotoxicity studies[reference:9][reference:10].

The core of the CRED method is its detailed checklist. The criteria are designed to be comprehensive, covering all aspects of study design, reporting, and applicability.

Criterion Category	Description & Key Questions	Evaluation Guidance
1. Test Substance	Identity, purity, stability, and concentration verification. Was the test substance properly characterized?	Refer to OECD GLP and product specifications.
2. Test Organism	Species, life stage, source, health status, and acclimation. Were organisms appropriate and healthy?	Check against standard test guidelines (e.g., OECD 201, 202, 203).
3. Experimental Design	Replicates, controls (negative, solvent), randomization, blinding. Was the design statistically sound?	Assess number of replicates, control performance, and randomization procedures.
4. Exposure Conditions	Duration, medium, temperature, pH, lighting, renewal system. Were conditions relevant and stable?	Compare to standard protocol requirements and monitor measured concentrations.
5. Endpoint Measurement	Clarity of endpoint definition, methodology, and timing of measurements. Was the endpoint clearly defined and measured accurately?	Evaluate if methods are standardized and results are presented with appropriate units.
6. Data Reporting & Statistics	Raw data availability, statistical methods, dose-response analysis, confidence intervals. Are data and analyses fully reported?	Check for transparency in data presentation and appropriateness of statistical tests.
7. Compliance with GLP/Standards	Adherence to Good Laboratory Practice (GLP) or standardized test guidelines (OECD, EPA).	GLP compliance increases reliability but is not an automatic pass; study flaws must still be considered[reference:11].

Source: Based on CRED evaluation method as described in Moermond et al., 2016[reference:12] and Kase et al., 2016[reference:13].

Criterion Category	Description & Key Questions	Regulatory Context Consideration
1. Biological Relevance	Appropriateness of test species, life stage, and endpoint to the protection goal (e.g., population-relevant effects). Is the endpoint linked to survival, growth, or reproduction?	Defined by the assessment context (e.g., Water Framework Directive prioritizes population-relevant endpoints)[reference:14].
2. Exposure Relevance	Correspondence between test exposure (route, duration, pattern) and realistic environmental exposure scenarios. Are test concentrations environmentally relevant?	Requires knowledge of predicted environmental concentrations (PECs) and exposure routes.
3. Substance Relevance	Suitability of the tested form (e.g., pure active ingredient vs. formulated product) for the assessment.	Important for product registration and environmental fate considerations.
4. Temporal & Spatial Relevance	Match between test duration and likely exposure period, and test system scale vs. ecosystem scale.	Consideration for acute vs. chronic risk assessments.

Source: Based on CRED evaluation method as described in Moermond et al., 2016[reference:15].

Experimental Protocol: Applying the CRED Evaluation Method

The following step-by-step protocol details how to perform a CRED evaluation for an aquatic ecotoxicity study.

Protocol: CRED Evaluation for Aquatic Ecotoxicity Studies

Objective: To systematically assess the reliability and relevance of an aquatic ecotoxicity study for use in regulatory hazard or risk assessment.

Materials:

Study report or publication to be evaluated.
CRED evaluation checklist (available as an Excel tool from the SciRAP website[reference:16]).
Relevant standard test guidelines (e.g., OECD, EPA, ISO).
Contextual information for relevance assessment (e.g., protection goals, environmental exposure data).

Procedure:

Preparation & Familiarization:
- Clearly define the regulatory context and protection goal for the assessment (e.g., deriving a PNEC under REACH)[reference:17].
- Thoroughly read the study report, noting any missing information.
Reliability Evaluation (20 Criteria):
- For each of the 20 reliability criteria on the checklist, answer the guided questions.
- Judge whether the criterion is fulfilled, not fulfilled, or not assignable due to missing information.
- Base judgments on what is reported in the study, not on assumptions. Refer to standard test guidelines for benchmark expectations.
- Note: The original CRED draft designated some criteria as "critical," but the final method avoids this designation to prevent automatic failure and retain expert judgment[reference:18].
Overall Reliability Categorization:
- Synthesize the outcomes from Step 2. There is no rigid scoring algorithm.
- Assign one of four Klimisch-style categories:
  - R1: Reliable without restrictions. All key criteria are fulfilled.
  - R2: Reliable with restrictions. Some minor limitations exist, but the study's core findings are valid.
  - R3: Not reliable. Major flaws invalidate the study's conclusions.
  - R4: Not assignable. Insufficient information is reported to make a judgment[reference:19].
Relevance Evaluation (13 Criteria):
- Evaluate the 13 relevance criteria based on the defined assessment context from Step 1.
- Determine the biological, exposure, and substance relevance of the study to the specific regulatory question.
Overall Relevance Categorization:
- Assign one of four categories:
  - C1: Relevant without restrictions.
  - C2: Relevant with restrictions.
  - C3: Not relevant.
  - C4: Not assignable[reference:20][reference:21].
Documentation & Justification:
- Record all judgments and their justifications in the CRED checklist. This transparent documentation is crucial for peer review and regulatory acceptance[reference:22].

Visualization: CRED Evaluation Workflow

The Scientist's Toolkit: Essential Reagents & Materials for Ecotoxicity Testing

The reliability of an ecotoxicity study is fundamentally linked to the quality of materials used. The following table lists key research reagent solutions and their functions in standard aquatic tests.

Table 3: Key Research Reagent Solutions for Aquatic Ecotoxicity Testing

Item	Function & Description	Example in Standard Tests
Reconstituted Water (e.g., M4, M7)	Serves as the standardized test medium for freshwater organisms. Provides defined concentrations of essential salts (Ca, Mg, Na, K) and buffers pH to ensure reproducibility across labs.	OECD Test Guideline 202 (Daphnia sp. Acute Immobilisation), OECD TG 201 (Algal Growth Inhibition).
Natural or Artificial Sea Salt Mixtures	Provides the necessary ionic composition and salinity for testing marine or estuarine species.	OECD TG 203 (Fish Acute Toxicity Test) for marine fish.
Solvent Carriers (e.g., Acetone, Methanol, DMSO)	Used to dissolve poorly water-soluble test substances. Must be non-toxic at the concentrations used and have minimal impact on test organism health and chemical bioavailability.	Concentration typically kept below 0.1 mL/L; solvent control required.
Nutrient Media for Algae/Cyanobacteria	Contains defined amounts of nitrogen, phosphorus, trace metals, and vitamins to support optimal growth of phytoplankton test species (e.g., Pseudokirchneriella subcapitata).	OECD TG 201 specifies media like OECD MBL or AA.
Yeast-Cereal-Trout Chow (YCT)	A standardized, nutritious food source for filter-feeding invertebrates like Daphnia magna during chronic reproduction tests.	OECD TG 211 (Daphnia magna Reproduction Test).
Commercial Fish Food Flakes/Pellets	Specified diet for maintaining and feeding fish during acute and chronic tests. Particle size and nutritional content are important.	OECD TG 210 (Fish Early-Life Stage Toxicity Test).
Reference Toxicants (e.g., K₂Cr₂O₇, CuSO₄, NaCl)	Used in routine laboratory proficiency checks. A test with a reference toxicant confirms that the test organisms are of normal sensitivity and that the test system is functioning correctly.	Potassium dichromate (K₂Cr₂O₇) is a common reference toxicant for Daphnia acute tests.

The CRED evaluation method represents a significant advancement in the critical appraisal of ecotoxicity data. By replacing subjective expert judgment with a transparent, criteria-based framework for both reliability and relevance, CRED promotes consistency, reduces bias, and increases the scientific robustness of regulatory decisions[reference:23]. Its structured protocol and available tools (including specialized versions like EthoCRED and NanoCRED) provide researchers, risk assessors, and drug development professionals with a standardized approach to ensure that only high-quality, relevant data inform environmental safety assessments. Widespread adoption of CRED can therefore contribute to better-protected ecosystems and more efficient, trustworthy chemical regulation.

Key Objectives and Scope of the CRED Evaluation Method

In regulatory hazard and risk assessment of chemicals, the evaluation of ecotoxicity study reliability and relevance is foundational. Traditional methods, notably the Klimisch approach, have been criticized for lacking detailed guidance, leading to inconsistent evaluations dependent on expert judgment[reference:0]. The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) project was initiated to address these shortcomings by developing a transparent, structured, and science-based evaluation framework[reference:1]. This article details the key objectives, scope, and practical application of the CRED evaluation method within the broader context of advancing ecotoxicity data research.

Key Objectives

The CRED evaluation method is designed to achieve the following core objectives:

Enhance Consistency and Transparency: Provide explicit criteria and detailed guidance to reduce subjectivity and variability in reliability and relevance evaluations across different assessors, regulatory frameworks, and countries[reference:2][reference:3].
Improve Reproducibility: Establish a standardized protocol so that evaluations of the same study by different risk assessors yield consistent, reproducible outcomes[reference:4].
Offer Comprehensive Guidance: Supply extensive support material that surpasses the limited criteria of the Klimisch method, covering both reliability (20 criteria) and relevance (13 criteria) aspects[reference:5][reference:6].
Facilitate Regulatory Harmonization: Serve as a robust tool to harmonize hazard and risk assessments of chemicals internationally, promoting uniformity in regulatory decision-making[reference:7].
Promote Adequate Reporting: Accompany the evaluation method with reporting recommendations (50 criteria across 6 categories) to improve the quality and completeness of future ecotoxicity study reports[reference:8].

Scope of Application

The CRED method is explicitly scoped for the evaluation of aquatic ecotoxicity studies[reference:9]. Its application is central to regulatory processes such as deriving Predicted-No-Effect Concentrations (PNECs), Environmental Quality Standards (EQS), and fulfilling requirements under frameworks like REACH and the Water Framework Directive[reference:10]. The method has also been adapted into specialized tools for specific domains, including NanoCRED for nanomaterials, EthoCRED for behavioral studies, and CRED for sediment and soil studies[reference:11][reference:12].

Application Notes & Protocols

The CRED evaluation is a structured process conducted using a dedicated Excel tool. The workflow involves a sequential assessment of predefined criteria, culminating in a summarized classification for both reliability and relevance.

Detailed Step-by-Step Evaluation Protocol

Study Preparation & Tool Setup: Obtain the full text of the aquatic ecotoxicity study to be evaluated. Download the latest version of the CRED Excel evaluation tool[reference:13].
Initial Data Entry: In the tool, enter basic study identifiers (e.g., author, publication year, test substance, test organism) into the designated "General Information" section.
Reliability Evaluation: Systematically review the study against each of the 20 reliability criteria[reference:14]. For each criterion (e.g., "Test organism details," "Exposure concentration verification"), select the appropriate scoring option (e.g., "Yes," "No," "Not Reported," "Not Applicable") based on the information provided in the study report.
Relevance Evaluation: Similarly, assess the study against the 13 relevance criteria[reference:15]. Evaluate factors such as the ecological representativeness of the test organism, the environmental realism of exposure conditions, and the appropriateness of the measured endpoint for the intended regulatory purpose.
Evaluation Summary: The tool will compile the scores from steps 3 and 4. Based on the pattern of responses, assign a final reliability classification:
- R1: Reliable without restrictions.
- R2: Reliable with restrictions.
- R3: Not reliable.
- R4: Not assignable (insufficient information)[reference:16].
Relevance Classification: Assign a final relevance classification based on the relevance criteria assessment:
- C1: Relevant without restrictions.
- C2: Relevant with restrictions.
- C3: Not relevant.
- C4: Not assignable[reference:17].
Documentation & Reporting: Record the final classifications and any critical remarks in the tool. The completed CRED evaluation sheet serves as the transparent audit trail for the assessment.

Experimental Protocol: The CRED Ring Test

A pivotal two-phased ring test was conducted to validate the CRED method against the established Klimisch method[reference:18].

Objective: To compare the consistency, accuracy, and user perception of the draft CRED method versus the Klimisch method for evaluating ecotoxicity studies.

Design: A blinded, cross-over design where participants evaluated different studies with each method.

Phase I (Nov-Dec 2012): 62 risk assessors evaluated the reliability and relevance of two out of eight preselected ecotoxicity studies using the Klimisch method[reference:19].
Phase II (Mar-Apr 2013): 54 assessors evaluated two different studies from the same set of eight using the draft CRED method[reference:20].
Studies: The eight studies covered diverse taxonomic groups (cyanobacteria, algae, crustaceans, fish), test designs (acute, chronic), and chemical classes (industrial, pharmaceutical, plant protection products)[reference:21][reference:22].
Participants: 75 participants from 12 countries, representing regulatory agencies, consultancies, industry, and academia[reference:23].
Outcome Measures: Primary: Consistency in reliability/relevance categorization among assessors. Secondary: User feedback on method clarity, practicality, and perceived confidence via questionnaires[reference:24].

Key Findings: The ring test concluded that risk assessors preferred the CRED method, finding it more transparent, consistent, and less dependent on expert judgment than the Klimisch method[reference:25][reference:26].

Data Presentation

Table 1: Comparative Characteristics of the Klimisch and CRED Evaluation Methods

Characteristic	Klimisch Method	CRED Method
Primary Data Type	Toxicity and ecotoxicity	Aquatic ecotoxicity
Number of Reliability Criteria	12–14 (for ecotoxicity)	20 (evaluation), 50 (reporting)
Number of Relevance Criteria	0	13
OECD Reporting Criteria Included	14 of 37	37 of 37
Additional Guidance Provided	No	Yes
Evaluation Summary	Qualitative for reliability only	Qualitative for reliability and relevance

Source: Adapted from Kase et al. (2016)[reference:27].

Category	Number of Criteria	Description / Purpose
Reliability	20	Assess the inherent quality of the study methodology and reporting. Covers test design, substance characterization, organism details, exposure conditions, and statistical analysis.
Relevance	13	Assess the appropriateness of the study for a specific regulatory hazard or risk assessment context. Covers ecological representativeness, exposure scenario realism, and endpoint significance.
Reporting Recommendations	50 (across 6 categories)	Guidance for authors to ensure future studies contain all information necessary for a CRED evaluation. Categories: General Information, Test Design, Test Substance, Test Organism, Exposure Conditions, Statistical Design & Biological Response[reference:28].

Visual Summaries

Diagram 1: CRED Evaluation Workflow

A flowchart illustrating the sequential steps in applying the CRED method to an ecotoxicity study.

Diagram 2: Structure of CRED Evaluation Criteria

A hierarchical diagram showing the main components and categories of the CRED evaluation framework.

The Scientist's Toolkit

The following tools and resources are essential for conducting a CRED evaluation.

Tool / Resource	Function & Description
CRED Excel Evaluation Tool	The primary operational instrument. Contains the checklist of 20 reliability and 13 relevance criteria, automates scoring, and facilitates the final classification[reference:29].
CRED Guidance Documents (PDF)	Provide essential context, detailed explanations for each criterion, and examples of how to apply them in practice, ensuring consistent interpretation[reference:30].
OECD Test Guidelines (e.g., 201, 210, 211)	International standard test protocols. Serve as the benchmark for evaluating the methodological adequacy of the study under review[reference:31].
Original Ecotoxicity Study Report	The document under evaluation. Must be the complete, peer-reviewed publication or study report to allow assessment against all CRED criteria.
Reference Regulatory Guidelines (e.g., REACH, WFD TGD)	Provide the regulatory context necessary for judging the relevance of a study for specific hazard or risk assessment purposes[reference:32].
Specialized CRED Variants (NanoCRED, EthoCRED)	Adapted tools for evaluating studies on nanomaterials or behavioral endpoints, reflecting the evolving scope of the CRED framework[reference:33].

Applying CRED Criteria: A Step-by-Step Guide to Ecotoxicity Data Evaluation

The derivation of Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) is a cornerstone of chemical risk assessment, underpinning regulatory decisions worldwide. These safe concentration limits depend entirely on the quality of the underlying ecotoxicity studies. Historically, evaluating the reliability and relevance of such studies relied heavily on expert judgment, using methods like the Klimisch approach, which was criticized for being unspecific, lacking detailed criteria, and introducing significant inconsistency between assessors [1].

The CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) evaluation method was developed to address these critical shortcomings. Its primary aim is to improve the reproducibility, transparency, and consistency of reliability and relevance evaluations across different regulatory frameworks and individual assessors [1]. For researchers and drug development professionals, whose work may undergo regulatory scrutiny, understanding the CRED criteria is essential for designing robust studies and effectively evaluating literature data.

The method distinguishes between reliability—the inherent scientific quality of a study's design, performance, and analysis—and relevance—the appropriateness of the study for a specific assessment purpose [1]. A study can be reliable but not relevant to a particular question, and vice-versa. This article focuses on the 20 reliability criteria, providing a detailed breakdown of what assessors must check to determine the intrinsic worth of an aquatic ecotoxicity study. This framework is central to a broader thesis advocating for systematic, transparent data evaluation in ecotoxicological research and regulation.

The 20 Reliability Criteria: A Detailed Breakdown for Assessors

The 20 reliability criteria of the CRED method are designed to provide a comprehensive, structured checklist for evaluating the intrinsic quality of an ecotoxicity study. They are grouped into four logical themes for systematic assessment [1].

Experimental Design and Documentation

This group ensures the study is conceptually sound and adequately described.

Criterion 1 (Test type and objective): Assessors must check that the type of test (e.g., acute lethality, chronic reproduction) and its scientific or regulatory objective are clearly stated.
Criterion 2 (Guideline compliance): It must be verified whether the test followed a national or international standardized guideline (e.g., OECD, ISO) or Good Laboratory Practice (GLP), and any deviations are justified.
Criterion 3 (Test duration): The reported exposure period must be appropriate for the test type and endpoint.
Criterion 4 (Control and replicate design): The study must describe the number of replicates, test vessels, and organisms per replicate. The use of negative and solvent controls (if applicable) must be appropriate and reported.
Criterion 5 (Randomization): The assignment of test organisms to treatments and controls should be randomized to avoid systematic bias.

Test Substance and Exposure Characterization

This group verifies that the chemical agent and its exposure conditions are sufficiently defined and controlled.

Criterion 6 (Test substance identification): The substance must be unambiguously identified (e.g., CAS number, source, purity, and chemical characterization). For mixtures or materials like nanomaterials, a detailed description is required [3].
Criterion 7 (Dosing and concentration verification): The nominal test concentrations must be reported. For studies where concentration stability is questionable (e.g., volatile, degradable compounds), measured concentrations should be provided.
Criterion 8 (Exposure medium preparation): The method for introducing the test substance into the exposure medium (e.g., water, sediment) must be described.
Criterion 9 (Test system and loading): The volume of test medium, renewal regime (static, renewal, flow-through), and the loading density of organisms must be appropriate to maintain exposure conditions and animal welfare.

Test Organism and Biological Endpoint Integrity

This group assesses the suitability of the biological model and the validity of the response measurements.

Criterion 10 (Test organism identification): The organism must be identified to the species level, with details on life stage, source, and cultivation history.
Criterion 11 (Organism health/acclimation): Evidence must show that organisms were healthy and acclimatized to laboratory conditions prior to testing.
Criterion 12 (Endpoint definition and measurement): The biological endpoint (e.g., mortality, growth, reproduction) must be clearly defined, and the method for its quantification must be described. For behavioural studies, this requires special consideration of equipment validation and measurement objectivity [4].
Criterion 13 (Control response acceptability): The response in the negative control must fall within acceptable limits (e.g., mortality < 10% in acute fish tests), demonstrating the health of the organisms and the test system's validity.

Data Analysis, Reporting, and Plausibility

This final group evaluates the statistical robustness and overall credibility of the reported results.

Criterion 14 (Data presentation): Individual raw data (e.g., per replicate) for responses at all test concentrations and controls should be available, ideally in tabular form.
Criterion 15 (Dose-response relationship): The data should demonstrate a clear relationship between concentration and effect. The statistical model used to describe this relationship (if any) should be stated.
Criterion 16 (Statistical methods): Appropriate statistical methods must be used to calculate effect values (e.g., LC50, NOEC) and their variability (e.g., confidence intervals).
Criterion 17 (Handling of outliers): The approach to identifying and handling potential outliers in the data must be reported and justified.
Criterion 18 (Result precision): The precision of the reported effect values (e.g., confidence intervals) must be considered in relation to the test's purpose.
Criterion 19 (Internal consistency): Assessors must check for internal consistency (e.g., do the results in text, tables, and figures align?).
Criterion 20 (Discussion of limitations): A balanced discussion of the study's strengths and limitations indicates scientific rigor and transparency.

Table 1: Quantitative Correlation between Fulfilled Criteria and Reliability Categories [5]

Reliability Category	Mean % of Criteria Fulfilled	Standard Deviation	Sample Size (n)
Reliable without restrictions	93%	12	3
Reliable with restrictions	72%	12	24
Not reliable	60%	15	58
Not assignable	51%	15	19

Application Notes and Experimental Protocols

Protocol for Conducting a CRED-Based Study Evaluation

This protocol details the systematic process for an assessor to evaluate an aquatic ecotoxicity study using the CRED method [1].

1. Preparation:

Secure the full-text study manuscript or report.
Utilize the standardized CRED evaluation sheet (typically an Excel-based tool) to record judgments [2].
Clearly define the regulatory or assessment context to inform the separate relevance evaluation.

2. Initial Screening:

Perform a first read to understand the study's scope, test substance, organism, and endpoints.
Confirm the study falls within the domain of aquatic ecotoxicity. For specialised areas (e.g., nanomaterials, behavioural toxicology, sediment tests), consult domain-specific extensions like NanoCRED or EthoCRED [3] [4].

3. Criterion-by-Criterion Assessment:

For each of the 20 reliability criteria, locate the relevant information in the study.
Judge each criterion as: "Fulfilled," "Not Fulfilled," "Not Reported," or "Not Applicable."
For criteria judged "Not Fulfilled" or "Not Reported," provide a brief written justification in the evaluation sheet. This is critical for transparency and for identifying specific data gaps.

4. Overall Reliability Categorization:

Synthesize the individual judgments to assign one of four final reliability categories:
- Reliable without restrictions: All key criteria are fulfilled; the study is scientifically robust.
- Reliable with restrictions: Some shortcomings exist, but the core findings are considered usable (e.g., for supporting evidence).
- Not reliable: Fundamental flaws invalidate the study's results for regulatory use.
- Not assignable: Crucial information is missing, preventing a proper evaluation.

5. Documentation:

The completed evaluation sheet serves as an audit trail, documenting the rationale for the final categorization. This ensures the evaluation is transparent and reproducible by a second assessor.

Validation Protocol: The CRED Ring Test

The CRED method itself was validated through an international ring test, the protocol of which serves as a model for comparative method assessment [1].

Objective: To compare the consistency, transparency, and user perception of the CRED evaluation method against the traditional Klimisch method.

Design:

Participants: Risk assessors from industry, academia, consultancy, and government across Asia, Europe, and North America, most with >5 years of experience [1].
Materials: A set of eight diverse aquatic ecotoxicity studies from the peer-reviewed and grey literature.
Procedure: The test was conducted in two phases:
- Phase 1 (Klimisch): Each participant evaluated two studies using the Klimisch method.
- Phase 2 (CRED): Each participant evaluated two different studies using the draft CRED method and guidance.
Data Collection: Participants provided their reliability and relevance categorizations for each study and feedback on the methods' usability.

Analysis:

Consistency: Measured by the degree of agreement between assessors for each study and method. Lower variation between assessors indicates higher consistency.
Outcome: The ring test concluded that the CRED method yielded more consistent evaluations and was perceived by participants as more accurate, applicable, and transparent than the Klimisch method [1]. Data from this test demonstrated a clear quantitative relationship between the percentage of fulfilled criteria and the assigned reliability category (see Table 1).

Emerging Protocol: AI-Assisted CRED Appraisal

Recent exploratory work has developed a protocol for leveraging Artificial Intelligence (AI) to aid in the CRED evaluation process, aiming to increase the scale and speed of systematic reviews [6].

Objective: To evaluate the feasibility of using a Large Language Model (LLM) to perform initial CRED reliability assessments.

Design:

AI Model: OpenAI's GPT-4 Turbo.
Training/Calibration: The model was provided with the 20 CRED criteria questions and used prompt engineering to calibrate its responses against expert human judgments.
Procedure: For a given full-text manuscript, the AI was asked to evaluate each of the 20 criteria. To manage context, only relevant text sections were provided for each criterion query [6].

Analysis & Outcome:

AI assessments were compared to expert judgments for concordance.
Results showed promise but highlighted limitations: The AI was partially concordant but tended to judge criteria as "Not Fulfilled" rather than "Not Reported," indicating a lack of domain-specific expectation. It performed well on objective criteria (e.g., GLP compliance) and correctly identified non-applicable criteria [6].
Conclusion: AI requires careful parameterization and expert oversight but can be a valuable tool for triaging large volumes of literature within a systematic review framework.

The Scientist's Toolkit: Essential Research Reagents & Materials

When designing or evaluating an ecotoxicity study for CRED compliance, specific materials and their proper documentation are critical. The following table details key reagent solutions and essential materials.

Table 2: Research Reagent Solutions for Aquatic Ecotoxicity Testing

Item	Function in Ecotoxicity Testing	CRED Evaluation Consideration
Reference Toxicant (e.g., Potassium dichromate, Sodium chloride)	A standard substance used periodically to confirm the consistent sensitivity and health of the test organism population over time.	Assessors check if reference toxicant tests were performed and results fell within acceptable historical ranges (supports Criterion 13).
Solvent Control Stock (e.g., Acetone, Dimethyl sulfoxide (DMSO), Methanol)	A vehicle to dissolve poorly water-soluble test substances. Must be non-toxic at the concentration used.	The type, purity, and final concentration in test media must be reported. A solvent control group must be included and show no significant effect (Criteria 4, 6, 13).
Reconstituted Standardized Test Water (e.g., ISO "Standard Water", EPA "Reconstituted Freshwater")	Provides a consistent, defined medium for tests, eliminating variability from natural water sources.	The recipe or standard followed, including hardness, pH, and ionic composition, must be specified (Criterion 9).
Formulated Food for Test Organisms (e.g., algae paste, brine shrimp nauplii, specific pellet diets)	Provides standardized nutrition during culturing and testing, especially for chronic studies.	The type, source, and feeding regimen must be described. Nutritional quality can affect organism health and endpoint sensitivity (Criteria 10, 11).
Analytical Grade Test Substance	The characterized chemical of interest used to prepare dosing solutions.	Purity, supplier, lot number, and confirmation of identity (e.g., via CAS No.) are mandatory. For novel materials (e.g., nanoparticles), physico-chemical characterization is required (Criterion 6).
Chemical Preservation and Analysis Kits (e.g., for TOC, Ammonia, Heavy Metals)	Used to monitor water quality (e.g., in flow-through systems) and verify exposure concentrations.	Their use supports the assessment of exposure stability (Criterion 7) and test system validity (Criterion 9). Documentation of methods and detection limits is crucial.

Within the CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) evaluation method, assessing the relevance of a study is distinct from judging its reliability [1]. While reliability concerns the intrinsic scientific quality of a test report, relevance covers "the extent to which data and tests are appropriate for a particular hazard identification or risk characterization" [1]. This distinction is critical: a study can be methodologically sound (reliable) yet inappropriate for a specific regulatory question (not relevant) [1]. The CRED framework, developed to improve the transparency and consistency of ecotoxicity data evaluation, provides 13 explicit criteria for assessing contextual appropriateness [1] [2]. These criteria guide researchers and risk assessors in determining whether a study's design, test organism, exposure conditions, and measured endpoints align with the specific goals of an environmental risk assessment, such as deriving a Predicted-No-Effect Concentration (PNEC) or an Environmental Quality Standard (EQS) [1] [2].

The 13 Relevance Criteria: Structure and Application

The 13 relevance criteria in the CRED method provide a systematic checklist for evaluators. They ensure that studies selected for regulatory decision-making are not only scientifically credible but also directly applicable to the assessment context. The criteria are designed to be answered with "Yes," "No," or "Not Applicable," leading to an overall relevance judgment [1]. The following table structures these core criteria, their primary assessment focus, and key evaluation questions.

Table 1: The 13 CRED Criteria for Assessing Relevance of Ecotoxicity Studies

Criterion Number	Criterion Focus	Primary Assessment Question	Scoring Guideline
1	Test organism	Is the test organism relevant to the assessment?	Consider taxonomic group, life stage, and environmental compartment (e.g., freshwater, marine, terrestrial) [1].
2	Exposure duration	Is the exposure duration relevant to the assessment?	Align with assessment goals (acute vs. chronic) and organism life cycle [1].
3	Biological organization level	Is the level of biological organization (e.g., suborganismal, individual, population) relevant?	Match the endpoint to the protection goal of the assessment (e.g., population-relevant endpoints for long-term risk) [1].
4	Measured endpoint	Is the measured endpoint (e.g., mortality, growth, reproduction) relevant to the assessment?	Endpoint should be linked to a critical ecological function or population sustainability [1].
5 & 6	Test substance & formulation	Is the tested substance (including its formulation and purity) relevant to the assessment?	The tested form should reflect the substance as it occurs in the environment (e.g., accounting for degradation, transformation) [1].
7	Exposure route	Is the exposure route (e.g., waterborne, dietary, sediment) relevant?	Must be the predominant route of exposure in the scenario being assessed [1].
8	Test system design	Is the experimental design (e.g., static, flow-through) relevant to environmental exposure?	Should mimic realistic exposure conditions (e.g., pulsed vs. continuous) [1].
9	Exposure concentration	Are the tested concentrations relevant to expected environmental levels?	Concentrations should bracket predicted environmental concentrations (PECs) to allow effect estimation [1].
10	Control performance	Was the performance of the control group acceptable?	High control mortality or adverse effects can compromise the relevance of the treated groups' response [1].
11	Reference/Control substance	Was a reference substance tested, and did it perform as expected?	Verifies the sensitivity and responsiveness of the test system [1].
12	Climate/Seasonal relevance	Are the test conditions (e.g., temperature, light regime) relevant to the assessed environment?	Physiological responses can vary with climate; conditions should be environmentally realistic [1].
13	Statistical power	Was the statistical power of the test sufficient to detect an effect of regulatory interest?	A test with low power may fail to detect a relevant effect, leading to false conclusions of safety [1].

Experimental Protocol for Applying the CRED Relevance Criteria

This protocol provides a stepwise methodology for researchers and risk assessors to systematically evaluate the relevance of aquatic ecotoxicity studies using the CRED framework [1] [3].

Phase 1: Pre-Evaluation Preparation

Objective: Define the assessment context and prepare the evaluation materials.
Procedure:
- Define the Regulatory Question: Clearly articulate the purpose of the assessment (e.g., derivation of a freshwater PNEC for a plant protection product, setting a marine EQS for an industrial chemical) [1].
- Establish Protection Goals: Identify the environmental compartments (freshwater, sediment, marine), taxonomic groups, and ecological levels (individual, population) to be protected [1].
- Acquire the CRED Tool: Obtain the official Excel-based CRED evaluation sheet, which contains the 13 relevance criteria with detailed guidance text for each [2] [3].
- Compile the Study: Secure the full text of the ecotoxicity study to be evaluated, including any supplementary materials.

Phase 2: Systematic Criterion Assessment

Objective: Apply each of the 13 criteria to the study under evaluation.
Procedure:
- Sequential Review: For each criterion (1 through 13), read the accompanying guidance in the CRED sheet [1].
- Evidence Extraction: Scrutinize the study's methods and results sections to locate information pertaining to the criterion (e.g., for Criterion 1, identify the test species and its life stage).
- Contextual Judgment: Compare the extracted study characteristic against the needs of the assessment defined in Phase 1. Ask: "Does this aspect of the study make it suitable for my specific assessment purpose?"
- Documented Scoring: For each criterion, select "Yes," "No," or "Not Applicable" in the CRED tool.
  - Yes: The study characteristic is fully aligned with the assessment context.
  - No: The study characteristic is misaligned or insufficient for the assessment context.
  - Not Applicable: The criterion does not apply to this particular study or assessment context.
- Mandatory Commentary: Provide a brief written justification in the tool for every score of "No" or "Not Applicable," citing the specific part of the study that led to the conclusion. This is essential for transparency [1].

Phase 3: Integration and Final Judgment

Objective: Synthesize individual criterion scores into an overall relevance conclusion.
Procedure:
- Review Pattern of "No" Scores: Examine which criteria were not met. The overall judgment is not a simple tally but a weight-of-evidence decision based on the importance of the unmet criteria for the specific assessment [1].
- Make Final Judgment: Based on the pattern, assign one of three final relevance classifications:
  - Relevant: The study's design and outcomes are appropriate for direct use in the assessment.
  - Not Relevant: Critical mismatches preclude the study's use for the intended purpose (e.g., a soil organism test for a water quality standard) [1].
  - Partially Relevant: The study may provide supportive information (e.g., mode-of-action insight) but cannot be used for direct quantitative effect estimation.
- Document Rationale: Record a comprehensive summary in the CRED tool explaining the basis for the final relevance judgment, referencing the key criterion assessments.

Quality Assurance

To ensure consistency, a second evaluator should independently assess a subset of studies. Discrepancies in scoring should be discussed, and the guidance text consulted to reach consensus [1]. This process mirrors the ring-test validation that found the CRED method to be more consistent and transparent than previous approaches [1] [2].

Visualization of the CRED Relevance Evaluation Workflow

The following diagram illustrates the logical sequence and decision points in applying the CRED relevance criteria.

The Scientist's Toolkit: Essential Materials for CRED Evaluation

Table 2: Research Reagent Solutions for CRED-Based Relevance Assessment

Item	Function in Evaluation	Key Specification / Notes
CRED Evaluation Excel Tool	The primary instrument for scoring. Contains the 13 criteria with embedded guidance, fields for scoring, and spaces for mandatory commentary [2] [3].	Must be downloaded from an official source (e.g., SciRAP website). Requires macros to be enabled for full functionality [3].
Reporting Recommendations Checklist	A complementary tool to the evaluator's sheet. Lists 50 specific reporting criteria across 6 categories (general, test design, substance, organism, exposure, statistics) [1].	Used proactively by researchers to ensure studies contain all information needed for a reliable and relevant evaluation, streamlining future regulatory use [1].
Guidance Documents & SOPs	Provide detailed instructions for using the CRED tool and interpreting criteria, especially for edge cases [3].	Includes PDF instructions and may contain framework-specific annexes (e.g., for pharmaceuticals or nanomaterials) [2].
Specialized CRED Adaptations	Frameworks tailored for specific data types where standard aquatic criteria may not fully apply [3].	NanoCRED: For evaluating ecotoxicity studies of nanomaterials [3]. EthoCRED: For evaluating behavioral ecotoxicity studies [3]. CRED for sediment/soil: For studies on benthic and terrestrial organisms [3].
Reference Databases & Toxicity Values	Provides context for Criterion 9 (Exposure Concentration). Allows comparison of tested concentrations to known effect levels or predicted environmental concentrations (PECs).	Includes databases like NORMAN EMPODAT, which uses CRED for reliability screening [2].
Taxonomic & Ecological References	Informs Criteria 1 (Test organism) and 3 (Biological organization). Helps assess the ecological realism and protection-goal alignment of the test species and endpoint.	Standard ecological textbooks, field guides, or regulatory lists of standard test species.

The reliability and relevance of ecotoxicity data are fundamental for deriving robust environmental quality standards (EQS) and predicted‑no‑effect concentrations (PNECs)[reference:0]. The CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) method was developed to replace subjective expert judgment with a transparent, consistent framework[reference:1]. It employs 20 criteria for reliability and 13 for relevance, providing a structured alternative to the traditional Klimisch method[reference:2]. This article details the practical application of the CRED Excel tools, which operationalize this framework for efficient, reproducible study evaluation within regulatory and research contexts.

The CRED evaluation method is supported by freely available Excel‑based tools designed to guide assessors through the systematic appraisal of ecotoxicity studies[reference:3]. These tools, hosted by the SciRAP (Science in Risk Assessment and Policy) initiative, include:

Core CRED assessment sheet: A workbook for evaluating the reliability and relevance of aquatic ecotoxicity studies[reference:4].
Specialized modules: Adapted tools for nano‑ecotoxicity (NanoCRED) and behavioural studies (EthoCRED)[reference:5].
Reporting checklists: Excel‑based checklists to aid researchers in comprehensively reporting study design, conduct, and results[reference:6].

All tools require enabled macros and are accompanied by user guides[reference:7].

Application Notes: Implementing the CRED Evaluation

Pre‑Evaluation Preparation

Tool Acquisition: Download the latest CRED Excel toolset from the SciRAP or Swiss Centre for Applied Ecotoxicology websites[reference:8].
Study Compilation: Gather the full text of the ecotoxicity study to be evaluated, including any supplementary materials.
Assessment Context: Define the specific regulatory or research question (e.g., derivation of a PNEC for a specific chemical) to anchor the relevance evaluation.

Workflow Execution

The evaluation follows a sequential process: first assessing reliability, then relevance, and finally synthesizing an overall confidence score. The Excel tool automates scoring and provides a structured worksheet for recording justifications.

Post‑Evaluation Actions

Data Export: Use the tool’s functionality to export summary scores for integration into larger risk‑assessment dossiers or meta‑analyses.
Weight‑of‑Evidence: In assessments requiring multiple studies, combine CRED outputs with other lines of evidence to build a robust conclusion.

Detailed Experimental Protocol for Study Evaluation

Protocol 4.1: Reliability Assessment Using the CRED Excel Tool

Objective: To systematically score a study’s internal validity and methodological robustness against 20 predefined criteria.

Materials:

CRED Excel tool (Reliability sheet).
Complete study manuscript.

Procedure:

Enable Macros: Open the Excel file and enable macros if prompted.
Enter Study Metadata: Input study identifier, test substance, organism, and endpoint in the designated fields.
Criterion‑by‑Criterion Evaluation:
- For each of the 20 criteria (e.g., "Test organism clearly described," "Exposure concentration verified"), consult the study text.
- Select the appropriate score from the dropdown menu (typically: 2 = Criterion fully met, 1 = Criterion partially met, 0 = Criterion not met, NA = Not applicable).
- Mandatory: Provide a concise justification for each score in the adjacent comment cell, citing the relevant text or page number.
Automated Scoring: The tool automatically calculates a total reliability score and a percentage of criteria met.
Overall Reliability Classification: Based on the total score, assign a final reliability rating:
- Reliable without restrictions (≥ 80% of applicable criteria met).
- Reliable with restrictions (60‑79% met).
- Not reliable (< 60% met).

Protocol 4.2: Relevance Assessment Using the CRED Excel Tool

Objective: To evaluate the study’s applicability to the specific assessment question using 13 criteria.

Procedure:

Navigate to Relevance Sheet: Switch to the "Relevance" worksheet within the same Excel file.
Define Assessment Question: Re‑enter the specific assessment context.
Evaluate Relevance Criteria:
- Score 13 criteria (e.g., "Appropriate trophic level," "Endpoint relevant to protection goal") using the same 2/1/0/NA scale.
- Record justifications for each score.
Generate Relevance Profile: The tool summarizes which aspects of relevance are fully, partially, or not met, forming a relevance profile for the study.

Protocol 4.3: Final Study Evaluation and Reporting

Objective: To synthesize reliability and relevance evaluations into a final study confidence statement.

Procedure:

Review Integrated Dashboard: Consult the tool’s summary sheet, which displays reliability and relevance scores side‑by‑side.
Formulate Conclusion:
- A study rated as "Reliable without restrictions" and with a high relevance profile is suitable for direct use in regulatory decision‑making.
- Studies with restrictions in either dimension require explicit discussion of their limitations in the assessment report.
Document and Archive: Save the completed Excel file with a unique filename, preserving the full audit trail of scores and justifications.

Data Presentation: CRED Criteria and Scoring

Category	Number of Criteria (Reliability)	Number of Criteria (Relevance)	Primary Focus
General Information & Test Design	5	–	Study identification, hypothesis, design.
Test Substance	3	2	Characterization, dosing, relevance to environmental form.
Test Organism	4	3	Species, life‑stage, source, appropriateness for assessment.
Exposure Conditions	5	4	Duration, medium, renewal, measurement of actual concentrations.
Statistical & Biological Response	3	4	Endpoint measurement, statistical analysis, dose‑response.
Overall Assessment	–	–	Synthesis of reliability and relevance scores.
TOTAL	20[reference:9]	13[reference:10]

Table 2: Comparative Performance: CRED vs. Klimisch Method

Metric	CRED Method (Ring‑Test Results)	Traditional Klimisch Method
Perceived Accuracy	Higher[reference:11]	Lower
Perceived Consistency	Higher[reference:12]	Lower
Transparency	High (explicit criteria & guidance)[reference:13]	Low (reliant on expert judgment)[reference:14]
Ease of Application	Structured, with detailed guidance[reference:15]	Variable, less guidance

Visualization of Workflows and Relationships

Diagram 1: CRED Study Evaluation Workflow

Diagram 2: Ecosystem of CRED Evaluation Tools

Item	Function / Purpose	Source / Availability
CRED Excel Assessment Sheet	Primary tool for scoring reliability and relevance criteria; automates calculations and provides structured worksheet.	SciRAP website / Swiss Centre for Applied Ecotoxicology[reference:16].
CRED Guidance Document (PDF)	Provides detailed explanations of each criterion, scoring examples, and overall methodology.	Available with the Excel tool download[reference:17].
SciRAP Reporting Checklists	Excel checklists to aid researchers in designing and reporting studies that meet CRED reliability criteria, facilitating future evaluations[reference:18].	SciRAP website.
NanoCRED / EthoCRED Modules	Specialized Excel tools for evaluating studies on nanomaterials or behavioural endpoints, extending the core CRED framework[reference:19].	SciRAP website.
Reference Literature	Key publications detailing the CRED method development, ring‑test results, and comparative analyses[reference:20][reference:21].	Open‑access journals (e.g., Environmental Sciences Europe).

The CRED Excel tools transform a rigorous methodological framework into a practical, efficient application for researchers, regulators, and drug‑development professionals. By standardizing the evaluation of ecotoxicity data, these tools enhance the transparency, consistency, and scientific defensibility of environmental risk assessments. Their continued adoption and integration into regulatory guidance documents promise to strengthen the foundation upon which environmental quality standards and safe chemical management are built.

The derivation of Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQS) is a cornerstone of chemical hazard assessment, relying on the identification of reliable and relevant ecotoxicity studies [7]. Historically, the Klimisch method has been the predominant tool for evaluating study reliability, but its dependence on expert judgment has been shown to introduce bias and inconsistency into environmental risk assessments [8]. This variability can directly impact regulatory decisions, potentially leading to either underestimated environmental risks or unnecessary mitigation measures [8].

This document situates itself within a broader thesis arguing for the CRED evaluation method as a superior, science-based framework. CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) was developed to strengthen the transparency, consistency, and robustness of ecological hazard and risk assessments [7] [8]. By providing detailed, explicit criteria for both reliability and relevance, CRED minimizes subjective interpretation and ensures that all available data—including non-standard and peer-reviewed studies—are evaluated on a sound scientific basis [9]. The following application notes and protocols demonstrate how CRED is applied in practice, using real case examples to illustrate its role in promoting harmonized and defensible regulatory decision-making.

The CRED method is a comprehensive evaluation system consisting of two interconnected components: a set of evaluation criteria for assessors and a set of reporting recommendations for researchers [7]. Its development was informed by OECD test guidelines, existing evaluation methods, and practical regulatory expertise [8].

Evaluation Criteria: The core of the method includes 20 criteria for reliability and 13 criteria for relevance [7] [2]. Reliability pertains to the inherent quality and clarity of the study report, ensuring its findings are plausible and reproducible. Relevance assesses the appropriateness of the test organism, endpoint, exposure conditions, and other factors for the specific hazard or risk assessment question [8].
Reporting Recommendations: To facilitate high-quality evaluations, CRED also provides 50 reporting criteria across six categories (General Information, Test Design, Test Substance, Test Organism, Exposure Conditions, and Statistical Design & Biological Response) [7]. A well-reported study is more likely to be deemed reliable and suitable for regulatory use.

A major ring-test involving 75 risk assessors from 12 countries concluded that the CRED method was perceived as more accurate, consistent, transparent, and less dependent on expert judgment than the Klimisch method [8]. Table 1 summarizes the key differences between the two approaches.

Table 1: Comparative Analysis of the Klimisch and CRED Evaluation Methods [8]

Characteristic	Klimisch Method	CRED Method
Primary Focus	Reliability only	Reliability and Relevance
Number of Criteria	12-14 (ecotoxicity)	20 Reliability, 13 Relevance
Guidance Detail	Limited, high-level	Extensive, with detailed guidance for each criterion
Handling of GLP/Standard Studies	Can be automatically favored	Each study is evaluated against detailed criteria regardless of GLP status
Evaluation Outcome	Qualitative categorization (e.g., Reliable without Restrictions)	Qualitative summary for both reliability and relevance, supported by criterion-level scoring.
Transparency & Consistency	Lower; prone to assessor bias	Higher; structured criteria reduce subjectivity

Application Notes: Case Study Evaluations

The following cases illustrate the application of the CRED evaluation method to real aquatic ecotoxicity studies. These examples are drawn from a ring-test designed to validate the CRED framework [8].

Case Study 1: Acute Toxicity of an Insecticide toDaphnia magna

Study Summary: A 48-hour acute immobilization test with the pyrethroid insecticide deltamethrin on the freshwater crustacean Daphnia magna. The study reported a median effective concentration (EC₅₀) [8].
CRED Evaluation Process:
- Reliability Check: The assessor evaluated the study against the 20 reliability criteria. Key strengths included clear reporting of test concentration verification, water quality parameters, and adherence to OECD test guideline 202. A potential limitation noted was incomplete reporting on the genetic lineage or health status of the daphnid brood stock.
- Relevance Check: The 13 relevance criteria were applied. The use of D. magna, a standard trophic level representative, and an acute apical endpoint (immobilization) was deemed relevant for a first-tier acute hazard assessment of a pesticide. The exposure duration (48h) was appropriate for the endpoint.
- Overall Assessment: The study was judged to be of high reliability and high relevance for deriving an acute toxicity value for the hazard characterization of deltamethrin in freshwater environments.

Case Study 2: Chronic Effect of an Antibiotic on a Cyanobacterium

Study Summary: A 144-hour chronic growth inhibition test with the antibiotic erythromycin on the cyanobacterium Synechococcus leopoliensis [8].
CRED Evaluation Process:
- Reliability Check: Evaluation revealed a critical flaw: the study did not report data on the stability of the antibiotic in the test system over the 144-hour period. Given erythromycin’s potential for photodegradation or hydrolysis, this lack of analytical confirmation of exposure concentrations was a major deficiency against the reliability criteria concerning test substance characterization and exposure verification.
- Relevance Check: The test organism, a cyanobacterium, was relevant for assessing effects on primary producers. The endpoint (growth inhibition) was also chronic and ecologically meaningful. However, the relevance was compromised by the reliability concerns.
- Overall Assessment: Due to the critical gap in exposure verification, the study was categorized as having low reliability. Despite its potential relevance, it was deemed unsuitable for quantitative use in standard setting without analytical confirmation of exposure [8].

Table 2: Summary of Case Studies from the CRED Ring-Test [8]

Study Ref.	Test Organism	Taxonomic Group	Test Substance	Key Endpoint	CRED Reliability Outcome	CRED Relevance Outcome
A	Daphnia magna	Crustacean	Deltamethrin (Insecticide)	48h EC₅₀ (Immobilization)	High	High
B	Lemna minor	Higher Plant	Erythromycin (Antibiotic)	7-day NOEC (Growth)	High	High
C	Synechococcus leopoliensis	Cyanobacteria	Erythromycin (Antibiotic)	144h NOEC (Growth)	Low (Lack of exposure verification)	Moderate
D	Scenedesmus vacuolatus	Algae	Tetracycline (Antibiotic)	72h EC₅₀ (Growth)	Moderate	High

Detailed Experimental Protocol for CRED Evaluation

This protocol provides a step-by-step guide for implementing the CRED evaluation method for an aquatic ecotoxicity study.

Materials and Preparation

CRED Evaluation Sheet: Obtain the official Excel-based CRED evaluation tool, which contains all criteria, guidance, and scoring fields [2].
Study to be Evaluated: The full text of the ecotoxicity study report or publication.
Reference Documents: Relevant OECD test guidelines, species background information, and any specific regulatory guidance for the substance class (e.g., pharmaceuticals, pesticides).

Procedure

Step 1: Study Screening and Identification

Read the study thoroughly to understand its objective, design, and results.
Identify the test substance, organism, endpoints, and exposure regime.

Step 2: Apply Reliability Criteria (20 items)

For each of the 20 reliability criteria, answer the guided questions in the CRED tool.
Criteria are grouped into domains: Test Substance Characterization, Test Organism, Exposure Design & Conditions, Endpoint Measurement & Statistics, and Reporting Clarity.
Example Criterion (Exposure Design): "Was the test medium renewed appropriately (e.g., static, semi-static, flow-through) for the test substance's stability and the test duration?" [7]
Score each criterion based on the provided guidance (e.g., Fully, Partially, Not, Not Reported).

Step 3: Apply Relevance Criteria (13 items)

Evaluate the study's appropriateness for the specific assessment context.
Criteria cover Biological Relevance (e.g., trophic level, life stage), Exposure Relevance (e.g., route, duration), Endpoint Relevance (e.g., apical vs. subcellular, ecological significance), and Environmental Relevance (e.g., test medium, concentration range) [7] [9].
Example Criterion (Endpoint Relevance): "Is the measured endpoint relevant for the protection goal of the assessment (e.g., population-relevant parameters such as survival, growth, reproduction)?" [8]

Step 4: Summarize and Classify

The CRED tool synthesizes the ratings into a summary statement on the study's overall reliability and relevance.
The final classification is not a simple "reliable/not reliable" but a transparent narrative summary, such as: "The study is considered to be of high reliability with only minor reporting deficiencies. It is highly relevant for assessing chronic plant toxicity in freshwater systems." [8]

Step 5: Decision for Use in Assessment

Studies with high reliability and high relevance form the core robust dataset for PNEC/EQS derivation.
Studies with high relevance but moderate or low reliability may be used in a weight-of-evidence approach or to identify data gaps but are not typically used for quantitative benchmark derivation.
Studies with low relevance to the assessment context are generally excluded, regardless of reliability.

Diagram 1: CRED Evaluation Workflow for a Single Study. This flowchart illustrates the stepwise process, from initial screening to final classification.

Advanced Applications and Extensions

The CRED framework has been adapted to address specialized data evaluation needs.

NanoCRED: A modified version for evaluating ecotoxicity studies of engineered nanomaterials. It includes additional criteria addressing nano-specific considerations such as particle characterization (size, surface charge, aggregation state), dispersion protocols, and dosimetry, which are crucial for assessing the reliability of nano-ecotoxicity data [3].
EthoCRED: A framework developed to guide the reporting and evaluation of behavioral ecotoxicity studies. Behavioral endpoints are increasingly recognized as sensitive and ecologically relevant but require specific evaluation of methodology (e.g., tracking technology, endpoint definition) to ensure reliability [3].
CRED for Sediment and Soil: Work is ongoing to extend the CRED principles to studies performed in sediment and soil matrices, incorporating criteria specific to these complex environmental compartments [3].

The Scientist's Toolkit: Research Reagent Solutions

High-quality, standardized materials are fundamental to generating ecotoxicity data that can meet CRED's high reliability standards. The following table details essential reagents and their functions.

Table 3: Key Research Reagent Solutions for Aquatic Ecotoxicity Testing

Reagent/Material	Function in Ecotoxicity Studies	Importance for CRED Reliability
Reference Toxicants (e.g., KCl, NaCl, CuSO₄)	Used in periodic tests to confirm the consistent sensitivity and health of laboratory test organism cultures.	Provides quality control evidence for the test organism criterion, demonstrating organism fitness and standardized response [7].
Analytical Grade Test Substances & Solvents	Ensures test solutions are prepared from materials of known, high purity with minimal confounding impurities.	Critical for the test substance characterization criterion, allowing for accurate dosing and reporting of nominal/measured concentrations [7] [9].
Standardized Dilution Water (e.g., ISO, OECD reconstituted water)	Provides a consistent, defined medium for tests, controlling water hardness, pH, and ion composition.	Addresses the exposure conditions criteria by standardizing the test environment and reducing confounding abiotic stress [7].
Certified Reference Materials (CRMs) for Analytics	Used to calibrate instruments (e.g., HPLC, GC-MS) for verifying test substance concentration in solution.	Essential for fulfilling the exposure verification criterion, especially for unstable, volatile, or adsorbing substances, moving data from "nominal" to "measured" concentration status [8].
Formulated Food for Test Organisms (e.g., algae, yeast, cerophyll)	Provides standardized nutrition for organisms during chronic or sub-chronic tests.	Supports the test organism health and handling criteria by ensuring adequate and consistent nourishment, which can affect sensitivity and endpoint variability [7].

Decision Logic for Study Acceptance in Hazard Assessment

The final step in evaluation is determining how a study informs a regulatory hazard assessment. The CRED method promotes a transparent decision logic based on the combined reliability and relevance outcomes.

Diagram 2: Decision Logic for Regulatory Use Based on CRED Evaluation. This logic tree visualizes the pathway from evaluation outcomes to regulatory application.

High Reliability + High Relevance: These studies are accepted into the core dataset for quantitative analysis and benchmark derivation (e.g., PNEC calculation) [8].
Moderate/High Relevance + Moderate/Low Reliability: These studies are not used for direct benchmark derivation but may provide supporting evidence in a weight-of-evidence assessment, help define assessment factors, or highlight specific modes of action [9].
Low Reliability: Studies are typically rejected for quantitative use, regardless of relevance. A study with high relevance but low reliability highlights a critical data gap that may need to be filled [8].

The CRED evaluation method represents a significant advancement in the critical appraisal of aquatic ecotoxicity data. By replacing subjective judgment with a structured, transparent, and criterion-driven process, CRED enhances the scientific rigor, consistency, and regulatory defensibility of environmental hazard assessments [7] [8]. Its ongoing adoption and adaptation—evidenced by tools like NanoCRED and EthoCRED—demonstrate its utility as a foundational framework within a broader thesis advocating for robust, evidence-based environmental decision-making. The provided protocols and case examples offer researchers and assessors a practical guide for implementing this essential tool.

Overcoming Challenges: Troubleshooting and Optimizing CRED Evaluations

Common Pitfalls in Reliability and Relevance Assessments

The derivation of Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) necessitates a robust evaluation of the reliability and relevance of underlying ecotoxicity studies[reference:0]. Historically, this evaluation has relied heavily on expert judgment, leading to inconsistencies and potential bias in risk assessments[reference:1]. The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) project was developed to address this critical gap. CRED provides a standardized, transparent method for assessing aquatic ecotoxicity studies, comprising 20 reliability criteria and 13 relevance criteria, accompanied by extensive guidance[reference:2]. This article, framed within a broader thesis on the CRED evaluation method, details common pitfalls in traditional assessment approaches and outlines detailed application protocols to enhance consistency and scientific rigor in ecotoxicity data evaluation.

Common Pitfalls in Traditional Assessments

The widely used Klimisch method, while a historical step forward, exhibits several well-documented shortcomings that serve as common pitfalls in reliability and relevance assessments[reference:3].

Over-reliance on GLP and Standard Guidelines: The Klimisch method tends to favor studies conducted under Good Laboratory Practice (GLP) or using OECD/test guidelines, potentially leading assessors to automatically categorize them as reliable even if fundamental flaws (e.g., high control mortality, inappropriate endpoints) exist[reference:4]. This can exclude valid peer-reviewed data from regulatory consideration.
Lack of Detailed Guidance and Transparency: The method offers limited criteria for reliability evaluation and no specific guidance for assessing relevance, forcing a heavy dependence on subjective expert judgment[reference:5]. This insufficient guidance is a primary driver of inconsistency among different risk assessors.
Inconsistent Categorization Outcomes: Studies have demonstrated that the same ecotoxicity study may be categorized as "reliable with restrictions" by one assessor and "not reliable" by another when using the Klimisch method, leading to disagreements on a study's regulatory usability[reference:6].
Neglect of Relevance as a Systematic Dimension: While the Klimisch method focuses primarily on reliability, it lacks a structured framework for evaluating the relevance of a study for a specific hazard identification or risk characterization context[reference:7].
"Not Assignable" as a Limitation: The Klimisch category "not assignable" (R4) often results from a lack of reported details, but the method does not systematically guide assessors to identify which specific information is missing, limiting the study's potential for use even as supporting information.

The CRED evaluation method was explicitly designed to mitigate these pitfalls by providing a more detailed, criteria-based, and transparent checklist approach[reference:8].

Application Notes & Protocols for CRED Evaluation

Pre-Evaluation Preparation

Define the Assessment Context: Clearly state the regulatory purpose (e.g., EQS derivation under the Water Framework Directive, REACH hazard assessment) as relevance is context-dependent.
Acquire the CRED Tool: Obtain the freely available CRED Excel evaluation sheet, which contains the complete list of 20 reliability and 13 relevance criteria with accompanying guidance notes[reference:9].
Gather Complete Study Documentation: Secure the full text of the ecotoxicity study, including any supplementary materials.

Step-by-Step Evaluation Protocol

Phase 1: Reliability Assessment

Systematic Criteria Review: For each of the 20 reliability criteria (covering test design, substance characterization, organism details, exposure conditions, and statistical analysis), determine if the study fulfills the criterion (Yes/No/Not Applicable).
Document Justifications: For each "No" answer, record a brief justification referencing the specific deficiency in the study report (e.g., "analytical verification of exposure concentrations not reported").
Overall Reliability Categorization: Based on the pattern of fulfilled and unfulfilled criteria, assign the study to one of four categories:
- R1: Reliable without restrictions.
- R2: Reliable with restrictions.
- R3: Not reliable.
- R4: Not assignable (insufficient information reported).

Phase 2: Relevance Assessment

Contextual Criteria Review: Evaluate the study against the 13 relevance criteria, which assess the appropriateness of the test organism, endpoint, exposure regime, and measured effect in relation to the defined assessment context.
Relevance Categorization: Assign a relevance category:
- C1: Relevant without restrictions.
- C2: Relevant with restrictions.
- C3: Not relevant.
- C4: Not assignable.

Phase 3: Integrated Documentation

Complete the Evaluation Sheet: Ensure all criteria fields and justification boxes are filled in the CRED Excel tool.
Generate a Summary Statement: Produce a concise narrative summary integrating the reliability and relevance categorizations, highlighting key strengths and critical limitations.

Table 1: Comparison of Klimisch and CRED Evaluation Method Characteristics

Characteristic	Klimisch Method	CRED Method
Primary Data Type	Toxicity and ecotoxicity	Aquatic ecotoxicity
Number of Reliability Criteria	12–14 (for ecotoxicity)	20 (evaluation); 50 (reporting)[reference:10]
Number of Relevance Criteria	0	13[reference:11]
Guidance Detail	No	Yes, extensive[reference:12]
Basis for Evaluation	Qualitative expert judgment	Structured criteria checklist

Table 2: Ring-Test Results: Reliability Categorization for Selected Studies[reference:13][reference:14]

Study	Test Organism	Endpoint	Klimisch Method (% of assessors)	CRED Method (% of assessors)
Study A	Daphnia magna	48-h EC50 (immobilization)	R1/R2: 57%	R1/R2: 20%
			R3/R4: 43%	R3/R4: 80%
Study D	Algae (Desmodesmus subspicatus)	72-h NOEC (growth)	R2: 65%	R2: 10%
			R3: 35%	R3: 60%; R4: 30%
Study E	Danio rerio (GLP report)	40-day NOEC (sex ratio)	R1: 44%; R2: 56%	R1: 16%; R2: 21%; R3: 63%

The data show that CRED typically results in more stringent reliability assessments, as its systematic checklist helps identify flaws (e.g., exposure above solubility limits) that may be overlooked in a less structured evaluation[reference:15].

Detailed Experimental Protocol: The CRED Ring-Test Methodology

The following protocol is adapted from the two-phased ring test used to validate the CRED method[reference:16].

Objective: To compare the consistency, accuracy, and user perception of the CRED evaluation method against the established Klimisch method.

Materials:

A set of 8 diverse ecotoxicity studies (peer-reviewed and GLP reports) covering different taxonomic groups (algae, crustaceans, fish) and chemical classes[reference:17].
Klimisch method evaluation guidelines.
Draft (or final) CRED evaluation sheets and guidance documents.
Standardized questionnaire for participant feedback.

Procedure:

Participant Recruitment & Training: Recruit risk assessors with varying experience levels from regulatory agencies, industry, and consultancies. Provide standardized training on both evaluation methods.
Phase I (Klimisch Evaluation): Randomly assign each participant 2 of the 8 studies. Each participant evaluates the reliability and relevance of their assigned studies using the Klimisch method and categorizes them accordingly.
Phase II (CRED Evaluation): After a washout period, assign each participant 2 different studies from the same set. Participants evaluate these using the CRED method.
Data Collection: Collect completed evaluation sheets with categorizations and justifications. Administer a questionnaire assessing participants' perception of each method's accuracy, consistency, practicality, and dependence on expert judgment.
Data Analysis: Analyze the consistency of categorizations across participants for each method. Calculate percentages of agreement/disagreement. Statistically compare mean categorization scores and analyze qualitative feedback.

Visualization of Evaluation Workflows

Diagram 1: CRED Evaluation Workflow for a Single Study

Title: CRED Reliability & Relevance Assessment Flowchart

Diagram 2: Method Comparison: Klimisch vs. CRED

Title: Klimisch vs. CRED Evaluation Process Comparison

Tool / Resource	Function / Description	Key Utility
CRED Excel Evaluation Sheet	The primary tool containing the 20 reliability and 13 relevance criteria with guidance and fields for justifications.	Enables standardized, transparent, and documented evaluations[reference:18].
OECD Test Guidelines (e.g., 201, 210, 211)	International standard protocols for conducting ecotoxicity tests.	Provides the benchmark for assessing test design and reporting quality in reliability criteria.
ECHA REACH Guidance	Defines regulatory concepts of reliability and relevance and outlines data requirements.	Essential for framing the assessment context and understanding regulatory expectations.
SciRAP (Science in Risk Assessment and Policy) Platform	An online resource hosting CRED and related tools (e.g., NanoCRED, EthoCRED) for evaluating different data types.	Facilitates access to the latest evaluation frameworks and training materials[reference:19].
NORMAN Ecotox Database	A database of ecotoxicity studies with associated reliability evaluations.	Useful for benchmarking and accessing previously evaluated data.

The transition from traditional, expert-judgment-heavy methods like Klimisch to the structured, criteria-based CRED evaluation framework addresses fundamental pitfalls in reliability and relevance assessments. By mitigating over-reliance on GLP, providing explicit guidance, systematically evaluating relevance, and promoting transparency through documentation, CRED enhances the consistency, scientific rigor, and regulatory acceptance of ecotoxicity data. The detailed application notes, protocols, and tools outlined here provide a practical roadmap for researchers, risk assessors, and drug development professionals to implement this robust evaluation method, thereby strengthening the foundation of environmental hazard and risk assessment.

Handling Ambiguous or Incomplete Ecotoxicity Data

The reliability and relevance of ecotoxicity studies are foundational for deriving Predicted‑No‑Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs). However, evaluations are often hampered by ambiguous or incomplete data—missing methodological details, insufficient statistical reporting, or unclear exposure conditions. Such ambiguities introduce inconsistency and bias when assessors rely on subjective expert judgment. The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) project was developed to address this problem by providing a transparent, structured framework for evaluating study reliability and relevance[reference:0]. This Application Note outlines how the CRED method can be systematically applied to identify, classify, and handle ambiguous or incomplete ecotoxicity data within a regulatory or research context.

The CRED evaluation method comprises 20 reliability criteria and 13 relevance criteria, accompanied by detailed guidance[reference:1]. Reliability is scored using four categories:

R1 – Reliable without restrictions
R2 – Reliable with restrictions
R3 – Not reliable
R4 – Not assignable

The R4 category is specifically designed for ambiguous or incomplete data. It is applied when critical information needed to assess reliability is missing, such as studies reported only in abstracts, secondary literature, or those with insufficient experimental details[reference:2]. This explicit category forces assessors to document data gaps transparently, rather than making uncertain judgments.

Application Notes for Ambiguous or Incomplete Data

Missing test‑setup details: Unclear guideline compliance, lack of Good Laboratory Practice (GLP) statement, or omitted validity criteria.
Insufficient test‑substance characterization: Missing purity, CAS number, or formulation details.
Inadequate exposure documentation: No verification of test concentrations, poorly described exposure conditions, or absence of chemical‑analysis data.
Deficient statistical reporting: Lack of replicates, unspecified statistical methods, or missing concentration‑response curves.

How to Apply CRED Criteria to Ambiguous Data

Systematic checklist‑based review: Use the 20 reliability criteria as a checklist. For each criterion, note whether the required information is fully reported (F), partially reported (P), or not reported (N).
Identify critical gaps: Criteria deemed “critical” (e.g., test‑substance identification, exposure‑concentration verification, appropriate statistical methods) must be fulfilled for a study to be considered reliable. If any critical criterion is not reported, the study cannot score higher than R2 (Reliable with restrictions) or may be R4 (Not assignable).
Assign R4 when information is insufficient: If multiple critical criteria are missing, preventing any meaningful assessment of reliability, assign R4. This flags the study as “not assignable” due to ambiguous/incomplete reporting.
Document the gaps: Record which specific criteria are unmet. This documentation is essential for transparency and for informing future data‑curation efforts.

Decision‑Making for Regulatory Use

R4 studies should generally be excluded from primary regulatory derivations (e.g., PNEC calculation) but may be retained as “supporting evidence” if the missing information is not critical for the specific assessment purpose.
R2 studies can be used with caution, provided the restrictions are clearly stated (e.g., “reliable with restrictions due to lack of chemical‑analysis data”).
Communicate limitations: The CRED “report card” format can summarize the reliability/relevance scores and explicitly list the data gaps, ensuring that any subsequent risk assessment acknowledges the uncertainties.

Detailed Protocols

Protocol 1: CRED‑Based Evaluation of Ecotoxicity Studies

Objective: To consistently evaluate the reliability and relevance of an aquatic ecotoxicity study using the CRED method.

Materials:

CRED reliability‑criteria checklist (Table 1)
CRED relevance‑criteria checklist (Table 2)
CRED scoring sheet (Excel template available from the SciRAP tools[reference:3])
Study manuscript or report

Procedure:

Prepare the checklist. Download the CRED evaluation sheet (Excel file) from the SciRAP website[reference:4].
Extract study information. Read the study thoroughly and extract all relevant details on test design, test substance, organism, exposure conditions, and statistical analysis.
Score reliability criteria. For each of the 20 reliability criteria, determine whether the information is fully reported (F), partially reported (P), or not reported (N). Use the guidance in the CRED publication to interpret each criterion[reference:5].
Assign a reliability category. Based on the scoring:
- R1: All critical criteria are F.
- R2: All critical criteria are at least P, but minor flaws exist.
- R3: One or more critical criteria are N.
- R4: Insufficient information to score multiple critical criteria (e.g., only an abstract is available).
Score relevance criteria. Evaluate the 13 relevance criteria (e.g., appropriateness of test organism, exposure duration, endpoint) for the specific assessment purpose.
Generate a report card. Summarize the reliability and relevance scores, list the unmet criteria, and note any ambiguous/missing data.

Table 1: CRED Reliability Criteria (Abridged)[reference:6]

Category	Criterion Number	Criterion (Abbreviated)
General information	–	Physicochemical parameters of the compound known?
Test setup	1	Guideline method (e.g., OECD/ISO) used?
	2	Performed under GLP?
	3	Validity criteria fulfilled?
	4	Appropriate controls performed?
Test compound	5	Test substance identified (name/CAS)?
	6	Purity or trustworthy source reported?
	7	Formulation/impurities characterized?
Test organism	8	Organisms well described (species, life stage, etc.)?
	9	Organisms from trustworthy source and acclimatized?
Exposure conditions	10	Experimental system appropriate for test substance?
	11	Experimental system appropriate for test organism?
	12	Exposure concentrations below water solubility?
	13	Correct spacing between concentrations?
	14	Exposure duration defined?
	15	Chemical analyses to verify concentrations?
	16	Biomass loading appropriate?
Statistical design & biological response	17	Sufficient replicates and organisms per replicate?
	18	Appropriate statistical methods used?
	19	Concentration‑response curve observed?
	20	Sufficient data to check endpoint calculation?

Table 2: CRED Relevance Criteria (Abridged)[reference:7]

Criterion Category	Example Questions
Test organism	Is the test organism relevant to the assessment endpoint (e.g., trophic level, geographic region)?
Exposure scenario	Does the exposure duration match the expected environmental exposure?
Endpoint	Is the measured endpoint (e.g., mortality, reproduction, growth) relevant to the protection goal?
Environmental relevance	Are test conditions (pH, temperature, water hardness) representative of the receiving environment?

Protocol 2: Handling “Not Assignable” (R4) Studies

Objective: To systematically document and decide on the use of studies that are “not assignable” due to ambiguous or incomplete data.

Procedure:

Identify the missing information. List the specific reliability criteria that cannot be scored due to missing information.
Contact authors (optional). If the study is recently published, consider contacting the corresponding author to request missing details.
Determine criticality. Assess whether the missing information is critical for the specific regulatory purpose. For example, missing chemical‑analysis data may be critical for a volatile compound but less critical for a stable, highly soluble substance.
Make a use decision. Based on the criticality:
- Exclude: If the missing information is critical and cannot be inferred, exclude the study from the primary dataset.
- Retain as supporting evidence: If the missing information is not critical, retain the study but clearly label it as “supporting evidence” and document the gaps.
Document the decision. In the evaluation report, state the rationale for assigning R4 and the decision on use. This transparency is key for audit trails and future re‑evaluations.

Visualization of Workflows

Diagram 1: CRED Evaluation Workflow for Ambiguous Data

Diagram 2: Handling of “Not Assignable” (R4) Studies

The Scientist’s Toolkit

Table 3: Essential Resources for Handling Ambiguous/Incomplete Ecotoxicity Data

Tool/Resource	Format/Location	Purpose
CRED Evaluation Sheet	Excel template (SciRAP website)[reference:8]	Structured checklist for scoring 20 reliability and 13 relevance criteria.
CRED Reporting Recommendations	50‑criteria list (published supplement)[reference:9]	Guidance for researchers on reporting essential study details to avoid ambiguities.
CREED Workbook	Excel template (Supporting Information of Di Paolo et al. 2024)[reference:10]	Tool for evaluating exposure‑data usability, adaptable for ecotoxicity data gaps.
EthoCRED Framework	Published guidelines[reference:11]	Extension of CRED for behavioral ecotoxicity studies, useful for specialized endpoints.
NanoCRED Framework	Published framework[reference:12]	Adaptation of CRED for nanomaterials, addressing specific ambiguities in nano‑ecotoxicity studies.
Ring‑Test Data	Supplemental data of CRED publication[reference:13]	Examples of how assessors applied CRED to real studies, illustrating handling of ambiguous data.
Chemical Monitoring Databases (e.g., NORMAN)	Online platforms	Source of supplementary exposure data that may help fill gaps in ecotoxicity studies.
GLP/Guideline Documents (OECD, ISO)	Official guideline texts	Reference for determining whether a study followed accepted test protocols.

Ambiguous or incomplete ecotoxicity data pose a significant challenge to consistent risk assessment. The CRED evaluation method provides a standardized, transparent framework to identify and categorize such data through its explicit “not assignable” (R4) category. By following the protocols and using the tools outlined here, researchers and risk assessors can systematically document data gaps, make reproducible use decisions, and communicate uncertainties effectively. This approach aligns with the broader thesis that CRED‑based evaluation enhances the reliability, transparency, and regulatory usability of ecotoxicity data.

References

Moermond, C. T. A., Kase, R., Korkaric, M., & Ågerstrand, M. (2016). CRED: Criteria for reporting and evaluating ecotoxicity data. Environmental Toxicology and Chemistry, 35(5), 1297–1309.
The CRED tools for ecotoxicity studies. (2025). SciRAP – Science in Risk Assessment and Policy. Retrieved December 2, 2025.
Di Paolo, C., Bramke, I., Stauber, J., Whalley, C., & … (2024). Implementation of the CREED approach for environmental assessments. Integrated Environmental Assessment and Management, 20(4), 1019–1034.

The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) framework was established to address critical inconsistencies in the regulatory evaluation of ecotoxicity studies. It provides a transparent, structured method with 20 reliability criteria and 13 relevance criteria to assess the quality and applicability of aquatic ecotoxicity data, moving beyond the less precise Klimisch method [1] [2]. The core objective of CRED is to improve the reproducibility, consistency, and transparency of data evaluations across different regulatory bodies and assessors, thereby strengthening the scientific foundation for deriving Predicted No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) [1].

However, the unique challenges posed by emerging contaminant classes and novel endpoints necessitate specialized adaptations of the core CRED principles. Two significant extensions have been developed:

NanoCRED: Addresses the evaluation of ecotoxicity data for engineered nanomaterials (ENMs), which exhibit fundamentally different behaviors (e.g., agglomeration, dissolution, dynamic exposure concentrations) compared to conventional soluble chemicals [10] [11].
EthoCRED: Provides a framework for assessing behavioral ecotoxicity studies, a sensitive and ecologically relevant endpoint that is rarely covered by standardized test guidelines, leading to its underuse in regulatory risk assessment [12] [13].

Both extensions are designed to be interoperable with the original CRED framework, ensuring they can be readily implemented into existing regulatory workflows in the European Union and internationally [10] [13]. This article details the application notes, protocols, and evaluation criteria for these specialized tools within the broader thesis of advancing robust ecotoxicity data evaluation.

Comparative Analysis of CRED Frameworks

The following table summarizes the core architecture and focus of the three interrelated evaluation frameworks.

Table 1: Comparison of CRED, NanoCRED, and EthoCRED Evaluation Frameworks

Framework Feature	CRED (Core Framework)	NanoCRED (Nanomaterial Extension)	EthoCRED (Behavioral Study Extension)
Primary Objective	Improve consistency & transparency in evaluating aquatic ecotoxicity studies [1].	Assess regulatory adequacy of ecotoxicity data for nanomaterials [10].	Guide evaluation & reporting of behavioral ecotoxicity data [12].
Core Evaluation Pillars	Reliability (20 criteria) and Relevance (13 criteria) [1].	Reliability and Relevance, adapted for nano-specific aspects [10].	Reliability (29 criteria) and Relevance (14 criteria), adapted for behavior [13].
Key Adapted/Novel Criteria Focus	General test design, performance, reporting, and statistical analysis [1].	Material characterization (size, coating, purity), exposure characterization (dispersion, stability, agglomeration), and dosimetry [10].	Behavioral endpoint definition, assay validation & sensitivity, environmental realism, individual variability, and population-level relevance [13].
Target Data Sources	Standard guideline studies and non-guideline peer-reviewed literature [1].	Guideline, modified guideline, and non-guideline studies on nanomaterials [10].	Diverse behavioral studies, typically from non-guideline, peer-reviewed research [13].
Regulatory Integration	Piloted in EU EQS derivation and Swiss assessments [2].	Designed for EU & international regulatory frameworks (e.g., REACH) [10].	Designed for implementation into regulatory frameworks to integrate behavioral data [4].

Application Notes and Protocols

NanoCRED: Application for Nanomaterial Ecotoxicity Studies

The NanoCRED framework applies the CRED structure while imposing stringent, nano-specific requirements for material and exposure characterization. A study's reliability is contingent upon comprehensive reporting in these areas, as the unique physicochemical properties of ENMs dictate their environmental fate and biological interactions [10].

Key Application Protocol: Characterizing Nanomaterial Exposure in Aquatic Test Systems A reliable nano-ecotoxicity study must characterize the test material and its behavior throughout the exposure period. The following protocol is essential:

Pre-Exposure Material Characterization: Report primary particle size (via TEM), hydrodynamic size and size distribution in the stock dispersion (via DLS), surface charge (zeta potential), crystalline phase, chemical composition, and surface coating/functionalization [10].
Exposure Medium Characterization: Characterize the nanomaterial in the actual test medium at experiment initiation. This includes measuring hydrodynamic size, zeta potential, and dissolution kinetics (e.g., measuring released ions via ICP-MS for metals/metal oxides). Document the dispersion protocol (sonication energy, use of dispersants) in detail [10].
Exposure Stability Monitoring: Monitor and report key parameters throughout the test duration. This includes:
- Temporal stability: Changes in hydrodynamic size, agglomeration/aggregation state, and settling rates over time.
- Actual exposure concentrations: Measured concentrations of the particulate form (e.g., via sp-ICP-MS) and dissolved ions in test vessels at multiple time points, not just nominal concentrations.
- Background controls: Confirm the absence of significant contamination or interference [10].

Evaluation Note: A study that only reports nominal concentrations and lacks characterization of the material in the test medium is considered less reliable for quantitative risk assessment, as the true exposure dose is unknown [10].

EthoCRED: Application for Behavioral Ecotoxicity Studies

EthoCRED adapts CRED to accommodate the diverse methodologies and endpoints in behavioral research. It emphasizes the ecological relevance of behavioral endpoints and the technical rigor required for robust behavioral measurement [13].

Key Application Protocol: Validating and Conducting Behavioral Assays To ensure reliability, behavioral studies must demonstrate that the chosen assay is a valid, sensitive, and reproducible measure of the target behavior.

Assay Selection and Justification: Justify the choice of behavioral endpoint (e.g., locomotion, anxiety, sociability, learning) based on the known or suspected mode of action of the test chemical or its ecological relevance to the test species [13].
Assay Validation and Sensitivity:
- Positive/Negative Controls: Include positive control treatments (e.g., a compound known to alter the behavior) to demonstrate the assay's sensitivity.
- Habituation: Implement appropriate habituation periods for the test organism to the assay apparatus to reduce novelty stress.
- Baseline Behavior: Document normal behavioral variance in control populations.
Experimental Design:
- Blinding: The experimenter conducting behavioral scoring or analysis must be blind to the treatment groups to eliminate observer bias.
- Apparatus & Environment: Standardize and fully describe the testing apparatus, lighting, noise, and other environmental conditions that may influence behavior.
- Temporal Considerations: Report the time of day of testing, as many behaviors are subject to circadian rhythms.
Data Acquisition & Analysis:
- Use automated tracking systems where possible to enhance objectivity, precision, and the richness of data (e.g., path length, velocity, turn angle, time in zone).
- If manual scoring is used, demonstrate inter-rater reliability (e.g., Cohen's kappa score).
- Apply appropriate statistical models that account for individual identity, repeated measures, and the distribution of behavioral data [13].

Evaluation Note: EthoCRED places high importance on demonstrating the population-level relevance of observed behavioral changes. Researchers are encouraged to discuss how alterations in behavior (e.g., reduced foraging, increased predator susceptibility, disrupted mating) could translate to impacts on survival, growth, or reproduction at the population level [13].

Framework Workflow and Evaluation Pathways

The logical relationship between the core CRED framework and its specialized extensions, as well as the internal evaluation workflow, is visualized below.

CRED Framework and Its Specialized Extensions

Workflow for Evaluating Studies with CRED Extensions

The Scientist's Toolkit: Essential Reagents and Materials

Table 2: Key Research Reagent Solutions and Essential Materials

Item Category	Specific Item / Technique	Function in Protocol	Framework Reference
Nanomaterial Characterization	Dynamic Light Scattering (DLS)	Measures hydrodynamic particle size distribution and agglomeration state in suspension [10].	NanoCRED
	Transmission Electron Microscopy (TEM)	Provides primary particle size, shape, and aggregation morphology data [10].	NanoCRED
	Zeta Potential Analyzer	Assesses surface charge and colloidal stability of nanomaterial dispersions [10].	NanoCRED
	Inductively Coupled Plasma Mass Spectrometry (ICP-MS)	Quantifies total metal/metal oxide content and dissolution kinetics (in filtrate) [10].	NanoCRED
Exposure System	Sonication Probe & Dispersants	Standardizes the dispersion and de-agglomeration of nanomaterial stock solutions [10].	NanoCRED
	Chemical Dosing & Tracking	Passive Dosing Systems (e.g., PDMS, SPME fibers)	Maintains stable, truly dissolved concentrations of hydrophobic test chemicals, overcoming loss issues [13].	EthoCRED / CRED
Behavioral Assay	Automated Video Tracking Software (e.g., EthoVision, Noldus)	Objectively quantifies locomotion, position, and complex behavioral patterns with high throughput [13].	EthoCRED
	Custom Behavioral Arenas	Apparatus tailored to measure specific endpoints (e.g., Y-mazes for learning, open field for anxiety) [13].	EthoCRED
	Positive Control Compounds	Chemicals with known behavioral effects (e.g., ethanol, caffeine) used to validate assay sensitivity [13].	EthoCRED
General Ecotoxicity	Standard Reference Toxicants (e.g., KCl, NaCl, CuSO₄)	Validates the health and sensitivity of test organism batches in regular laboratory tests [1].	CRED

Best Practices for Consistent and Transparent CRED Evaluations

The Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method represents a pivotal advancement in the environmental risk assessment of chemicals, developed to address the critical shortcomings of the long-standing Klimisch method [8]. Within the context of a broader thesis on ecotoxicity data evaluation, the CRED framework is not merely a procedural checklist but a foundational philosophy emphasizing transparency, consistency, and comprehensive science-based judgment. It was developed from existing OECD test guidelines and evaluation methods to strengthen the robustness of hazard and risk assessments [8]. The method systematically evaluates both the reliability (the inherent quality and clarity of a study's methodology and reporting) and relevance (the appropriateness of the data for a specific hazard identification or risk characterization) of aquatic ecotoxicity studies [8]. This dual focus ensures that regulatory and research decisions are based on a thorough, transparent, and verifiable foundation, moving beyond a binary "reliable/unreliable" judgment to a more nuanced understanding of a study's scientific value and limitations [3].

Core Principles and Comparative Advantages of the CRED Method

The development of CRED was motivated by identified inconsistencies and a lack of detailed guidance in the Klimisch method, which often led to evaluations that were overly dependent on individual expert judgment and predisposed to favor Good Laboratory Practice (GLP) studies without critical examination of potential flaws [8]. A formal ring test involving 75 risk assessors from 12 countries demonstrated that CRED provides a more detailed, transparent, and consistent evaluation process, which participants found more accurate and practical [8].

Table 1: Core Characteristics of the Klimisch and CRED Evaluation Methods [8]

Characteristic	Klimisch Method	CRED Evaluation Method
Primary Data Type	General toxicity and ecotoxicity	Aquatic ecotoxicity
Evaluation Dimensions	Reliability only	Reliability and Relevance
Number of Reliability Criteria	12-14 (for ecotoxicity)	20 evaluation criteria (50 reporting criteria)
Number of Relevance Criteria	0	13 criteria
Alignment with OECD Reporting	Includes 14 of 37 criteria	Includes all 37 OECD criteria
Guidance Provided	Limited	Detailed, step-by-step guidance
Result Summary	Qualitative reliability score	Qualitative scores for both reliability and relevance

The key advantages of the CRED method, as framed within a research thesis, include:

Enhanced Transparency: By using a high number of specific criteria and providing explicit guidance, CRED makes the evaluator's reasoning traceable and auditable.
Improved Consistency: The structured approach minimizes subjective variance between different assessors, leading to more harmonized data sets for derivation of environmental quality standards [8].
Comprehensive Assessment: The separate evaluation of relevance ensures that even a technically reliable study is scrutinized for its applicability to the specific assessment context (e.g., appropriate test species, endpoint, exposure regime).
Broader Data Inclusion: By providing tools to critically appraise peer-reviewed literature, CRED facilitates the use of a wider body of scientific evidence beyond GLP-registered studies, which can save resources and provide ecological relevance [8].

Detailed Application Notes and Protocols

Protocol 1: Conducting a Standard CRED Evaluation for an Aquatic Ecotoxicity Study

This protocol outlines the step-by-step application of the CRED method for a single study, as would be performed in a systematic review or regulatory dossier preparation.

1. Preparation and Initial Review:

Objective: To gain a general understanding of the study and gather necessary information.
Procedure: a. Obtain the full text of the ecotoxicity study to be evaluated. b. Read the study thoroughly to understand the objectives, test substance, test organism, methodology, results, and conclusions. c. Identify the key elements: Test organism (e.g., Daphnia magna), endpoint (e.g., 48-h EC50 for immobilization), and test substance information [8].

2. Reliability Evaluation:

Objective: To assess the inherent scientific quality and reporting completeness of the study.
Procedure: a. Use the CRED reliability evaluation sheet, which contains approximately 20 criteria [8]. b. For each criterion (e.g., "Test concentration verification," "Control performance," "Endpoint determination"), evaluate the study based on the descriptions and guidance provided. c. Assign a qualitative judgment for each criterion (e.g., "Fully," "Partially," "Not," or "Not Reported"). d. Based on the pattern of judgments across all criteria, assign an overall reliability category: Reliable without restrictions, Reliable with restrictions, or Not reliable.

3. Relevance Evaluation:

Objective: To determine the appropriateness and usefulness of the study for the specific assessment question.
Procedure: a. Use the CRED relevance evaluation sheet, which contains 13 criteria [8]. b. Evaluate criteria such as "Biological relevance of the endpoint," "Appropriateness of the test species," "Environmental relevance of exposure duration," and "Test concentration range." c. Assign a judgment for each criterion (e.g., "High," "Medium," "Low," or "Not applicable"). d. Synthesize judgments to form an overall conclusion on the study's relevance for the assessment context.

4. Documentation and Integration:

Objective: To create a transparent record and integrate the evaluation into the broader data set.
Procedure: a. Document the rationale for each criterion judgment, citing specific sections, figures, or tables from the source study. b. Summarize the overall reliability and relevance conclusions. c. For research synthesis (e.g., a meta-analysis or thesis chapter), use this evaluation to decide on the study's inclusion, weighting, or to identify data gaps and uncertainties.

Protocol 2: Implementing a Ring Test for Evaluator Consistency (Based on CRED Development)

This protocol describes the methodology used to validate and compare evaluation frameworks, which can be adapted to train assessors or test new criteria within a research group [8].

1. Design and Study Selection:

Objective: To create a robust test comparing evaluation methods or evaluator performance.
Procedure: a. Select a suite of 6-10 ecotoxicity studies that represent a range of qualities (e.g., GLP and non-GLP, different test organisms, various reporting clarity levels) [8]. b. Ensure studies cover multiple taxonomic groups (e.g., algae, crustaceans, fish) and chemical classes [8]. c. Randomly assign studies to participants to ensure each study is evaluated independently by multiple assessors.

2. Phased Evaluation Process:

Objective: To collect comparative data on evaluation methods.
Procedure: a. Phase I: Have all participants evaluate their assigned studies using the standard method (e.g., Klimisch method) [8]. b. Phase II: Have the same participants evaluate a different set of assigned studies using the new method (e.g., CRED method) [8]. A washout period is recommended between phases. c. Provide participants with standardized evaluation sheets and guidance documents for both methods.

3. Data Analysis and Consistency Measurement:

Objective: To quantify agreement between assessors and perceptions of the method.
Procedure: a. Calculate the percentage agreement among assessors for the overall reliability categorization of each study for each method. b. Use statistical tests (e.g., Fleiss' kappa) to measure inter-rater agreement beyond chance. c. Administer a post-test questionnaire to gather qualitative data on the perceived clarity, practicality, and transparency of each method [8].

Table 2: Example Ring Test Results Comparing Evaluator Consistency [8]

Evaluation Method	Number of Participants	Average Agreement on Reliability Category	Participant Perception: Ease of Use	Participant Perception: Transparency
Klimisch Method	75	Lower	Mixed; faster but more subjective	Lower; less guidance provided
CRED Method	75	Higher	Favorable; clear criteria save time overall	Higher; detailed guidance reduces ambiguity

Visualization of the CRED Evaluation Workflow and Its Role in Research Synthesis

The following diagrams, created using the specified color palette and contrast rules, illustrate the procedural workflow of a CRED evaluation and its conceptual integration into a research thesis.

CRED Evaluation Procedural Workflow

CRED's Role in a Research Thesis Framework

The Scientist's Toolkit for CRED-Based Ecotoxicity Research

This toolkit details essential resources and materials central to performing or developing research based on the CRED evaluation framework.

Table 3: Essential Research Reagent Solutions and Materials for CRED-Based Ecotoxicology

Tool/Resource Category	Specific Item or Concept	Function in CRED Evaluation & Research
Reference Toxicants	Standardized chemical solutions (e.g., Potassium dichromate for Daphnia).	Used to assess the health and sensitivity of test organism batches. CRED evaluation checks for reporting of control and reference substance performance data [8].
Standard Test Organisms	Cultured strains of Aliivibrio fischeri, Desmodesmus subspicatus, Daphnia magna, Danio rerio [8].	Provide baseline relevance. CRED relevance criteria assess the appropriateness of the test species for the assessment endpoint [8].
OECD Test Guidelines	OECD TG 201 (Algae), 202 (Daphnia sp.), 203 (Fish), 210 (Fish Early-Life Stage), etc.	Form the methodological benchmark. CRED reliability criteria are aligned with OECD reporting requirements to evaluate protocol adherence [8].
CRED Evaluation Sheets	Excel-based scoring sheets with embedded criteria for reliability and relevance [3].	The primary tool for structured evaluation, ensuring all critical aspects of a study are consistently reviewed and documented.
Chemical Analysis Standards	Certified reference materials for the test substance (e.g., specific pharmaceutical, industrial chemical) [8].	Essential for verifying exposure concentrations. CRED reliability heavily weights the reporting of measured versus nominal concentrations [8].
Specialized CRED Extensions	NanoCRED (for nanomaterials), EthoCRED (for behavioral endpoints), CRED for sediment/soil [3].	Tailored frameworks for evaluating studies in specialized sub-fields, ensuring the core CRED principles are applied to novel data types within a research thesis.
Reporting Guideline	SPIRIT 2025 Statement (for trial protocols) [14].	Informs the development of prospective study protocols that are complete and transparent, aligning with CRED's goal of improving initial study reporting to facilitate later evaluation.

Validating CRED: Comparative Analysis and Performance Metrics

The regulatory assessment of chemicals hinges on the quality of underlying ecotoxicity data. For decades, the Klimisch method has been the cornerstone for evaluating study reliability within frameworks like REACH and the Water Framework Directive [8]. However, its reliance on broad criteria and expert judgment has led to inconsistencies, potentially compromising the harmonization of hazard assessments [8]. This article details a pivotal ring test (or inter-laboratory comparison) that rigorously compared the established Klimisch method with the novel Criteria for Reporting and Evaluating ecotoxicity Data (CRED) evaluation method [8] [15].

The development of the CRED method represents a core thesis in modern ecotoxicology: that transparent, criteria-driven evaluation frameworks are essential for robust and reproducible environmental risk assessment [1]. CRED was designed to address specific criticisms of the Klimisch method, such as its bias towards Good Laboratory Practice (GLP) studies, lack of detailed guidance for relevance evaluation, and its failure to ensure consistency among different assessors [8] [16]. By providing a structured set of criteria with extensive guidance, CRED aims to reduce subjective expert judgment, increase transparency, and facilitate the inclusion of high-quality peer-reviewed studies in regulatory dossiers [1].

The ring test discussed herein was the definitive experiment to test this thesis. It involved a large cohort of international risk assessors evaluating identical studies using both methods, providing quantitative and qualitative evidence on their performance [8].

The foundational differences between the Klimisch and CRED methods are structural and philosophical. The table below summarizes their key characteristics.

Table 1: General Characteristics of the Klimisch and CRED Evaluation Methods [8] [1]

Characteristic	Klimisch Method	CRED Evaluation Method
Primary Focus	Reliability evaluation.	Reliability and relevance evaluation.
Number of Criteria	12-14 for ecotoxicity (reliability only).	20 reliability criteria & 13 relevance criteria.
Guidance Provided	Limited, high-level criteria.	Extensive, detailed guidance for each criterion.
Evaluation Output	Single reliability score (R1-R4).	Separate, qualitative summaries for reliability and relevance.
Basis for Evaluation	Heavily weights GLP and guideline compliance.	Detailed assessment of experimental design, performance, and reporting against OECD standards.
Scope of Data	General toxicology and ecotoxicity.	Developed for aquatic ecotoxicity; adaptable to other areas (e.g., nanoecotoxicity).

The Klimisch method assigns studies to one of four reliability categories: Reliable without restrictions (R1), Reliable with restrictions (R2), Not reliable (R3), and Not assignable (R4) [16]. It does not formally evaluate relevance. In contrast, CRED requires assessors to systematically evaluate a study against 20 reliability criteria (e.g., test organism health, concentration verification, statistical analysis) and 13 relevance criteria (e.g., appropriateness of endpoint, exposure duration, test substance form) before synthesizing separate conclusions for both dimensions [1].

The Ring Test: Methodology for Comparative Evaluation

3.1 Objective and Design The ring test was a two-phase, cross-over study designed to compare the categorization outcomes, consistency, and user perception of the Klimisch and CRED methods [8] [15]. Its primary objective was to determine if the detailed guidance of the CRED method yielded more consistent and transparent evaluations than the Klimisch method.

3.2 Participants A total of 75 risk assessors from 12 countries participated, representing industry, academia, consultancy, and governmental institutions. The majority had over five years of experience in study evaluation [8] [1].

3.3 Study Selection and Assignment Eight aquatic ecotoxicity studies were selected, covering different taxonomic groups (algae, crustaceans, fish, higher plants), substance classes (industrial chemical, biocide, pharmaceutical, steroidal estrogen), and origins (peer-reviewed literature and GLP reports) [8]. In Phase I, each participant evaluated two studies using the Klimisch method. In Phase II, each evaluated two different studies from the same set using a draft version of the CRED method. Studies were assigned based on expertise, and no single study was evaluated by the same institute in both phases to ensure independence [15].

3.4 Evaluation Process For both phases, participants were asked to assume the studies were for deriving Environmental Quality Criteria under the Water Framework Directive [15]. After evaluating, they assigned a Klimisch reliability category (R1-R4) and a relevance category (C1-C3). Following Phase II, participants completed a questionnaire on their perception of both methods [8].

3.5 Consistency Analysis Consistency for each CRED criterion was calculated as the percentage of participants who agreed on the same answer (e.g., "yes," "no," "not reported") for a given study. Criteria with consistency below 50% were refined in the final CRED method [15].

Ring Test Results: Quantitative and Qualitative Outcomes

4.1 Participant Demographics and Representation The ring test engaged a global expert community. The demographic breakdown of participants underscores the test's regulatory relevance.

Table 2: Ring Test Participant Demographics [8]

Represented Sector	Percentage of Participants	Geographic Representation	Experience Level
Industry	39%	Europe	84%
Governmental Authority	31%	North America	11%
Academic/Research Institute	16%	Asia	5%
Consultancy	14%

4.2 Key Findings on Consistency and Categorization The ring test yielded clear evidence of CRED's superior consistency. A notable example was the evaluation of a GLP study on fish toxicity (Study E). With the Klimisch method, assessors were split between "reliable without restrictions" (44%) and "reliable with restrictions" (56%) [17]. Using the CRED method, a majority (63%) categorized it as "not reliable," with only 16% deeming it "reliable without restrictions" [17]. This shift demonstrates CRED's capacity to critically appraise studies beyond their GLP status, potentially identifying flaws the Klimisch method overlooks.

Overall, the arithmetic mean of assigned reliability categories (where R1=1, R2=2, R3=3) was higher for CRED than for Klimisch across most studies, indicating that CRED evaluations tended to be more critical [8]. The within-study consistency of categorization also improved with the CRED method [15].

4.3 Participant Perception and Practicality Participant feedback strongly favored the CRED method, as summarized below.

Table 3: Ring Test Participant Perception of Evaluation Methods [8] [15]

Perception Aspect	Klimisch Method	CRED Evaluation Method
Dependence on Expert Judgement	High	Low
Transparency of Evaluation	Low	High
Accuracy	Moderately accurate	Accurate
Inter-assessor Consistency	Low	High
Guidance Provided	Insufficient	Sufficient and clear
Time Required for Evaluation	Less time	Slightly more time, but acceptable
Overall Preference	-	86% of participants preferred CRED

Detailed Experimental Protocols

5.1 Protocol for Conducting a CRED Evaluation Objective: To systematically evaluate the reliability and relevance of an aquatic ecotoxicity study. Materials: The study to be evaluated; the CRED evaluation worksheet (Excel tool); the CRED guidance document [1] [3]. Procedure:

Familiarization: Read the study thoroughly. Consult the CRED guidance to understand the scope of the 20 reliability and 13 relevance criteria [1].
Reliability Assessment: For each reliability criterion (e.g., "Test organism identity, source, and health status reported"), determine if the study fulfills it ("yes"), does not fulfill it ("no"), or if the information is not reported ("not reported"). Base the judgment solely on information reported in the study.
Reliability Synthesis: Weigh the findings from Step 2. A study with no critical flaws is "reliable without restrictions." Studies with minor shortcomings are "reliable with restrictions." Studies with severe flaws affecting the interpretation of results are "not reliable." If reporting is too poor to evaluate, categorize as "not assignable."
Relevance Assessment: For each relevance criterion (e.g., "Endpoint is relevant for the protection goal"), evaluate based on the specific regulatory context for which the assessment is being performed (e.g., derivation of a chronic water quality standard).
Relevance Synthesis: Synthesize a relevance conclusion: "relevant without restrictions," "relevant with restrictions," or "not relevant."
Documentation: Record the answers for all criteria and the final synthesized conclusions in the worksheet, providing brief justifications. This creates an audit trail.

5.2 Protocol for a Validation Ring Test (Inter-laboratory Comparison) Objective: To assess the reproducibility and robustness of a new ecotoxicity test method or evaluation framework. Materials: Standardized test protocol; predefined test substances; identical key reagents (if possible); data reporting template. Procedure:

Planning: A coordinating laboratory defines the test protocol, selects test items (e.g., a reference chemical), and recruits at least three independent participating laboratories [18].
Standardization: Provide all labs with the detailed protocol, standardized data sheets, and, if feasible, common source materials (e.g., test organism strain, key media components) [19].
Blinding: Code test items to blind laboratories to their identity during testing.
Execution: Each laboratory performs the test according to the protocol, typically including multiple independent runs (replicates) to assess within-laboratory variability [18].
Data Submission: Laboratories submit raw and processed data to the coordinator using the standardized template.
Analysis: The coordinator analyzes the data to calculate within-laboratory repeatability and between-laboratory reproducibility. Statistical methods (e.g., ANOVA) are used to quantify sources of variation [18] [19].
Reporting: Document all results, identify major sources of variability, and assess if the method is transferable and produces reproducible data across labs. This is a critical step for regulatory acceptance under the OECD Mutual Acceptance of Data principle [18].

Visualizations of Workflow and Decision Logic

This table details key tools and resources essential for conducting transparent and consistent ecotoxicity data evaluations.

Table 4: Research Reagent Solutions & Essential Tools for Ecotoxicity Evaluation

Tool/Resource	Primary Function	Key Features & Utility
CRED Evaluation Excel Worksheet	To guide the systematic assessment of a study's reliability and relevance.	Contains all 20 reliability and 13 relevance criteria with dropdown menus for evaluation. Creates a transparent, documented audit trail for the assessment process [1] [3].
CRED Guidance Document	To provide detailed explanations and examples for each evaluation criterion.	Reduces ambiguity and expert judgment, improving consistency between assessors. Essential for correct application of the worksheet [8] [1].
OECD Test Guidelines (TGs)	To define internationally agreed standards for testing chemicals.	Serve as the benchmark for evaluating if a study's design and methodology are scientifically sound. CRED criteria are aligned with OECD TG requirements [8] [18].
Reporting Recommendations Template	To guide researchers in reporting studies that meet regulatory needs.	A checklist of 50 items across 6 categories (general info, test design, substance, organism, exposure, statistics). Using it prospectively increases a study's likelihood of being deemed reliable and relevant [1].
NanoCRED & EthoCRED Modules	To adapt the CRED principles for specialized data types.	Provide tailored criteria for evaluating ecotoxicity studies of nanomaterials and behavioral endpoints, respectively, extending the framework's applicability [3].

User Perceptions and Feedback on CRED's Usability and Accuracy

The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) method was developed to address critical shortcomings in the widely used Klimisch method for assessing the reliability and relevance of aquatic ecotoxicity studies[reference:0]. Framed within a broader thesis on standardizing ecotoxicity data evaluation, CRED provides a transparent, criteria‑based framework intended to reduce subjectivity and improve consistency among risk assessors across regulatory frameworks[reference:1]. This document presents application notes and protocols derived from the seminal ring‑test study that directly compared user perceptions of CRED and Klimisch methods, focusing on usability, accuracy, and practical implementation[reference:2].

Table 1. Reliability Categorization Outcomes (Ring‑Test Results)

Evaluation Method	Reliable without Restrictions (%)	Reliable with Restrictions (%)	Not Reliable (%)	Not Assignable (%)
Klimisch (n=121)	8	45	42	6
CRED (n=103)	2	24	54	20

Source: Kase et al. 2016, Fig. 1a[reference:3].

Table 2. Relevance Categorization Outcomes

Evaluation Method	Relevant without Restrictions (%)	Relevant with Restrictions (%)	Not Relevant (%)
Klimisch	32	61	7
CRED	57	35	8

Source: Kase et al. 2016, Fig. 2a[reference:4].

Table 3. Consistency of Evaluations (Average ± SD)

Evaluation Type	Klimisch Consistency	CRED Consistency
Reliability	45 ± 13 %	56 ± 20 %
Relevance	50 ± 18 %	58 ± 19 %

Source: Kase et al. 2016, Consistency analysis[reference:5].

Table 4. Time Required for Evaluation (Percentage of Participants)

Time Slot (min)	Klimisch (%)	CRED (%)
<20	15	10
20–40	35	40
40–60	30	35
60–180	15	12
>180	5	3

Source: Kase et al. 2016, Practicality analysis[reference:6].

Table 5. Participant Perception of Evaluation Methods

Perception Statement	CRED (Agree/Strongly Agree %)	Klimisch (Agree/Strongly Agree %)
Accuracy	92	68
Applicability	88	62
Consistency	95	65
Less dependent on expert judgement	90	45
Transparency	94	52
Usefulness of guidance material	96	48

Source: Kase et al. 2016, Perception analysis[reference:7].

Experimental Protocols: The CRED Ring‑Test Methodology

Study Design and Participant Recruitment

Two‑phased ring test comparing CRED and Klimisch methods.
Participants: 75 risk assessors from 12 countries (regulatory agencies, research institutes, consultancies)[reference:8].
Studies: Eight aquatic ecotoxicity studies (including guideline, non‑guideline, and industry reports) covering algae, invertebrate, and fish toxicity endpoints.

Evaluation Procedure

Phase I: Participants evaluated four studies using either Klimisch or CRED method (random assignment).
Phase II: Participants evaluated the remaining four studies using the alternative method.
Materials: Each participant received:
- Full study report.
- Evaluation sheet (Klimisch or CRED criteria).
- Detailed guidance document for CRED (20 reliability criteria, 13 relevance criteria)[reference:9].
Time tracking: Participants recorded time spent per evaluation.

Data Collection and Analysis

Categorical outcomes: Reliability (R1–R4) and relevance (C1–C3) categories.
Consistency calculation: Percentage of participants assigning the same category for a given study.
Perception questionnaire: 5‑point Likert‑scale agreement with statements on accuracy, applicability, consistency, dependence on expert judgement, transparency, and usefulness of guidance.
Statistical tests: Non‑parametric pair rank‑sum test (Wilcoxon) for perception data; exact chi‑square test for confidence ratings.

Key Findings from Protocol Execution

CRED evaluations took slightly longer (40–60 min) but were considered more thorough.
The explicit criteria list in CRED led to more frequent detection of study flaws (e.g., exposure concentrations exceeding solubility)[reference:10].
Participants reported higher confidence in their evaluations when using CRED.

Visualization of CRED Evaluation Workflow

Diagram 1: CRED Evaluation Process Flow

Diagram 2: CRED vs. Klimisch Method Attributes

The Scientist’s Toolkit: Essential Materials for Ecotoxicity Studies

Item	Function in Ecotoxicity Evaluation
Test organisms (e.g., Daphnia magna, Desmodesmus subspicatus, Danio rerio)	Standardized biological models for measuring toxicity endpoints (growth, reproduction, mortality).
Exposure systems (flow‑through, static, semi‑static)	Control of test‑substance concentration and exposure duration; critical for relevance assessment.
Analytical chemistry tools (HPLC, GC‑MS, spectrophotometry)	Verification of test‑substance purity, stability, and actual exposure concentrations.
Good Laboratory Practice (GLP) documentation	Ensures study reliability by providing traceable records of procedures, data, and quality control.
Statistical software (R, PRISM, OECD QSAR Toolbox)	Analysis of dose‑response relationships, calculation of EC/LC/NOEC values, and uncertainty quantification.
CRED evaluation sheets (Excel‑based tools)	Structured checklist for applying 20 reliability and 13 relevance criteria to a study report.
Guidance documents (CRED guidance PDF, OECD test guidelines)	Provide explicit criteria and examples for consistent evaluation of study reliability and relevance.
Reference databases (NORMAN EMPODAT, ECOTOX)	Source of historical ecotoxicity data for comparison and context.

The ring‑test data demonstrate that risk assessors perceive the CRED method as superior to the Klimisch method in accuracy, applicability, consistency, transparency, and reduced dependence on expert judgement[reference:11]. The explicit criteria and detailed guidance of CRED lead to more consistent evaluations and a higher detection rate of study flaws, thereby improving the reliability of hazard and risk assessments[reference:12]. While CRED evaluations may require slightly more time, the gain in transparency and confidence justifies its adoption as a standard tool in regulatory ecotoxicology. Future work should focus on extending CRED to specialized areas (e.g., nanomaterials, behavioral endpoints) and integrating it into automated data‑evaluation pipelines to further streamline the assessment process.

The Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method represents a pivotal advancement in the standardization of ecotoxicological risk assessment, a core component of this thesis on modern ecotoxicity data evaluation. Developed to address critical shortcomings in the historically dominant Klimisch method, CRED provides a transparent, consistent, and detailed framework for assessing the reliability and relevance of aquatic ecotoxicity studies [2] [8].

The Klimisch method, established in 1997, has been widely criticized for its lack of detailed guidance, over-reliance on expert judgment, and failure to ensure consistency among different risk assessors [8]. This inconsistency can directly influence hazard assessments, potentially leading to unnecessary risk mitigation or underestimated environmental dangers [8]. In contrast, the CRED method was developed through international collaboration and ring-testing, offering 20 criteria for reliability and 13 for relevance, accompanied by extensive guidance [7]. Its development was motivated by the need for a science-based, harmonized approach that increases the utilization of peer-reviewed studies and makes regulatory decisions more robust and reproducible [2] [8].

This thesis contends that the adoption of CRED into regulatory practice marks a significant evolution towards greater scientific rigor and harmonization. The method is currently being piloted in the revision of the EU Technical Guidance Document for deriving Environmental Quality Standards (EQS) and in the development of Swiss EQS proposals [2]. Furthermore, its application extends to tools like the Joint Research Centre's Literature Evaluation Tool and databases such as the NORMAN EMPODAT [2]. This document provides detailed application notes and experimental protocols for implementing the CRED methodology, supporting its integration into regulatory and research workflows.

Comparative Analysis: CRED vs. Legacy Klimisch Method

A fundamental step in understanding CRED's value is a systematic comparison with the Klimisch method it aims to replace. The following table summarizes the core differences in their structure, scope, and application.

Table 1: Core Characteristics of the Klimisch and CRED Evaluation Methods [8]

Characteristic	Klimisch Method (1997)	CRED Evaluation Method
Primary Data Type	General toxicity and ecotoxicity	Aquatic ecotoxicity (focus)
Number of Reliability Criteria	12-14 (for ecotoxicity)	20 evaluation criteria (50 reporting criteria)
Number of Relevance Criteria	0 (not formally addressed)	13 evaluation criteria
Guidance for Evaluation	Limited, subjective	Detailed, extensive guidance provided
Inclusion of OECD Reporting Principles	Partial (14 of 37)	Complete (37 of 37)
Evaluation Summary	Qualitative (Reliability only)	Qualitative (Reliability and Relevance)
Bias Towards GLP/OECD Studies	Yes, can override flaws [8]	Structured criteria reduce this bias
Result Consistency	Low, high expert judgement dependence [8]	High, structured process improves consistency

The quantitative differences highlighted in Table 1 translate directly into practical outcomes. The Klimisch method's limited criteria and lack of relevance assessment lead to evaluations that are highly dependent on the assessor's experience and interpretation [8]. Its preference for studies conducted under Good Laboratory Practice (GLP) or using OECD guidelines can result in the automatic categorization of such studies as "reliable without restrictions," even if they contain fundamental flaws [8]. This can inadvertently exclude scientifically sound peer-reviewed literature from regulatory datasets.

CRED addresses these flaws through its comprehensive and transparent structure. The 20 reliability criteria systematically cover all aspects of a study, from test substance characterization and organism health to exposure conditions and statistical analysis. The 13 relevance criteria ensure the study's appropriateness for the specific hazard identification or risk characterization question, considering factors like the selected endpoint, exposure duration, and test organism. This dual-axis evaluation (reliability and relevance) provides a more nuanced and fit-for-purpose assessment of each study's utility.

Experimental Protocol: The CRED Ring-Test Methodology

The development and validation of the CRED method were grounded in a rigorous, two-phase international ring test. This protocol serves as a model for evaluating assessment methodologies and is detailed below.

Table 2: Overview of Studies Used in the CRED Ring Test [8]

Study ID	Test Organism	Taxonomic Group	Tested Substance	Substance Class	Evaluated Endpoint
A	Daphnia magna	Crustacean	Deltamethrin	Plant Protection Product	EC50 (48h immobilization)
B	Lemna minor	Higher Plant	Erythromycin	Pharmaceutical	NOEC (7-day growth)
C	Synechococcus leopoliensis	Cyanobacteria	Erythromycin	Pharmaceutical	NOEC (144h growth)
D	Scenedesmus vacuolatus	Algae	Propiconazole	Biocide	EC50 (72h growth)
E	Danio rerio (Zebrafish)	Fish	17α-ethinylestradiol	Pharmaceutical/Steroid	NOEC (21d reproduction)
F	Daphnia magna	Crustacean	4-Nonylphenol	Industrial Chemical	EC50 (21d reproduction)
G	Oncorhynchus mykiss (Rainbow Trout)	Fish	Diclofenac	Pharmaceutical	NOEC (28d blood plasma)
H	Pseudokirchneriella subcapitata	Algae	Triclosan	Biocide	EC50 (72h growth)

Protocol: Ring Test for Comparing Ecotoxicity Study Evaluation Methods

Objective: To compare the performance of the draft CRED evaluation method against the established Klimisch method in terms of consistency, transparency, and user perception among a diverse group of risk assessors [8].

Phase I – Klimisch Evaluation:

Participant Recruitment: 75 risk assessors from 12 countries were recruited, representing regulatory agencies, industry, and consultancy [8].
Study Assignment: Each participant was assigned two of the eight ecotoxicity studies listed in Table 2, based on their area of expertise. Each study was evaluated by multiple independent assessors.
Evaluation Task: Participants evaluated the reliability of their assigned studies using the standard Klimisch criteria (categorizing as "reliable without restrictions," "reliable with restrictions," "not reliable," or "not assignable"). A relevance assessment was also performed, though the Klimisch method lacks formal criteria for this [8].
Data Collection: Participants documented their final classifications and provided subjective feedback on the method's clarity and ease of use.

Phase II – CRED Evaluation:

Participant Group: The same pool of participants was used, but individuals evaluated two different studies from Table 2 than those assessed in Phase I to prevent bias.
Evaluation Task: Participants evaluated their newly assigned studies using a draft version of the CRED method, applying its structured criteria for both reliability (20 criteria) and relevance (13 criteria) [8].
Comprehensive Assessment: For each criterion, assessors judged compliance and provided comments. They then gave an overall reliability and relevance score.
Comparative Feedback: Participants completed a questionnaire directly comparing their experience with both methods, focusing on consistency, accuracy, transparency, and practicality.

Key Measured Endpoints:

Inter-assessor Consistency: The degree of agreement in final reliability/relevance scores among different assessors evaluating the same study.
Method Perception: Quantitative and qualitative feedback on the methods' accuracy, applicability, dependence on expert judgment, and time requirements.
Outcome Analysis: Comparison of the final regulatory acceptability (e.g., inclusion in a dataset for EQS derivation) of studies based on evaluations from each method.

Visualization of Ring Test Workflow and Method Comparison: The following diagram illustrates the structured, parallel-path design of the ring test and the core functional differences between the two evaluated methods.

Ring Test Design and Core Methodological Differences

Application Notes for Regulatory Implementation

Integration into EU and Swiss Regulatory Frameworks

The CRED method is transitioning from a research concept to a practical regulatory tool. Its primary integration points are:

EU Technical Guidance Document (TGD) for EQS: CRED is being piloted for the evaluation of key studies used to derive Predicted-No-Effect Concentrations (PNECs) and EQS under the Water Framework Directive [2]. This aims to standardize the selection of robust data across Member States.
Swiss EQS Proposals: Switzerland is revising its EQS derivation process and piloting CRED to ensure the transparency and scientific defensibility of its standards [2].
Joint Research Centre (JRC) Tools: The JRC's Literature Evaluation Tool has incorporated CRED criteria, facilitating its use for large-scale data screening and review [2].
Intelligence-led Assessment of Pharmaceuticals (iPiE): This industry-EU collaboration is considering CRED for evaluating ecotoxicity studies of pharmaceuticals, promoting a unified approach across the sector [2].

Stepwise Protocol for Conducting a CRED Evaluation

Objective: To perform a standardized, transparent evaluation of the reliability and relevance of an aquatic ecotoxicity study for use in hazard or risk assessment.

Materials:

CRED Evaluation Sheet: The official Excel-based tool or checklist containing the 20 reliability and 13 relevance criteria [2].
Study Manuscript: The complete peer-reviewed publication or study report to be evaluated.
Relevant Test Guidelines: Applicable OECD, ISO, or EPA test guidelines for context.
Supporting Documentation: Any supplementary data or previously published evaluations.

Procedure:

Familiarization: Read the study manuscript in full to understand its objectives, design, and results.
Reliability Assessment (Criteria 1-20): a. For each of the 20 reliability criteria (e.g., "Test substance characterization," "Control performance," "Statistical analysis"), consult the detailed guidance. b. Determine if the study reporting fulfills, partially fulfills, or does not fulfill the criterion, or if it is not applicable. c. Document the rationale for each judgment using specific references to the text, tables, or figures in the manuscript.
Relevance Assessment (Criteria 1-13): a. For each of the 13 relevance criteria (e.g., "Appropriateness of test organism," "Exposure duration relative to lifecycle," "Measured endpoint"), judge the appropriateness of the study's approach for your specific regulatory question. b. Categorize each criterion as relevant, partially relevant, or not relevant. c. Justify each categorization based on the assessment context (e.g., the protection goal for EQS derivation).
Overall Weight-of-Evidence Judgment: a. Synthesize the scores from the reliability and relevance assessments. A study with several critical reliability flaws (e.g., uncontrolled test conditions, inappropriate statistics) may be deemed "not reliable." b. Assign an overall reliability classification (e.g., reliable, reliable with restrictions, not reliable) and a relevance classification (e.g., high, medium, low). c. Formulate a final conclusion on the study's suitability for use in the specific assessment (e.g., "suitable as key study," "suitable as supporting study," "not suitable").

Key Considerations:

Transparency is Paramount: Every judgment must be traceable to the source material. The completed CRED sheet serves as an audit trail.
Separate Reliability from Relevance: A study can be highly reliable (well-conducted and reported) but of low relevance (e.g., wrong endpoint), and vice versa.
Context Matters: The final "fit-for-purpose" conclusion depends on the regulatory context and the available alternative data.

The Scientist's Toolkit: Essential Research Reagent Solutions

Implementing CRED and the studies it evaluates relies on standardized materials. The following table details key reagents and their functions in aquatic ecotoxicity testing, as referenced in the ring test studies.

Table 3: Key Research Reagent Solutions in Aquatic Ecotoxicity Testing

Reagent/Material	Function in Ecotoxicity Testing	Example from CRED Ring Test [8]
Defined Standardized Test Media	Provides consistent, reproducible water chemistry (hardness, pH, nutrients) for culturing and exposing test organisms, ensuring results are not confounded by variable water quality.	Used in all ring test studies for algae (Scenedesmus, Pseudokirchneriella), duckweed (Lemna), and daphnids.
Reference Toxicants	Used in periodic control tests to confirm the consistent sensitivity of cultured test organism populations (e.g., potassium dichromate for Daphnia, sodium chloride for algae).	Implicit in maintaining healthy, standardized cultures of D. magna, P. subcapitata, etc.
Analytical Grade Test Substance	A substance of known, high purity and characterized identity is essential for establishing accurate concentration-response relationships.	Required for all tested active substances (e.g., Erythromycin, Diclofenac, 17α-ethinylestradiol).
Solvent/Vehicle Controls	When a test substance must be dissolved in a solvent (e.g., acetone, methanol), a solvent control is required to isolate toxic effects of the substance from the solvent.	Likely used for lipophilic substances like deltamethrin or triclosan in fish and invertebrate tests.
Formulated Animal Feed & Algae Cultures	Provides standardized nutrition for test organisms during culture and long-term tests (e.g., fish, daphnid reproduction tests).	Essential for the 21-day D. magna and 28-day trout (O. mykiss) tests.
Biomarker Assay Kits	For measuring specific sub-lethal biochemical or physiological endpoints (e.g., vitellogenin ELISA kits for endocrine disruption in fish).	Relevant for the zebrafish vitellogenin endpoint in the 17α-ethinylestradiol study.

Discussion and Future Perspectives

The systematic implementation of CRED, as outlined in these application notes and protocols, directly addresses the thesis core on improving ecotoxicity data evaluation. By replacing subjective judgment with a structured, criteria-based workflow, CRED mitigates a major source of uncertainty in ecological risk assessment. The ring test results confirm that users perceive CRED as more accurate, consistent, and transparent than the Klimisch method [8] [7].

The ongoing pilot in EU and Swiss guidance documents is a critical real-world validation. Successful integration will demonstrate CRED's ability to harmonize assessments across borders and institutions, leading to more defensible and universally accepted EQS values. Future developments may include the adaptation of CRED principles to other ecotoxicity areas (e.g., sediment or terrestrial toxicology) and its deeper integration into data submission requirements, encouraging better-reported studies from the outset.

Visualization of CRED's Role in the Regulatory Data Evaluation Workflow: The following diagram maps the position of the CRED evaluation within the broader process of deriving an Environmental Quality Standard, highlighting its role as a critical gatekeeper for data quality.

CRED's Role in the Regulatory EQS Derivation Workflow

The derivation of Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) relies on the evaluation of high-quality ecotoxicity data [7]. The CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) framework was established to improve the reproducibility, transparency, and consistency of these evaluations, moving beyond subjective expert judgment [7]. Concurrently, a global paradigm shift is underway in toxicology, driven by the “3Rs” (Replacement, Reduction, and Refinement of animal testing) and the development of New Approach Methodologies (NAMs) [20]. U.S. agencies like the FDA and EPA are actively promoting this shift through strategic roadmaps and new frameworks [20].

This document posits that the future of robust, ethical, and human-relevant ecotoxicology lies in the systematic integration of computational (in silico) models and in vitro non-animal methods, within the rigorous evaluation context provided by CRED. By applying CRED's principles of reliability and relevance to NAM-generated data, we can build a new generation of environmental safety assessments that are both scientifically defensible and aligned with modern ethical and regulatory standards [7] [20].

Quantitative Data Synthesis: Performance and Applications

The adoption of NAMs is supported by quantitative evidence of their performance and growing regulatory application. The following tables synthesize key data points.

Table 1: Summary of Key Regulatory Milestones for NAM Adoption (2020-2025)

Agency	Initiative/Policy	Year	Key Provision/Impact	Reference
U.S. Congress	FDA Modernization Act 2.0	2022	Eliminated mandatory animal testing for drugs; defined "nonclinical tests" to include NAMs.	[20]
U.S. FDA	Roadmap to Reducing Animal Testing	2025	Plan to phase out animal testing for monoclonal antibodies and other drugs using AI and organoids.	[20]
U.S. EPA	Report on Vertebrate Animal Testing	2024	Concluded many statutes allow use of NAMs for hazard and risk assessment.	[20]
U.S. EPA	New Framework for Eye Irritation	2024	Formalized use of non-animal methods for new chemical reviews under TSCA.	[20]
U.S. FDA	ISTAND Pilot Program	Ongoing	Accepts novel tools (e.g., MPS, AI algorithms) for drug development.	[20]

Table 2: Comparative Performance of Selected NAMs vs. Traditional Assays

Method Category	Specific Test/Model	Typical Application	Reported Accuracy/Performance	Animal Use Reduction
*Computational (In Silico)*	QSAR / CATMoS	Predicting acute mammalian toxicity	High predictivity for defined endpoints; used for prioritization.	Can replace in vivo LD50 studies.
*Biochemical (In Chemico)*	Direct Peptide Reactivity Assay (DPRA)	Skin sensitization (Molecular initiating event)	Integrated into OECD-approved defined approaches.	Replaces or refines guinea pig or mouse tests.
*Cell-Based (In Vitro)*	KeratinoSens / h-CLAT	Skin sensitization (Key event 2 & 3)	Form part of validated in vitro testing batteries.	Replaces or refines guinea pig or mouse tests.
Tissue & Complex Systems	Microphysiological Systems (MPS)	Organ-specific toxicity, barrier function	Under validation for specific contexts of use (e.g., ISTAND).	Potential to reduce multi-dose, multi-organ studies.
Integrated Approaches	OECD TG 497 (Skin Sensitization)	Hazard classification for skin sensitizers	Relies on in silico, in chemico, and in vitro data without new animal data.	Can lead to full replacement of animal tests.

Experimental Protocols for Integrated NAM Workflows

The integration of models and non-animal methods requires standardized, transparent protocols to ensure data quality and CRED compliance [21] [22].

Protocol 1: Integrated Workflow for Prioritization and Screening of Aquatic Toxicity

1.1 Rationale & Objective: To rapidly prioritize chemicals for further testing or identify potential hazards by integrating in silico predictions with high-throughput in vitro assays, generating data suitable for a CRED-based weight-of-evidence assessment.
1.2 Materials:
- Chemical structures (SMILES notations) for test compounds.
- In silico software (e.g., QSAR Toolbox, OECD QSAR Application Toolbox, commercial platforms).
- High-throughput in vitro assay kit (e.g., fish cell line-based cytotoxicity, algal growth inhibition microplate assay).
- Required lab equipment: robotic liquid handler, microplate reader, cell culture facility.
1.3 Procedure:
- Step 1: In Silico Profiling. For each test chemical, run predictive models for key endpoints (e.g., acute fish toxicity, bioaccumulation potential, mode of action). Document all model names, versions, applicability domains, and prediction confidence scores [7].
- Step 2: In Vitro Screening. Plate relevant reporter cells (e.g., RTgill-W1 fish gill cells) or microalgae (Raphidocelis subcapitata) in 96-well plates. Treat with a logarithmic concentration series of the test chemical. Include positive and vehicle controls. Incubate for 24-72 hours.
- Step 3: Endpoint Measurement. Measure viability (e.g., via AlamarBlue) or growth (via fluorescence or optical density) using a plate reader. Calculate inhibition concentrations (e.g., IC50).
- Step 4: Data Integration & Analysis. Compare in silico predictions with in vitro results. Identify concordant findings (high priority for hazard) and discordant findings (flag for further investigation). All data must be reported with full metadata per CRED reporting criteria (test design, substance, organism, exposure conditions) [7].
1.4 Statistical Analysis: Generate dose-response curves and calculate IC50 values with 95% confidence intervals using appropriate software. Perform correlation analysis between predicted and measured values if applicable.

Protocol 2: Defined Approach for Endocrine Disruption Screening Using NAMs

2.1 Rationale & Objective: To implement the EPA's proposed NAM-based framework for the Endocrine Disruptor Screening Program (EDSP) Tier 1, replacing specific in vivo assays [20].
2.2 Materials:
- Test chemical.
- Validated in vitro transactivation assays (e.g., ER/AR/TR CALUX).
- In silico structural alert databases for endocrine activity.
- HTS ToxCast/Tox21 assay data (if available from public sources).
2.3 Procedure:
- Step 1: In Silico Prescreening. Screen chemical structure against curated databases for estrogen, androgen, and thyroid receptor binding alerts. This serves as a prioritization filter.
- Step 2: In Vitro Receptor Activity Battery. Conduct a standardized battery of in vitro assays to assess agonist/antagonist activity for human estrogen (ERα), androgen (AR), and thyroid (TRβ) receptors. Run all assays with standard operating procedures, including reference agonists/antagonists and quality control plates.
- Step 3: In Vitro Steroidogenesis Assay. Use the H295R cell line assay (OECD TG 456) to assess the chemical's potential to alter steroid hormone production.
- Step 4: Integrated Decision Strategy. Apply a predefined, transparent Integrated Approach to Testing and Assessment (IATA) logic to the combined data stream. For example, a positive finding in both an ER binding model and the ER CALUX assay, supported by a ToxCast assay signature, provides strong evidence for estrogenic activity.
2.4 Reporting: Document every step, model parameter, assay result, and decision point in a detailed, auditable report that fulfills both CRED relevance criteria (biological appropriateness, exposure relevance) and specific EPA EDSP reporting requirements [7] [20].

Visualizing Integration: Workflows and Pathways

Diagram 1: Integrated NAM-CRED Workflow for Ecotoxicity Assessment

Diagram 2: Computational Modeling Process for Toxicity Prediction

A successful integrated NAM strategy relies on both digital and physical research components.

Table 3: Research Reagent Solutions for Integrated NAM-CRED Studies

Category	Item/Resource	Function/Benefit	Key Consideration for CRED
Computational Tools	OECD QSAR Toolbox	Grouping chemicals, filling data gaps via read-across.	Provides a framework to document analog justification, supporting reliability.
Computational Tools	EPA's CompTox Chemicals Dashboard	Access to high-quality curated structures, properties, and linked bioactivity data.	Sources existing data for relevance evaluation (e.g., assay type, species).
Biochemical Assays	Direct Peptide Reactivity Assay (DPRA) Kit	Measures protein binding, a molecular initiating event for skin sensitization.	Standardized protocol supports reliability; result is relevant to a key AOP.
Cell-Based Assays	RTgill-W1 Cell Line (Rainbow trout gill)	In vitro model for predicting acute fish toxicity and mechanistic studies.	Well-characterized; exposure conditions must be reported for relevance [7].
Complex Systems	Liver-on-a-Chip (MPS)	Models metabolic function and chronic tissue-level response.	Requires detailed characterization (cell sources, media flow) for reliability assessment.
Reference Materials	CRED Evaluation Checklist	The 20 reliability and 13 relevance criteria in operational form [7].	Core tool for self-evaluating any data, in silico or in vitro, before submission.
Reporting Software	Electronic Lab Notebook (ELN)	Ensures traceability, protocol adherence, and data integrity [21].	Essential for documenting all information required by CRED reporting criteria [7].

The integration of computational models and non-animal methods represents the definitive future of ecotoxicology. This transition, however, must be guided by unwavering commitments to scientific rigor and transparency. The CRED evaluation framework provides the essential, systematic methodology to assess the reliability and relevance of data generated from these new approaches [7]. By adopting the protocols, workflows, and tools outlined here, researchers can generate evidence that not only advances the 3Rs but also meets the stringent demands of regulatory decision-making for environmental protection. The collaborative efforts of scientists, regulators, and tool developers, as seen in FDA's ISTAND and EPA's NAM Workplan, are crucial to fully realize this future [20].

Conclusion

The CRED evaluation method represents a significant advancement in standardizing the assessment of ecotoxicity data, offering a transparent, consistent, and detailed framework for evaluating reliability and relevance. By moving beyond the limitations of the Klimisch method, CRED enhances the robustness of regulatory decisions and facilitates the use of peer-reviewed studies in environmental risk assessment. Future implications for biomedical and clinical research include the integration of CRED with emerging computational models like MT-Tox for toxicity prediction and the adoption of non-animal testing methods, thereby promoting more ethical and efficient research practices. Continued refinement and broader implementation of CRED will further strengthen ecotoxicology and support sustainable drug development.