This article provides a comprehensive comparison of the established Klimisch method and the modern CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) framework for evaluating the reliability and relevance of...
This article provides a comprehensive comparison of the established Klimisch method and the modern CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) framework for evaluating the reliability and relevance of aquatic ecotoxicity studies. Aimed at researchers, scientists, and drug development professionals, it explores the foundational principles of both methods, details the step-by-step application of CRED's 20 reliability and 13 relevance criteria, and addresses common implementation challenges. Through a validation-focused analysis, it synthesizes evidence from large-scale ring tests demonstrating CRED's superior consistency, transparency, and reduced dependency on expert judgment. The conclusion highlights CRED's role in harmonizing environmental risk assessments and its implications for improving the regulatory acceptance of peer-reviewed data in biomedical and clinical research.
The derivation of Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) forms the cornerstone of chemical risk assessment worldwide. The integrity of these safety thresholds is entirely dependent on the quality of the underlying ecotoxicity studies [1]. Historically, the Klimisch method, developed in 1997, has been the dominant framework for evaluating the reliability of such data. However, its reliance on broad categories and expert judgment has been widely criticized for introducing bias, inconsistency, and a lack of transparency into regulatory decisions [2] [3]. This comparison guide examines the evolution from the Klimisch method to the modern Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) framework, providing researchers and assessors with objective data and protocols to understand why rigorous, structured data evaluation is a non-negotiable regulatory imperative.
The core distinction between the Klimisch and CRED methods lies in their structure, specificity, and scope. The following table summarizes their key characteristics based on regulatory guidance and comparative research.
Table: Comparison of the Klimisch and CRED Ecotoxicity Data Evaluation Methods
| Feature | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Core Purpose | Categorize study reliability for regulatory inclusion [2]. | Evaluate reliability and relevance to improve assessment transparency and consistency [1] [2]. |
| Evaluation Scope | Primarily reliability (4 categories) [2]. | Separate, detailed evaluation of reliability (20 criteria) and relevance (13 criteria) [1] [4]. |
| Guidance & Specificity | Limited criteria; high reliance on expert judgement [2]. | Extensive guidance for each criterion to reduce subjectivity [1]. |
| Bias Criticism | Favors industry GLP (Good Laboratory Practice) studies; may overlook flaws in guideline studies [1] [2]. | Science-based; evaluates methodological quality independent of GLP status [2] [3]. |
| Outcome Transparency | Provides a final category (e.g., "Reliable with restrictions") [2]. | Identifies specific study strengths and weaknesses against each criterion [1]. |
| Regulatory Adoption | Widely used historically in EU frameworks (e.g., REACH) [2] [3]. | Piloted in EU EQS derivation and Swiss assessments; recommended as a Klimisch replacement [4]. |
A pivotal ring test involving 75 risk assessors from 12 countries directly compared the two methods [2]. Participants evaluated multiple ecotoxicity studies using each framework. The results demonstrated clear, quantitative differences in performance.
Table: Ring Test Results on Evaluation Consistency and User Perception [2]
| Evaluation Metric | Klimisch Method Performance | CRED Method Performance |
|---|---|---|
| Consistency Among Assessors | Lower consistency; same study often assigned to different reliability categories by different experts. | Higher consistency due to detailed criteria and guidance. |
| Perceived Accuracy | Rated lower by participants. | 85% of participants rated it as "accurate" or "very accurate." |
| Perceived Practicality | Criticized for lack of guidance, making evaluations difficult. | 80% found the criteria clear; majority found time requirement acceptable. |
| Transparency of Process | Opaque; dependent on individual expert judgment. | Rated as more transparent and less dependent on expert judgment. |
The ring test also revealed how the granular CRED criteria lead to more nuanced and potentially more protective evaluations. For instance, in one case study evaluating a fish toxicity test, the Klimisch method led most assessors to categorize it as "reliable" (with or without restrictions). In contrast, the CRED method, with its specific criteria, led a majority of assessors (63%) to categorize the same study as "not reliable," primarily due to deficiencies in experimental design and statistical analysis [5] [2].
Furthermore, analysis shows a strong correlation between the number of fulfilled CRED criteria and the final reliability category. Studies categorized as "reliable without restrictions" fulfilled a mean of 93% of criteria, while those deemed "not reliable" fulfilled only 60% [6].
The comparative data presented above were generated through a standardized, two-phase ring test protocol [2].
Objective: To compare the consistency, user perception, and outcomes of the Klimisch and CRED evaluation methods.
Materials:
Procedure:
Data Interpretation: Lower inter-assessor consistency in Phase I indicated higher subjectivity in the Klimisch method. Positive feedback and higher consistency in Phase II supported the hypothesis that the CRED method provides a more robust and user-friendly framework [2].
The following diagram illustrates the logical workflow for incorporating ecotoxicity studies into regulatory decision-making, highlighting the critical evaluation step where the CRED and Klimisch methods are applied.
Diagram Title: Logical Workflow for Regulatory Ecotoxicity Data Evaluation.
Robust ecotoxicity testing requires standardized reagents and organisms. The following toolkit is essential for generating data suitable for high-reliability evaluation under frameworks like CRED.
Table: Key Research Reagent Solutions for Aquatic Ecotoxicity Testing
| Reagent/Material | Function in Ecotoxicity Testing | Regulatory Context |
|---|---|---|
| Standardized Test Organisms (e.g., Daphnia magna, Danio rerio) | Provides consistent, biologically relevant endpoints (mortality, reproduction, growth). Required for OECD and EPA guideline studies [2]. | |
| Reference Toxicants (e.g., Potassium dichromate, Sodium chloride) | Validates health and sensitivity of test organisms; a core reliability criterion in study evaluation [1]. | |
| Good Laboratory Practice (GLP) Reagents | Chemicals and materials with full traceability and quality documentation. Often used in industry-submitted studies [1] [2]. | |
| Analytical Grade Test Substances | Ensures exposure concentrations are known and verified, a fundamental reliability criterion for both Klimisch and CRED evaluations [1] [7]. | |
| Formulation Vehicles/Controls (e.g., solvent, clean sediment) | Ensures any observed effects are due to the test substance and not the delivery method [1]. |
The transition from the Klimisch method to the CRED framework represents a significant advancement in regulatory toxicology. The evidence shows that CRED's structured, criteria-based approach directly addresses the core regulatory imperative for consistency, transparency, and scientific rigor in data evaluation [2] [4].
For researchers and study authors, adhering to the CRED reporting recommendations—which detail 50 items across six categories like test design and statistical analysis—increases the likelihood that their work will be deemed reliable and relevant for regulatory use [1]. For risk assessors and drug development professionals, adopting the CRED method mitigates the bias and inconsistency inherent in expert judgment, leading to more defensible and harmonized risk assessments across different agencies and regions [4]. Furthermore, by providing a clear rubric, CRED facilitates the inclusion of high-quality peer-reviewed literature alongside GLP studies, potentially expanding the data available for assessment and making better use of scientific resources [2].
Ultimately, the choice of evaluation method is not merely academic; it directly influences which studies inform safety thresholds, thereby impacting environmental protection and regulatory decisions. The experimental and perceptual data confirm that the CRED method provides a more robust, transparent, and scientifically sound tool for fulfilling the critical regulatory task of ecotoxicity data evaluation.
The Klimisch method, introduced in 1997, established a foundational framework for evaluating the reliability of toxicological and ecotoxicological data for regulatory purposes [8]. Developed by scientists from the chemical company BASF, the method was designed to bring a systematic, harmonized approach to data assessment for hazard and risk evaluation [9]. Its primary goal was to provide a clear, standardized process for determining which studies could be trusted to inform regulatory decisions, a critical need in chemical safety assessment. This guide will detail the method's core principles, compare its performance with modern alternatives like the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method, and analyze its lasting influence on regulatory science.
The Klimisch method operates on a four-tier scoring system that categorizes studies based on their adherence to established scientific and regulatory standards. The central concept is reliability, defined as the inherent quality of a test report relating to standardized methodology and the clarity of the experimental description [9]. The method places significant weight on studies conducted according to internationally accepted testing guidelines (e.g., OECD, EPA) and Good Laboratory Practice (GLP) [8].
The four reliability categories are as follows [8] [10]:
In regulatory frameworks like the European Union's REACH Regulation, only studies scoring 1 or 2 are typically accepted as standalone evidence to cover a regulatory endpoint. Studies scoring 3 or 4 may still be considered as supporting information within a "weight of evidence" approach [8].
Table 1: The Klimisch Four-Tier Scoring System
| Klimisch Score | Category | Key Assignment Criteria | Typical Regulatory Use |
|---|---|---|---|
| 1 | Reliable without restriction | Guideline study (preferably GLP); detailed documentation. | Primary evidence for hazard/risk assessment. |
| 2 | Reliable with restriction | Guideline study with minor restrictions; well-documented non-guideline study. | Primary evidence; may require expert justification. |
| 3 | Not reliable | Significant methodological deficiencies; insufficient documentation. | Not accepted as primary evidence; supporting info only. |
| 4 | Not assignable | Abstract or secondary literature; insufficient details for evaluation. | Not accepted as primary evidence; supporting info only. |
While the Klimisch method achieved widespread adoption, its limitations spurred the development of more detailed alternatives. A major critique is that its broad categories lack specific, transparent criteria, leading to potential inconsistency between assessors and a perceived bias towards industry-funded GLP studies over peer-reviewed literature [8] [11]. The CRED method was developed explicitly to address these shortcomings by providing a more granular, transparent, and balanced framework [11].
A structured ring test involving 75 risk assessors from 12 countries directly compared the two methods for evaluating aquatic ecotoxicity studies [11]. The results highlight fundamental differences in design and outcome.
Table 2: Feature Comparison: Klimisch vs. CRED Evaluation Methods
| Characteristic | Klimisch Method | CRED Method |
|---|---|---|
| Primary Scope | Toxicological & ecotoxicological data [8]. | Aquatic ecotoxicity data (specialized) [11]. |
| Evaluation Dimensions | Reliability only [11]. | Reliability and relevance [11]. |
| Number of Criteria | 12-14 reliability criteria for ecotoxicity [11]. | 20 reliability and 13 relevance criteria [11]. |
| Guidance Detail | Limited; relies heavily on expert judgement [11]. | Detailed guidance documents and criteria explanations [11]. |
| Output | Qualitative score (1-4) [8]. | Qualitative evaluation for reliability and relevance [11]. |
The ring test revealed that the CRED method produced more consistent evaluations between different assessors. Participants also perceived CRED as less dependent on subjective expert judgment, more accurate, and practical in terms of using clear criteria, despite requiring slightly more time initially [11]. The study concluded that CRED provides a more transparent and science-based evaluation, making it a suitable successor to Klimisch for ecotoxicity assessment [11].
Other tools have emerged to operationalize or replace the Klimisch approach:
Table 3: Results from a Ring Test Comparing Klimisch and CRED Methods [11]
| Performance Metric | Klimisch Method Outcome | CRED Method Outcome |
|---|---|---|
| Inter-assessor consistency | Lower consistency between different evaluators. | Higher consistency between different evaluators. |
| Perceived dependence on expert judgement | High dependence. | Lower dependence; guided by clear criteria. |
| Perceived accuracy | Moderate. | Higher perceived accuracy. |
| Time required for evaluation | Generally faster. | Slightly longer, but considered a worthwhile trade-off. |
The comparative findings between Klimisch and CRED are grounded in a robust, two-phase ring test experiment [11].
Objective: To compare the categorization consistency, user perception, and practicality of the Klimisch and CRED evaluation methods.
Design & Materials: The test followed a cross-over design. In Phase I, participants evaluated two of eight selected aquatic ecotoxicity studies using the Klimisch method. In Phase II, a different group of participants evaluated a different set of two studies from the same pool using a draft version of the CRED method. This ensured independent evaluation. The eight studies covered various test organisms (e.g., Daphnia magna, algae, fish), chemical classes (pesticides, pharmaceuticals), and included both GLP and non-GLP peer-reviewed literature [11].
Procedure:
Key Outcome: The quantitative analysis of agreement rates and qualitative user feedback demonstrated that the structured, criteria-rich CRED method reduced variability and increased transparency in study evaluation compared to the more subjective Klimisch approach [11].
Despite its limitations, the Klimisch method's integration into major regulatory systems is a testament to its foundational utility.
Evaluating ecotoxicity data, whether by Klimisch, CRED, or other methods, requires an understanding of the standard testing components. Below are key reagents and materials commonly featured in guideline ecotoxicity studies.
Table 4: Key Research Reagents and Materials in Aquatic Ecotoxicity Testing
| Item | Function in Ecotoxicity Testing | Example from CRED Ring Test [11] |
|---|---|---|
| Standard Test Organisms | Well-characterized species with defined sensitivity, enabling reproducible and comparable results across studies. | Daphnia magna (water flea), Lemna minor (duckweed), Rainbow trout. |
| Reference Toxicants | Standard chemicals (e.g., potassium dichromate) used to confirm the health and sensitivity of test organism batches. | Implied in GLP studies; part of quality assurance. |
| Culture Media/Reconstituted Water | Standardized aqueous medium providing essential ions and nutrients for test organisms, controlling water quality variables. | OECD standard reconstituted freshwater for Daphnia or fish tests. |
| Analytical Grade Test Substance | Chemical of interest with verified purity and identity; solutions prepared with precise concentrations for dose-response. | Deltamethrin (pesticide), Erythromycin (pharmaceutical). |
| Solvents/Carriers | Agents like acetone or dimethyl sulfoxide used to dissolve poorly water-soluble test substances; require a solvent control group. | Used for lipophilic test substances. |
| Positive Control Substances | Chemicals with known toxic effects, used to verify the responsiveness of the test system. | Not always required in guideline tests but used in some advanced or in vitro assays. |
The diagrams below illustrate the logical workflows of the Klimisch method and the comparative decision pathways between Klimisch and CRED.
Klimisch Method Evaluation Decision Tree
Conceptual Comparison: Klimisch vs. CRED Evaluation Pathways
In regulatory ecotoxicology, the derivation of safe environmental concentrations, such as Predicted-No-Effect Concentrations (PNECs) or Environmental Quality Standards (EQS), depends entirely on the underlying quality of the ecotoxicity data used [1]. For decades, the predominant tool for judging this quality has been the Klimisch method, developed in 1997 [11]. While it introduced a structured, four-category system for reliability assessment, its widespread application has revealed significant shortcomings [11] [15]. Criticisms center on its lack of detailed criteria, absence of guidance for evaluating relevance, and consequent inconsistency among assessors [11] [16]. These gaps can directly impact regulatory outcomes, potentially leading to the unjustified exclusion of valuable peer-reviewed data or, conversely, to the acceptance of flawed studies [1] [15].
The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) method was developed as a direct, science-based response to these deficiencies [11] [4]. Framed within a broader initiative to improve the transparency and robustness of environmental hazard assessments, CRED provides a more detailed, transparent, and consistent framework for evaluating both the reliability and relevance of aquatic ecotoxicity studies [1] [4]. This comparison guide objectively analyzes the performance of the Klimisch and CRED methods, supported by experimental data from a major international ring test, to illustrate why the evolution from Klimisch to CRED represents a critical advancement for researchers, regulatory scientists, and drug development professionals.
The fundamental differences between the Klimisch and CRED methods lie in their structure, scope, and guiding philosophy. The following table summarizes their core characteristics.
Table 1: Core Characteristics of the Klimisch and CRED Evaluation Methods [11] [1]
| Characteristic | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Primary Scope | General toxicity and ecotoxicity data. | Aquatic ecotoxicity studies (with adaptations for other media). |
| Evaluation Dimensions | Reliability only. | Reliability and Relevance. |
| Number of Criteria | 12-14 reliability criteria. | 20 reliability criteria & 13 relevance criteria. |
| Guidance Provided | Minimal; no detailed guidance on applying criteria. | Extensive guidance text for each criterion. |
| Evaluation Approach | Holistic, highly dependent on expert judgement. | Structured, criteria-driven, with defined questions. |
| Basis for Evaluation | Favors guideline studies and Good Laboratory Practice (GLP). | Based on scientific merit and reporting quality, independent of GLP status. |
| Key Shortcomings | Lack of detail causes inconsistency; no relevance assessment; biases against non-guideline literature. | More time-intensive initially due to detailed assessment. |
The Klimisch method's major gap is its lack of operational detail. It provides a list of criteria (e.g., "test substance characterized") without clarifying what characterization is sufficient, leaving interpretation to the assessor's judgment [11]. This leads to low inter-assessor consistency. Furthermore, it lacks any criteria for evaluating relevance—the appropriateness of a study for a specific regulatory question [17]. Its known bias toward GLP and standard guideline studies can automatically discount methodologically sound and potentially more sensitive peer-reviewed literature [1] [15].
In contrast, CRED is designed for transparency and consistency. Each of its 20 reliability criteria includes specific guiding questions to standardize assessment [1]. Its separate 13 relevance criteria ensure the study's test organism, endpoint, exposure design, and effect concentrations are appropriate for the hazard or risk assessment context [1] [17]. This structured approach reduces arbitrariness.
The logical workflow of each method further highlights their differences.
To quantitatively compare the methods, a two-phase international ring test was conducted with 75 risk assessors from 12 countries, representing industry, academia, consultancy, and government [11] [1].
Experimental Protocol [11]:
The results demonstrated a clear performance difference.
Table 2: Key Quantitative Outcomes from the CRED Ring Test [11]
| Evaluation Metric | Klimisch Method Outcome | CRED Method Outcome | Implication |
|---|---|---|---|
| Inter-assessor Consistency | Low. High variation in assigned reliability categories for the same study. | High. Significantly improved agreement among assessors. | CRED reduces subjective interpretation. |
| Perceived Accuracy | Lower. Felt to be less accurate due to vague criteria. | Higher. Perceived as more accurate for decision-making. | Detailed criteria increase confidence. |
| Handling of Relevance | Not addressed. No criteria or guidance provided. | Explicitly addressed. 13 criteria with guidance enabled structured evaluation. | CRED ensures data is fit for purpose. |
| Dependence on Expert Judgement | Very High. Core decisions rely on individual experience. | Reduced. Structured questions guide the assessor. | CRED standardizes the evaluation process. |
| Practicality (Time vs. Benefit) | Initially faster but less informative. | Initially more time-consuming, but perceived as worthwhile for the added transparency and robustness. | CRED's detail is seen as a valuable investment. |
The ring test concluded that the CRED evaluation method was a suitable replacement for the Klimisch method, offering a more detailed, transparent, and consistent approach that could improve harmonization across regulatory frameworks [11] [16].
The CRED framework's structured yet adaptable nature has allowed it to evolve, addressing specific criticisms of the Klimisch method's rigidity, particularly for novel types of studies and substances.
For researchers generating ecotoxicity data intended for regulatory use or high-impact publication, adherence to rigorous standards is paramount. The following toolkit details essential materials and approaches that align with CRED's criteria, facilitating studies that will be evaluated as highly reliable and relevant.
Table 3: Key Research Reagent Solutions for High-Quality Ecotoxicity Studies
| Tool/Reagent | Primary Function | Importance for CRED/Klimisch Evaluation |
|---|---|---|
| OECD/ISO Test Guidelines | Provide standardized protocols for test design, organism culturing, and endpoint measurement. | Following a guideline directly addresses multiple reliability criteria on test design. CRED references all OECD reporting requirements [11]. |
| Certified Reference Materials & Analytical Standards | Ensure accurate characterization and dosing of the test substance (e.g., purity, concentration verification). | Critical for "test substance characterization" criteria. Lack of this data can downgrade reliability in both methods. |
| Good Laboratory Practice (GLP) | A quality system covering the organizational process and conditions for study planning, performance, and reporting. | Klimisch heavily favors GLP studies [11]. CRED views GLP positively but evaluates the scientific merit irrespective of GLP status [1]. |
| Defined Reference Toxicants | Standard chemicals (e.g., potassium dichromate for Daphnia) used to confirm the health and sensitivity of test organisms. | Demonstrates the adequacy of test organism condition and response, a key reliability criterion. |
| Statistical Analysis Software | Enables robust data analysis (e.g., calculating EC50, NOEC, using proper variance tests). | Essential for fulfilling criteria on statistical design and documentation of results. |
The comparative analysis demonstrates that the criticisms of the Klimisch method's lack of detail and consistency are well-founded and addressed by the CRED framework. By replacing vague prompts with detailed criteria and guidance, and by formally integrating relevance assessment, CRED reduces arbitrary expert judgment and increases transparency. Experimental ring test data confirms that this leads to greater consistency among assessors [11].
For the scientific and regulatory community, adopting CRED or its specialized derivatives (e.g., NanoCRED) means that decisions are built on a more robust and auditable foundation. It enables the more reliable integration of peer-reviewed science into the regulatory process, which is crucial for assessing emerging contaminants like pharmaceuticals and nanomaterials [4] [15]. As noted in a 2022 analysis, 58% of key studies used to restrict hazardous substances under REACH were non-standard studies [15]. Evaluating such studies fairly requires the precise, structured framework that CRED provides. Moving forward, the continued evolution and adoption of these transparent evaluation methods are essential for ensuring that environmental risk assessments keep pace with scientific advancement and provide a high level of environmental protection.
This guide provides a comparative analysis of the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method against established alternatives, primarily the Klimisch method. Framed within broader research on improving ecotoxicity data evaluation, it presents experimental data from validation ring tests, detailed protocols, and essential tools for researchers and regulatory professionals.
The evaluation of ecotoxicity study reliability and relevance is a cornerstone of environmental risk assessment for chemicals, pharmaceuticals, and biocides. For decades, the predominant method for this task was the one established by Klimisch and colleagues in 1997 [11]. While initially a step forward, the Klimisch method faced increasing criticism for its lack of detailed guidance, absence of structured relevance evaluation, and failure to ensure consistent outcomes among different risk assessors [11] [2]. This inconsistency could directly influence hazard assessments, potentially leading to underestimated environmental risks or unnecessary mitigation measures [11].
The CRED project was initiated in 2012 specifically to address these shortcomings [11] [4]. Its core goals were to strengthen the transparency, consistency, and harmonization of chemical hazard and risk assessments across different regulatory frameworks [11] [16]. A key objective was to increase the utilization of high-quality peer-reviewed studies in regulatory dossiers, moving beyond a reliance solely on industry-submitted, Good Laboratory Practice (GLP) studies [1]. The project produced two main outputs: a detailed evaluation method for assessors and a set of reporting recommendations for researchers to improve study quality and regulatory usability [1].
The following tables provide a structured comparison of the CRED method against the Klimisch method and other contemporary evaluation frameworks, based on data from comparative studies and ring tests.
Table 1: Foundational Characteristics of the Klimisch and CRED Evaluation Methods [19]
| Characteristic | Klimisch Method | CRED Evaluation Method |
|---|---|---|
| Primary Data Type | General toxicity and ecotoxicity | Aquatic ecotoxicity (with adaptations for other fields) |
| Reliability Criteria | 12-14 (for ecotoxicity) | 20 evaluation criteria (linked to 50 reporting criteria) |
| Relevance Criteria | 0 (not formally addressed) | 13 specific criteria |
| Alignment with OECD Reporting | Covers 14 of 37 OECD criteria | Covers all 37 OECD criteria |
| Guidance Provided | Limited or none | Extensive guidance for each criterion |
| Evaluation Summary | Qualitative for reliability only | Qualitative for both reliability and relevance |
Table 2: Comparison of Reliability Evaluation Methods for Ecotoxicity Data [20]
| Feature | Klimisch et al. | Durda & Preziosi | Hobbs et al. | Schneider et al. (ToxRTool) |
|---|---|---|---|---|
| Evaluation Categories | Reliable without/with restrictions, Not reliable, Not assignable | High, Moderate, Low quality, Not reliable, Not assignable | High, Acceptable, Unacceptable quality | Reliable without/with restrictions, Not reliable, Not assignable |
| Number of Criteria | 12-14 | 40 | 20 | 21 |
| Covers Relevance? | No | No | No | A few aspects |
| Additional Guidance | No | Yes | No | Yes |
| Regulatory Status | Recommended in REACH guidance | - | - | - |
Table 3: Key Results from the CRED-Klimisch Ring Test (2012-2013) [11] [5]
| Study & Substance | Klimisch Method Results | CRED Method Results | Key Implication |
|---|---|---|---|
| Study E (Estrone, Fish) | 44% "Reliable without restrictions", 56% "Reliable with restrictions" [5]. | 16% "Reliable without restrictions", 21% "Reliable with restrictions", 63% "Not reliable" [5]. | CRED was more critical of a GLP study with potential flaws, reducing automatic acceptance. |
| Overall Ring Test Findings | Higher inconsistency among assessors; perceived as more dependent on expert judgment. | Higher consistency; perceived as more accurate, transparent, and practical. | CRED's structured guidance reduced subjective variability. |
| Participant Perception | - | Over 70% of participants found CRED "less dependent on expert judgement" and "more accurate" [11]. | Strong user preference for the structured CRED approach. |
The comparative data presented above stems from a two-phased international ring test designed to robustly validate CRED against the Klimisch method [11].
Objective: To compare the categorization outcomes, consistency, and user perception of the CRED and Klimisch evaluation methods [2].
Design & Participants:
Procedure:
Outcome Integration: Feedback and consistency analyses from the ring test were directly used to fine-tune the final CRED method, resulting in the version with 20 reliability and 13 relevance criteria [1] [2].
Table 4: Key Research Reagent Solutions for Ecotoxicity Evaluation
| Tool / Resource | Primary Function | Relevance to CRED/Klimisch Context |
|---|---|---|
| CRED Excel Evaluation Tool [4] [18] | Provides a structured, user-friendly spreadsheet to apply the 20 reliability and 13 relevance criteria to a study. | The primary implementation tool for the CRED method, ensuring systematic and transparent evaluation. |
| OECD Test Guidelines (e.g., 201, 210, 211) [11] | International standard protocols for conducting ecotoxicity tests with algae, daphnia, and fish. | Form the scientific basis for many CRED evaluation criteria; studies adhering to these are more likely to be deemed reliable. |
| GLP (Good Laboratory Practice) Standards | A quality system governing the organizational process and conditions under which non-clinical studies are planned, performed, and archived. | Klimisch method is criticized for overly favoring GLP studies. CRED assesses GLP studies critically based on their actual scientific merit [11]. |
| CRED Reporting Recommendations [1] | A checklist of 50 criteria across six categories (e.g., test design, exposure conditions) for authors to report studies comprehensively. | Aids in prospective improvement of study quality and regulatory usability, complementing the retrospective evaluation method. |
| Specialized CRED Extensions (NanoCRED, EthoCRED) [18] | Adapted evaluation frameworks for nanomaterials and behavioral ecotoxicity studies. | Demonstrates the evolution and domain-specific application of the core CRED principles to new scientific frontiers. |
The following diagrams illustrate the structured workflow of the CRED evaluation method and its integrative role in the regulatory decision-making process.
CRED Evaluation Method Workflow
CRED's Role in Regulatory Assessment Harmonization
The regulatory assessment of chemicals hinges on the quality and appropriateness of ecotoxicity data. For decades, the Klimisch method has served as a common tool for evaluating study reliability but has faced increasing criticism for its lack of detail, insufficient guidance, and inability to ensure consistency among assessors [2]. In response, the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) framework was developed to provide a more transparent, consistent, and scientifically robust system for evaluating both the reliability (intrinsic scientific quality) and relevance (suitability for a specific assessment purpose) of studies [1]. This guide objectively compares these two pivotal methodologies, underscoring a fundamental paradigm shift in ecotoxicological data assessment for researchers and regulatory professionals.
In regulatory ecotoxicology, precise definitions are essential for standardized assessment.
A study must be evaluated for both dimensions. A reliable study may be irrelevant for a given assessment (e.g., a high-quality soil toxicity test for an aquatic risk assessment). Conversely, a study that is highly relevant in concept may be judged unreliable due to methodological flaws and thus carry little weight [1].
The following table summarizes the fundamental structural and philosophical differences between the two evaluation frameworks.
Table 1: Foundational Comparison of the Klimisch and CRED Evaluation Methods [2] [11]
| Characteristic | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Primary Scope | General toxicity and ecotoxicity data. | Developed for aquatic ecotoxicity; later adapted for other areas (e.g., behavior, sediments). |
| Core Evaluation Dimensions | Focuses primarily on reliability. Provides no formal criteria for relevance evaluation. | Explicitly and separately evaluates both reliability and relevance. |
| Number of Criteria | 12-14 reliability criteria for ecotoxicity. No structured relevance criteria. | 20 reliability criteria and 13 relevance criteria, each with detailed guidance. |
| Guidance & Transparency | Limited guidance, leading to high dependency on expert judgment and inconsistent outcomes. | Extensive guidance for each criterion improves transparency, reduces arbitrariness, and promotes consistency. |
| Basis for Evaluation | Often criticized for favoring Good Laboratory Practice (GLP) and standard guideline studies, potentially overlooking methodological flaws. | Evaluates study merit based on scientific quality and reporting completeness, regardless of GLP status. |
| Reporting Integration | Not integrated with reporting recommendations. | Linked to 50 specific reporting recommendations across 6 categories to improve study usability. |
| Outcome Categories | Reliability: Reliable without restrictions (R1), Reliable with restrictions (R2), Not reliable (R3), Not assignable (R4). | Provides separate qualitative summaries for reliability and relevance evaluations. |
A pivotal two-phase ring test was conducted with 75 risk assessors from 12 countries to compare the methods empirically [2] [11]. Participants evaluated a set of eight ecotoxicity studies using the Klimisch method in Phase I and the draft CRED method in Phase II.
Table 2: Key Quantitative Results from the CRED Ring Test Illustrating Method Differences [5] [2]
| Study & Context | Klimisch Method Result | CRED Method Result | Implication of Discrepancy |
|---|---|---|---|
| Study E (Fish chronic test with estrone) | 44% rated "Reliable without restrictions" (R1); 56% "Reliable with restrictions" (R2). Mean score: 1.6 (scale R1=1 to R3=3). | 16% rated "Reliable without restrictions"; 21% "Reliable with restrictions"; 63% "Not reliable". Mean score: 2.5. | CRED's detailed criteria led to a significantly more critical reliability assessment, downgrading a GLP study due to methodological flaws the Klimisch method often overlooks. |
| General Ring Test Findings | High inconsistency among assessors. Perceived as less accurate, more dependent on subjective judgment. | Higher consistency among assessors. Perceived as more accurate, transparent, and practical. | The structured, criteria-based approach of CRED reduces evaluator bias and increases harmonization in risk assessments. |
The experimental design of the ring test provides a model for methodological comparison [2] [11].
Objective: To compare the consistency, transparency, and user perception of the Klimisch and CRED evaluation methods. Design: A two-phase, blinded, cross-over study. Participants: 75 experienced risk assessors from industry, academia, consultancy, and government across 12 countries. Study Set: Eight diverse aquatic ecotoxicity studies (chronic/acute, various species/chemicals, peer-reviewed and GLP reports). Procedure:
The following table details key resources and materials central to conducting and evaluating ecotoxicity studies within the discussed frameworks.
Table 3: Key Reagents and Resources for Ecotoxicity Study Evaluation
| Tool / Resource | Function in Evaluation | Context in Klimisch/CRED |
|---|---|---|
| OECD Test Guidelines | International standard test protocols (e.g., OECD 210: Fish Early-Life Stage). | Klimisch heavily favors guideline studies. CRED uses them as a quality benchmark but assesses adherence critically. |
| Good Laboratory Practice (GLP) | A quality system covering the organizational process and conditions for non-clinical studies. | Klimisch is criticized for equating GLP with high reliability. CRED assesses GLP studies on their detailed methodological merits. |
| CRED Evaluation Excel Tool | A structured spreadsheet implementing the 20 reliability and 13 relevance criteria with guidance. | The primary practical tool for applying the CRED method, promoting standardization [4]. |
| Reporting Checklist | A list of 50 specific items across six categories (test design, organism, exposure, etc.) to ensure complete reporting. | Integrated with the CRED method to prospectively improve study quality and regulatory utility [1]. |
| EthoCRED & NanoCRED Modules | Specialized extensions of the CRED criteria for behavioral ecotoxicity and nanomaterial studies [21] [18]. | Demonstrates the adaptive, domain-specific evolution of the CRED framework beyond the static Klimisch method. |
The following diagrams illustrate the procedural workflow of the CRED method and the conceptual relationship between reliability and relevance in regulatory assessment.
CRED Evaluation Method Workflow
Relationship Between Reliability, Relevance, and Regulatory Assessment
The comparison reveals that the CRED method is a demonstrable successor to the Klimisch method, addressing its core shortcomings through structured, transparent criteria for both reliability and relevance. Empirical ring test data shows CRED leads to more consistent and critically scrutinized evaluations [2]. The ongoing development of specialized modules like EthoCRED for behavioral endpoints confirms the framework's adaptability to evolving scientific frontiers [21] [22]. For researchers, employing CRED's reporting guidelines increases the regulatory utility of their work. For assessors, adopting CRED minimizes bias and promotes harmonized, defensible risk assessments across regulatory frameworks.
The regulatory assessment of chemicals for environmental safety is fundamentally dependent on the quality of ecotoxicity studies. For decades, the Klimisch method, established in 1997, served as the backbone for evaluating study reliability within frameworks like REACH and the Water Framework Directive [11]. However, this method has faced increasing criticism for its lack of detailed guidance, inconsistency among assessors, and failure to formally evaluate a study's relevance for a specific regulatory purpose [11] [1]. These shortcomings can directly impact risk assessments, potentially leading to either unnecessary mitigation measures or the underestimation of environmental threats [11].
In response, the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method was developed as a science-based replacement. The CRED project aimed to strengthen the transparency, consistency, and robustness of chemical hazard assessments by providing detailed, structured criteria for evaluating both the reliability and relevance of aquatic ecotoxicity studies [11] [4]. This guide deconstructs the CRED checklist, particularly its 20 reliability criteria, and provides a comparative analysis with the Klimisch method, supported by experimental ring test data.
The core thesis posits that the CRED evaluation method represents a significant advancement over the Klimisch method by introducing a more granular, transparent, and less subjective framework. This shift is designed to reduce the dependency on expert judgment, minimize discrepancies between assessors, and enable the broader inclusion of high-quality peer-reviewed literature in regulatory decision-making [1].
The fundamental differences between the two methods are structural and philosophical. The Klimisch method offers a limited set of 12-14 reliability criteria with no specific guidance for relevance, often leading assessors to make binary "reliable/not reliable" judgments that can be overly influenced by a study's adherence to Good Laboratory Practice (GLP) [11] [2]. In contrast, CRED expands the analytical lens with 20 explicit reliability criteria and 13 relevance criteria, each accompanied by extensive guidance [1] [23]. This structure encourages a systematic, criterion-by-criterion assessment that separates the intrinsic scientific quality of a study (reliability) from its applicability to a specific regulatory question (relevance) [1].
Table 1: Foundational Comparison of the Klimisch and CRED Evaluation Methods
| Characteristic | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Primary Focus | Reliability of toxicity/ecotoxicity studies | Reliability & Relevance of aquatic ecotoxicity studies |
| Number of Reliability Criteria | 12-14 (for ecotoxicity) | 20 |
| Number of Relevance Criteria | 0 | 13 |
| Guidance Provided | Limited, high-level criteria | Extensive guidance for each criterion |
| Evaluation Output | Qualitative reliability category (R1-R4) | Qualitative reliability and relevance categories |
| Basis for Development | Industry experience and regulatory need | OECD guidelines, existing methods, multi-stakeholder expert input [11] |
| Inclusion of OECD Reporting Criteria | 14 out of 37 | All 37 [11] |
The 20 reliability criteria of CRED provide a comprehensive framework for assessing the inherent quality of an aquatic ecotoxicity study's design, conduct, and analysis. They are designed to be applicable to a broad range of tests, from acute to chronic, across various organisms [1]. These criteria can be conceptually grouped into several key domains of experimental science:
The application of these criteria is not a simple checklist exercise. Assessors evaluate each criterion, and the overall judgment on reliability is informed by the pattern of met and unmet criteria [6]. The method allows for a nuanced conclusion where a study with minor limitations may still be considered "reliable with restrictions," while a study with critical flaws in fundamental design or execution would be deemed "not reliable" [1].
A pivotal two-phase ring test was conducted to empirically compare the performance of the Klimisch and CRED methods [11] [2]. In this test, 75 risk assessors from 12 countries evaluated a set of eight diverse ecotoxicity studies, using one method in the first phase and the other in the second [11] [2].
The results demonstrated that the CRED method led to more conservative and differentiated reliability assessments. For example, in the evaluation of a GLP study on fish toxicity, the Klimisch method resulted in 44% of assessors rating it "reliable without restrictions," while CRED led to only 16% giving the same rating, with 63% categorizing it as "not reliable" due to identified flaws [5]. This indicates CRED's effectiveness in preventing the automatic acceptance of GLP studies and ensuring a more critical appraisal.
Furthermore, the ring test quantified the relationship between the fulfillment of CRED criteria and the final reliability categorization. As shown in the data below, a clear gradient exists: studies rated as "reliable without restrictions" met nearly all criteria on average, while those rated "not reliable" met significantly fewer [6].
Table 2: CRED Criteria Fulfillment Linked to Reliability Categorization [6]
| CRED Reliability Category | Mean % of Criteria Fulfilled | Standard Deviation | Minimum % | Maximum % |
|---|---|---|---|---|
| Reliable without restrictions | 93% | 12 | 79% | 100% |
| Reliable with restrictions | 72% | 12 | 47% | 90% |
| Not reliable | 60% | 15 | 21% | 90% |
| Not assignable | 51% | 15 | 21% | 64% |
Participant feedback strongly favored the CRED method. A majority of risk assessors found CRED to be more accurate, consistent, transparent, and less dependent on expert judgment than the Klimisch method. Importantly, they also found it practical in terms of time needed for evaluation [11] [2].
The comparative data supporting CRED's efficacy were generated through a rigorously designed ring test. The protocol ensured robust, independent comparisons between the two evaluation methods [11] [2].
1. Study Selection & Participant Recruitment:
2. Two-Phase, Independent Evaluation Design:
3. Data Collection & Analysis:
4. Outcome Integration:
Diagram Title: Two-Phase Ring Test Workflow for CRED vs. Klimisch Method Comparison
The CRED method provides a structured pathway for assessors, reducing ad-hoc decision-making. The following diagram illustrates the logical workflow an evaluator follows when applying the CRED checklist to an ecotoxicity study.
Diagram Title: CRED Evaluation Workflow: From Reliability to Relevance Judgment
Implementing and evaluating studies aligned with CRED principles requires specific tools and resources. The following table details key items essential for conducting robust ecotoxicity research and for performing systematic study evaluations.
Table 3: Research Reagent & Evaluation Toolkit for CRED-Aligned Ecotoxicology
| Tool/Reagent Category | Specific Item/Resource | Primary Function in Research/Evaluation |
|---|---|---|
| Reference Guidelines | OECD Test Guidelines (e.g., 201, 210, 211) | Provide internationally standardized protocols for conducting tests, forming the benchmark for methodological reliability [11]. |
| Evaluation Framework | CRED Excel Tool (freely available) | Provides the structured checklist of 20 reliability and 13 relevance criteria with guidance to ensure consistent, transparent study evaluation [4]. |
| Test Organisms | Cultured strains of Daphnia magna, Danio rerio (zebrafish), Pseudokirchneriella subcapitata (algae) | Represent standardized, ecologically relevant trophic levels (invertebrates, vertebrates, primary producers) for assessing chemical impacts. |
| Analytical Chemistry | Chemical reference standards, HPLC/MS systems | For verifying the identity, purity, and measured concentration of the test substance in exposure media—a core CRED reliability criterion. |
| Water Quality Control | Conductivity meters, pH meters, dissolved oxygen probes | For monitoring and reporting key exposure parameters, ensuring test conditions are within acceptable ranges for the test organisms. |
| Statistical Software | Programs capable of ANOVA, regression, and EC/LC50 calculation (e.g., R, GraphPad Prism) | For appropriate data analysis, a critical component of both conducting studies and evaluating the plausibility of reported results. |
| Specialized Extensions | NanoCRED, EthoCRED, CRED for sediment/soil frameworks | Adapted CRED criteria for evaluating studies on nanomaterials, behavioral endpoints, and non-aqueous exposure matrices [18]. |
The deconstruction of the CRED checklist reveals a comprehensive, systematic, and transparent framework specifically designed to address the deficiencies of the historically dominant Klimisch method. By expanding from a handful of general criteria to 20 detailed reliability and 13 relevance criteria, CRED transforms study evaluation from an exercise in expert judgment into a structured, documentable process. Empirical ring test data confirm that this structure leads to more consistent, conservative, and critical appraisals of study quality, effectively demoting poorly performed studies regardless of their GLP status while providing clear reasoning for such decisions.
For researchers and drug development professionals, adopting CRED's companion reporting recommendations (comprising 50 criteria across six categories) during study design and manuscript preparation proactively ensures that resulting publications will contain the necessary information for favorable regulatory evaluation [1] [23]. As the CRED method gains traction in regulatory pilots and evolves into specialized toolkits like NanoCRED and EthoCRED, it is establishing itself as the contemporary standard for achieving robust, reproducible, and harmonized environmental safety assessments [4] [18].
The derivation of Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) hinges on the use of reliable and relevant ecotoxicity studies [1]. Historically, evaluating these studies has relied heavily on expert judgment, leading to potential bias and inconsistency, where different risk assessors may draw divergent conclusions from the same data [1] [2]. To address this, the Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) evaluation method was developed as a science-based successor to the widely used but criticized Klimisch method [2].
The Klimisch method, established in 1997, categorizes study reliability simply as "reliable without restrictions," "reliable with restrictions," "not reliable," or "not assignable" [2]. While foundational, it has been criticized for lacking detailed guidance, promoting inconsistency, and exhibiting bias towards industry-sponsored Good Laboratory Practice (GLP) studies, potentially at the expense of relevant peer-reviewed literature [1] [2]. In contrast, the CRED method provides a structured, transparent framework with explicit criteria for evaluating both the reliability (intrinsic scientific quality) and relevance (appropriateness for a specific assessment purpose) of aquatic ecotoxicity studies [1]. This guide objectively compares these two methods, focusing on the application of CRED's 13 relevance criteria to align studies with assessment goals.
The fundamental differences between the Klimisch and CRED evaluation methods are structural and philosophical. The table below summarizes their core characteristics:
Table 1: Foundational Comparison of the Klimisch and CRED Evaluation Methods
| Feature | Klimisch Method | CRED Evaluation Method |
|---|---|---|
| Primary Focus | Reliability of studies (especially guideline/GLP studies) [2]. | Reliability and relevance of all aquatic ecotoxicity studies [1]. |
| Evaluation Categories | Reliability only: R1, R2, R3, R4 [2]. | Separate scores for Reliability (20 criteria) and Relevance (13 criteria) [1] [2]. |
| Guidance Level | Limited, leading to high dependence on expert judgement [2]. | Extensive, with detailed guidance for each criterion to improve consistency [1]. |
| Scope of Application | Primarily ecotoxicity studies; encourages use of guideline studies [1]. | Designed for aquatic ecotoxicity but adapted for nanomaterials (NanoCRED), behavior (EthoCRED), and sediment/soil studies [18]. |
| Outcome Transparency | Low; provides a final category with minimal documentation on how it was reached [2]. | High; requires documented assessment against specific criteria, making the evaluation process traceable [1]. |
A core advancement of the CRED method is its explicit and detailed evaluation of relevance, which is defined as "the extent to which data and tests are appropriate for a particular hazard identification or risk characterisation" [2]. This is distinct from reliability. A study can be perfectly reliable (well-conducted and reported) but irrelevant for a specific regulatory question (e.g., a soil organism test for a water quality standard) [1].
CRED's relevance evaluation is based on 13 criteria designed to match the study to the assessment goal [1] [2]. These criteria can be grouped into key domains:
The evaluator assesses each criterion, and the overall relevance is categorized as "relevant without restrictions" (C1), "relevant with restrictions" (C2), or "not relevant" (C3) [2].
A rigorous two-phase ring test was conducted to compare the performance of the Klimisch and CRED methods. In this test, 75 risk assessors from 12 countries evaluated a set of eight ecotoxicity studies [2]. The results highlight significant differences in consistency and outcome.
The ring test revealed that the CRED method led to more conservative and differentiated reliability assessments. For example, in one fish toxicity study, the Klimisch method led evaluators to categorize the study as reliable (with or without restrictions), while the CRED method resulted in most evaluators (63%) categorizing it as "not reliable" [5]. Furthermore, the CRED method significantly improved consistency among evaluators. The table below summarizes key quantitative findings from the ring test:
Table 2: Summary of Ring Test Results Comparing Klimisch and CRED Methods [5] [2]
| Metric | Klimisch Method | CRED Evaluation Method | Implication |
|---|---|---|---|
| Average Consensus on Reliability | Lower consistency among evaluators [2]. | Higher consistency among evaluators [2]. | CRED reduces subjective interpretation. |
| Categorization of a Sample Fish Study | 44% "Reliable without restrictions", 56% "Reliable with restrictions" [5]. | 16% "Reliable without restrictions", 21% "Reliable with restrictions", 63% "Not reliable" [5]. | CRED prompts more critical appraisal of study flaws. |
| Perceived Uncertainty in Evaluation | Higher perceived uncertainty [2]. | Lower perceived uncertainty [2]. | Detailed guidance increases assessor confidence. |
| Average Time for Evaluation | Reported as slightly less time-consuming [2]. | Reported as slightly more time-consuming but worthwhile [2]. | CRED's thoroughness requires an initial time investment. |
The methodology of the ring test provides a template for comparative method validation [2].
The following diagram illustrates the logical workflow for evaluating a study using the CRED method, highlighting the distinct but interconnected assessment of reliability and relevance.
CRED Evaluation Workflow: Reliability and Relevance Assessment
Conducting and evaluating high-quality ecotoxicity studies requires standardized materials. The following table lists key reagent solutions and their functions in aquatic testing protocols.
Table 3: Key Research Reagent Solutions for Aquatic Ecotoxicity Testing
| Reagent/Material | Function in Ecotoxicity Testing | Example Use & Importance |
|---|---|---|
| Reconstituted Standardized Test Water | Provides a consistent, defined medium for exposure tests. | Used in OECD guidelines (e.g., 201, 210, 211) to control water chemistry (hardness, pH), ensuring reproducibility and allowing comparison across studies [2]. |
| Reference Toxicant Stock Solution | Serves as a positive control to validate test organism health and sensitivity. | A standard substance (e.g., potassium dichromate for Daphnia) is tested regularly to confirm that control organism responses are within an accepted historical range. |
| Test Substance Vehicle/Solvent | Dissolves or disperses poorly soluble substances for accurate dosing. | Solvents like acetone or dimethyl sulfoxide (DMSO) must be non-toxic at the concentrations used and include a solvent control in the experimental design. |
| Formulated Algal or Zooplankton Feed | Provides nutrition for chronic tests with primary producers or consumers. | Specific algal species (e.g., Pseudokirchneriella subcapitata) or prepared suspensions are essential for reproduction and growth tests (e.g., OECD 211, Daphnia reproduction). |
| Preservative/Fixative Solution | Stabilizes biological samples for endpoint analysis. | Used to fix organisms for later counting (e.g., in algal tests) or to preserve tissue samples for sub-lethal biochemical or histological analysis. |
| Analytical Grade Test Chemical | The substance of interest with known and documented purity. | High and verified purity is critical for defining the exact exposure concentration, especially for substances with impurities that may be toxic. |
The regulatory assessment of chemicals hinges on the reliable and transparent evaluation of toxicological data. For decades, the Klimisch method, introduced in 1997, served as the international standard for assessing study reliability, categorizing data into four scores from "reliable without restriction" to "not assignable" [8]. Its structured approach aimed to harmonize evaluations but has faced criticism for lacking detailed criteria, being overly reliant on expert judgment, and potentially favoring industry-funded Good Laboratory Practice (GLP) studies [16] [8].
In response, the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) framework was developed to strengthen consistency and transparency [16]. CRED provides a more granular system for evaluating both the reliability (internal scientific validity) and relevance (appropriateness for the assessment question) of aquatic ecotoxicity studies [18]. A pivotal ring test demonstrated that risk assessors perceive CRED as more transparent, accurate, and consistent than the Klimisch method [16]. This guide provides a practical, comparative navigation of the CRED Excel-based tools and documentation process, contextualized within the broader methodological shift from Klimisch to CRED for robust environmental risk assessment.
The core difference between the Klimisch and CRED methods lies in their structure and granularity. Klimisch offers a high-level, score-based categorization, while CRED deconstructs the evaluation into detailed, criteria-driven checklists.
Table 1: Core Comparison of Klimisch and CRED Evaluation Methods
| Feature | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Primary Output | A single reliability score (1-4) [8]. | Separate scores for Reliability (1-3) and Relevance (High, Medium, Low), plus an overall Adequacy judgment [16]. |
| Evaluation Basis | Broad categories centered on compliance with testing guidelines and GLP [10] [8]. | Detailed criteria across multiple domains: test substance characterization, test organism, experimental design, and statistical analysis [16]. |
| Transparency | Low; the rationale for a score is not systematically documented. | High; scoring is based on explicit answers to a standardized checklist. |
| Handling of Relevance | Not formally addressed within the scoring system. | Explicitly evaluated as a core component, independent of reliability. |
| Key Criticism | Lacks detail, subjective, may undervalue non-GLP but scientifically sound studies [16] [8]. | More time-consuming initially due to its comprehensive nature. |
| Regulatory Adoption | Widely embedded in EU systems (e.g., REACH, IUCLID) [10] [8]. | Recommended as a superior replacement; specialized versions (e.g., NanoCRED, EthoCRED) are being developed [18]. |
The comparative performance of CRED and Klimisch was empirically tested in a two-phased international ring test involving 75 risk assessors from 12 countries [16].
3.1 Experimental Protocol
3.2 Results and Performance Data The ring test generated quantitative data on the performance and perception of the two methods.
Table 2: Ring Test Performance Metrics for Klimisch vs. CRED [16]
| Metric | Klimisch Method | CRED Method | Interpretation |
|---|---|---|---|
| Inter-assessor Consistency | Lower | Higher | CRED's detailed criteria reduced subjective interpretation, leading to more uniform evaluations across different experts. |
| Perceived Accuracy | Less accurate | More accurate | Participants felt CRED's structured approach captured study quality more precisely. |
| Perceived Transparency | Less transparent | More transparent | The explicit checklist made the evaluation process and rationale clear and auditable. |
| Dependence on Expert Judgement | High | Lower | CRED's objective questions reduced the burden of personal judgment. |
| Practicality & Time Needed | Faster, but less detailed | Slightly more time, but deemed practical | While CRED took more time initially, assessors considered the investment worthwhile for the improved output quality. |
The CRED evaluation is operationalized through a dedicated Excel workbook, which structures the documentation process. The following diagram illustrates the core workflow for evaluating a study using this tool.
CRED Excel Workflow for Study Evaluation
4.1 Initial Setup and Study Input
4.2 Executing the Reliability Evaluation
4.3 Executing the Relevance Evaluation
4.4 Final Integration and Documentation
The transition from Klimisch to CRED represents a fundamental shift from a summary judgment to an analytical, criterion-based process. The following diagram contrasts the logical structure of the two evaluation frameworks.
Structural Comparison of Evaluation Frameworks
Conducting robust data evaluations requires a suite of tools and resources beyond the core scoring sheet.
Table 3: Essential Toolkit for Ecotoxicity Data Evaluation
| Tool/Resource | Function & Purpose | Source/Access |
|---|---|---|
| CRED Excel Workbook | The primary tool for executing the CRED evaluation, containing the checklists and scoring logic [18]. | Available for download from the SciRAP website [18]. |
| ECOTOX Database | A comprehensive, publicly available database of ecotoxicity studies used by the U.S. EPA to screen open literature for risk assessments [7]. | U.S. EPA ECOTOXicology database. |
| ToxRTool (Excel) | An alternative tool that provides detailed questions to guide the assignment of a Klimisch score, improving its consistency [10]. | Developed by ECVAM (EURL ECVAM) [10]. |
| IUCLID Software | The regulatory data management software under the EU's REACH regulation, which has the Klimisch score as a standard data field [10] [8]. | European Chemicals Agency (ECHA). |
| Guideline Documents (OECD, EPA) | Define standard test methods and reporting requirements, serving as the benchmark for reliability criteria [8] [7]. | OECD, U.S. EPA, and other regulatory body websites. |
| Specialized CRED Extensions | Tailored checklists for novel data types: NanoCRED for nanomaterials, EthoCRED for behavioral studies, and CRED for sediment and soil [18]. | Scientific literature and the SciRAP platform [18]. |
The CRED method, with its transparent, criterion-based Excel tool, represents a significant advancement over the Klimisch system for the evaluation of ecotoxicity data. Empirical evidence shows it enhances consistency and reduces subjectivity among assessors [16]. For researchers and regulators, mastering the CRED Excel sheet is not merely an administrative task but a fundamental practice for ensuring the scientific integrity of hazard and risk assessments.
The future of data evaluation is moving towards greater integration and automation. Methods like the Database-Calibrated Assessment Process (DCAP), which automates the derivation of toxicity values from large databases, seek to increase efficiency [14]. Furthermore, the rise of AI-based toxicity prediction models using sources like ToxCast data highlights the growing need for rigorously evaluated, high-quality training data [25]. In this evolving landscape, robust, transparent evaluation frameworks like CRED are more critical than ever. They provide the essential foundation of trust in primary data, which in turn supports the development of reliable next-generation assessment methodologies [25] [14].
The regulatory assessment of chemicals hinges on the quality of underlying ecotoxicity data. For decades, the Klimisch method, established in 1997, has been the standard for evaluating study reliability within frameworks like REACH [8] [4]. While it introduced a systematic approach, this method has faced sustained criticism for its lack of detailed guidance, inconsistency among assessors, and potential bias toward industry-funded Good Laboratory Practice (GLP) studies [16] [11]. This has created a pressing need for a more robust, transparent, and scientifically rigorous framework.
In response, the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method was developed [18]. CRED represents a paradigm shift, moving from a simplistic scoring system to a comprehensive evaluation and reporting framework. Its core thesis is that transparent evaluation and detailed reporting are inseparable for improving the reliability and regulatory utility of ecotoxicity studies [1]. This guide provides a comparative analysis for researchers and assessors, detailing how CRED's structured criteria—particularly its 50 reporting requirements—offer a superior path from data generation to defensible regulatory decision-making.
The fundamental differences between the Klimisch and CRED methods are structural and philosophical. The table below summarizes their core characteristics.
Table 1: Foundational Comparison of the Klimisch and CRED Evaluation Methods
| Characteristic | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Primary Purpose | Assign a reliability score for regulatory acceptance [8]. | Evaluate reliability and relevance to strengthen hazard/risk assessment transparency [16] [1]. |
| Evaluation Scope | Reliability only [11]. | Separate, detailed evaluation of Reliability (20 criteria) and Relevance (13 criteria) [1]. |
| Output | A single score (1-4): Reliable without/with restriction, Not reliable, Not assignable [8]. | Qualitative summary of reliability and relevance, identifying specific strengths and weaknesses [11]. |
| Guidance Detail | Limited criteria; high dependence on expert judgment [16]. | Extensive guidance for each criterion to minimize subjectivity [1]. |
| Basis for Evaluation | Heavily weighted toward guideline compliance and GLP [11]. | Detailed assessment of study design, conduct, analysis, and reporting quality, regardless of GLP status [16]. |
| Associated Reporting Aid | None. | 50 reporting criteria across six categories to guide study authors [1]. |
A critical quantitative distinction is the granularity of assessment. The Klimisch method, often implemented via tools like ToxRTool, uses a checklist (e.g., 21 for in vivo studies) to arrive at one of four scores [26]. In contrast, CRED decomposes the evaluation into 33 discrete criteria (20 for reliability, 13 for relevance), each with specific guidance [1]. Furthermore, CRED's forward-looking component includes 50 reporting criteria designed to ensure studies contain all information necessary for a definitive evaluation [1].
The development and validation of the CRED method were supported by a two-phase international ring test involving 75 risk assessors from 12 countries [16] [11].
The ring test generated compelling performance data in favor of the CRED method.
Table 2: Ring Test Results Comparing Klimisch and CRED Method Performance
| Performance Metric | Klimisch Method Outcome | CRED Method Outcome | Interpretation |
|---|---|---|---|
| Inter-assessor Consistency | Lower consistency in reliability categorization among assessors [16]. | Higher consistency in evaluations across different assessors [16]. | CRED's detailed criteria reduce subjective interpretation. |
| Perceived Accuracy | Viewed as less accurate due to overly broad criteria [11]. | 85% of participants rated CRED as "accurate" or "very accurate" [11]. | Detailed guidance improves confidence in evaluation outcomes. |
| Perceived Transparency | Criticized for lack of transparency in scoring decisions [16]. | 88% of participants found CRED to be "transparent" or "very transparent" [11]. | Explicit criteria make the evaluation process auditable. |
| Dependence on Expert Judgement | High dependence, leading to variability [11]. | Perceived as less dependent on individual expert judgement [16]. | Standardization reduces personal bias. |
| Evaluation Time | Generally faster due to simpler criteria [11]. | Slightly more time-consuming, but participants found the time investment "acceptable" given the improved output [11]. | The benefit of thoroughness outweighs the time cost. |
| Handling of Non-Guideline Studies | Tends to downgrade non-guideline or non-GLP studies [11]. | Enables structured, fair evaluation of all studies based on scientific merit [1]. | Promotes inclusion of high-quality peer-reviewed literature. |
The data demonstrate that CRED addresses the core shortcomings of the Klimisch method. By providing a structured, transparent, and detailed framework, it reduces arbitrariness and builds a more reliable foundation for risk assessment decisions [16].
The CRED framework establishes a logical progression from assessing existing studies to guiding the creation of new, high-quality data. The following diagram illustrates this integrated workflow for a risk assessor or study author.
CRED Evaluation and Reporting Cycle
A cornerstone of CRED is its proactive set of 50 reporting criteria, designed to ensure studies are published with sufficient detail for regulatory use [1]. These criteria transform evaluation checkpoints into authoring guidelines, covering six essential categories:
Table 3: Overview of CRED's 50 Reporting Criteria for Ecotoxicity Studies
| Reporting Category | Number of Criteria | Purpose and Key Examples |
|---|---|---|
| 1. General Information | 6 | Ensure traceability and context. Examples: Funding source, declaration of conflicts, precise citation [1]. |
| 2. Test Design | 8 | Document the experimental blueprint. Examples: Clear study objective, test type (acute/chronic), duration, loading rate (for sediment tests) [1]. |
| 3. Test Substance | 7 | Characterize the chemical agent. Examples: Substance identity, source, purity, chemical analysis of exposure concentrations, stability information [1]. |
| 4. Test Organism | 7 | Describe the biological model. Examples: Species name, life stage, source, cultivation conditions, feeding regime, health status [1]. |
| 5. Exposure Conditions | 13 | Detail the test environment. Examples: Test vessel type, media composition, temperature, pH, light, aeration, renewal method for water/sediment [1]. |
| 6. Statistical Design & Biological Response | 9 | Guarantee robust data analysis. Examples: Number of replicates, sample size, randomization procedure, statistical methods, raw data availability, dose-response curves [1]. |
Adherence to these criteria by journal authors directly addresses the common shortcomings that lead to studies being categorized as "not reliable" or "not assignable" under older systems. It bridges the gap between academic publishing and regulatory data needs.
Adopting the CRED framework requires specific tools and resources. The following toolkit is essential for researchers conducting studies and assessors evaluating them.
Table 4: Research Reagent Solutions & Essential Tools for CRED Implementation
| Tool / Resource | Type | Primary Function | Access / Notes |
|---|---|---|---|
| CRED Evaluation Excel Tool | Software Tool | Guides the evaluator through all 33 reliability and relevance criteria with embedded guidance, producing a standardized summary [4]. | Freely available for download from project websites [4]. |
| CRED 50 Reporting Checklist | Reporting Template | Serves as a checklist for authors to ensure all necessary methodological and results data are included in a manuscript or supplementary information [1]. | Found within the CRED publications and tools [1]. |
| ToxRTool | Software Tool | An Excel-based tool that implements a criteria checklist to assign a Klimisch score (1, 2, or 3). Useful for understanding the traditional evaluation path [10]. | Developed by ECVAM; available online [10]. |
| OECD Test Guidelines | Reference Standards | Define standardized test methods for the toxicity testing of chemicals. CRED's reliability criteria are aligned with these guidelines [11]. | Essential reference for designing studies and evaluating guideline compliance. |
| NanoCRED & EthoCRED | Specialized Frameworks | Domain-specific adaptations of CRED for evaluating ecotoxicity studies of nanomaterials and behavioral endpoints, respectively [18]. | Extend the CRED principles to emerging and complex endpoints. |
| GLP Principles | Quality System | A set of rules for non-clinical safety studies. While CRED does not require GLP, understanding its principles is important for evaluating study conduct [11]. | Provides a benchmark for data recording and study management practices. |
The evidence from comparative ring tests and methodological analysis clearly supports CRED as a scientifically superior successor to the Klimisch method. By decoupling and deeply evaluating reliability and relevance, CRED introduces necessary rigor and transparency into chemical safety assessments [16]. For study authors, the integrated 50 reporting criteria provide a clear roadmap for producing work that meets both high scientific and regulatory standards, maximizing its impact and utility.
The CRED framework is not static; it is evolving to meet new challenges. Specialized versions like NanoCRED (for nanomaterials) and EthoCRED (for behavioral ecotoxicity) demonstrate its adaptability [18]. Furthermore, CRED is being piloted in the revision of major regulatory guidance documents, such as the EU's Technical Guidance Document for Environmental Quality Standards, signaling its growing acceptance as a new best practice [4]. For the scientific and regulatory community, adopting CRED represents a decisive step toward more consistent, transparent, and scientifically defensible environmental risk assessments.
The regulatory assessment of chemicals relies heavily on high-quality ecotoxicity data to determine Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQS) [1]. For decades, the primary tool for evaluating the reliability of such data has been the Klimisch method, established in 1997 [16]. This method assigns studies to one of four categories: "Reliable without restrictions," "Reliable with restrictions," "Not reliable," or "Not assignable" [8]. While it standardized evaluations, the Klimisch method has been criticized for its lack of detailed guidance, its potential bias toward industry-funded Good Laboratory Practice (GLP) studies, and its failure to ensure consistent judgments among different risk assessors [16] [11].
To address these shortcomings, the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method was developed. CRED provides a more transparent, detailed, and structured framework for evaluating both the reliability and relevance of aquatic ecotoxicity studies [1]. This walkthrough compares both methods within the context of a broader thesis on modernizing ecotoxicity data evaluation, highlighting how CRED's systematic approach enhances consistency and decision-making in environmental risk assessment.
The core difference between the methods lies in their structure and granularity. The following table summarizes their foundational characteristics.
Table 1: Foundational Characteristics of the Klimisch and CRED Evaluation Methods
| Characteristic | Klimisch Method | CRED Method |
|---|---|---|
| Primary Focus | Reliability of toxicological & ecotoxicological data [8]. | Reliability and relevance of aquatic ecotoxicity data [16] [1]. |
| Number of Criteria | 12-14 (ecotoxicity) [11]. | 20 reliability and 13 relevance criteria [1] [4]. |
| Guidance Detail | Limited, high-level descriptions [16]. | Extensive guidance for each criterion [1]. |
| Evaluation Output | Single reliability score (1-4) [10]. | Separate, qualitative summaries for reliability and relevance [11]. |
| Basis | General scientific and guideline acceptance [8]. | OECD test guidelines and comprehensive reporting standards [11]. |
The Klimisch method is a high-level screening tool. An evaluator assesses a study against broad descriptors to assign one of four scores [10] [8]:
Its application often depends on expert judgment, leading to potential inconsistency [11].
CRED evaluation is a detailed, two-part process requiring separate assessments for reliability and relevance [1].
A key conceptual advancement in CRED is the clear separation between reliability (inherent quality) and relevance (fitness for purpose). A study can be highly reliable but irrelevant for a specific assessment, and vice-versa [1].
A direct comparison was made possible through a two-phase international ring test involving 75 risk assessors from 12 countries [16] [11].
The ring test yielded clear, quantitative differences in consistency and user perception.
Table 2: Ring Test Results on Evaluator Agreement and Perception
| Evaluation Aspect | Klimisch Method Outcome | CRED Method Outcome | Implication |
|---|---|---|---|
| Agreement on Reliability Category | Low consensus among evaluators [11]. | Higher consensus among evaluators [16] [11]. | CRED's detailed criteria reduce subjective interpretation. |
| Perceived Consistency | Rated lower by participants [16]. | 85% of participants found it more consistent [16]. | Improves harmonization across regulators and regions. |
| Perceived Transparency | Rated lower by participants [16]. | 90% of participants found it more transparent [16]. | Makes the basis for regulatory decisions clearer. |
| Dependence on Expert Judgement | High [11]. | Reduced dependence [16]. | Limits bias and promotes objective, criteria-driven assessment. |
| Time Required | Perceived as quicker but less thorough. | Perceived as slightly more time-consuming but more accurate [16]. | Investment in time yields higher-quality, defensible evaluations. |
The data also showed a correlation between the percentage of fulfilled CRED criteria and the final reliability category assigned, demonstrating the method's internal logic.
Table 3: Percentage of Fulfilled CRED Criteria per Assigned Category [6]
| CRED Reliability Category | Mean % of Criteria Fulfilled | Standard Deviation | Range (Min-Max) |
|---|---|---|---|
| Reliable without restrictions | 93% | 12 | 79% - 100% |
| Reliable with restrictions | 72% | 12 | 47% - 90% |
| Not reliable | 60% | 15 | 21% - 90% |
| Not assignable | 51% | 15 | 21% - 64% |
For researchers and assessors, applying CRED involves specific tools and considerations.
Table 4: Essential Research Reagent Solutions & Tools for CRED Evaluation
| Tool / Resource | Function / Purpose | Source / Availability |
|---|---|---|
| CRED Excel Evaluation Tool | Structured spreadsheet to guide evaluators through all 33 reliability and relevance criteria, ensuring no criterion is overlooked. | Freely available for download [4]. |
| CRED Reporting Template | A checklist of 50 reporting criteria across 6 categories (general info, test design, substance, organism, exposure, statistics) to guide study authors. | Provided in project publications [1]. |
| OECD Test Guidelines | Foundational reference for acceptable test methodologies (e.g., OECD 210: Fish Early-Life Stage Test). Essential for assessing protocol adherence. | OECD website. |
| NanoCRED & EthoCRED | Specialized adaptations of CRED for evaluating ecotoxicity studies of nanomaterials and studies measuring behavioral endpoints, respectively [18]. | Scientific publications [18]. |
| Purpose Statement Framework | Guided process to define the specific regulatory assessment goal before evaluating relevance. A core, mandatory step in the CRED process. | Detailed in CRED guidance [1]. |
The case study data demonstrates that CRED provides a more robust, consistent, and transparent framework than the Klimisch method for evaluating aquatic ecotoxicity studies. Its structured approach reduces arbitrariness, and its explicit relevance assessment ensures data is fit for purpose. CRED is gaining regulatory traction, being piloted in the revision of EU EQS guidance and used in databases like NORMAN EMPODAT [4].
The principles of CRED are also expanding into new areas, showing the framework's adaptability:
For researchers, using the CRED reporting criteria when designing and publishing studies maximizes the likelihood their work will be deemed reliable and relevant for regulatory use [1]. For risk assessors, adopting the CRED evaluation method mitigates the inconsistencies of the past and leads to more harmonized, defensible, and scientifically rigorous environmental hazard and risk assessments [16].
The evaluation of ecotoxicity data is a cornerstone of environmental hazard and risk assessment, informing the derivation of safe chemical concentrations such as Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) [23]. For decades, the Klimisch method has been the default framework for assessing data reliability [8]. However, its reliance on broad categories and lack of detailed guidance have been criticized for introducing subjectivity and inconsistency among assessors [11]. In response, the CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) method was developed to provide a more transparent, structured, and comprehensive tool [23]. This guide objectively compares these two pivotal methods, focusing on their performance in managing evaluator subjectivity and distinguishing the separate concepts of reliability and relevance, supported by experimental data from a major ring test [2].
The foundational difference between the Klimisch and CRED methods lies in their structure, scope, and granularity. The Klimisch method, introduced in 1997, is a scoring system that categorizes studies into four reliability tiers [8]. It primarily emphasizes compliance with standardized test guidelines (like OECD methods) and Good Laboratory Practice (GLP) [10]. While revolutionary for its time, it offers minimal explicit criteria for making these categorical judgments and provides no framework for evaluating the relevance of a study to a specific assessment question [11].
In contrast, the CRED method, finalized in 2016, is built on a detailed checklist approach. It decomposes the evaluation process into 20 specific reliability criteria and 13 specific relevance criteria, each accompanied by extensive guidance text [23] [1]. This design explicitly separates the assessment of a study's intrinsic scientific quality (reliability) from its applicability to a particular regulatory context (relevance) [1]. Furthermore, CRED includes 50 reporting recommendations to guide researchers in generating studies that are easier to evaluate reliably [23].
Table 1: Core Characteristics of the Klimisch and CRED Evaluation Methods
| Characteristic | Klimisch Method | CRED Method |
|---|---|---|
| Primary Focus | Reliability of data for regulatory use [8]. | Reliability and relevance of aquatic ecotoxicity studies [23]. |
| Number of Criteria | 4 broad categories (scores 1-4) [8]. | 20 reliability criteria + 13 relevance criteria [23]. |
| Guidance Detail | Minimal; heavily dependent on expert judgment [11]. | Extensive guidance for each criterion [2]. |
| Relevance Evaluation | Not formally addressed [11]. | Structured evaluation with defined categories (C1-C3) [2]. |
| Basis for Evaluation | Adherence to guidelines (OECD, GLP) is paramount [8] [10]. | Detailed assessment of test design, performance, and analysis [1]. |
| Output | Single reliability score (1, 2, 3, or 4) [8]. | Separate reliability and relevance summaries, with documented rationale for each criterion [1]. |
A definitive two-phase international ring test was conducted to compare the performance of the Klimisch and CRED methods empirically [11] [2].
Experimental Protocol:
Key Results on Consistency and Subjectivity: The ring test data revealed significant differences in assessor agreement. A clear example was the evaluation of a GLP study on fish toxicity (Study E). Using the Klimisch method, assessors' ratings were split between "reliable without restrictions" (44%) and "reliable with restrictions" (56%) [5]. With the CRED method, the assessment was more critical and more dispersed: only 16% found it "reliable without restrictions," 21% "reliable with restrictions," and a majority of 63% categorized it as "not reliable" [5]. This demonstrates that CRED's detailed criteria can prevent the automatic acceptance of GLP studies and uncover methodological flaws, leading to more varied but potentially more accurate evaluations.
Overall, the quantitative analysis showed that the mean percentage of criteria fulfilled consistently aligned with the final reliability category assigned by CRED. Studies rated "reliable without restrictions" met an average of 93% of criteria, while those rated "not reliable" met only 60% on average [6]. This strong correlation validates CRED's internal logic and provides transparency, as the final category is directly traceable to specific, documented shortcomings [6].
Table 2: Selected Ring Test Results Comparing Assessor Consistency
| Study | Key Characteristics | Klimisch Method Results (Distribution) | CRED Method Results (Distribution) |
|---|---|---|---|
| Study E [5] | GLP report on fish chronic toxicity. | R1: 44%, R2: 56% | R1: 16%, R2: 21%, R3: 63% |
| Study D [2] | Peer-reviewed study on algal toxicity. | R1: 20%, R2: 60%, R3: 20% | R1: 0%, R2: 92%, R3: 8% |
| General Trend | Across all studies. | Higher rate of R1/R2 assignments; greater assessor disagreement. | More critical; final category strongly correlates with % of fulfilled criteria [6]. |
Participant Perception: Post-test questionnaires indicated that a majority of risk assessors found the CRED method to be more accurate, consistent, and transparent than the Klimisch method [2]. They perceived it as less dependent on subjective expert judgment and considered its detailed criteria practical for use, despite a slightly higher initial time investment [2].
The following reagents and materials are fundamental to conducting standardized aquatic ecotoxicity tests, which form the basis for studies evaluated by both Klimisch and CRED methods.
Table 3: Key Reagents and Materials for Aquatic Ecotoxicity Testing
| Reagent/Material | Function in Ecotoxicity Testing |
|---|---|
| Reconstituted Standardized Test Water (e.g., ISO, OECD recipes) | Provides a consistent, defined medium for exposing aquatic organisms, ensuring reproducibility and minimizing background variability. |
| Reference Toxicants (e.g., Potassium dichromate, Sodium chloride) | Used in periodic positive control tests to verify the health and sensitivity of test organism populations. |
| Culture Media for Algae/Cyanobacteria (e.g., OECD TG 201 medium) | Supplies essential nutrients for sustained growth of phytoplankton test species in chronic tests. |
| Formulated Sediment | Provides a standardized substrate for testing benthic organisms (e.g., Chironomus riparius) in sediment toxicity tests. |
| Analytical Grade Test Substance | Essential for preparing accurate stock and test solutions; requires verification of purity and concentration (e.g., via HPLC, GC-MS). |
| Solvents/Carriers (e.g., Acetone, Dimethyl sulfoxide) | Used to dissolve poorly water-soluble test substances. Must be non-toxic to test organisms at the concentrations used. |
Comparison of Ecotoxicity Study Evaluation Workflows
Experimental Design of the CRED Validation Ring Test
Within regulatory ecotoxicology, the robust evaluation of available data is fundamental for environmental hazard and risk assessment. The classification of a study as "not assignable"—indicating insufficient experimental detail for a proper reliability judgment—poses a significant challenge, often leading to the exclusion of potentially valuable information [8]. For decades, the Klimisch method (1997) has been the benchmark for this task, but its broad categories and reliance on expert judgment have been criticized for a lack of transparency and consistency [11]. In response, the Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) method was developed to provide a more detailed, structured, and transparent framework [1]. This guide provides a comparative analysis of these two pivotal methodologies, focusing on their efficacy in handling incomplete or poorly reported studies, supported by experimental data and practical tools.
The core distinction between the Klimisch and CRED methods lies in their structure, granularity, and scope. The following table summarizes their foundational characteristics.
Table 1: Foundational Characteristics of the Klimisch and CRED Evaluation Methods
| Characteristic | Klimisch Method | CRED Method |
|---|---|---|
| Primary Focus | Reliability of toxicological and ecotoxicological data [8]. | Reliability and relevance of aquatic ecotoxicity data [11]. |
| Number of Criteria | 12-14 reliability criteria for ecotoxicity [11]. | 20 reliability and 13 relevance criteria [1]. |
| Guidance Provided | Minimal; heavily dependent on expert judgment [11]. | Extensive guidance for each criterion to improve consistency [1]. |
| Evaluation Output | Single, composite reliability score (1-4) [8]. | Separate, qualitative summaries for reliability and relevance [11]. |
| Handling of 'Not Assignable' | A defined category (#4) for studies with insufficient experimental details [8]. | Detailed criteria identify specific missing information, reducing unclassifiable outcomes [11] [1]. |
| Bias Toward Guideline Studies | Strong; explicitly favors GLP and OECD guideline studies [11] [8]. | Mitigated; evaluates study based on detailed scientific criteria, not just compliance [1]. |
The "not assignable" category in the Klimisch system is a terminal classification for studies lacking sufficient detail, often found in short abstracts or secondary literature [8]. While these studies can be used as supporting information, their exclusion from primary analysis can impact risk assessments, especially for data-poor substances.
The CRED method fundamentally reframes this problem. Instead of a single "unusable" bucket, its 20 reliability criteria systematically dissect a study's design, performance, and reporting [1]. A study missing critical information fails specific criteria, but other well-reported aspects can still be evaluated. This granularity allows assessors to determine if a study is "reliable with restrictions" based on identified flaws, rather than wholly "not assignable." Furthermore, the companion CRED reporting recommendations (50 criteria across six categories) provide a proactive template for researchers to ensure all essential information is reported, preventing studies from becoming "not assignable" in the first place [1].
A two-phase international ring test was conducted to compare the methods objectively [11] [1].
Ring Test Experimental Protocol:
Key Quantitative Results: The ring test data demonstrated CRED's advantages in handling study quality and consistency.
Table 2: Key Quantitative Outcomes from the CRED vs. Klimisch Ring Test [11]
| Outcome Measure | Klimisch Method Result | CRED Method Result | Implication |
|---|---|---|---|
| Agreement on Reliability Category | Lower inter-assessor consistency. | Higher inter-assessor consistency. | CRED produces more reproducible evaluations. |
| Perceived Accuracy | 64% of participants rated it as "accurate" or "very accurate." | 88% of participants rated it as "accurate" or "very accurate." | CRED is perceived as more scientifically rigorous. |
| Perceived Consistency | 49% rated it as "consistent" or "very consistent." | 85% rated it as "consistent" or "very consistent." | Detailed criteria reduce subjective judgment. |
| Time Required for Evaluation | Reported as generally shorter. | Initially longer, but considered a worthwhile investment for the clarity gained. | CRED's detail requires more upfront time but reduces downstream uncertainty. |
The fundamental difference in approach between the two methods is illustrated in their evaluation pathways.
Diagram 1: A comparison of the evaluation workflows for the Klimisch and CRED methods.
Successful application of these evaluation frameworks is supported by specific tools and resources.
Table 3: Research Reagent Solutions for Ecotoxicity Data Evaluation
| Tool / Resource | Primary Function | Key Utility in Handling Incomplete Data |
|---|---|---|
| CRED Excel Workbook [18] | Provides a structured template to apply the 20 reliability and 13 relevance criteria. | Guides assessors to identify exactly which information is missing or inadequate, moving beyond a simple "not assignable" label. |
| CRED Reporting Recommendations [1] | A checklist of 50 criteria across six categories (e.g., test design, exposure conditions) for reporting studies. | Proactive solution. Enables researchers to generate complete, regulatorily-useful data, preventing "not assignable" classifications. |
| ToxRTool [8] | An assistive tool developed to standardize the process of assigning a Klimisch score. | Helps ensure more consistent application of the Klimisch criteria among different assessors. |
| ADORE Benchmark Dataset [28] | A curated machine-learning dataset of acute aquatic toxicity data. | Serves as a reference for well-structured, documented data; useful for training and validating evaluation approaches. |
| OECD No. 54 (Under Revision) [29] | Provides international guidance on the statistical analysis of ecotoxicity data. | Updated statistical guidance helps assessors properly evaluate the validity of data analysis, a key reliability criterion. |
The CRED methodology has proven adaptable, evolving into specialized tools for emerging challenges and other environmental compartments, demonstrating its utility as a core framework.
Diagram 2: The evolution of the CRED framework into specialized and related evaluation tools.
Based on the comparative analysis, specific strategies can be formulated for different professional roles to improve the handling of data quality issues.
Table 4: Strategic Recommendations for Handling Data Evaluation Challenges
| Role | Primary Challenge | Recommended Strategy | Rationale & Expected Outcome |
|---|---|---|---|
| Risk Assessor / Regulator | Inconsistent interpretation of "not assignable" or "reliable with restrictions" studies. | Adopt the CRED evaluation method as the primary tool for critical study appraisal [11] [1]. | The detailed criteria and guidance minimize subjective judgment, leading to more transparent, consistent, and defensible evaluations. |
| Researcher / Study Author | Conducting valuable research that is later deemed "not assignable" for regulatory use due to incomplete reporting. | Use the CRED reporting recommendations as a checklist when designing, conducting, and publishing ecotoxicity studies [1]. | Proactively ensures all necessary information is reported, maximizing the regulatory utility and impact of published research. |
| Data Curator / Database Manager | Integrating studies of highly variable reporting quality into usable databases. | Structure database fields to capture the metadata specified by CRED and CREED [27]. | Facilitates efficient filtering and evaluation of data by end-users and supports the application of structured evaluation frameworks. |
| All Professionals | Navigating the transition from the established Klimisch method to more detailed frameworks. | For legacy Klimisch evaluations, document the specific reasons for a score (esp. 3 or 4). For new assessments, pilot CRED on a subset of studies to build proficiency. | Creates an audit trail for past decisions and smooths the learning curve for more robust future evaluations. |
The "not assignable" classification represents a significant hurdle in ecological risk assessment, potentially sidelining scientifically informative data. The Klimisch method, while historically valuable, addresses this with a blunt categorical tool that relies heavily on expert judgment. In contrast, the CRED method provides a sophisticated, multi-tool kit: its evaluation criteria diagnose specific reporting deficiencies, and its reporting recommendations prevent those deficiencies from occurring. Empirical evidence from the ring test confirms that CRED offers superior consistency, transparency, and accuracy [11]. The subsequent development of specialized CRED tools for nanomaterials, behavioral endpoints, and soil/sediment studies underscores its robustness and adaptability as the contemporary standard [18]. For researchers and assessors aiming to maximize the utility of all available data while ensuring scientific rigor, adopting the CRED framework is the most effective strategy for handling incomplete data and advancing the field of regulatory ecotoxicology.
In ecotoxicity data evaluation, professionals face a critical dilemma: the imperative for thorough, transparent assessments versus the constant pressure of project deadlines. For decades, the Klimisch method has served as the regulatory backbone for evaluating study reliability [11]. However, its reliance on expert judgment and lack of detailed guidance has led to inconsistencies, where the same study might be categorized differently by separate assessors [11]. This inconsistency can directly impact hazard assessments, potentially leading to underestimated environmental risks or unnecessary mitigation measures [11].
The Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) method was developed to address these shortcomings by providing a more structured, transparent, and detailed framework [16] [23]. While CRED enhances the robustness and harmonization of assessments, its comprehensive nature raises questions about practical time investment within tight project schedules [16]. This guide objectively compares both methods, providing researchers and drug development professionals with the experimental data and workflow insights needed to make informed decisions that balance scientific rigor with project management realities.
The core structural differences between the Klimisch and CRED methods define their application, thoroughness, and time requirements. The following table summarizes their key characteristics.
Table 1: Foundational Characteristics of the Klimisch and CRED Evaluation Methods [11] [23] [1]
| Characteristic | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Primary Scope | General toxicity and ecotoxicity data [8]. | Focus on aquatic ecotoxicity studies (with adaptations for nano, soil, sediment, and behavioral studies) [23] [18]. |
| Evaluation Dimensions | Reliability only [11]. | Reliability and Relevance evaluated independently [23]. |
| Number of Criteria | 12-14 criteria for reliability (ecotoxicity) [11]. | 20 reliability criteria and 13 relevance criteria [23]. |
| Guidance & Transparency | Minimal guidance; high dependence on expert judgement [11]. | Extensive guidance provided for each criterion; promotes transparency and reduces subjectivity [16]. |
| Bias Toward GLP Studies | Strongly favors studies conducted under Good Laboratory Practice (GLP) or standardized guidelines [11] [8]. | Provides a balanced evaluation of guideline and peer-reviewed non-GLP studies based on scientific merit [11]. |
| Outcome | A single reliability score (1-4) [8]. | Qualitative summary for both reliability and relevance, supporting a weight-of-evidence approach [11]. |
A major two-phased international ring test provided direct comparative data on the performance and perception of both methods [16] [11]. Seventy-five risk assessors from 12 countries, representing industry, academia, consultancy, and government, evaluated a set of eight aquatic ecotoxicity studies [11]. The results clearly favored the CRED approach.
Table 2: Ring Test Results Comparing Method Performance and Perception (75 Participants) [16] [11]
| Evaluation Aspect | Klimisch Method | CRED Method | Implication for Risk Assessment |
|---|---|---|---|
| Consistency Among Assessors | Low. High variability in scoring the same study [11]. | High. Improved alignment in evaluations [16]. | Reduces arbitrariness, leading to more harmonized regulatory decisions across regions [16]. |
| Perceived Accuracy | Moderately accurate. | Rated as more accurate by participants [16]. | Increases confidence in the derived data set for PNEC/EQS derivation [23]. |
| Handling of Non-GLP Studies | Often downgraded or excluded [11]. | Enables structured integration based on scientific quality [11]. | Broadens the data base, potentially reducing animal testing and saving resources [11]. |
| Transparency & Documentation | Low. Reasoning for scores is often not documented [11]. | High. Criteria-based evaluation forces clear documentation [23]. | Creates an auditable trail, facilitating peer review and regulatory acceptance [1]. |
| Time Investment (Per Study) | Lower (approx. 15-30 min). Due to fewer criteria and reliance on "expert judgment" [16]. | Higher (approx. 40-60 min). Due to comprehensive criteria assessment [16]. | Initial time cost is offset by reduced need for re-evaluation and clearer justification [16]. |
The definitive comparison between the Klimisch and CRED methods stems from a rigorously designed international ring test [11]. The protocol below details how this validation was conducted.
Objective: To compare the categorization, consistency, and practicality of the Klimisch and CRED evaluation methods and to gather user perception data [11].
Phase I (Klimisch Evaluation):
Phase II (CRED Evaluation):
Participant Profile: The 75 participants were risk assessors with varied expertise, with the majority having over five years of experience in study evaluation [1]. They represented a global cross-section of stakeholders from 12 countries [16].
Data Analysis: Consistency was measured by comparing categorization outcomes for the same study across different assessors. Participant feedback on both methods' accuracy, ease of use, and time requirements was collected via questionnaire [16] [11].
Integrating a thorough method like CRED into project timelines requires strategic planning. The following workflow and strategies illustrate how to manage this process efficiently.
Diagram: A Tiered Workflow for Efficient Study Evaluation [30]
Key Time Management Strategies:
Adopt a Tiered Approach: Not all studies require the same level of scrutiny. Implement the workflow above [30]. Use a quick Klimisch-based triage to filter out clearly unreliable or non-relevant studies. Reserve the full CRED evaluation for studies that are critical for your assessment endpoint (e.g., key studies for deriving a PNEC) [30]. This aligns with modern proposals for integrating reliability assessment into ecological risk assessment [30].
Parallelize Evaluation Tasks: For large projects, evaluation work can be distributed among team members. The structured, criteria-based nature of CRED minimizes subjectivity, making it more suitable for parallel work followed by calibration discussions than the highly judgmental Klimisch method [16].
Leverage Reporting Recommendations Proactively: The CRED project includes 50 reporting criteria designed to guide researchers in publishing more robust and regulatory-ready studies [23]. Promoting these standards within your organization or to collaborators can reduce the time spent retrospectively seeking clarifications or downgrading studies for poor reporting.
Factor in Time Savings from Reduced Re-Evaluation: While CRED takes longer per study initially, its transparency and consistency reduce the likelihood of contentious re-evaluations during regulatory review or peer consultation, avoiding significant project delays later [16] [4].
A high-quality ecotoxicity evaluation relies on both methodological frameworks and concrete scientific tools. The following table details key resources referenced in the CRED criteria and broader evaluation context.
Table 3: Key Research Reagent Solutions and Essential Materials for Ecotoxicity Evaluation
| Item / Solution | Function in Evaluation | Relevance to CRED/Klimisch |
|---|---|---|
| OECD Test Guidelines(e.g., OECD TG 201, 210, 211) | Provide internationally standardized protocols for testing chemicals on algae, daphnia, and fish [11]. | Form the primary benchmark for evaluating test design reliability in both methods. CRED checks alignment with all relevant guideline points [11]. |
| Good Laboratory Practice (GLP) | A quality system covering the organizational process and conditions for non-clinical safety studies [11]. | Klimisch heavily favors GLP studies [8]. CRED evaluates GLP as one aspect among many, focusing on scientific conduct over compliance alone [11]. |
| Reference/Control Substances(e.g., K₂Cr₂O₇ for Daphnia) | Used to confirm the sensitivity and health of test organisms is within acceptable historical ranges [30]. | Critical for assessing test validity (a key reliability criterion in CRED). Poor control performance can invalidate a study [30]. |
| Analytical Grade Test Substance | A substance of known identity, purity, and stability is essential for defining exposure concentrations [23]. | CRED's reliability criteria specifically evaluate the reporting of test substance characterization (purity, formulation), which is often lacking in peer-reviewed studies [23] [1]. |
| Appropriate Statistical Software | Necessary for calculating endpoints (EC50, NOEC) and verifying the correctness of reported statistical analyses [23]. | CRED's reliability assessment includes an evaluation of whether statistical methods are clearly described and appropriately applied [23]. |
| CRED Evaluation Excel Tool | A freely available spreadsheet that guides the assessor through all 33 criteria with embedded guidance [4]. | The primary implementation tool for the CRED method, standardizing the evaluation process and documentation [4]. |
The choice between the Klimisch and CRED methods is fundamentally a trade-off between expediency and comprehensive reliability. For rapid screening or when evaluating clearly guideline-compliant GLP studies, the Klimisch method offers speed. However, for building defensible, transparent, and scientifically robust risk assessments—particularly when incorporating data from the peer-reviewed literature—the CRED method is superior in reducing bias and inconsistency [16] [11].
The future of the field points toward the adoption and adaptation of structured frameworks like CRED. Its principles are already being extended into specialized areas through NanoCRED (for nanomaterials), EthoCRED (for behavioral studies), and CRED for sediment and soil [18]. Furthermore, integration with systematic review principles is being explored to further objectify the evaluation process [30].
Effective time management, therefore, does not mean choosing the fastest method, but rather strategically applying the most appropriate level of scrutiny. By using a tiered workflow, promoting better reporting, and leveraging the CRED tool, researchers and assessors can deliver high-quality, transparent evaluations that meet scientific and regulatory standards without compromising project timelines.
This guide provides a comparative analysis of the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) methodology against the established Klimisch method. Framed within the broader thesis of advancing ecotoxicity data evaluation, we objectively compare their performance, integration with major regulatory frameworks, and supporting experimental data to inform researchers and regulatory professionals.
The Klimisch method, established in 1997, has been a cornerstone for evaluating the reliability of ecotoxicity studies for regulatory hazard and risk assessment [11]. However, it has been criticized for its lack of detail, insufficient guidance, and failure to ensure consistency among assessors [11]. In response, the CRED evaluation method was developed to strengthen the transparency, consistency, and robustness of these assessments [11] [1].
The table below summarizes the fundamental differences between the two approaches.
Table 1: Core Characteristics of the Klimisch and CRED Evaluation Methods [11]
| Characteristic | Klimisch Method | CRED Evaluation Method |
|---|---|---|
| Primary Scope | Toxicity and ecotoxicity data | Aquatic ecotoxicity studies |
| Evaluation Dimensions | Reliability only | Reliability and Relevance |
| Number of Reliability Criteria | 12-14 for ecotoxicity | 20 criteria |
| Number of Relevance Criteria | 0 | 13 criteria |
| Alignment with OECD Reporting | Includes 14 of 37 OECD criteria | Includes all 37 OECD criteria |
| Guidance Provided | Limited, high-level | Detailed, extensive guidance for each criterion |
| Result Summary | Qualitative reliability categorization (e.g., "reliable without restrictions") | Qualitative summary for both reliability and relevance |
| Inherent Bias | Favors GLP (Good Laboratory Practice) and standard guideline studies [11] | Designed to evaluate all studies based on scientific merit, promoting use of peer-reviewed literature [11] |
A key advancement of CRED is its formal separation and detailed evaluation of reliability (the intrinsic scientific quality of a study) and relevance (the appropriateness of the study for a specific hazard identification or risk assessment purpose) [1]. This separation is crucial, as a reliable study may not be relevant for a particular assessment, and vice-versa [1].
A two-phased international ring test was conducted to empirically compare the performance of the CRED and Klimisch methods [11].
The ring test demonstrated that CRED leads to more nuanced and consistent evaluations. Data showed a clear correlation between the percentage of fulfilled CRED criteria and the final reliability category assigned to a study [6].
Table 2: Ring Test Results on Consistency and Perceived Utility [11]
| Evaluation Aspect | Klimisch Method | CRED Evaluation Method | Outcome |
|---|---|---|---|
| Inter-assessor Consistency | Lower | Higher | CRED's detailed criteria reduced subjectivity. |
| Perceived Accuracy | Less accurate | More accurate | |
| Perceived Consistency | Less consistent | More consistent | |
| Dependence on Expert Judgement | More dependent | Less dependent | |
| Transparency of Evaluation | Less transparent | More transparent | |
| Time Required for Evaluation | Perceived as shorter | Perceived as practical and feasible | CRED took longer but was deemed worth the effort. |
Table 3: Relationship Between CRED Criteria Fulfillment and Assigned Reliability Category [6]
| Reliability Category (CRED) | Mean % of Fulfilled Criteria | Standard Deviation | Sample Size (n) |
|---|---|---|---|
| Reliable without restrictions | 93% | 12 | 3 |
| Reliable with restrictions | 72% | 12 | 24 |
| Not reliable | 60% | 15 | 58 |
| Not assignable | 51% | 15 | 19 |
The data shows a gradient: studies categorized as more reliable fulfilled a higher percentage of CRED's detailed criteria. The significant standard deviation within categories, especially "Not reliable," highlights that studies can be deficient in different ways, which CRED's transparent scoring helps to document [6].
REACH guidance requires the evaluation of study reliability and relevance but historically has relied on the Klimisch method, offering limited detail [11]. CRED aligns perfectly with REACH's goals by providing the missing granularity.
OECD test guidelines are the international standard for generating regulatory ecotoxicity data [31]. CRED is explicitly designed for synergy with these guidelines.
For pharmaceutical, chemical, and agrochemical companies, integrating CRED into internal SOPs standardizes the data evaluation process across global teams.
The following workflow diagrams the CRED evaluation process and its integration into a regulatory assessment system.
CRED Evaluation and Integration Workflow
Implementing the CRED methodology or conducting studies designed to meet its criteria requires specific conceptual and practical tools.
Table 4: Key Research Reagent Solutions for CRED-Aligned Work
| Tool / Solution | Function in CRED Context | Relevance to Framework Integration |
|---|---|---|
| CRED Evaluation Excel Sheet [18] | The primary implementation tool. Provides structured worksheets to score the 20 reliability and 13 relevance criteria, automatically calculating scores and categories. | Ensures standardized application across an organization, creating a reproducible audit trail for REACH dossiers or internal reviews. |
| CRED Reporting Template [1] | A checklist of 50 reporting criteria across six categories (general info, test design, substance, organism, exposure, statistics). | Used prospectively to design studies that will meet regulatory evaluation standards. Aligns with OECD guideline principles. |
| OECD Test Guidelines (e.g., 201, 210, 211) [31] | The foundational test protocols for generating aquatic ecotoxicity data. CRED's reliability criteria are based on these. | Mandatory for regulatory studies under REACH and other frameworks. CRED acts as the bridge between guideline conduct and guideline evaluation. |
| NanoCRED & EthoCRED Modules [18] | Specialized adaptation of CRED principles for nanomaterial ecotoxicity and behavioral endpoint studies. | Provides a ready-made evaluation framework for emerging science, facilitating the inclusion of non-standard data into regulatory assessments (e.g., for novel materials). |
| (Q)SAR Assessment Framework (QAF) [32] | A principled guide for evaluating computational predictions, such as Quantitative Structure-Activity Relationships. | While distinct from CRED, it operates on similar principles of transparency and criteria-based assessment. Its use alongside CRED supports a holistic, weight-of-evidence approach under REACH. |
The choice between Klimisch and CRED is not merely methodological but strategic, impacting data inclusivity, regulatory defensibility, and assessment consistency.
Table 5: Strategic Comparison for Framework Integration
| Integration Goal | Performance with Klimisch Method | Performance with CRED Method | Verdict |
|---|---|---|---|
| Achieving Consistency Across Assessors | Poor. High dependence on expert judgment leads to significant variability [11]. | Excellent. Structured criteria and guidance drastically reduce subjective interpretation [11]. | CRED Superior |
| Transparency & Auditability | Low. Categorical outcome provides no insight into the reasoning behind the score. | High. Detailed scoring and documentation explain the final category, supporting regulatory dialogue. | CRED Superior |
| Harmonization Across REACH, OECD, SOPs | Limited. Functions as a high-level gatekeeper without fostering alignment. | Strong. Built on OECD criteria and provides tools that can be embedded into SOPs for end-to-end harmonization. | CRED Superior |
| Utilizing Peer-Reviewed Literature | Biased Against. Inherent favoritism towards GLP guideline studies can exclude valid published science [11]. | Facilitates. Evaluates all studies on scientific merit, helping to address data gaps with available research [11]. | CRED Superior |
| Implementation Speed & Ease | Fast. Simple checklist allows for quick, superficial evaluation. | Moderate. Requires more initial time for detailed assessment, but tools streamline the process. | Context Dependent |
Conclusion: Experimental data and practical experience confirm that the CRED evaluation method is a scientifically robust and operationally suitable replacement for the Klimisch method [11]. Its structured, transparent, and detailed approach directly addresses the core weaknesses of its predecessor. For organizations aiming to integrate their practices with REACH, OECD guidelines, and robust internal SOPs, adopting CRED represents a strategic upgrade. It enhances the credibility, consistency, and defensibility of ecotoxicity assessments while facilitating a broader and more efficient use of available scientific data. The ongoing development of specialized modules (NanoCRED, EthoCRED) ensures its continued relevance for evaluating emerging scientific frontiers within established regulatory frameworks [18].
The evaluation of ecotoxicity data is a cornerstone of environmental risk assessment for chemicals, pesticides, and pharmaceuticals. For decades, the Klimisch method has been the predominant tool for assessing the reliability of such studies, categorizing them as "reliable without restrictions," "reliable with restrictions," "not reliable," or "not assignable" [2]. However, its reliance on expert judgement and lack of detailed, transparent criteria have led to inconsistencies in hazard assessments [2]. In response, the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method was developed to provide a more structured, transparent, and consistent framework for evaluating both the reliability and relevance of aquatic ecotoxicity studies [2].
This evolution mirrors a broader shift across life sciences towards digital transformation, where cloud-based platforms, electronic laboratory notebooks (ELNs), and advanced analytics are replacing paper-based systems and manual data handling [33] [34]. This article compares the CRED and Klimisch methodologies and demonstrates how digital tools and templates can be leveraged to implement the more robust CRED process efficiently. By integrating structured evaluation criteria with modern data management systems, researchers can achieve greater harmonization, transparency, and efficiency in environmental hazard and risk assessment.
A pivotal two-phased ring test provided direct experimental comparison between the Klimisch and CRED evaluation methods [2]. Seventy-five risk assessors from 12 countries evaluated multiple ecotoxicity studies, first using the Klimisch method and then using a draft version of the CRED method [2]. The results highlight fundamental differences in application and outcome.
Table: Core Characteristics of Klimisch and CRED Evaluation Methods
| Feature | Klimisch Method (1997) | CRED Evaluation Method (2016) |
|---|---|---|
| Primary Focus | Reliability of studies (mainly methodology and reporting clarity) [2]. | Reliability and relevance of studies for a specific assessment context [2]. |
| Evaluation Criteria | Limited, high-level criteria; heavy dependence on expert judgement [2]. | 20 detailed reliability criteria and 13 relevance criteria with guided considerations [2]. |
| Guidance Provided | Minimal; lacks specific guidance for consistent relevance evaluation [2]. | Comprehensive guidance for applying each criterion to promote consistency [2]. |
| Bias Toward Study Type | Favors Good Laboratory Practice (GLP) and OECD guideline studies, potentially overlooking flaws [2]. | Evaluates all studies (GLP, guideline, peer-reviewed) against the same detailed scientific criteria [2]. |
| Outcome Transparency | Low; provides only a final category (e.g., R1, R2) [2]. | High; evaluators document judgements on each criterion, creating an audit trail [2]. |
| Perceived Consistency | Low; leads to discrepancies between different risk assessors [2]. | High; structured criteria reduce subjective variation [2]. |
The quantitative data from the ring test, particularly from a study on fish toxicity with estrone (Study E), underscores the impact of methodological choice. When using the Klimisch method, 44% of participants categorized the study as "reliable without restrictions," and 56% as "reliable with restrictions" – with none rating it "not reliable" [5]. In stark contrast, when the same study was evaluated using the CRED method, only 16% found it "reliable without restrictions," 21% "reliable with restrictions," and a majority of 63% categorized it as "not reliable" [5]. The mean scores for conclusive categories were 1.6 (Klimisch) versus 2.5 (CRED) on a scale where 1 is most reliable and 3 is not reliable [5]. This demonstrates that the CRED method's detailed scrutiny leads to a stricter, more conservative assessment, which could significantly alter the data set used for final risk characterization.
Experimental Protocol: Two-Phased Ring Test for Method Comparison [2]
Diagram 1: Experimental Workflow for CRED vs. Klimisch Ring Test Comparison
The CRED method's strength—its detailed criteria—can also be a practical challenge, requiring more time for initial evaluation compared to the simpler Klimisch approach [2]. This is where digital tools transform the process from a burdensome manual checklist into an integrated, efficient, and scalable workflow. The transition in life sciences from paper-based silos to connected digital platforms is critical for managing complex data [34].
Table: Digital Solutions for Streamlining the CRED Evaluation Process
| Digital Tool Category | Function in CRED Process | Example & Benefit |
|---|---|---|
| Electronic Laboratory Notebooks (ELNs) & Cloud Platforms | Provides the digital backbone for hosting evaluation templates, capturing data, and linking related studies [33] [34]. | Revvity Signals BioELN or IDBS Polar: Enable creation of standardized CRED evaluation forms, ensuring all criteria are addressed. They connect experimental data, protocols, and evaluation results in one searchable location, ending manual data hunting [33] [34]. |
| Structured Digital Templates | Encodes the 20 reliability and 13 relevance criteria into a standardized digital form, ensuring consistency and completeness [2]. | A custom template within an ELN with mandatory fields, dropdown menus, and guided text prompts. This reduces evaluator variance and ensures an audit trail for every judgement [2]. |
| Data Integration & Automation | Automates the ingestion of study metadata (e.g., test organism, concentration, endpoints) from electronic reports into the evaluation template. | Using application programming interfaces (APIs) to pull data from PDFs or instrument outputs directly into the ELN platform, minimizing manual entry errors and saving time [35] [34]. |
| Advanced Analytics & Visualization | Facilitates the comparison and weighting of multiple evaluated studies for final risk assessment. | Tools like TIBCO Spotfire (powering Revvity's VitroVivo) can visualize the "scores" or outcomes from multiple CRED evaluations, helping scientists identify key studies and patterns [33]. |
Diagram 2: Digital Transformation Pathway for Ecotoxicity Data Evaluation
Implementing this digital pathway directly addresses the major time sink noted in traditional biopharma development, where scientists waste up to 20% of their time on data administration [34]. A cloud-based digital backbone not only streamlines the evaluation of individual studies but also creates a FAIR (Findable, Accessible, Interoperable, Reusable) repository of evaluated ecotoxicity data [33]. This repository becomes a powerful asset for future assessments, predictive modeling, and regulatory submissions.
Adopting the digital CRED process requires a combination of specific software platforms and data science techniques. The following toolkit outlines the key components.
Table: Research Reagent Solutions for Digital CRED Evaluation
| Category | Item/Platform | Function in Digital CRED Process |
|---|---|---|
| Core Digital Infrastructure | Cloud-based ELN/Platform (e.g., Revvity Signals BioELN [33], IDBS Polar [34]) | Serves as the primary environment for housing CRED templates, linking raw study data, and capturing evaluation results. Ensures version control and collaborative review. |
| Specialized Analysis & Visualization | Advanced Analytics Software (e.g., TIBCO Spotfire within Signals VitroVivo [33]) | Transforms evaluation results and underlying ecotoxicity data into interactive visualizations. Helps identify trends, compare studies, and support weight-of-evidence decisions. |
| Data Handling & Processing | Probabilistic Imputation Methods (e.g., PMILD for mixed-type data [36]) | Addresses missing or incomplete data in historical studies undergoing evaluation. Uses models like Gaussian Copula to estimate missing values, reducing bias from data gaps [36]. |
| Automation & Interoperability | Application Programming Interfaces (APIs) | Enables automated, bidirectional data flow between instruments, databases, and the central ELN platform. Eliminates error-prone manual transcription [35] [34]. |
| Standardized Digital Templates | Custom CRED Evaluation Form | A pre-configured digital form within the ELN that codifies all CRED criteria. Guides the evaluator step-by-step, ensures completeness, and generates a standardized output report. |
Example Digital CRED Workflow:
Diagram 3: Integrated Digital Workflow for CRED-Based Study Evaluation
The transition from the Klimisch method to the CRED framework represents a significant advancement toward scientific rigor and regulatory transparency in ecotoxicity assessment. While the CRED method demands more detailed scrutiny, this very detail is what reduces inconsistency and bias [2]. The key to making this superior methodology practical and scalable lies in its integration with modern digital tools.
By implementing the CRED criteria within cloud-based ELNs, structured digital templates, and automated data pipelines, organizations can overcome the initial time investment hurdle. This creates a seamless workflow that not only ensures more reliable chemical safety assessments but also builds a reusable, queryable knowledge asset. For researchers, scientists, and drug development professionals, leveraging this combination of robust methodology and digital technology is no longer optional; it is essential for conducting efficient, defensible, and cutting-edge environmental risk assessments in the modern era [33] [34].
The regulatory processes governing environmental chemicals, pharmaceuticals, and biocides depend on a foundational element: reliable and relevant ecotoxicity data. Predicted-no-effect concentrations (PNECs) and environmental quality standards (EQSs) are derived from this data to define safe chemical thresholds [1]. However, the studies underlying these critical decisions must first be evaluated for their scientific quality and appropriateness—a process historically dependent on variable expert judgment [1]. This subjectivity introduces inconsistency and bias, potentially leading to divergent regulatory outcomes for the same substance [2].
For decades, the primary tool for this evaluation has been the method established by Klimisch and colleagues in 1997. While pioneering, this method has faced mounting criticism for its lack of detail, insufficient guidance, and inherent bias toward industry-funded, Good Laboratory Practice (GLP) studies, often at the expense of peer-reviewed literature [1] [2]. Its broad categories (“reliable without restrictions,” “reliable with restrictions,” “not reliable,” “not assignable”) offer limited granularity, leaving considerable room for interpretation and inconsistent application among assessors [2].
In response, the CRED (Criteria for Reporting and Evaluating Ecotoxicity Data) project developed a robust, transparent alternative. The CRED evaluation method provides a detailed framework with explicit criteria and guidance for assessing both the reliability (intrinsic scientific quality) and relevance (appropriateness for a specific assessment purpose) of aquatic ecotoxicity studies [1]. To empirically validate CRED against the established Klimisch standard, a comprehensive international ring test was conducted. This analysis presents the results from that pivotal exercise, involving 75 risk assessors from 12 countries, to objectively compare the performance, consistency, and utility of the two methodological paradigms [2].
The ring test was a meticulously designed, two-phased exercise aimed at generating direct, comparable evidence on the application of the Klimisch and CRED methods [2].
To ensure independence and avoid bias, no assessor evaluated the same study using both methods, and within participating institutions, different individuals were assigned to different phases [2]. Participants were instructed to assume a common regulatory context—deriving Environmental Quality Criteria under the EU Water Framework Directive—to standardize the relevance evaluation [2]. Following each phase, assessors completed detailed questionnaires on their experience, perception of the methods, and identified uncertainties [2].
Table 1: Ring Test Experimental Protocol Overview
| Aspect | Description |
|---|---|
| Objective | Compare consistency, transparency, and usability of the Klimisch and CRED evaluation methods [2]. |
| Participants | 75 risk assessors from 12 countries, representing industry, academia, consultancy, and government [2]. |
| Study Set | 8 peer-reviewed aquatic ecotoxicity studies, covering various organisms and endpoints [2]. |
| Phase I | Evaluate 2 studies per assessor using the Klimisch method [2]. |
| Phase II | Evaluate 2 different studies per assessor using the draft CRED method [2]. |
| Output Metrics | 1) Final reliability/relevance categorization. 2) Criteria-level consistency analysis. 3) Participant feedback on usability [2]. |
The ring test yielded quantitative data on evaluation consistency and qualitative feedback on user experience, providing a multi-dimensional comparison.
3.1 Quantitative Outcomes: Measuring Consistency The primary quantitative measure was the consistency of agreement among assessors when evaluating the same study. The CRED method demonstrated a clear and significant advantage. For reliability evaluations, the rate of complete agreement among assessors was approximately 20% higher when using CRED compared to the Klimisch method [2]. This striking improvement in consensus is directly attributable to CRED's detailed criteria and guidance, which reduce ambiguity and the need for subjective expert judgment [2].
Table 2: Key Quantitative Results from the Ring Test
| Metric | Klimisch Method | CRED Evaluation Method | Implication |
|---|---|---|---|
| Assessor Agreement on Reliability | Lower consistency rate [2]. | ~20% higher consistency rate [2]. | CRED produces more reproducible and uniform evaluations. |
| Final Reliability Categorization | Tendency to cluster in "reliable with restrictions" (R2) [2]. | More differentiated spread across categories [2]. | CRED allows for more nuanced discrimination of study quality. |
| Criteria Structure | 4 broad reliability categories; no formal relevance criteria [1]. | 20 reliability & 13 relevance criteria with detailed guidance [1]. | CRED's structure enables transparent, criterion-by-criterion assessment. |
3.2 Qualitative Feedback: User Perception and Practicality Participant questionnaires revealed a strong preference for the CRED framework. A majority of the 75 assessors perceived the CRED method as more accurate, applicable, consistent, and transparent than the Klimisch method [1]. Key user experience findings included:
The core strength of the CRED method lies in its structured and transparent design. It decouples the evaluation of reliability from relevance, recognizing that a methodologically sound study may not be pertinent for a specific regulatory question, and vice versa [1].
4.1 Reliability Criteria The 20 reliability criteria are organized into key domains: test design, test substance characterization, test organism information, exposure conditions, and statistical design/biological response [1]. Each criterion includes specific guidance for evaluation, moving beyond a simple checklist to a reasoned assessment. For example, instead of a vague notion of "adequate controls," CRED asks evaluators to specifically check if control groups met validity criteria such as acceptable survival or growth rates [1].
4.2 Relevance Criteria The 13 relevance criteria ensure the study is fit-for-purpose, considering factors such as the appropriateness of the tested organism, exposure duration, measured endpoints, and the environmental compartment addressed relative to the regulatory goal [1]. This formalizes a critical step that was previously ad-hoc under the Klimisch approach.
Table 3: Comparison of Evaluation Criteria: Klimisch vs. CRED
| Evaluation Dimension | Klimisch Method | CRED Evaluation Method |
|---|---|---|
| Reliability Assessment | 4 broad categories (R1-R4). Lacks detailed criteria and guidance [2]. | 20 specific criteria with extensive guidance across test design, substance, organism, exposure, and statistics [1]. |
| Relevance Assessment | No formal criteria or categories provided [2]. | 13 specific criteria to evaluate appropriateness of organism, endpoint, exposure, etc., for the assessment goal [1]. |
| Guidance & Transparency | Minimal guidance, high reliance on expert judgment [1]. | Detailed guidance for each criterion promotes transparency and reduces arbitrariness [1]. |
| Reporting Link | Not integrated with reporting standards. | Paired with 50-category reporting recommendations to improve primary study quality [1]. |
4.3 Visualizing the Evaluation Workflow The following diagram illustrates the logical workflow and decision-making process underpinning the CRED evaluation method, highlighting its systematic nature.
CRED Evaluation Decision Workflow
4.4 The Scientist's Toolkit: Essential Resources for Implementation Adopting the CRED method is facilitated by several key tools and resources developed alongside the framework.
Table 4: Research Reagent Solutions for Ecotoxicity Evaluation
| Tool/Resource | Function | Source/Availability |
|---|---|---|
| CRED Excel Evaluation Tool | Structured spreadsheet with all 33 reliability & relevance criteria, guidance text, and scoring fields to standardize the evaluation process [18]. | Available for download from the SciRAP website [18]. |
| OECD Test Guidelines | Internationally agreed test protocols (e.g., OECD 210 Fish Early-Life Stage) defining standard methods; serve as a key benchmark for reliability evaluation [1]. | OECD Publishing. |
| Good Laboratory Practice (GLP) | A quality system covering the organizational process and conditions for non-clinical safety studies; often a positive factor in reliability assessment [1]. | OECD Principles of GLP. |
| CRED Reporting Checklist | A list of 50 reporting criteria across 6 categories to guide authors in preparing ecotoxicity studies that meet regulatory needs [1]. | Included in the CRED publication suite [1]. |
| Domain-Specific CRED Extensions | Adapted evaluation criteria for specialized areas (e.g., NanoCRED for nanomaterials, EthoCRED for behavioral studies) [18]. | Published in relevant scientific journals [18]. |
The ring test evidence strongly supports the CRED method as a superior, science-based replacement for the Klimisch approach. Its adoption carries significant implications.
5.1 Enhancing Regulatory Consistency and Transparency By standardizing the evaluation process, CRED mitigates the risk of divergent hazard classifications for the same chemical in different regulatory jurisdictions or by different assessors [2]. Its transparency allows stakeholders to see precisely why a study was accepted or rejected, building trust in regulatory decisions.
5.2 Expanding the Data Universe for Risk Assessment A major critique of the Klimisch method is its implicit bias toward GLP-registered studies, which can marginalize valuable peer-reviewed science [1] [2]. CRED's objective, criteria-driven approach levels the playing field, allowing any well-conducted and well-reported study—regardless of its origin—to be recognized as reliable. This can enrich datasets for data-poor substances, potentially reducing animal testing and accelerating assessments [2].
5.3 Visualizing Method Comparison and Outcomes The following diagram synthesizes the core differences between the Klimisch and CRED methods and maps them to the outcomes observed in the ring test, illustrating the cause-and-effect relationship between methodological design and practical results.
Method Design and Its Impact on Evaluation Outcomes
5.4 Fostering Higher-Quality Ecotoxicological Science The CRED project pairs its evaluation method with comprehensive reporting recommendations [1]. By providing authors with a clear checklist of essential information, these recommendations have a prospective effect: improving the completeness and clarity of future ecotoxicity studies, which in turn facilitates their reliable evaluation and regulatory use [1].
The international ring test, engaging 75 assessors across 12 countries, provides compelling empirical evidence that the CRED evaluation method represents a significant advancement over the Klimisch standard. CRED delivers on the essential needs of modern regulatory science: greater consistency through detailed criteria, enhanced transparency via explicit guidance, and improved robustness by enabling the inclusion of all scientifically valid data. Its adoption across regulatory frameworks for chemicals, pharmaceuticals, and biocides would harmonize hazard assessment practices, strengthen the scientific foundation of environmental protection measures, and ultimately lead to more reliable safety determinations for ecosystems worldwide.
The regulatory assessment of chemicals relies on the evaluation of ecotoxicity studies to determine environmental risks and set safety thresholds, such as Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) [1]. For years, the Klimisch method, established in 1997, has been the backbone of this reliability evaluation process [11]. It categorizes studies as "reliable without restrictions," "reliable with restrictions," "not reliable," or "not assignable" [11]. However, this method has faced growing criticism for its lack of detailed guidance, its failure to ensure consistency among different assessors, and its focus solely on reliability while neglecting relevance [11] [1]. These shortcomings can lead to discrepancies in hazard assessments, potentially resulting in underestimated environmental risks or unnecessary regulatory measures [11].
In response, the Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) method was developed to strengthen the transparency, consistency, and scientific robustness of evaluations [11] [4]. This comparison guide objectively analyzes the performance of the CRED method against the traditional Klimisch approach, focusing on experimental data that quantifies their ability to harmonize assessments across different evaluators. The context is a broader thesis on modernizing ecotoxicity data evaluation, positioning CRED as a science-driven successor designed to reduce reliance on subjective expert judgment and improve the integration of peer-reviewed literature into regulatory frameworks [1] [4].
The fundamental difference between the Klimisch and CRED methods lies in their structure, scope, and guidance. The table below summarizes their core characteristics.
Table 1: Core Characteristics of the Klimisch and CRED Evaluation Methods [11]
| Characteristic | Klimisch Method (1997) | CRED Method (2016) |
|---|---|---|
| Primary Scope | General toxicity and ecotoxicity data | Aquatic ecotoxicity studies (with extensions for nano, sediments, soil, and behavior) |
| Reliability Criteria | 12-14 criteria for ecotoxicity studies | 20 detailed criteria |
| Relevance Criteria | Not formally included | 13 specific criteria |
| Guidance Provided | Minimal; high reliance on expert judgment | Extensive guidance for each criterion |
| Evaluation Output | Single reliability category (e.g., "reliable with restrictions") | Qualitative summary for both reliability and relevance |
| Basis for Evaluation | Favors guideline studies (e.g., OECD, EPA) and Good Laboratory Practice (GLP) | Evaluates inherent scientific quality, independent of GLP status |
The Klimisch method offers a basic checklist but lacks instructions, leading to inconsistent interpretations [11]. It has also been criticized for a bias that favors standardized guideline studies from industry, potentially excluding methodologically sound, peer-reviewed literature from regulatory consideration [11] [1].
In contrast, CRED provides a more granular and transparent framework. It separately evaluates reliability (the inherent scientific quality of the study) and relevance (the appropriateness of the study for a specific assessment purpose) [1]. This separation is critical, as a study can be highly reliable but irrelevant for a particular regulatory question, and vice versa [1]. The method includes comprehensive guidance for applying its 33 total criteria, aiming to minimize subjective judgment [11].
To quantitatively compare the two methods, a two-phase international ring test (a type of inter-laboratory comparison) was conducted [11].
The ring test was designed to mirror real-world evaluation scenarios and measure consistency between assessors [11].
The ring test provided clear quantitative and qualitative evidence of CRED's superior performance in harmonizing evaluations.
Table 2: Key Quantitative Outcomes from the CRED Ring Test [11]
| Evaluation Metric | Klimisch Method Outcome | CRED Method Outcome | Implication |
|---|---|---|---|
| Agreement on Reliability Category | Lower agreement among evaluators | Higher agreement among evaluators | CRED reduces discrepancy between different assessors. |
| Perceived Dependence on Expert Judgement | High | Low | CRED's detailed guidance reduces subjective interpretation. |
| Perceived Accuracy & Consistency | Moderate | High | Assessors trusted CRED evaluations more. |
| Practicality (Time & Use of Criteria) | Rated as less practical | Rated as practical | Despite more criteria, CRED was found user-friendly. |
| Handling of Non-Guideline Studies | Often downgraded or excluded | Fairly evaluated based on scientific merit | CRED facilitates use of peer-reviewed literature. |
The data shows that CRED's structured approach successfully reduced variability in study categorization. Participant feedback strongly favored CRED, with 85-90% agreeing it was more accurate, consistent, and transparent than the Klimisch method [11]. Notably, CRED achieved this while also being rated as practical in terms of time and effort required [11].
The CRED framework is not static. Its core principles have been adapted to address specialized testing challenges, demonstrating its robustness and flexibility.
These extensions confirm CRED's role as a living framework. They also highlight a key advantage: CRED evaluates the inherent scientific quality of a study, making it adaptable to novel test substances and endpoints beyond traditional chemical testing [1] [17].
The following diagrams illustrate the structural and procedural differences between the two evaluation methods and the design of the validating ring test.
Comparison of Ecotoxicity Study Evaluation Workflows
Experimental Design of the CRED Ring Test
Implementing a robust evaluation requires specific tools and resources. The following table details key "reagent solutions" for conducting ecotoxicity study evaluations based on the CRED framework and related methodologies.
Table 3: Research Reagent Solutions for Ecotoxicity Data Evaluation
| Tool / Resource | Function in Evaluation | Key Features / Notes |
|---|---|---|
| CRED Excel Tool [18] [4] | Provides a structured, user-friendly interface to apply the 20 reliability and 13 relevance criteria. | Ensures systematic evaluation; includes guidance prompts; facilitates data recording and visualization. |
| OECD Test Guidelines (e.g., 201, 210, 211) [11] | Serve as the benchmark for standardized test methodology against which study design is compared. | Essential for understanding minimum reporting and performance requirements for guideline tests. |
| CRED Reporting Recommendations [1] | A checklist of 50 criteria across 6 categories for reporting aquatic ecotoxicity studies. | Used prospectively to improve study reporting or retrospectively to assess completeness. |
| NanoCRED & EthoCRED Modules [18] [17] | Specialized criteria and guidance for evaluating studies on nanomaterials and behavioral endpoints. | Addresses specific methodological challenges not covered in the core CRED method. |
| Good Laboratory Practice (GLP) Documentation | Provides information on study traceability and quality assurance processes. | While CRED does not favor GLP, its documentation can help verify reported procedures. |
| Statistical Analysis Software | Critical for independently verifying reported statistical results and endpoint calculations (e.g., EC50, NOEC). | Reduces reliance on the author's analysis alone; essential for plausibility checks. |
The experimental data from the international ring test provides clear evidence that the CRED evaluation method significantly reduces discrepancies between evaluators compared to the legacy Klimisch method. By replacing a sparse checklist with a detailed, guidance-rich framework that separately assesses reliability and relevance, CRED minimizes subjective judgment and promotes transparency [11] [1].
This advancement supports a critical shift in regulatory ecotoxicology: enabling the consistent and scientifically defensible integration of high-quality peer-reviewed literature alongside standardized guideline studies [4]. The subsequent development of specialized modules like NanoCRED demonstrates the framework's adaptability to emerging scientific frontiers [18] [17]. For researchers, scientists, and drug development professionals, adopting the CRED methodology represents a move toward more consistent, transparent, and robust environmental hazard and risk assessments.
The regulatory evaluation of ecotoxicity studies is a cornerstone of environmental hazard and risk assessment for chemicals, informing decisions from pesticide approval to water quality standards [16]. For decades, this process has relied heavily on the method established by Klimisch and colleagues in 1997. While a significant step forward at the time, the Klimisch method has faced increasing criticism for its lack of detailed guidance and its failure to ensure consistency among different risk assessors, leading to evaluations that are highly dependent on individual expert judgment [16] [11].
In response, the Criteria for Reporting and Evaluating ecotoxicity Data (CRED) method was developed to strengthen the consistency, transparency, and scientific robustness of these critical evaluations [16] [1]. CRED provides a structured framework with explicit criteria for assessing both the reliability (inherent scientific quality) and relevance (appropriateness for a specific assessment) of aquatic ecotoxicity studies [1]. The core thesis of modern ecotoxicity evaluation research posits that structured, transparent methods like CRED are necessary successors to older, judgment-dependent frameworks like Klimisch's to achieve harmonized and credible regulatory decisions [11].
A comprehensive, two-phased ring test was conducted to directly compare the Klimisch and CRED evaluation methods. The test involved 75 risk assessors from 12 countries, representing industry, academia, consultancy, and governmental institutions [16] [11]. Participants evaluated a set of eight ecotoxicity studies, using the Klimisch method in the first phase and the CRED method in the second phase on different studies to ensure independent assessment.
The ring test revealed significant differences in how the two methods categorized the same studies, highlighting the impact of methodological structure on final outcomes.
Table 1: Comparative Study Categorization from Ring Test (Selected Studies)
| Study Description | Klimisch Method Categorization | CRED Method Categorization | Implication |
|---|---|---|---|
| Study E: GLP report on fish toxicity of estrone [5] | 44% "Reliable without restrictions"; 56% "Reliable with restrictions" [5] | 16% "Reliable without restrictions"; 21% "Reliable with restrictions"; 63% "Not reliable" [5] | CRED's detailed criteria led to stricter reliability assessment of a GLP study. |
| General Trend across 8 studies | Higher tendency to categorize as "Reliable" (with or without restrictions) [11] | More nuanced distribution, with increased identification of "Not reliable" studies [11] | CRED reduces automatic acceptance of guideline studies and enables critical flaw identification. |
The fundamental design differences between the two frameworks translate directly into user experience and perceived reliability.
Table 2: Structural and Perceptual Comparison of Klimisch vs. CRED Methods
| Comparison Aspect | Klimisch Method | CRED Evaluation Method | Source |
|---|---|---|---|
| Core Focus | Primarily reliability evaluation. | Explicit, separate evaluation of reliability and relevance. | [11] [1] |
| Number of Criteria | 12-14 reliability criteria for ecotoxicity; no formal relevance criteria. | 20 reliability criteria and 13 relevance criteria, with extensive guidance. | [11] [1] |
| Guidance & Transparency | Limited guidance, leaving much to expert interpretation. | Detailed guidance for each criterion to improve consistency. | [16] [11] |
| Basis for Evaluation | Heavily reliant on expert judgment; favors GLP and standard guideline studies. | Based on OECD test guidelines and reporting requirements; structured checklist reduces bias. | [11] [1] |
| Risk Assessor Perception (from ring test) | Less consistent, less transparent, more dependent on expert judgement. | More accurate, consistent, transparent, and practical in terms of criteria use and time needed. | [16] [11] |
| Output | Qualitative reliability category (e.g., "reliable without restrictions"). | Qualitative reliability and relevance categories, supported by transparent criteria scoring. | [11] |
The pivotal evidence comparing the Klimisch and CRED methods comes from a formally conducted ring test. The methodology is described below [11] [1].
Objective: To compare the categorization outcomes, consistency, and practical utility of the Klimisch and CRED evaluation methods for assessing the reliability and relevance of aquatic ecotoxicity studies.
Phase I (Klimisch Evaluation):
Phase II (CRED Evaluation):
Study Selection: The eight studies covered a range of test organisms (algae, crustaceans, fish, higher plants), chemical classes (industrial chemical, biocide, pharmaceutical, steroidal estrogen), and both peer-reviewed and industry GLP reports [11].
Data Analysis: For each study, the distribution of final reliability/relevance categories was compared between methods. The consistency among evaluators was assessed, and feedback on the practicality, transparency, and perceived accuracy of each method was collected via questionnaire [16] [11].
The following diagram contrasts the high-level processes of the Klimisch and CRED evaluation methods, illustrating the difference between a judgment-dependent path and a criteria-driven path.
Diagram Title: Klimisch vs. CRED Evaluation Workflow
This diagram places the CRED and Klimisch methods within the broader context of data evaluation for regulatory risk assessment, showing the integration of tools and the overarching goal of robust decision-making.
Diagram Title: Data Evaluation in Regulatory Risk Assessment
The development of the CRED method has been accompanied by practical tools and resources designed to aid researchers and risk assessors.
Table 3: Research Reagent Solutions & Key Resources for Ecotoxicity Study Evaluation
| Tool / Resource Name | Type | Primary Function | Source / Availability |
|---|---|---|---|
| CRED Evaluation Criteria & Guidance | Methodological Framework | Provides the 20 reliability and 13 relevance criteria with detailed guidance for evaluating aquatic ecotoxicity studies. | Published in [1]; available via SciRAP. |
| CRED Reporting Recommendations | Reporting Checklist | A set of 50 criteria in 6 categories to guide researchers in reporting studies that meet regulatory needs, improving usability. | Published in [1]; available via SciRAP. |
| SciRAP Online Platform | Web-Based Tool | Hosts the CRED criteria and other evaluation frameworks in a color-coding tool to facilitate structured, transparent evaluations and generate summary profiles. | Freely available at http://www.scirap.org [37]. |
| NanoCRED Framework | Specialized Framework | An adaptation of CRED for evaluating the reliability and relevance of ecotoxicity studies for nanomaterials [18]. | Available via SciRAP platform [18] [37]. |
| EthoCRED Framework | Specialized Framework | A framework developed to guide the reporting and evaluation of behavioural ecotoxicity studies [18]. | Described in Bertram et al., 2024 [18]. |
| CRED for Sediment and Soil | Specialized Framework | Expanded criteria for the evaluation of ecotoxicity studies performed with sediment and soil organisms. | Described in Casado‐Martinez et al., 2024 [18]. |
The derivation of Predicted-No-Effect Concentrations (PNECs) and Environmental Quality Standards (EQSs) relies fundamentally on the evaluation of the reliability and relevance of underlying ecotoxicity studies [1]. For decades, the Klimisch method served as the benchmark for this task, assigning studies to categories such as "reliable without restrictions" or "not reliable" [3]. However, its reliance on expert judgment and lack of detailed, transparent criteria led to inconsistencies, where different assessors could reach divergent conclusions on the same study [1]. This need for greater reproducibility and transparency catalyzed the development of next-generation, criteria-based tools.
This article examines three pivotal tools that represent this evolution: CRED (Criteria for Reporting and Evaluating Ecotoxicity Data), ToxRTool (Toxicological data Reliability Assessment Tool), and SciRAP (Science in Risk Assessment and Policy). Framed within the broader thesis of advancing beyond Klimisch, we objectively compare their designs, applications, and performance, supported by experimental data. The analysis confirms that while all three tools aim to systematize evaluation, CRED is distinguished by its specificity for ecotoxicity, its separation of reliability and relevance, and its validation through comprehensive ring-testing [1] [4].
The following table summarizes the core characteristics, objectives, and outputs of the CRED, ToxRTool, and SciRAP evaluation tools.
Table 1: Foundational Comparison of CRED, ToxRTool, and SciRAP
| Feature | CRED (Criteria for Reporting & Evaluating Ecotoxicity Data) | ToxRTool (Toxicological data Reliability Assessment Tool) | SciRAP (Science in Risk Assessment and Policy) |
|---|---|---|---|
| Primary Focus | Aquatic ecotoxicity studies (with extensions for nano, behavior, sediment) [18] [1] | Reliability of in vivo and in vitro (eco)toxicological data [38] | Reliability & relevance of ecotoxicity, in vivo, in vitro, and epidemiological studies [39] |
| Core Objective | Replace Klimisch for ecotoxicity; improve transparency, consistency, and regulatory usability [1] [4] | Operationalize Klimisch categories with a standardized checklist; improve harmonization under REACH [3] [38] | Provide structured, transparent evaluation for regulatory risk assessment, bridging to systematic review principles [3] |
| Evaluation Output | Separate, detailed assessments for Reliability (20 criteria) and Relevance (13 criteria) [1] [4] | Numerical score leading to a Klimisch category (1, 2, or 3) [3] [38] | Separate scores for Reporting Quality and Methodological Quality (Reliability), plus Relevance [3] [40] |
| Key Innovation | Explicit, extensive criteria with guidance; ring-tested against Klimisch; linked reporting recommendations [1] | Binary scoring system with weighted "key" criteria; Excel-based tool [3] | Modular design for different study types; separates reporting from methodology; explores "risk of bias" [3] [40] |
CRED was developed explicitly to address the shortcomings of the Klimisch method for aquatic ecotoxicity data. It provides a granular set of criteria—20 for reliability and 13 for relevance—accompanied by extensive guidance to minimize ambiguity [1]. Its development was informed by a ring test directly comparing it to the Klimisch method [4].
ToxRTool, an Excel-based tool, was created to bring consistency to the application of the Klimisch categories themselves. It uses a checklist of 21 criteria across five study areas, applying a binary (yes/no) scoring system with higher weight given to certain "red" or key criteria. The final score maps to Klimisch categories 1, 2, or 3 [3] [38].
SciRAP offers a suite of tools for various study types. Its approach involves evaluating reporting quality and methodological quality (together constituting reliability) separately from relevance. This separation allows for a nuanced understanding of whether a study is poorly reported or poorly conducted [3] [40]. SciRAP tools are also being adapted for specific areas like nanomaterials (SciRAPnano) [40].
The most direct empirical comparison between the old and new paradigms comes from a ring test conducted during the development of CRED.
The ring test was designed to compare the performance of the established Klimisch method with the draft CRED evaluation method [1].
The ring test provided quantitative and qualitative evidence of CRED's advantages.
Table 2: Key Results from the CRED vs. Klimisch Ring Test [1]
| Comparison Metric | Klimisch Method Outcome | CRED Method Outcome | Implication |
|---|---|---|---|
| Evaluator Consistency | Lower consistency between assessors | Higher consistency between assessors | CRED's detailed criteria reduce subjective interpretation. |
| Perceived Accuracy | Rated lower by participants | Rated higher by participants | Assessors had more confidence in evaluations made with CRED. |
| Perceived Applicability | Rated lower | Rated higher | CRED's specific guidance was found more useful for ecotoxicity studies. |
| Perceived Transparency | Rated lower | Rated significantly higher | The explicit criteria and guidance make the evaluation process more traceable. |
| Overall Preference | - | Majority of risk assessors preferred CRED | Supports CRED as a suitable replacement for Klimisch in ecotoxicity. |
The data shows that risk assessors found the CRED method more accurate, applicable, consistent, and transparent than the Klimisch method [1]. This preference underscores a shift towards tools that provide structured guidance over those reliant on unspecified expert judgment.
The tools differ fundamentally in their evaluation logic and final output. The following diagram illustrates the conceptual workflow of the CRED evaluation process, highlighting its detailed, criteria-driven pathway.
CRED Detailed Evaluation Workflow
CRED's strength is its parallel, detailed assessment. It does not produce a single composite score but separate, well-documented judgments for reliability and relevance. This allows a study to be deemed, for instance, highly relevant for a specific assessment but only of medium reliability due to statistical limitations, each with clear justification [1].
ToxRTool is primarily a reliability filter that outputs a Klimisch category. Its binary scoring can be efficient but has been criticized for potentially oversimplifying complex study quality issues, as numerical scores may not fully communicate specific strengths and weaknesses [3].
SciRAP introduces a key conceptual separation between the quality of reporting and the quality of methodology. A study can be well-conducted but poorly reported, or vice-versa. This distinction is crucial for a fair assessment, especially for non-guideline studies from the scientific literature [3].
The diagram below provides a high-level comparison of the primary output and focus of each tool's evaluation process.
Comparison of Evaluation Tool Outputs
Implementing these evaluation frameworks requires both conceptual and practical tools. The following table details essential "research reagent solutions" – key resources and materials critical for conducting or evaluating modern ecotoxicity studies effectively.
Table 3: Essential Toolkit for Ecotoxicity Study Execution and Evaluation
| Tool / Resource | Primary Function | Role in Evaluation Context |
|---|---|---|
| OECD Test Guidelines | Standardized protocols for toxicity testing (e.g., OECD 203, 210). | The gold standard for methodological quality. Compliance is a key criterion in all evaluation tools [1] [3]. |
| Good Laboratory Practice (GLP) | A quality system covering the organizational process and conditions for non-clinical studies. | Often a benchmark for reliability in Klimisch/ToxRTool. CRED and SciRAP evaluate specific methodological elements beyond GPL compliance [1] [3]. |
| Chemical Analysis Tools (e.g., HPLC, GC-MS) | To measure and verify the actual concentration of the test substance in exposure media. | Critical for assessing exposure validity. All tools include criteria for test substance characterization and exposure verification [1]. |
| Negative & Positive Controls | Reference groups to validate test system health and responsiveness. | Fundamental for assessing study validity. Their inclusion and results are explicit reliability criteria in CRED and other tools [1]. |
| Statistical Analysis Software (e.g., R, GraphPad Prism) | To calculate effect concentrations (EC/LC/NOEC) and associated confidence intervals. | Essential for deriving robust results. The appropriateness of statistical methods is a core element of reliability evaluation [1]. |
| Reporting Checklists (e.g., CRED's 50 reporting criteria) | Guides to ensure all essential study information is documented [1]. | Improves reporting quality upfront. Used prospectively by researchers and retrospectively by evaluators to locate critical information. |
Choosing the most appropriate tool depends on the specific regulatory context and assessment goals.
A significant trend is the specialization of tools for emerging challenges. Both CRED and SciRAP have spawned dedicated modules for nanomaterials (NanoCRED and SciRAPnano, respectively), addressing unique aspects like particle characterization and dosimetry, which are poorly covered by generic guidelines [18] [40].
The move "Beyond Klimisch" is characterized by a shift from opaque expert judgment to transparent, criteria-based evaluation. CRED, ToxRTool, and SciRAP all contribute to this paradigm, yet they serve overlapping but distinct niches.
Experimental evidence demonstrates that CRED provides a more consistent, transparent, and preferred framework for evaluating aquatic ecotoxicity studies compared to the legacy Klimisch method [1]. Its detailed, separate handling of reliability and relevance makes it particularly suited for deriving robust environmental quality standards. ToxRTool offers a pragmatic, harmonized implementation of the Klimisch categories for general toxicological data. SciRAP provides a flexible, multi-domain platform suitable for integrated assessments and is evolving to incorporate systematic review principles.
The future of ecotoxicity data evaluation lies in the continued refinement and contextual application of these structured tools, supported by specialized modules for novel contaminants, ultimately ensuring that environmental risk assessments are built upon a foundation of reliably evaluated and clearly relevant science.
The evaluation of ecotoxicity data forms the scientific cornerstone for environmental risk assessments (ERA) of chemicals and pharmaceuticals. For decades, the Klimisch method [2] has been a default tool for classifying study reliability, yet it has faced sustained criticism for lacking detail, being susceptible to bias, and failing to ensure consistency among assessors [1] [2]. In response, the Criteria for Reporting and Evaluating Ecotoxicity Data (CRED) was developed as a transparent, criteria-rich framework designed to improve the reproducibility and consistency of reliability and relevance evaluations [1] [18].
This comparison guide analyzes the tangible impact of CRED, contrasting it with the Klimisch method through empirical data and examining its integration into evolving European regulatory guidance and pharmaceutical industry projects. The shift towards CRED represents more than a methodological update; it is part of a broader movement to incorporate high-quality academic research into regulatory decision-making, thereby strengthening the scientific basis for environmental protection [41] [42].
A comprehensive ring test involving 75 risk assessors from 12 countries provides the most direct experimental comparison of the two methods [2]. Participants evaluated identical sets of aquatic ecotoxicity studies using both frameworks.
The ring test demonstrated systematic differences in how studies are categorized, with CRED typically applying more stringent reliability assessments.
Table: Ring Test Results for Klimisch vs. CRED Evaluation Methods [5] [2]
| Study & Description | Klimisch Method Categorization | CRED Method Categorization | Key Implication |
|---|---|---|---|
| Study E (Industry GLP report on fish toxicity) [5] | 44% "Reliable without restrictions" (R1), 56% "Reliable with restrictions" (R2). Mean score: 1.6 (R1=1, R2=2). | 16% R1, 21% R2, 63% "Not reliable" (R3). Mean score: 2.5. | Highlights CRED's stricter scrutiny; majority found study unreliable despite GLP status. |
| Overall Ring Test Consistency | Lower consistency among evaluators. High reliance on expert judgment due to vague criteria [2]. | Higher consistency. Structured criteria and guidance reduced variability among assessors [2]. | CRED enhances harmonization across different assessors and institutions. |
| Perceived Method Utility | Seen as less transparent, less accurate, and more dependent on subjective judgment [2]. | Rated as more accurate, applicable, consistent, and transparent by participants [1] [2]. | User feedback supports CRED as a practical and superior tool for regulatory evaluation. |
The core differences between the methods are foundational, extending beyond specific outcomes to their structure and philosophy.
Table: Structural Comparison of Klimisch and CRED Evaluation Frameworks [1] [20] [2]
| Feature | Klimisch Method | CRED Method | Advantage of CRED |
|---|---|---|---|
| Primary Focus | Reliability only [2]. | Reliability (20 criteria) and Relevance (13 criteria) evaluated separately [1]. | Enables nuanced assessment; a reliable study may not be relevant for a specific regulatory question. |
| Guidance & Transparency | Limited guidance; 12-14 brief criteria for ecotoxicity with no detailed guidance [20] [2]. | Extensive guidance for each criterion. Includes reporting recommendations (50 criteria across 6 categories) [1]. | Reduces arbitrary expert judgment, increases transparency and reproducibility. |
| Bias Potential | Criticized for favoring industry-sponsored Good Laboratory Practice (GLP) studies [1] [2]. | Criteria are substance- and study-type neutral; GLP is not an automatic pass [2]. | Promotes unbiased evaluation of all studies, including academic peer-reviewed research [41]. |
| Evaluation Categories | Reliable without restrictions (R1), Reliable with restrictions (R2), Not reliable (R3), Not assignable (R4) [2]. | Uses same categories (R1-R4) for reliability; adds relevance categories (C1-C3) [2]. | Maintains regulatory familiarity while adding essential relevance dimension. |
The empirical data supporting CRED's advantages were generated through a meticulously designed two-phase ring test [2].
The following diagram contrasts the decision-making pathways of the two evaluation methods.
CRED principles are increasingly embedded in the regulatory lifecycle for pharmaceuticals, from development to post-authorization.
Effectively applying the CRED framework or preparing studies for CRED-compliant evaluation requires specific tools and resources.
Table: Essential Toolkit for CRED-Based Ecotoxicity Research and Evaluation
| Tool / Resource | Function | Source/Availability |
|---|---|---|
| CRED Evaluation Excel Sheet | Structured spreadsheet with all 33 reliability and relevance criteria and scoring guidance. Facilitates standardized evaluation [18]. | Available for download from the SciRAP website [18]. |
| CRED Reporting Checklist | A list of 50 specific reporting criteria across 6 categories (general info, test design, substance, organism, exposure, statistics) [1]. Serves as a guide for designing and writing studies. | Included in the CRED publication [1] and associated resources [18]. |
| Specialized CRED Extensions | Adapted frameworks for novel areas: NanoCRED (nanomaterials), EthoCRED (behavioral studies), CRED for sediment and soil [18]. | Details in respective peer-reviewed publications [18]. |
| OECD Test Guidelines | Foundational test protocols (e.g., OECD 210, 211). CRED criteria align with and expand upon these standards [1] [2]. | OECD website. |
| Regulatory Guidance Documents | Contextual documents: EMA's Environmental Risk Assessment Guidelines, EU Variation Regulation (EU) 2024/1701 [43] [42]. | EMA and EC official websites. |
While not yet formally mandated in all guidelines, CRED's principles align with and support key EU regulatory shifts.
The pharmaceutical industry applies CRED-like principles in specific areas to enhance data quality and regulatory acceptance.
The experimental evidence demonstrates that the CRED method offers a more consistent, transparent, and detailed framework for ecotoxicity data evaluation compared to the Klimisch method. Its real-world impact is growing through its role in improving regulatory submissions and supporting the integration of academic science into chemical safety assessments.
The future trajectory points toward broader formal adoption. The ongoing revision of the EU pharmaceutical legislation (the Pharma Package) and the implementation of new regulations like the Variation Regulation create opportunities for embedding robust evaluation standards [43] [44]. Furthermore, the development of specialized CRED tools for nanomaterials, behavioral toxicology, and terrestrial systems ensures its continued relevance for assessing next-generation chemicals and pharmaceuticals [18]. For researchers and assessors, adopting CRED is not merely a procedural change but an active step towards more reproducible, credible, and impactful environmental safety science.
The comparative analysis underscores that the CRED method represents a significant evolution over the Klimisch approach, offering a more granular, transparent, and consistent framework for ecotoxicity data evaluation. While the Klimisch method provided a foundational system, its reliance on expert judgment often led to inconsistencies. CRED, with its detailed criteria for both reliability and relevance, addresses these shortcomings, as validated by extensive ring testing. For biomedical and clinical researchers, particularly in drug development, adopting CRED can enhance the robustness of environmental risk assessments, facilitate the acceptance of high-quality peer-reviewed data in regulatory dossiers, and contribute to better-informed environmental safety profiles for new substances. The future direction points toward broader integration of CRED into global regulatory guidelines and its adaptation for emerging data types, promising greater harmonization in ecological hazard assessment.