Integrative Biomedical Research

Integrative Biomedical Research (Journal of Angiotherapy) | Online ISSN  3068-6326
743
Citations
1.5m
Views
737
Articles
Your new experience awaits. Try the new design now and help us make it even better
Switch to the new experience
Figures and Tables
REVIEWS   (Open Access)

Comprehensive Review of Foundation Toxicity Models Integrating In Vivo, In Vitro, and Chemical Knowledge for Unified Risk Prediction

Tajmin Khanam 1*, Lilufar Yeasmin 1, Mst. Farzina Akter 1

+ Author Affiliations

Integrative Biomedical Research 10 (1) 1-14 https://doi.org/10.25163/biomedical.10110733

Submitted: 07 March 2026 Revised: 22 April 2026  Published: 08 May 2026 


Abstract

Toxicology, perhaps more than many scientific disciplines, seems to be standing at a quiet but decisive turning point. For decades, the field relied heavily on in vivo experimentation—robust and deeply informative, yet increasingly constrained by ethical concerns, cost, and, not insignificantly, questions of human relevance. As chemical exposure grows in both scale and complexity, these traditional approaches begin to feel, if not outdated, then at least insufficient. This review explores the emergence of “foundation toxicity models,” an evolving class of integrative frameworks that attempt to bring together in vivo observations, in vitro mechanistic insights, and chemical structural knowledge into unified predictive systems. At the core of this transition lies a shift from observation to anticipation. Advances in quantitative high-throughput screening, artificial intelligence, and multi-omics technologies have made it possible to deconstruct toxicity into measurable biological perturbations. Mechanistic frameworks such as Adverse Outcome Pathways provide a conceptual scaffold, linking molecular initiating events to organism-level outcomes. Meanwhile, computational approaches—from classical QSAR models to deep learning and graph-based architectures—are increasingly capable of capturing the nonlinear complexity inherent in biological systems. And yet, the path forward is not entirely smooth. Challenges related to data heterogeneity, model interpretability, and the persistent difficulty of in vitro-to-in vivo extrapolation remain significant. Still, there is a sense—perhaps cautious, but growing—that these limitations are not insurmountable. Rather, they represent transitional constraints within a rapidly evolving paradigm. Taken together, the convergence of experimental and computational toxicology suggests a future in which toxicity is not merely detected after the fact, but predicted in advance. Foundation toxicity models, in this context, do not simply extend existing methods—they begin to redefine how chemical risk is understood, evaluated, and, ultimately, prevented.

Keywords: Predictive toxicology; Adverse Outcome Pathways; qHTS; Artificial intelligence; IVIVE

1. Introduction

For much of modern toxicology, the evaluation of chemical safety has relied—almost unquestioningly—on in vivo experimentation. Animal models, particularly rodents, have served as the foundational systems through which hazard identification and dose–response relationships are inferred. And yet, as indispensable as these models once seemed, their limitations have become increasingly difficult to ignore. The sheer scale of contemporary chemical production—where thousands of new compounds are introduced annually—has exposed a structural mismatch between testing capacity and regulatory demand (National Research Council, 2007). Traditional in vivo studies, while methodologically rigorous, are slow, costly, and ethically burdensome. It is not uncommon for a single regulatory assessment to require years of experimentation, thousands of animals, and substantial financial investment, often without guaranteeing direct human relevance.

More troubling, perhaps, is the persistent issue of translational uncertainty. Biological responses observed in animal systems do not always align with human physiology, largely due to interspecies differences in metabolism, signaling pathways, and organ-specific sensitivities (Gottmann et al., 2001). This disconnect introduces a degree of ambiguity into risk assessment—one that becomes particularly consequential in drug development and environmental health decision-making. In this light, the long-standing “gold standard” begins to appear less like a definitive benchmark and more like a constrained approximation of biological reality.

It is within this context that toxicology has begun—gradually but decisively—to redefine itself. Over the past two decades, a conceptual shift has emerged, moving away from descriptive, endpoint-driven animal testing toward mechanism-based, human-relevant approaches. Central to this transformation is the integration of in vitro and in silico methodologies, which together promise not only higher throughput but also deeper mechanistic insight (Kavlock et al., 2009; Sun et al., 2012). High-throughput screening technologies, particularly quantitative high-throughput screening (qHTS), have enabled the rapid evaluation of thousands of compounds across diverse biological targets, generating datasets of unprecedented scale and resolution (Shukla et al., 2010).

At the same time, advances in computational toxicology have opened new avenues for predictive modeling. Early efforts in this domain—such as quantitative structure–activity relationship (QSAR) modeling—were grounded in the premise that chemical structure encodes biological activity (McKinney, 1985). While promising, these models were often constrained by limited data quality and simplistic assumptions (Cronin & Schultz, 2003). However, with the advent of machine learning and improved data curation practices, contemporary models are increasingly capable of capturing nonlinear relationships and multidimensional interactions within toxicological systems (Zhu et al., 2008; Merlot, 2010).

What is emerging, then, is not merely a replacement of one methodology with another, but rather a convergence—a synthesis of diverse data streams into what might be termed “foundation toxicity models.” These models aim to integrate chemical descriptors, in vitro assay outputs, and in vivo outcomes into unified predictive frameworks. In doing so, they attempt to bridge a longstanding gap: the disconnect between molecular-level perturbations and organism-level adverse effects. This challenge—often referred to as the in vitro to in vivo extrapolation (IVIVE) problem—remains one of the central obstacles in modern toxicology (Thomas et al., 2013).

The development of mechanistic frameworks such as Adverse Outcome Pathways (AOPs) has provided a conceptual scaffold for addressing this gap. AOPs map the progression from a molecular initiating event through a series of key biological events to an adverse outcome at the organism or population level (Ankley et al., 2010). In principle, this structured representation allows for the alignment of in vitro assay data with in vivo toxicity endpoints, thereby enhancing both interpretability and predictive accuracy. Yet, despite their conceptual elegance, AOP-based models still face challenges in terms of completeness, validation, and integration with quantitative modeling approaches.

It is worth noting that this shift toward integrative modeling is not occurring in isolation. Large-scale collaborative initiatives—such as the Tox21 program—have played a pivotal role in accelerating methodological innovation and data sharing (Schmidt, 2009). These efforts reflect a broader recognition that the complexity of toxicological systems cannot be adequately addressed through isolated experimental approaches. Instead, what is required is a systems-level perspective—one that embraces heterogeneity, leverages computational power, and remains grounded in biological plausibility.

Historically, the roots of this transformation can be traced back several decades. Early computational efforts, including expert systems and predictive toxicology evaluation projects, hinted at the possibility of in silico hazard prediction (Benfenati & Gini, 1997; Bristol et al., 1996). Similarly, foundational in vitro assays—such as neutral red uptake and colorimetric viability tests—demonstrated that cellular responses could serve as proxies for organismal toxicity (Borenfreund & Puerner, 1985; Mosmann, 1983). While these methods were initially limited in scope, they laid the groundwork for the high-throughput, data-rich paradigms that define contemporary toxicology.

Still, the path forward is not without uncertainty. The integration of heterogeneous data—spanning chemical space, biological systems, and temporal scales—introduces both technical and conceptual challenges. Data quality remains a persistent concern, as inconsistencies in experimental design and reporting can undermine model reliability (Gottmann et al., 2001). Moreover, the selection of appropriate molecular descriptors and statistical algorithms requires careful consideration, particularly given the risk of overfitting and reduced generalizability (Cronin & Schultz, 2003).

And yet, despite these challenges, the trajectory is unmistakable. Toxicology is moving toward a future in which predictive models—grounded in mechanistic understanding and supported by diverse data streams—play a central role in risk assessment. Foundation toxicity models, in this sense, represent both a culmination of past efforts and a starting point for new methodological innovations.

This review, therefore, seeks to examine the evolution and current state of these integrative approaches. Specifically, it explores how in vivo, in vitro, and chemical knowledge can be harmonized within unified predictive frameworks; evaluates the role of machine learning and molecular descriptors in enhancing model performance; and considers the extent to which mechanistic frameworks such as AOPs can improve IVIVE. In doing so, it aims not only to synthesize existing knowledge but also to reflect—perhaps cautiously—on the future direction of toxicity testing in an increasingly data-driven scientific landscape.

2. Methodology

2.1 Literature Identification and Conceptual Scope

This narrative review was designed to synthesize the evolving landscape of predictive toxicology, with a particular focus on integrative “foundation toxicity models.” Rather than adopting a strictly systematic approach, the methodology followed a structured narrative framework, allowing for conceptual depth and interdisciplinary integration. Literature was identified through targeted searches of major scientific databases, including PubMed, Scopus, and Web of Science, using combinations of keywords such as “computational toxicology,” “AOP,” “IVIVE,” “qHTS,” and “machine learning in toxicity prediction.”

Priority was given to studies that contributed to the conceptual or methodological development of predictive toxicology, particularly those addressing the integration of in vivo, in vitro, and in silico data streams. Foundational reports—such as those from the National Research Council (2007, 2009)—were included to contextualize the paradigm shift toward 21st-century risk assessment frameworks.

2.2 Inclusion Criteria and Thematic Structuring

The inclusion criteria emphasized peer-reviewed articles that addressed mechanistic modeling, computational prediction, and data integration in toxicology. Studies were selected not only for methodological rigor but also for their relevance to bridging experimental and computational domains. Particular attention was given to research describing quantitative structure–activity relationships (QSAR), machine learning models, and deep learning architectures, as well as experimental approaches such as qHTS and organotypic in vitro systems (Sun et al., 2012; Zhu et al., 2008). To ensure coherence, the selected literature was organized into thematic domains: (i) historical evolution of toxicity testing, (ii) computational modeling approaches, (iii) data integration and high-throughput screening, (iv) mechanistic frameworks such as AOPs, and (v) translational challenges including IVIVE and PBPK modeling. This thematic structuring allowed for a layered discussion, moving from foundational concepts to emerging technologies.

2.3 Data Interpretation and Synthesis Strategy

Given the heterogeneity of available data, a qualitative synthesis approach was adopted. Rather than aggregating quantitative outcomes, the review focused on identifying recurring conceptual patterns, methodological innovations, and areas of convergence across studies. For instance, multiple sources highlighted AOPs as a unifying framework linking molecular events to adverse outcomes (Ankley et al., 2010; Jarabek & Hines, 2019). Similarly, advances in machine learning and deep learning were consistently associated with improved predictive performance, particularly when combined with high-quality training datasets (Gianquinto et al., 2026).

2.4 Limitations of the Methodological Approach

It is important to acknowledge that narrative reviews inherently involve a degree of subjectivity in study selection and interpretation. While efforts were made to include a representative and balanced set of sources, the absence of formal meta-analytic techniques limits the ability to quantify effect sizes or directly compare model performance across studies. Nevertheless, the chosen approach allows for a more flexible and integrative discussion—one that reflects the interdisciplinary nature of modern toxicology.

3. AI and Computational Modeling for Adverse Drug Event and Toxicity Prediction

3.1 Mechanistic Scaffolding and the Quiet Rise of Adverse Outcome Pathways

If a single conceptual anchor is holding together the increasingly complex landscape of modern toxicology, it is perhaps the Adverse Outcome Pathway (AOP) framework. Not because it solves everything—it does not—but because it offers, at the very least, a structured way of thinking about causality in biological systems. An AOP traces a chain of events beginning with a Molecular Initiating Event (MIE)—a chemical interaction at the molecular level—and progressing through a cascade of Key Events (KEs), eventually culminating in an Adverse Outcome (AO) that carries regulatory significance (Ankley et al., 2010). The integration of mechanistic AOP frameworks with PBPK-based kinetic modeling provides a structured basis for predictive toxicology (Figure 1).

This layered organization has proven particularly useful in a field historically fragmented by disconnected observations. In vitro assays, for instance, often generate highly specific but context-poor data. AOPs allow these isolated signals to be repositioned within a broader biological narrative—one that links molecular perturbation to systemic dysfunction. In this sense, foundation toxicity models do not merely aggregate data; they attempt to tell a story about toxicity, albeit a probabilistic one. Yet, even here, a certain caution is warranted. The mechanistic clarity that AOPs promise is, at times, more aspirational than complete. Biological systems rarely unfold in linear pathways, and feedback loops, redundancy, and compensatory mechanisms complicate even the most carefully constructed frameworks. Still, when combined with physiologically based pharmacokinetic (PBPK) models—which simulate absorption, distribution, metabolism, and excretion (ADME)—AOPs begin to approximate something closer to real biological behavior (Kreutz et al., 2024; Gianquinto et al., 2026).

Importantly, these models also help address one of toxicology’s most persistent challenges: the failure of animal systems to predict human-specific toxicities, particularly hepatotoxicity and cardiotoxicity. By incorporating human-relevant systems—such as stem cell–derived organoids and 3D tissue cultures—the reliance on interspecies extrapolation is reduced, though not entirely eliminated (Combes & Balls, 2011; Sariya et al., 2025). The resulting framework, imperfect as it may be, represents a meaningful step toward more reliable human risk prediction.

3.2 A Paradigm in Transition: From Observation to Anticipation

For decades, toxicology functioned as a largely observational science. Harm was identified after exposure, cataloged through pathological endpoints, and interpreted retrospectively. That approach—while foundational—now feels increasingly misaligned with the speed and complexity of modern chemical innovation. The gap between chemical generation and safety evaluation has widened to the point where traditional methods can no longer keep pace (Sun et al., 2012).

What is emerging instead is a shift—subtle at first, but now unmistakable—toward predictive toxicology. This transition is not simply technological; it is philosophical. Rather than asking what damage has occurred, the field is beginning to ask what damage is likely to occur, and why. Artificial intelligence (AI) and computational modeling sit at the center of this shift, offering tools capable of integrating molecular structure, biological response data, and clinical observations into unified predictive systems (Jarabek & Hines, 2019; Gianquinto et al., 2026).

Still, the promise of AI must be approached with measured optimism. Predictive accuracy, while improving, is not uniformly reliable across all toxicity endpoints. Complex, multi-factorial conditions—such as drug-induced liver injury—continue to challenge even the most advanced

Figure 1. Integration of Adverse Outcome Pathways (AOP) and PBPK Modeling for Mechanistic Toxicity Prediction. A conceptual framework illustrating how chemical exposure initiates molecular events that propagate through key biological pathways to produce adverse outcomes. Physiologically based pharmacokinetic (PBPK) modeling provides ADME-based context, enabling IVIVE translation for human-relevant risk prediction.

Figure 2. Modeling Human Diversity Using Virtual Populations in Computational Toxicology. A simplified schematic showing how PBPK models integrated with AI simulate diverse human populations across age, genetics, health status, and environmental exposure. These virtual populations enable improved prediction of developmental and reproductive toxicity (DART) and support human-relevant risk assessment.

models. And yet, the trajectory is clear: toxicology is evolving into a discipline that not only interprets biological harm but anticipates it, constructing what might cautiously be described as a “toxicological blueprint” of human physiology (Sun et al., 2012).

3.3 The Computational Evolution: From QSAR Foundations to Deep Learning Architectures

The roots of computational toxicology are, in many ways, deceptively simple. Early QSAR models operated on the premise that chemical structure encodes biological activity—a principle that remains valid, though far from sufficient (Ziemba, 2025). These models established the first quantitative links between molecular descriptors and toxicological outcomes, marking an important, if limited, step toward in silico prediction.

However, classical QSAR approaches struggled with the inherent complexity of biological systems. Nonlinear interactions, context-dependent effects, and so-called “activity cliffs” exposed the limitations of models built on relatively simple statistical relationships (Gianquinto et al., 2026). The emergence of machine learning (ML) began to address these shortcomings. Algorithms such as Support Vector Machines and Random Forests introduced flexibility, enabling the analysis of high-dimensional datasets with improved predictive performance (Sun et al., 2012; Zhang et al., 2025). The real transformation, however, arrived with deep learning (DL). Unlike earlier methods, DL architectures do not rely on predefined features; instead, they learn hierarchical representations directly from raw data. Models such as DeepTox demonstrated the power of this approach, achieving state-of-the-art performance through multi-task learning across multiple toxicity endpoints (Mayr et al., 2016).

More recently, Graph Neural Networks (GNNs) have redefined how molecular data are represented, treating compounds as relational graphs rather than static descriptors. This shift allows for a more nuanced understanding of chemical interactions, capturing both local and global structural features. At the same time, emerging approaches—such as Vision Transformers—are beginning to integrate visual and numerical representations of chemical space, hinting at a future where models can seamlessly navigate multiple data modalities (Zhang et al., 2025; Gianquinto et al., 2026).

3.4 Data as Foundation: qHTS, Omics, and the Challenge of Integration

If computational models are the engines of predictive toxicology, then data—high-quality, large-scale, and diverse—are the fuel. Programs such as Tox21 and ToxCast have fundamentally reshaped the data landscape, generating millions of measurements through quantitative high-throughput screening (qHTS) (Sun et al., 2012; Paul Friedman et al., 2025). These platforms allow researchers to dissect complex toxicity endpoints into discrete, measurable perturbations at the molecular and cellular levels. Yet, the real challenge lies not in data generation, but in data integration. Modern toxicity models are increasingly multi-modal, combining chemical descriptors with transcriptomic, proteomic, and metabolomic data (Jarabek & Hines, 2019; Gianquinto et al., 2026). This integration enables the identification of molecular “signatures” of toxicity—patterns that emerge long before clinical symptoms become apparent.

Adding another layer of complexity, real-world data sources such as Electronic Health Records (EHRs) and pharmacovigilance databases (e.g., FAERS) are being incorporated into predictive frameworks. These datasets introduce variability—genetic, environmental, and clinical—that is largely absent from controlled laboratory studies (Laurent, 2026; Kreutz et al., 2024). While this variability complicates modeling, it also enhances relevance, bringing predictions closer to real-world outcomes.

3.5 Bridging Scales: IVIVE, PBPK, and the Persistent Data Gap

Perhaps the most persistent—and conceptually challenging—problem in toxicology is the translation from in vitro findings to in vivo outcomes. Cells in culture, after all, exist in highly controlled environments, devoid of systemic interactions, immune responses, and metabolic complexity (Zhang et al., 2018; Kim & Choi, 2026). Bridging this gap requires not just data, but interpretation. Here, the integration of AOP frameworks with PBPK modeling becomes particularly powerful. AOPs provide the mechanistic narrative, while PBPK models supply the quantitative context, simulating how a chemical move through the human body (Kreutz et al., 2024). Together, they enable In Vitro-to-In Vivo Extrapolation (IVIVE), translating effective concentrations observed in vitro into human-equivalent doses.

This translation is critical for risk assessment. Metrics such as the Bioactivity-Exposure Ratio (BER) offer a way to contextualize hazard within real-world exposure scenarios, helping regulators determine whether observed bioactivity is likely to manifest in practice (Paul Friedman et al., 2025; Kim & Choi, 2026). Still, uncertainty remains. IVIVE is not a single step but a chain of assumptions, each introducing potential error. Recognizing—and quantifying—this uncertainty is as important as the predictions themselves.

3.6 Opening the Black Box: Explainability and Trust in AI Systems

As AI models grow in complexity, so too does the challenge of interpreting their predictions. Deep learning systems, while powerful, often function as “black boxes,” producing outputs without transparent reasoning. In a regulatory context, this opacity is not merely inconvenient—it is unacceptable.

Explainable AI (XAI) has emerged as a response to this challenge. Techniques such as the Contrastive Explanation Method (CEM) aim to identify the specific features driving a model’s prediction, distinguishing between factors that contribute to toxicity (“pertinent positives”) and those that mitigate it (“pertinent negatives”) (Sharma et al., 2023). This level of interpretability is not just a technical enhancement; it is a prerequisite for trust. Regulators, clinicians, and researchers must be able to assess not only whether a prediction is accurate, but whether it is biologically plausible (Ward et al., 2021; Laurent, 2026). Without this transparency, even the most accurate models risk remaining confined to academic exploration rather than practical application.

3.7 Modeling Human Diversity: Life-Stages, Susceptibility, and Virtual Populations

One of the more compelling advantages of computational toxicology lies in its ability to simulate diversity—something traditional animal models struggle to capture. Human populations are heterogeneous, shaped by genetics, age, health status, and environmental exposure. Capturing this variability is essential for meaningful risk assessment. PBPK models, combined with AI frameworks, enable the creation of “virtual populations,” in which parameters can be adjusted to reflect different life stages and susceptibility windows (Kreutz et al., 2024). This is particularly important in areas such as developmental and reproductive toxicity (DART), where timing of exposure can be as critical as dose. Computational toxicology frameworks increasingly incorporate virtual populations to capture human variability and improve predictive accuracy (Figure 2). Agent-based models (ABMs) further extend this capability, simulating complex biological processes such as embryonic development. When combined with human-induced pluripotent stem cell (hiPSC)-derived systems, these models offer a level of human relevance that was previously unattainable (Knudsen et al., 2020; Ziemba, 2025). Still, these approaches remain computationally intensive and require careful validation, particularly when extrapolating to population-level outcomes.

3.8 Toward Integration: A One Health Perspective in Toxicity Prediction

Looking ahead, the trajectory of toxicology appears to be converging toward a more integrated, systems-level perspective. The concept of “One Health”—which recognizes the interconnectedness of human, animal, and environmental health—provides a useful framework for this integration (Magurany et al., 2023; Jarabek & Hines, 2019). Foundation toxicity models, situated at the intersection of AI, multi-omics, and advanced in vitro systems, are uniquely positioned to support this vision. They offer the possibility of more ethical testing paradigms, reduced reliance on animal models, and more precise, individualized risk assessments.

And yet, challenges remain. Data standardization, reproducibility, and regulatory acceptance continue to pose significant barriers. Perhaps more fundamentally, there is the question of balance—how to integrate increasingly complex computational tools without losing sight of biological reality. Still, the direction is difficult to ignore. Toxicology is no longer merely a science of observation. It is becoming, gradually but decisively, a science of prediction—one that seeks not only to understand harm, but to anticipate and prevent it.

 

4. Progress Toward a Unified Predictive Framework

4.1 From Observation to Prediction: A Gradual but Defining Shift

The transition from traditional toxicology—anchored in observing apical endpoints—to a predictive, mechanism-driven discipline is no longer speculative; it is, perhaps somewhat unexpectedly, becoming operational. What once appeared as an aspirational framework outlined in early 21st-century reports is now materializing through the convergence of computational modeling, high-throughput experimentation, and structured biological knowledge (Sun et al., 2012).

This evolution, however, has not been abrupt. Rather, it has unfolded through incremental refinements—each addressing a specific limitation of earlier approaches. At its core lies the idea of a “toxicological blueprint,” where chemical structure, biological perturbation, and system-level outcomes are no longer treated as isolated observations but as interconnected layers of a predictive system. Still, one might hesitate to call this transformation complete. The integration is ongoing, and the reliability of such models, while improving, remains context-dependent.

4.2 The Computational Landscape: Expanding Complexity and Performance

The progression of computational models reflects, quite clearly, a shift in both ambition and capability. Early QSAR approaches, grounded in linear relationships, provided an essential starting point but were inherently constrained in capturing the nonlinear dynamics of biological systems (Gianquinto et al., 2026). The introduction of machine learning—particularly Support Vector Machines and Random Forests—marked a notable improvement, allowing for more robust classification across high-dimensional chemical spaces (Sun et al., 2012; Zhang et al., 2025). Yet, it is with deep learning that the field begins to approach something closer to biological realism. Models such as DeepTox, for instance, demonstrated that multi-task learning could simultaneously predict multiple toxicity endpoints with remarkable accuracy (Mayr et al., 2016). More recent architectures—Graph Neural Networks and Vision Transformers—extend this capability further by capturing structural relationships and multimodal representations without relying on manual feature engineering (Hong & Kwon, 2025; Sariya et al., 2025).

As summarized in Table 1, the computational landscape now encompasses a spectrum of methods ranging from classical statistical models to advanced AI architectures, each contributing distinct strengths to toxicity prediction. These models, taken together, suggest that predictive performance is no longer limited by algorithmic capacity alone, but increasingly by the quality and integration of input data (Gianquinto et al., 2026).

4.3 Data as Infrastructure: The Quiet Power of Standardized Resources

If algorithms provide the analytical framework, it is data that ultimately determines the limits of prediction. The emergence of large-scale, standardized databases has fundamentally reshaped this landscape. Initiatives such as Tox21 and ToxCast have generated extensive bioactivity datasets, enabling models to learn from a breadth of chemical–biological interactions that would have been inconceivable only decades ago (Sun et al., 2012; Paul Friedman et al., 2025). But the significance of these resources extends beyond scale. They introduce a level of standardization that allows for reproducibility—an often underappreciated but essential component of predictive modeling. As detailed in Table 2, databases such as ChEMBL, PubChem, and DrugBank provide curated chemical and pharmacological data, while specialized repositories like SIDER and LTKB contribute clinically relevant adverse outcome information (Amorim et al., 2024; Zhang et al., 2025).

Importantly, this convergence of datasets enables multimodal integration. Chemical descriptors can now be analyzed alongside transcriptomic signatures, clinical outcomes, and even real-world patient variability. This integration begins to address what has long been described as the “data gap”—the disconnect between controlled experimental systems and real-world biological complexity (Zhang et al., 2018). Still, integration introduces its own challenges, particularly in harmonizing heterogeneous data formats and ensuring consistency across sources.

4.4 Organ-Specific Toxicity: Navigating the Complexity of Biological Response

Perhaps one of the most persistent challenges in toxicology is the accurate prediction of organ-specific toxicity. Liver injury, cardiotoxicity, and neurotoxicity, for example, often emerge from complex, multi-factorial processes that are difficult to replicate in simplified experimental systems. Traditional animal models, while informative, have frequently failed to predict these outcomes with sufficient reliability (Gianquinto et al., 2026; Amorim et al., 2024). Computational tools are beginning—cautiously—to fill this gap. As illustrated in Table 3, models such as DILIrank and CUPID leverage both experimental and clinical data to improve prediction of hepatotoxicity and cardiotoxicity, respectively (Gianquinto et al., 2026). These tools do not eliminate uncertainty, but they do offer a more human-relevant perspective, particularly when integrated with mechanistic

Table 1. Computational Models and Algorithms for Toxicity Prediction. This table summarizes the principal computational approaches used to link chemical structure with biological outcomes. It highlights the evolution from classical QSAR to advanced deep learning and mechanistic models, emphasizing their inputs, outputs, and predictive strengths in modern toxicology.

Algorithm Type

Specific Method

Primary Application

Key Feature

Input Data Type

Output Type

Primary Strength

References

Classical QSAR

Linear Regression

Endpoint prediction

Statistical correlation

Molecular descriptors

Continuous toxicity value

Simplicity, speed

Gianquinto et al. (2026)

Machine Learning

Support Vector Machine (SVM)

Toxicity classification

Handles high-dimensional data

Chemical fingerprints

Binary (toxic/non-toxic)

Robust performance

Sun et al. (2012)

Machine Learning

Random Forest

Organ-specific toxicity

Ensemble learning

Engineered features

Classification

Interpretability

Zhang et al. (2025)

Deep Learning

Deep Neural Networks (DeepTox)

Multi-endpoint prediction

Automated pattern extraction

Large HTS datasets

Multiple toxicity endpoints

High predictive accuracy

Mayr et al. (2016)

Deep Learning

Convolutional Neural Networks (CNN)

Image-based analysis

Spatial feature recognition

Molecular images

Feature maps

Automatic feature extraction

Zhang et al. (2025)

Deep Learning

Graph Neural Networks (GNNs)

Molecular modeling

Captures relational structure

Atom–bond graphs

Toxicity classes

Structural precision

Gianquinto et al. (2026)

Deep Learning

Vision Transformers (ViT)

Multimodal integration

Attention mechanisms

Images + tabular data

Integrated risk scores

Global context modeling

Sariya et al. (2025)

Mechanistic Modeling

PBPK Models

Dosimetry and ADME

Compartmental simulation

Physiological/kinetic data

Organ concentration

Human relevance

Kreutz et al. (2024)

Mechanistic Modeling

Agent-Based Models (ABM)

Morphogenesis simulation

Cellular interaction modeling

Cell behavior rules

Developmental risk outcomes

Mechanistic depth

Knudsen et al. (2020)

Explainable AI

Contrastive Explanation Method (CEM)

Model interpretation

Identifies causal features

SMILES embeddings

Mechanistic insights

Transparency

Sharma et al. (2023)

Table 2. Key Databases and Resources for Computational Toxicology. This table outlines major publicly available databases that support computational toxicology. These repositories provide standardized chemical, biological, and clinical datasets essential for model development, validation, and regulatory applications.

Database Name

Content Scope

Data Scale

Primary Use

Key Feature

Developer/Organization

Accessibility

References

Tox21

HTS bioactivity

~10,000 chemicals

Pathway evaluation

Robotic screening

NIH/EPA/FDA

Open access

Sun et al. (2012)

ToxCast

Bioassay library

1,800+ chemicals

Hazard profiling

Multi-endpoint assays

US EPA

Open access

Paul Friedman et al. (2025)

ChEMBL

Bioactive molecules

2.5M+ records

Target identification

Curated bioactivity data

EMBL-EBI

Open access

Amorim et al. (2024)

PubChem

Chemical structures

115M+ compounds

Structural search

Large-scale diversity

NIH/NCBI

Open access

Amorim et al. (2024)

DrugBank

Drugs and targets

3,500+ targets

Clinical pharmacology

Pathway integration

University of Alberta

Open access

Zhang et al. (2025)

DSSTox

Standardized chemical lists

Extensive

QSAR modeling

Structure-searchable data

US EPA

Open access

Gianquinto et al. (2026)

SIDER

Drug side effects

1,400+ drugs

ADE prediction

Adverse reaction frequency

EMBL

Open access

Zhang et al. (2025)

ToxValDB

Toxicity values

200,000+ records

Reference dose estimation

Standardized PoDs

US EPA

Open access

Paul Friedman et al. (2025)

LTKB

Liver toxicity

~1,000 drugs

DILI assessment

FDA-aligned labels

US FDA

Open access

Gianquinto et al. (2026)

MoleculeNet

ML benchmarks

Standard datasets

Model validation

Curated splits

Stanford/Industry

Open access

Gianquinto et al. (2026)

frameworks like Adverse Outcome Pathways (AOPs) (Ankley et al., 2010; Jarabek & Hines, 2019).

Moreover, emerging approaches such as agent-based modeling provide a way to simulate complex biological processes—such as embryonic development—in silico. These models, while still in developmental stages, suggest that toxicity prediction may eventually move beyond static endpoints toward dynamic system simulations (Knudsen et al., 2020). Yet, one must acknowledge that these models remain computationally intensive and require extensive validation before widespread adoption.

4.5 Next-Generation Risk Assessment: Between Promise and Practicality

The question of whether New Approach Methodologies (NAMs) can replace—or at least meaningfully supplement—traditional toxicology is no longer theoretical. Evidence from recent case studies suggests that bioactivity-based predictions can, in many cases, provide protective estimates of risk. Specifically, analyses have shown that points of departure derived from NAMs are often equal to or lower than those obtained from animal studies, indicating a conservative and potentially safer approach (Paul Friedman et al., 2020; Kim & Choi, 2026).

As shown in Table 4, case studies across multiple endpoints—including endocrine disruption, mutagenicity, and cardiotoxicity—demonstrate encouraging predictive performance, with models such as DeepTox and CardioToxNet achieving high accuracy metrics (Mayr et al., 2016; Amorim et al., 2024). These findings suggest that integrated computational frameworks can support risk prioritization with a level of confidence that was previously unattainable. However, a critical limitation persists—the so-called “kinetic bottleneck.” In vitro systems, while informative, lack the metabolic and physiological context of a living organism (Coecke et al., 2005). Addressing this requires the integration of Physiologically Based Pharmacokinetic (PBPK) modeling and In Vitro-to-In Vivo Extrapolation (IVIVE), which together attempt to translate cellular responses into human-relevant exposure scenarios (Kreutz et al., 2024; Jarabek & Hines, 2019). Even so, these translations involve assumptions that must be carefully evaluated.

4.6 Explainability and Trust: Opening the Black Box

As predictive models become more sophisticated, their interpretability becomes increasingly important. The “black box” nature of many AI systems poses a significant barrier to regulatory acceptance. It is not sufficient for a model to be accurate; it must also be understandable. Explainable AI (XAI) approaches, such as the Contrastive Explanation Method, attempt to address this issue by identifying the features that drive model predictions (Sharma et al., 2023). By distinguishing between causal and non-causal features, these methods provide insights into the underlying biological mechanisms, allowing researchers to assess whether predictions are not only statistically valid but also biologically plausible. This transparency is essential for building trust among scientists, clinicians, and regulators alike. Without it, even the most advanced models risk remaining confined to research settings rather than being integrated into decision-making frameworks (Laurent, 2026).

4.7 Toward Integration: A Measured Path Forward

Taken together, the findings presented in this section suggest that toxicology is undergoing a profound transformation. The integration of computational models, high-throughput data, and mechanistic frameworks offers a pathway toward more predictive, human-relevant risk assessment. And yet, it would be premature to describe this transformation as complete. Challenges related to data quality, standardization, and model validation remain significant. Moreover, the balance between model complexity and interpretability continues to shape the direction of the field. Still, the trajectory is clear. Toxicology is evolving—perhaps unevenly, but undeniably—toward a unified predictive framework. One that, if realized, may finally reconcile the long-standing tension between efficiency, ethics, and scientific rigor in chemical safety assessment.

5. Limitations of the study

Despite the rapid progress described, several limitations remain. First, the integration of heterogeneous datasets—spanning chemical descriptors, biological assays, and clinical observations—introduces challenges in standardization and reproducibility. Variability in experimental design and data quality can significantly influence model performance. Second, while computational models have improved predictive accuracy, their interpretability remains limited, particularly in deep learning systems, raising concerns for regulatory acceptance. Additionally, in vitro systems, although human-relevant, lack the physiological complexity of whole

Table 3. Toxicological Endpoints and Representative Predictive Tools. This table presents key toxicity endpoints alongside corresponding assays, regulatory guidelines, and computational tools. It highlights how predictive modeling supports regulatory decision-making and early hazard identification.

Toxicity Endpoint

Proxy Assay

Regulatory Guideline

Key Database

Representative Tool

Computational Approach

Impact Status

References

Acute oral toxicity

Rodent models

OECD TG 420/423

ToxValDB

OPERA

Multi-task ML

Regulatory input

Gianquinto et al. (2026)

Hepatotoxicity

HepG2/HepaRG cells

ICH M3(R2)

LTKB

DILIrank

Logistic regression

Decision support

Gianquinto et al. (2026)

Cardiotoxicity

hERG inhibition

ICH S7B/E14

hERG databases

CUPID

Random forest

Lead prioritization

Gianquinto et al. (2026)

Mutagenicity

Ames test

OECD TG 471

ISSTOX

VEGA

Consensus QSAR

Alternative method

Ziemba (2025)

Skin sensitization

Hapten binding

OECD TG 497

SkinSensDB

ToxTree

Structural alerts

Validated method

Gianquinto et al. (2026)

Neurotoxicity

MEA recordings

OECD TG 442

NeuTox-2.0

NeuTox

Multi-modal DL

High-throughput screening

Zhang et al. (2025)

Nephrotoxicity

HK-2 cells

ICH M3(R2)

TOXRIC

LASSO model

Linear regression

Early screening

Zhang et al. (2025)

DART

Virtual embryo

OECD TG 414

ToxRefDB

CompuCell3D

Agent-based modeling

Emerging research

Knudsen et al. (2020)

Endocrine disruption

ER/AR binding

OECD TG 455

EADB

CERAPP

Deep learning

Predictive alerts

Gianquinto et al. (2026)

Respiratory toxicity

Ciliary function

ICH S7A

ToxCast

Naïve Bayes

Classification

Exploratory stage

Gianquinto et al. (2026)

Table 4. Case Study Performance and Next-Generation Risk Assessment (NGRA) Outcomes. This table summarizes representative case studies applying integrated testing strategies and new approach methodologies (NAMs). It demonstrates model performance metrics and highlights their role in advancing next-generation risk assessment frameworks.

Case Study / Chemical

Endpoint

Model Used

Primary Metric

Secondary Metric

Data Complexity

Key Outcome

References

Benzophenone

Endocrine

QIVIVE + 1C model

80% within 10×

Human cell-based

Protective PoD

Magurany et al. (2023)

HC Yellow No. 13

Hepatosteatosis

PBPK (GastroPlus)

In vitro (hSKP-HPC)

No expected risk

Sepehri et al. (2025)

Benzophenone-4

Systemic

Population PBPK

In vitro datasets

Validated safety

Ebmeyer et al. (2024)

Substituted phenols

Reproductive

HTTK R-package

Uterotrophic assays

Protective NOAEL

Chang et al. (2015)

ToxCast 448 library

Acute oral

HTTK IVIVE

89% protective

Bioactivity vs in vivo

Conservative PoD

Paul Friedman et al. (2020)

DeepAmes

Mutagenicity

Ensemble DL/ML

0.840 (Accuracy)

0.740 (F1)

11.5k compounds

Regulatory potential

Amorim et al. (2024)

CardioToxNet

Cardiotoxicity

Multi-task DNN

0.930 (Accuracy)

0.860 (F1)

12.6k compounds

Improved screening

Amorim et al. (2024)

DeepTox

Multi-endpoint

Deep neural network

0.967 (AUC)

0.926 (Accuracy)

Tox21 dataset

Benchmark model

Mayr et al. (2016)

ADMETlab 2.0

Multi-endpoint

MGA model

0.853 (Accuracy)

0.549 (MCC)

Diverse datasets

Integrated profiling

Amorim et al. (2024)

Clinical prediction

Trial success

Multi-task DL

Improved AUC-ROC

Clinical datasets

Performance depends on size

Sharma et al. (2023)

organisms, necessitating reliance on IVIVE and PBPK modeling, which introduce additional assumptions. Finally, many predictive models remain insufficiently validated across diverse chemical spaces and populations. These limitations suggest that, while promising, current approaches should be viewed as complementary rather than fully substitutive to traditional toxicological methods.

6. Conclusion

The trajectory of toxicology is shifting—gradually, but unmistakably—toward predictive, integrative frameworks. Foundation toxicity models, combining computational intelligence with mechanistic biological insight, offer a compelling path forward. Yet, this transition remains incomplete. Challenges in data integration, model validation, and biological interpretation persist. Still, the convergence of AI, high-throughput screening, and mechanistic frameworks suggests a future where toxicity can be anticipated with greater precision and relevance. If carefully developed and critically evaluated, these approaches may redefine risk assessment—not as a retrospective exercise, but as a predictive science grounded in human biology.

Author Contributions

T.K. conceptualized and designed the study and led the overall manuscript development. T.K. and L.Y. conducted the literature search, screening, and synthesis of evidence related to predictive toxicology, artificial intelligence, and foundation toxicity models. M.F.A. contributed to data interpretation, particularly in computational toxicology, IVIVE frameworks, and adverse outcome pathway integration, and provided critical revisions to the manuscript. All authors contributed to drafting, reviewing, and final approval of the manuscript and agree to be accountable for all aspects of the work.

Acknowledgements

The authors gratefully acknowledge the academic and research support provided by the National Institute of Textile Engineering and Research and the University of Dhaka. The authors also extend sincere appreciation to the global scientific community whose published research and technological advancements contributed to the development of this review.

References


Amorim, A. M. B., Piochi, L. F., Gaspar, A. T., Preto, A. J., Rosário-Ferreira, N., & Moreira, I. S. (2024). Advancing drug safety in drug development: Bridging computational predictions for enhanced toxicity prediction. Chemical Research in Toxicology, 37(5), 827–849. https://doi.org/10.1021/acs.chemrestox.3c00352           

Ankley, G. T., Bennett, R. S., Erickson, R. J., Hoff, D. J., Hornung, M. W., Johnson, R. D., Mount, D. R., Nichols, J. W., Russom, C. L., Schmieder, P. K., & Serrano, J. A. (2010). Adverse outcome pathways: A conceptual framework to support ecotoxicology research and risk assessment. Environmental Toxicology and Chemistry, 29(3), 730–741. https://doi.org/10.1002/etc.34     

Benfenati, E., & Gini, G. (1997). Computational predictive programs (expert systems) in toxicology. Toxicology, 119(3), 213–225.

Borenfreund, E., & Puerner, J. A. (1985). Toxicity determined in vitro by morphological alterations and neutral red absorption. Toxicology Letters, 24, 119–124.

Bristol, D. W., Wachsman, J. T., & Greenwell, A. (1996). The NIEHS predictive-toxicology evaluation project. Environmental Health Perspectives, 104(Suppl 5), 1001–1010.

Chang, X., Kleinstreuer, N., Ceger, P., Hsieh, J.-H., Allen, D., & Casey, W. (2015). Application of reverse dosimetry to compare in vitro and in vivo estrogen receptor activity. Applied In Vitro Toxicology, 1, 33–44. https://doi.org/10.1089/aivt.2014.0005            

Coecke, S., Ahr, H., Blaauboer, B. J., Bremer, S., Casati, S., Castell, J., Combes, R., Corvi, R., Crespi, C. L., Cunningham, M. J., Elaut, G., Eletti, B., Freidig, A., Gennari, A., Ghersi-Egea, J-F., Guillouzo, A., Hartung, T., Hoet, P., Ingelman-Sundberg, M., ... Worth, A. (2005). Metabolism: A bottleneck in in vitro toxicological test development. The Report and Recommendations of ECVAM Workshop 54. ATLA, 34, 49–84.

Combes, R. D., & Balls, M. (2011). Integrated testing strategies for toxicity employing new and existing technologies. ATLA, 39, 213–225.

Cronin, M. T. D., & Schultz, T. W. (2003). Pitfalls in QSAR. Journal of Molecular Structure: THEOCHEM, 622(1–2), 39–51.

Ebmeyer, J., Najjar, A., Lange, D., Boettcher, M., Voß, S., Brandmair, K., et al. (2024). Next generation risk assessment: An ab initio case study to assess the systemic safety of the cosmetic ingredient, benzyl salicylate, after dermal exposure. Frontiers in Pharmacology, 15, 1345992. https://doi.org/10.3389/fphar.2024.1345992 

Gianquinto, E., Bersani, M., Armando, L., Davani, L., Cena, C., De Simone, A., & Spyrakis, F. (2026). Toward integrative predictive toxicology: Advanced methods for drug toxicity and safety prediction. Wiley Interdisciplinary Reviews: Computational Molecular Science, 16(1), e70065. https://doi.org/10.1002/wcms.70065 

Gottmann, E., Kramer, S., Pfahringer, B., & Helma, C. (2001). Data quality in predictive toxicology: Reproducibility of rodent carcinogenicity experiments. Environmental Health Perspectives, 109(5), 509–514.

Hong, K., & Kwon, H. (2025). Toxicity prediction using vision transformers and multimodal feature fusion. Scientific Reports, 15, 9572. https://doi.org/10.1038/s41598-025-95720-5  

Jarabek, A. M., & Hines, D. E. (2019). Mechanistic integration of exposure and effects: Advances to apply systems toxicology in support of regulatory decision-making. Current Opinion in Toxicology, 16, 83–92. https://doi.org/10.1016/j.cotox.2019.09.001       

Kavlock, R. J., Austin, C. P., & Tice, R. R. (2009). Toxicity testing in the 21st century: Implications for human health risk assessment. Risk Analysis, 29(4), 485–487.         

Kim, D., & Choi, J. (2026). Application of in vitro new approach methodologies data to chemical risk assessment: Current status and perspectives toward next generation risk assessment. Frontiers in Toxicology, 8, 1754231. https://doi.org/10.3389/ftox.2026.1754231       

Knudsen, T. B., Baker, N. C., Sipes, N. S., & Zurlinden, T. J. (2020). Predictive DART: In silico models and computational intelligence for developmental and reproductive toxicity. Current Opinion in Toxicology, 23-24, 119–126. https://doi.org/10.1016/j.cotox.2020.11.001       

Kreutz, A., Chang, X., Davis, H. H., & Wetmore, B. (2024). Toxicokinetic variability, physiologically based toxicokinetic model, in vitro-in vivo extrapolation, new approach methodologies, life-stage. Human Genomics, 18, 129. https://doi.org/10.1186/s40246-024-00691-9       

Laurent, E. G. (2026). Artificial intelligence for adverse drug event prediction: Integrative multi-modal modeling, clinical translation, and regulatory alignment in pharmacovigilance. Frontline Medical Sciences and Pharmaceutical Journal, 6(3), 15–19.

Magurany, K. A., Chang, X., Clewell, R., Coecke, S., Haugabrooks, E., & Marty, S. (2023). A pragmatic framework for the application of new approach methodologies in one health toxicological risk assessment. Toxicological Sciences, 192(2), 155–177. https://doi.org/10.1093/toxsci/kfad012             

Mayr, A., Klambauer, G., Unterthiner, T., & Hochreiter, S. (2016). DeepTox: Toxicity prediction using deep learning. Frontiers in Environmental Science, 3, 80. https://doi.org/10.3389/fenvs.2015.00080   

McKinney, J. D. (1985). The molecular basis of chemical toxicity. Environmental Health Perspectives, 61, 5–10.

Merlot, C. (2010). Computational toxicology—a tool for early safety evaluation. Drug Discovery Today, 15(1–2), 16–22.

Mosmann, T. (1983). Rapid colorimetric assay for cellular growth and survival. Journal of Immunological Methods, 65(1–2), 55–63.

National Research Council. (2007). Toxicity testing in the 21st century: A vision and a strategy. National Academies Press.

Paul Friedman, K., Gagne, M., Loo, L. H., Karamertzanis, P., Netzeva, T., Sobanski, T., Franzosa, J. A., Richard, A. M., Lougee, R. R., Gissi, A., Lee, J. Y., Angrish, M., Dorne, J. L., Foster, S., Raffaele, K., Bahadori, T., Gwinn, M. R., Lambert, J., Whelan, M., Rasenberg, M., Barton-Maclaren, T., & Thomas, R. S. (2025). Utility of in vitro bioactivity as a lower bound estimate of in vivo adverse effect levels and in risk-based prioritization. Toxicological Sciences, 205(1), 74–105. https://doi.org/10.1093/toxsci/kfaf019             

Paul Friedman, K., Gagne, M., Loo, L. H., Karamertzanis, P., Netzeva, T., Sobanski, T., et al. (2020). Utility of in vitro bioactivity as a lower bound estimate of in vivo adverse effect levels and in risk-based prioritization. Toxicological Sciences, 173(2), 202–225. https://doi.org/10.1093/toxsci/kfz201

Paul Friedman, K., Gagne, M., Loo, L. H., Karamertzanis, P., Netzeva, T., Sobanski, T., Franzosa, J. A., Richard, A. M., Lougee, R. R., Gissi, A., Lee, J. Y., Angrish, M., Dorne, J. L., Foster, S., Raffaele, K., Bahadori, T., Gwinn, M. R., Lambert, J., Whelan, M., Rasenberg, M., Barton-Maclaren, T., & Thomas, R. S. (2025). Utility of in vitro bioactivity as a lower bound estimate of in vivo adverse effect levels and in risk-based prioritization. Toxicological Sciences, 205(1), 74–105. https://doi.org/10.1093/toxsci/kfaf019             

Sariya, D., Gupta, A., Asif, M., & Rai, C. (2025). Advances in preclinical toxicology: Bridging in vitro, in vivo, and translational gaps. Biopress Journal of Advanced Pharmacology, 1(1), 75–86. https://doi.org/10.5281/zenodo.17101419 

Schmidt, C. W. (2009). TOX 21: New dimensions of toxicity testing. Environmental Health Perspectives, 117(8), A348–A353.

Sepehri, S., De Win, D., Heymans, A., Van Goethem, F., Rodrigues, R. M., Rogiers, V., et al. (2025). Next generation risk assessment of hair dye HC yellow no. 13: Ensuring protection from liver steatogenic effects. Regulatory Toxicology and Pharmacology, 159, 105794. https://doi.org/10.1016/j.yrtph.2025.105794   

Sharma, B., Chenthamarakshan, V., Dhurandhar, A., Pereira, S., Hendler, J. A., Dordick, J. S., & Das, P. (2023). Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations. Scientific Reports, 13(1), 4908. https://doi.org/10.1038/s41598-023-31169-8  

Shukla, S. J., Huang, R., Austin, C. P., & Xia, M. (2010). The future of toxicity testing. Drug Discovery Today, 15(23–24), 997–1007.

Sun, H., Xia, M., Austin, C. P., & Huang, R. (2012). Paradigm shift in toxicity testing and modeling. The AAPS Journal, 14(3), 473-480. https://doi.org/10.1208/s12248-012-9358-1    

Thomas, R. S., Philbert, M. A., Auerbach, S. S., Wetmore, B. A., Devito, M. J., Cote, I., Rowlands, J. C., Whelan, M. P., Hays, S. M., & Andersen, M. E. (2013). Incorporating new technologies into toxicity testing. Toxicological Sciences, 136(1), 4–18.

Ward, D. K., et al. (2021). Explainability modules integrate feature importance analysis to identify clinically salient predictors. Frontiers in Pharmacology.

Zhang, Q., Li, J., Middleton, A., Bhattacharya, S., & Conolly, R. B. (2018). Bridging the data gap from in vitro toxicity testing to chemical safety assessment through computational modeling. Frontiers in Public Health, 6, 261. https://doi.org/10.3389/fpubh.2018.00261       

Zhang, R., Wen, H., Lin, Z., Li, B., & Zhou, X. (2025). Artificial intelligence-driven drug toxicity prediction: Advances, challenges, and future directions. Toxics, 13, 525. https://doi.org/10.3390/toxics13070525           

Zhang, R., Wen, H., Lin, Z., Li, B., & Zhou, X. (2025). Artificial intelligence-driven drug toxicity prediction: Advances, challenges, and future directions. Toxics, 13, 525. https://doi.org/10.3390/toxics13070525           

Zhu, H., Rusyn, I., Richard, A., & Tropsha, A. (2008). Improved QSAR models using cell viability data. Environmental Health Perspectives, 116(4), 506–513.

Ziemba, B. (2025). Advances in cytotoxicity testing: From in vitro assays to in silico models. International Journal of Molecular Sciences, 26, 11202. https://doi.org/10.3390/ijms262211202