Opendatabay APP

PFAS Toxicological Prioritisation Data

Data Science and Analytics

Tags and Keywords

Pfas

Half-life

Toxicokinetic

Machine

Learning

Animals

Trusted By
Trusted by company1Trusted by company2Trusted by company3
PFAS Toxicological Prioritisation Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Offers predictions for the toxicokinetic half-lives ($t\frac{1}{2}$) of Per- and Polyfluoro-Alkyl Substances (PFAS) in multiple species. These man-made chemicals are often detected in body tissues, yet the toxicokinetics for most remain uncharacterised. This information is vital for exposure reconstruction and extrapolating results from toxicological studies. The predictions are derived using an ensemble machine learning method, specifically random forest, which modelled existing in vivo measured half-lives across four species: human, monkey, rat, and mouse. The model achieves an accuracy of 86.1% and incorporates mechanistically motivated descriptors, such as physiological factors like kidney geometry, as surrogates for renal transporter expression.

Columns

The data includes 30 columns, covering chemical identifiers, physical-chemical predictions, physiological parameters, and the final half-life classifications. Key columns include:
  • CASRN: Chemical Abstract Service Registry Number, with 6,603 unique values.
  • DTXSID: Distributed Structure-Searchable Toxicity (DSSTox) identifier.
  • Species: The organism for which the prediction is made (9 unique values observed, including Cattle).
  • Type: Kidney type, such as Unipapillary (67% most common) or Multirenculated.
  • LogP_pred, LogVP_pred, LogWS_pred, LogKOA_pred: Predicted values for Octanol-Water Partition Coefficient, Vapour Pressure, Water Solubility, and Octanol-Air Partition Coefficient, respectively.
  • AVERAGE_MASS: Average mass of the chemical.
  • GlomTotSA_KW_ratio, ProxTubDiam: Physiological descriptors potentially relating to renal transporter expression.
  • ClassPredFull: The predicted half-life classification bin (1 to 4).
  • CLtot.Lpkgbwpday: Predicted total clearance.
  • Css.mgpL: Predicted steady-state concentration.

Distribution

The data product is available as a CSV file (S2_Dawson et al._ML PFAS_HL_101322.csv) approximately 123.14 MB in size. The dataset contains 357,000 valid records, with 30 columns. The model’s applicability domain encompasses 3,890 compounds. Half-lives are classified into four discrete bins using predicted bin medians: 4.9 hours, 2.2 days, 33 days, and 3.3 years.

Usage

This data allows for the tentative extrapolation and prioritisation of PFAS compounds based on their predicted toxicokinetic characteristics. Ideal applications include:
  • Toxicology: Providing necessary chemical-specific half-life knowledge for interpreting toxicological studies.
  • Exposure Assessment: Aiding in the reconstruction of human and animal exposure scenarios.
  • Regulatory Prioritisation: Identifying compounds likely to exhibit long half-lives in key species, flagging them for further testing or control.
  • Environmental Modelling: Inputting toxicokinetic parameters into environmental fate models.

Coverage

The predictions specifically model t\frac{1}{2} across four core species: human, monkey, rat, and mouse, synthesising limited available in vivo data for eleven PFAS. For human predictions, 56% of PFAS are classified in the longest half-life bin (Bin 4: >2 months). The mechanistic descriptors examined include details on dosing adjustment (e.g., IV), average mass, and kidney geometry parameters. The data coverage is focused on predicted toxicokinetic endpoints based on the characteristics of the chemical and the specific species/physiological context.

License

CC0: Public Domain

Who Can Use It

  • Toxicologists: For understanding chemical persistence in biological systems.
  • Environmental Scientists: For assessing bioaccumulation and risk.
  • Regulatory Agencies: For screening and managing PFAS chemicals.
  • Pharmacokinetic Modelers: For parameterising PBPK or exposure models.

Dataset Name Suggestions

  • PFAS Toxicokinetic Half-Life Predictions
  • Machine Learning Predicted PFAS Half-Lives
  • Multi-Species PFAS Half-Life Model Outputs
  • PFAS Toxicological Prioritisation Data

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

11/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format