Treffer: ChatGPT 100,000 Patient 24-Month In Silico Phase III 5-Arm Pancreatic Cancer Clinical Trial Triplicate

Title:
ChatGPT 100,000 Patient 24-Month In Silico Phase III 5-Arm Pancreatic Cancer Clinical Trial Triplicate
Publisher Information:
Zenodo
Publication Year:
2025
Collection:
Zenodo
Document Type:
Report report
Language:
English
DOI:
10.5281/zenodo.16415815
Rights:
Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode
Accession Number:
edsbas.C041B58C
Database:
BASE

Weitere Informationen

Inquiry: Is it possible for ChatGPT to simulate three reproducible 100,000 patient pancreatic ductal adenocarcinoma (PDAC) Phase III clinical trial reports? If so, can the results be internally and externally validated, cross-verified using other AI models, and be compared both clinically and financially to other trials? Concept: 5 arms based on the Daraxonrasib + Mitazalimab + liposomal Irinotecan drug combination, baseline characteristics, and patient archetypes were identified from a prior study: doi.org/10.5281/zenodo.15735068. Six artificial intelligence models were then implemented to address the clinical trial pipeline: o3ph: ChatGPT o3-pro Research, g25p: Google Gemini 2.5 Pro, grk4: Grok 4, grk3: Grok 3 Think, o3pr: ChatGPT o3-pro, and ops4: Opus 4 Extended. o3ph generated the ICH E3-aligned trial reports, log files, plus internal, and external validations. g25p, grk4, grk3, o3pr, and ops 4 provided cross verifications that highlighted trial-to-trial and model-to-model correlations. g25p utilized 24 generations in the study to produce a virtual trials overview, while o3ph provided a meta-analysis of pooled and scored data versus relevant virtual and on-site trials. o3ph also provided a financial assessment and value proposition of USD estimates against Phase II and Phase III studies. ops4 provided visualizations written in Python for the majority of the sections. Results: 100,000 individual patients generated from three separate o3ph conversations followed multiplicative hazard ratios and per-arm monthly hazards set in the prompt. Key variables were independent of each other, which yielded distributions of uncensored results. Log file cumulative effects of the censored 100,000 patients yielded expected results in OS by Arm (A > D > E), ≥G3 AE (A > D > E), and PFS (A > D > E). Baseline characteristics by metric across trials were in close alignment, and internal validations between log files and trial reports exhibited similar performance. External validation vs. a Flatiron Health ...