Treffer: End-to-End Pancreatic Ductal Adenocarcinoma Digital Twin Clinical Trial Proposals
Weitere Informationen
Question: Can online clinical trial literature be utilized by artificial intelligence to generate well-informed pancreatic ductal adenocarcinoma (PDAC) digital twin trial proposals? Findings: AI successfully generated 40 meta-analyses in PDAC top research areas at an average word length of 10,196. One verification per meta-analysis was performed using AI automation, and the combined 408,081 word dataset was utilized to generate reports. Reports were separately verified and further visualized by AI. A second dataset consisting of the reports was used by five different AI models to yield five proposals. The proposals were evaluated by five AI judges to determine the top three proposals. Proposal deliverable completion reached a maximum score of 9.60/10, while the top citation proficiency score was 9.20. Prospective trial impact reached a peak score of 8.94, while funding probabilities ranged from 8.54 to 8.66. Design: The framework consisted of 7 AI model combinations; o3re: ChatGPT o3 Research, g25p: Google Gemini 2.5 Pro Preview, son4: Sonnet 4 Extended, grk3: xAI Grok 3, o3pr: ChatGPT o3-pro, ops4: Opus 4 Extended, and o3ch: ChatGPT o3; which were utilized according to model specific advantages. o3re autonomously searched the web and constructed detailed meta-analyses for Dataset 1. g25p was used to verify quantitative data across meta-analyses and reports, and also generated reports. son4 and ops4 were used to produce visualizations in Python for reports and proposals; while grk3 fixed code when necessary. The 6 reports were combined to form Dataset 2, which was an input for the o3pr, ops4, g25p, o3ch, and grk3 proposals. The same 5 models served as judges for the combined proposals, with average scores being used to determine the top 3 proposals. AI run time for core experiments was approximately 14 hours, with paper completion in 27 days. Results: One verification for each meta-analysis was conducted by g25p at an average accuracy of 95%. The same model verified one table for each of the six reports at a ...