Treffer: Benchmarking large language models for identifying transcription factor regulatory interactions.

Title:
Benchmarking large language models for identifying transcription factor regulatory interactions.
Authors:
Noel L; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., Hsiao YW; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., He Y; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States.; Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., Hung A; Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., Cui X; Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., Ray E; Department of Surgery, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., Moore JH; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., Peng PC; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States., Huang X; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States.
Source:
Bioinformatics (Oxford, England) [Bioinformatics] 2026 Jan 02; Vol. 42 (1).
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: Oxford University Press Country of Publication: England NLM ID: 9808944 Publication Model: Print Cited Medium: Internet ISSN: 1367-4811 (Electronic) Linking ISSN: 13674803 NLM ISO Abbreviation: Bioinformatics Subsets: MEDLINE
Imprint Name(s):
Original Publication: Oxford : Oxford University Press, c1998-
References:
Proteomics. 2021 Dec;21(23-24):e2000034. (PMID: 34314098)
Database (Oxford). 2022 Sep 16;2022:. (PMID: 36124642)
Genome Res. 2011 May;21(5):645-57. (PMID: 21324878)
Curr Opin Genet Dev. 2017 Apr;43:110-119. (PMID: 28359978)
BioData Min. 2023 Jul 13;16(1):20. (PMID: 37443040)
Eur Radiol. 2024 May;34(5):2817-2825. (PMID: 37794249)
Nucleic Acids Res. 2018 Jan 4;46(D1):D380-D386. (PMID: 29087512)
PLoS Comput Biol. 2023 Sep 28;19(9):e1011511. (PMID: 37769024)
Cell. 2018 Feb 8;172(4):650-665. (PMID: 29425488)
Bioinform Adv. 2022 Mar 08;2(1):vbac016. (PMID: 36699385)
BMC Genomics. 2013 Feb 06;14:84. (PMID: 23387820)
NAR Genom Bioinform. 2023 Sep 13;5(3):lqad083. (PMID: 37711605)
Nucleic Acids Res. 2021 Jan 8;49(D1):D104-D111. (PMID: 33231677)
ACM BCB. 2018 Aug-Sep;2018:1-10. (PMID: 31061989)
Genome Res. 2019 Aug;29(8):1363-1375. (PMID: 31340985)
Nucleic Acids Res. 2025 Jan 6;53(D1):D1016-D1028. (PMID: 39565209)
Genomics Proteomics Bioinformatics. 2020 Apr;18(2):120-128. (PMID: 32858223)
Nucleic Acids Res. 2023 Nov 10;51(20):10934-10949. (PMID: 37843125)
PLOS Digit Health. 2023 Feb 9;2(2):e0000198. (PMID: 36812645)
J Am Med Inform Assoc. 2023 Jun 20;30(7):1237-1245. (PMID: 37087108)
Grant Information:
Glazer Foundation Award; R21CA280458 United States NH NIH HHS; R00 CA256519 United States CA NCI NIH HHS; Jim and Eleanor Department of Surgery Research Award; U01 AG066833 United States NH NIH HHS; R01 CA151610 United States CA NCI NIH HHS; U01 AG066833 United States AG NIA NIH HHS; R00CA256519 United States NH NIH HHS; 2R01CA151610 United States NH NIH HHS; Samuel Oschin Cancer Institute Research Development Fund; R21 CA280458 United States CA NCI NIH HHS
Substance Nomenclature:
0 (Transcription Factors)
Entry Date(s):
Date Created: 20251212 Date Completed: 20260105 Latest Revision: 20260109
Update Code:
20260109
PubMed Central ID:
PMC12766914
DOI:
10.1093/bioinformatics/btaf653
PMID:
41386265
Database:
MEDLINE

Weitere Informationen

Motivation: Transcription factors (TFs) and their target genes form regulatory networks that control gene expression and influence diverse biological processes and disease outcomes. Although multiple computational methods and curated databases have been developed to identify TF-target interactions, they often require specialized expertise. Large language models (LLMs) chatbots offer a more accessible alternative for querying TF-target interactions. In this study, we benchmarked four prominent LLMs, Anthropic's Claude 3.5 Sonnet, Google's Gemini 1.0 Pro, OpenAI's GPT-4o, and Meta's Llama3 8b, using 8432 literature-curated human TF-target interactions. We examined four regulatory categories: bidirectional, ambiguous, self-regulated, and unidirectional interactions.
Results: Under single-turn queries, Claude 3.5 Sonnet and GPT-4o outperformed the others, with balanced accuracies reaching 50.0 ± 7.6% (GPT-4o, self-regulated) and 48.2 ± 1.0% (Claude 3.5 Sonnet, unidirectional). Zero-temperature settings generally enhanced reproducibility, and multi-turn prompting improved performance for most models, increasing Claude 3.5 Sonnet's accuracy on self-regulated pairs by 32.6%. Excluding TF-target pairs with all unknown regulation types also generally improved accuracy, with unidirectional regulation reaching near 70% balanced accuracy in some cases. We also benchmarked Anthropic's Claude 3.5 Sonnet, Google's Gemini 2.0 Flash, OpenAI's GPT-4o, and Meta's Llama3 using 5148 experimentally derived TF-target interactions. Claude 3.5 Sonnet consistently outperformed the other models across conditions. Our findings highlight that prompt engineering and strategic use of model parameters consistently influence LLM chatbots' performance on TF-target identifications. This study establishes a benchmarking framework and demonstrates the potential of pre-trained general-purpose LLMs to support regulatory biology research, especially for researchers without extensive computational expertise.
Availability and Implementation: The literature-based TF-target interactions ground truth were obtained from TRRUST v2 human dataset (www.grnpedia.org/trrust). The experimental derived TF-target interactions ground truth were obtained from TFLink Home Sapiens small-scale interaction table (https://tflink.net/). Processed TF-target interactions data and the analytical pipeline has been compiled as an interactive Python notebook file and is available at https://github.com/pengpclab/LLM-TF-interactions.
(© The Author(s) 2025. Published by Oxford University Press.)