Biomedical Relation Extraction via Adaptive Document-Relation Cross-Mapping and Concept Unique Identifier

Publication date: Jan 09, 2025

Document-Level Biomedical Relation Extraction (Bio-RE) aims to identify relations between biomedical entities within extensive texts, serving as a crucial subfield of biomedical text mining. Existing Bio-RE methods struggle with cross-sentence inference, which is essential for capturing relations spanning multiple sentences. Moreover, previous methods often overlook the incompleteness of documents and lack the integration of external knowledge, limiting contextual richness. Besides, the scarcity of annotated data further hampers model training. Recent advancements in large language models (LLMs) have inspired us to explore all the above issues for document-level Bio-RE. Specifically, we propose a document-level Bio-RE framework via LLM Adaptive Document-Relation Cross-Mapping (ADRCM) Fine-Tuning and Concept Unique Identifier (CUI) Retrieval-Augmented Generation (RAG). First, we introduce the Iteration-of-REsummary (IoRs) prompt for solving the data scarcity issue. In this way, Bio-RE task-specific synthetic data can be generated by guiding ChatGPT to focus on entity relations and iteratively refining synthetic data. Next, we propose ADRCM fine-tuning, a novel fine-tuning recipe that establishes mappings across different documents and relations, enhancing the model’s contextual understanding and cross-sentence inference capabilities. Finally, during the inference, a biomedical-specific RAG approach, named CUI RAG, is designed to leverage CUIs as indexes for entities, narrowing the retrieval scope and enriching the relevant document contexts. Experiments conducted on three Bio-RE datasets (GDA, CDR, and BioRED) demonstrate the state-of-the-art performance of our proposed method by comparing it with other related works.

PDF

Concepts Keywords
Biotechnology Adrcm
Ethanol Based
Exacerbates Bio
Gmail Biomedical
Recipes Cdr
Document
Entities
Entity
Fine
Gda
Prompt
Rag
Relation
Relations
Tuning

Semantics

Type Source Name
drug DRUGBANK Guanosine
drug DRUGBANK Coenzyme M
disease MESH alcohol dependence
drug DRUGBANK Ethanol
disease MESH uncertainty
drug DRUGBANK Spinosad
drug DRUGBANK Gold
drug DRUGBANK Chlorhexadol
disease MESH hallucinations
drug DRUGBANK Trestolone
drug DRUGBANK Alpha-1-proteinase inhibitor
drug DRUGBANK Clotiazepam
disease MESH hepatitis
drug DRUGBANK Hexadecanal
drug DRUGBANK Serine

Download Document

(Visited 1 times, 1 visits today)

Leave a Comment

Your email address will not be published. Required fields are marked *