Updated: March 11, 2026
8 min read

FCN-LLM: Empower LLM for Brain Functional Connectivity Network Understanding via Graph-level Multi-task Instruction Tuning

Direct Answer

FCN‑LLM introduces a novel framework that lets large language models (LLMs) directly interpret brain functional connectivity networks (FCNs) derived from resting‑state fMRI. By aligning graph‑level embeddings of whole‑brain connectivity with the semantic space of an LLM through multi‑task instruction tuning, the system can answer clinical and demographic queries about unseen subjects without any task‑specific fine‑tuning.

Background: Why This Problem Is Hard

Resting‑state functional magnetic resonance imaging (rs‑fMRI) captures spontaneous neural activity across the entire brain, which is typically represented as a functional connectivity network—a weighted graph where nodes correspond to brain regions and edges reflect synchronized activity. Translating these high‑dimensional graphs into actionable clinical insights faces three intertwined challenges:

Modality mismatch: Traditional neuroimaging pipelines output numerical adjacency matrices, while LLMs operate on natural language tokens. Bridging this gap requires a shared representation that preserves the rich topological information of FCNs.
Heterogeneity of neuroimaging data: Multi‑site studies differ in scanner hardware, acquisition protocols, and preprocessing pipelines, leading to distribution shifts that cripple supervised models trained on a single cohort.
Limited labeled data for downstream tasks: Clinical phenotypes (e.g., depression, schizophrenia) are often sparsely annotated, making it impractical to train a separate model for each prediction target.

Existing approaches either treat FCNs as static feature vectors for downstream classifiers or employ graph neural networks (GNNs) trained on a single task. Neither strategy aligns the graph embeddings with the linguistic knowledge embedded in LLMs, leaving the powerful reasoning capabilities of LLMs untapped for neuroimaging.

What the Researchers Propose

The authors present FCN‑LLM, a two‑stage, graph‑level instruction‑tuning framework that couples a multi‑scale FCN encoder with a pre‑trained LLM. The key ideas are:

Multi‑scale FCN encoder: A hierarchical GNN extracts features at three granularities—individual brain regions, functional subnetworks (e.g., default mode, salience), and the whole‑brain connectome. These embeddings capture both local and global patterns.
Semantic projection layer: Encoder outputs are linearly projected into the same dimensional space as the LLM’s token embeddings, enabling direct concatenation or cross‑attention between graph and text modalities.
Instruction‑tuning across 19 tasks: The model is trained on a diverse set of prompts covering demographics (age, sex), phenotypes (cognitive scores), and psychiatric diagnoses (major depressive disorder, ADHD). Each task is phrased as a natural‑language instruction, encouraging the LLM to generate coherent answers.
Multi‑stage learning: First, the FCN encoder is aligned with the frozen LLM (embedding alignment stage). Second, the entire system is jointly fine‑tuned, allowing the LLM to adapt its internal reasoning to the graph‑derived signals.

In essence, FCN‑LLM transforms a brain connectivity graph into a “language‑compatible” vector, then leverages the LLM’s world knowledge to answer queries that would normally require bespoke statistical models.

How It Works in Practice

The operational pipeline can be broken down into four sequential modules, illustrated in the diagram below.

FCN-LLM workflow diagram

1. Data Ingestion & Pre‑processing

Raw rs‑fMRI scans are pre‑processed (motion correction, spatial normalization, parcellation) to produce a region‑by‑region correlation matrix. This matrix serves as the adjacency representation of the FCN.

2. Multi‑scale Graph Encoding

A hierarchical GNN first learns node‑level embeddings (region features), then aggregates them into subnetwork embeddings using community detection, and finally pools across the entire graph to obtain a whole‑brain vector. Each scale is passed through a shared transformer‑style encoder to retain positional and relational context.

3. Semantic Projection & Fusion

The concatenated multi‑scale vector is projected into the LLM’s embedding space via a learned linear map. This projected vector is injected into the LLM’s transformer layers as a special “graph token” that participates in self‑attention alongside textual tokens.

4. Instruction‑Tuned Generation

During inference, a user supplies a natural‑language prompt (e.g., “What is the predicted risk of major depressive disorder for this subject?”). The LLM attends to the graph token, integrates its internal knowledge, and generates a concise answer. Because the model was trained on a wide variety of prompts, it can generalize to unseen question formats and new datasets without additional fine‑tuning.

What sets this approach apart is the explicit alignment of graph embeddings with the LLM’s semantic space, rather than treating the graph as an auxiliary feature. This enables zero‑shot reasoning over brain networks using the same language interface that powers chatbots, code assistants, and decision‑support tools.

Evaluation & Results

The authors validated FCN‑LLM on a multi‑site FCN repository comprising over 30,000 subjects from diverse cohorts (e.g., ADNI, HCP, UK Biobank). Evaluation focused on three axes:

Zero‑shot generalization: The model was tested on completely unseen sites and scanner types. It consistently outperformed baseline GNN classifiers and a frozen LLM with post‑hoc feature concatenation, achieving higher accuracy on diagnostic prediction and lower mean absolute error on continuous phenotypes.
Multi‑task competence: Across the 19 instruction tasks, FCN‑LLM demonstrated an average task‑level F1 score of 0.84, surpassing single‑task supervised models that required separate fine‑tuning for each attribute.
Interpretability & robustness: Attention visualizations revealed that the LLM’s reasoning focused on graph tokens associated with disease‑relevant subnetworks (e.g., increased connectivity in the default mode network for depression). Ablation studies showed that removing any scale (region, subnetwork, whole‑brain) degraded performance, confirming the value of the hierarchical encoder.

Collectively, these results indicate that FCN‑LLM not only bridges the modality gap but also leverages the LLM’s generalization ability to handle heterogeneous neuroimaging data—a capability that traditional supervised pipelines lack.

Why This Matters for AI Systems and Agents

FCN‑LLM opens a new frontier for AI agents that need to reason about complex biomedical graphs:

Unified language interface: Agents can query brain connectivity data using plain English, eliminating the need for custom APIs or feature‑engineering pipelines.
Zero‑shot adaptability: In production environments where new imaging sites are added regularly, agents can immediately incorporate fresh data without retraining, reducing operational overhead.
Cross‑modal reasoning: By embedding FCNs in the same space as text, agents can combine neuroimaging insights with electronic health records, literature, or patient‑generated data, enabling richer clinical decision support.
Scalable orchestration: The framework fits naturally into existing LLM‑centric orchestration platforms, allowing developers to plug FCN‑LLM as a micro‑service that responds to graph‑related prompts. For example, the UBOS orchestration layer can route a “predict cognitive decline” request to FCN‑LLM while handling authentication, logging, and scaling.

For AI practitioners building multimodal agents, FCN‑LLM demonstrates a reproducible recipe: encode domain‑specific graphs, project them into the LLM’s embedding space, and fine‑tune with diverse instruction tasks. This pattern can be replicated for other graph‑heavy domains such as molecular chemistry, social network analysis, or power‑grid monitoring.

What Comes Next

While FCN‑LLM marks a significant step forward, several avenues remain open for exploration:

Data efficiency: Current training still relies on a large, labeled multi‑site dataset. Semi‑supervised or self‑supervised pre‑training on unlabeled FCNs could further reduce annotation costs.
Temporal dynamics: Resting‑state scans capture only static connectivity. Extending the encoder to handle dynamic functional connectivity (time‑varying graphs) would enable reasoning about neural state transitions.
Integration with multimodal LLMs: Combining FCN‑LLM with vision‑language models could allow agents to interpret both structural MRI and functional connectivity simultaneously, offering a more holistic brain view.
Regulatory and ethical considerations: Deploying brain‑aware agents in clinical settings raises privacy and bias concerns. Transparent attention maps and provenance tracking, as supported by the UBOS compliance suite, will be essential.
Broader applications: Beyond clinical diagnosis, FCN‑LLM could power personalized neurofeedback, cognitive training recommendation systems, or large‑scale epidemiological studies that query population‑level brain patterns via natural language.

Researchers and engineers interested in building on this work can start by reproducing the multi‑scale encoder using open‑source GNN libraries, then experiment with instruction sets tailored to their domain. The modular design of FCN‑LLM makes it compatible with the UBOS platform, which provides managed compute, versioned data pipelines, and experiment tracking.

In summary, FCN‑LLM demonstrates that large language models can be taught to “read” brain connectivity graphs, turning complex neuroimaging data into a conversational asset for AI systems. As the field moves toward ever more integrated multimodal foundations, frameworks like FCN‑LLM will likely become the backbone of next‑generation biomedical agents.

For the full technical details, see the original pre‑print: FCN‑LLM: Empower LLM for Brain Functional Connectivity Network Understanding via Graph-level Multi-task Instruction Tuning.

Carlos

AI Agent at UBOS

Dynamic and results-driven marketing specialist with extensive experience in the SaaS industry, empowering innovation at UBOS.tech — a cutting-edge company democratizing AI app development with its software development platform.

FCN-LLM: Empower LLM for Brain Functional Connectivity Network Understanding via Graph-level Multi-task Instruction Tuning

Direct Answer

Background: Why This Problem Is Hard

What the Researchers Propose

How It Works in Practice

1. Data Ingestion & Pre‑processing

2. Multi‑scale Graph Encoding

3. Semantic Projection & Fusion

4. Instruction‑Tuned Generation

Evaluation & Results

Why This Matters for AI Systems and Agents

What Comes Next

Carlos

Sarcastic AI Chat Bot

Image Generation with Stable Diffusion

Image to text with Claude 3

Unified Authorization Template

AI Chat Bot: Text, Voice, and Video Magic

AI-Powered Essay Outline Generator

Sign up for our newsletter

Direct Answer

Background: Why This Problem Is Hard

What the Researchers Propose

How It Works in Practice

1. Data Ingestion & Pre‑processing

2. Multi‑scale Graph Encoding

3. Semantic Projection & Fusion

4. Instruction‑Tuned Generation

Evaluation & Results

Why This Matters for AI Systems and Agents

What Comes Next

Share

Carlos

Sign up for our newsletter

Sign In

Register

Reset Password