Adaptive Learning with a Local LLM + RAG

Privacy-first personalized education using local AI and Retrieval-Augmented Generation.

Local LLM RAG Adaptive Learning Moodle Data Privacy

πŸ“„ Published Β· Journal of Computer Science 2026

Can AI tutoring be both intelligent and private? In this work we answer yes β€” by combining a locally-deployed Large Language Model with Retrieval-Augmented Generation and Bayesian Knowledge Tracing to deliver a fully on-premise, adaptive educational assistant integrated directly into Moodle.


The Problem in One Picture

Traditional cloud-based LLM tools used in education create a fundamental tension: the very data that should remain most protected β€” student performance records, questions, and struggles β€” is the data transmitted to external servers.

⚠️

The privacy paradox: Cloud-based tutoring systems improve with more student data, but improving with student data requires exposing that data. Local, on-device AI breaks this cycle entirely.

Beyond privacy, most existing tools apply a one-size-fits-all approach: the same explanation, the same difficulty, the same pace β€” regardless of what the student already knows. Our system addresses both problems simultaneously.


01

System Architecture at a Glance

The full system is built from three tightly coupled modules. Here is the complete data flow:

flowchart TD
    U([πŸ§‘β€πŸŽ“ Student]) -->|Query| API

    subgraph LOCAL["πŸ”’ Local Institution Server"]
        direction TB
        API[/"πŸ” Secure Communication API\n(TLS-encrypted)"/]

        subgraph CORE["Core Processing"]
            direction LR
            LLM["🧠 Local LLM\nDeepSeek-V3 671B\n(GGUF Q4_K_M)"]
            RAG["πŸ“š RAG Module\nDocument Retrieval\n+ Context Injection"]
        end

        FUSION["βš™οΈ Information Fusion\nAlign tone Β· difficulty Β· domain"]
        BKT["πŸ“Š Adaptive Learning\nBayesian Knowledge Tracing"]
        LOGS[("πŸ—„οΈ Storage & Logs\nInteraction History")]
    end

    KB[("πŸ“– Knowledge Base\nTextbooks Β· Lecture Notes\nΒ· Technical Manuals")]

    API --> LLM
    API --> RAG
    RAG <-->|"Semantic Search"| KB
    LLM --> FUSION
    RAG --> FUSION
    FUSION --> BKT
    BKT -->|"Contextualized\nPersonalized Output"| U
    BKT --> LOGS
    LOGS -.->|"Proficiency Updates"| BKT

    style LOCAL fill:#0f172a,stroke:#334155,color:#e2e8f0
    style CORE fill:#1e293b,stroke:#475569,color:#e2e8f0
    style LLM fill:#1e3a5f,stroke:#38bdf8,color:#7dd3fc
    style RAG fill:#1e3a5f,stroke:#818cf8,color:#a5b4fc
    style BKT fill:#064e3b,stroke:#4ade80,color:#86efac
    style FUSION fill:#2d1b00,stroke:#f59e0b,color:#fde68a
    style LOGS fill:#1e293b,stroke:#475569,color:#94a3b8
    style KB fill:#1a1a2e,stroke:#6366f1,color:#a5b4fc
πŸ’‘

The parallel LLM + RAG design is key: the LLM provides fluency and reasoning while RAG grounds responses in verified educational content β€” together they reduce hallucinations while maintaining natural, coherent explanations.


02

The RAG Pipeline in Detail

Retrieval-Augmented Generation works in two phases: first retrieve, then generate. Here is the step-by-step flow as implemented in our system:

sequenceDiagram
    actor S as πŸ§‘β€πŸŽ“ Student
    participant API as Secure API
    participant QE as Query Encoder
    participant IDX as Document Index
    participant KB as Knowledge Base
    participant GEN as LLM Generator
    participant FUSE as Info Fusion

    S->>API: Sends question (encrypted)
    API->>QE: Forward query
    QE->>IDX: Encode & search for relevant chunks
    IDX->>KB: Look up textbooks / lecture notes
    KB-->>IDX: Return top-k passages
    IDX-->>QE: Ranked relevant chunks
    QE->>GEN: Query + retrieved context
    Note over GEN: Generates response grounded<br/>in retrieved knowledge
    GEN-->>FUSE: Raw response + context
    FUSE-->>API: Pedagogically tuned output
    API-->>S: Final personalized answer
πŸ“ Student Query Natural language
β†’
πŸ”’ Query Encode Vector embedding
β†’
πŸ” Retrieval Top-k chunks
β†’
🧠 LLM Generate Grounded response
β†’
βš™οΈ Fuse + Adapt BKT tuning
β†’
βœ… Output Personalized answer

03

Adaptive Learning: Bayesian Knowledge Tracing

The adaptive core of our system is Bayesian Knowledge Tracing (BKT), a probabilistic model that continuously updates its estimate of what a student knows based on every interaction.

🎲 The Four BKT Parameters

BKT tracks mastery through four core probabilities, updated after every student response:

Initial Mastery
p(Lβ‚€)
Probability the student already knows the skill before any practice begins.
Learning Rate
p(T)
Probability of transitioning from non-mastery to mastery after a practice attempt.
Slip
p(S)
Probability of answering incorrectly despite having mastered the skill (careless error).
Guess
p(G)
Probability of answering correctly without having mastered the skill (lucky guess).

The BKT-estimated mastery probability then directly controls how the LLM is prompted:

flowchart LR
    SCORE{"BKT Mastery\nEstimate"}

    SCORE -->|"p(L) < 0.4\nLow Mastery"| LOW["πŸŸ₯ Foundational Mode\nβ€’ Retrieve basic concepts via RAG\nβ€’ Step-by-step LLM explanations\nβ€’ Simple analogies & examples"]

    SCORE -->|"0.4 ≀ p(L) < 0.75\nDeveloping"| MID["🟨 Scaffolded Mode\nβ€’ Mix of explanation + challenge\nβ€’ Guided problem-solving\nβ€’ Conceptual connections"]

    SCORE -->|"p(L) β‰₯ 0.75\nHigh Mastery"| HIGH["🟩 Advanced Mode\nβ€’ Concise, technical responses\nβ€’ Complex problems & edge cases\nβ€’ Cross-concept synthesis"]

    style SCORE fill:#1e3a5f,stroke:#38bdf8,color:#7dd3fc
    style LOW fill:#450a0a,stroke:#f87171,color:#fca5a5
    style MID fill:#2d1b00,stroke:#f59e0b,color:#fde68a
    style HIGH fill:#052e16,stroke:#4ade80,color:#86efac

04

Adaptive Algorithms Compared

Our system draws on three complementary adaptive techniques. Here is how they relate:

mindmap
  root((Adaptive\nLearning\nCore))
    BKT
      Skill mastery tracking
      Probabilistic updates
      Drives LLM prompting
      Bayesian inference
    IRT
      1PL Rasch Model
      2PL Discrimination
      3PL Guessing
      Question difficulty calibration
    RL
      Q-learning rewards
      Dynamic path optimization
      Real-time feedback loop
      Engagement maximization
Technique What it Models Real-time Interpretable Our Role
BKT Skill mastery over time βœ“ Yes βœ“ High Primary scoring & LLM prompt control
IRT Item difficulty & student ability ~ Partial βœ“ High Question difficulty calibration
RL Optimal exercise sequencing βœ“ Yes βœ— Low Learning path optimization
Deep Rec. Content similarity patterns ~ Partial βœ— Low Supplementary material suggestion

05

Pilot Study Results

We ran a pilot study in an Industrial Maintenance & Operational Safety course for Supply Chain Management students β€” a technically demanding, interdisciplinary context that is a strong test of adaptability.

+15% Success Rate 65% β†’ 80%
βˆ’20% Response Time 45s β†’ 36s
+60% Daily Interactions 5 β†’ 8 per student
↑ Satisfaction Moderate β†’ High

Success Rate β€” Before vs. After

{
  "type": "bar",
  "data": {
    "labels": ["Baseline", "With System"],
    "datasets": [{
      "label": "Student Success Rate (%)",
      "data": [65, 80],
      "backgroundColor": ["rgba(248,113,113,0.7)", "rgba(74,222,128,0.7)"],
      "borderColor": ["#f87171", "#4ade80"],
      "borderWidth": 2,
      "borderRadius": 8
    }]
  },
  "options": {
    "responsive": true,
    "plugins": {
      "legend": { "display": false },
      "title": {
        "display": true,
        "text": "Quiz & Assignment Success Rate (%)",
        "color": "#e2e8f0",
        "font": { "size": 14 }
      }
    },
    "scales": {
      "y": {
        "beginAtZero": true,
        "max": 100,
        "grid": { "color": "rgba(255,255,255,0.07)" },
        "ticks": { "color": "#94a3b8" }
      },
      "x": {
        "grid": { "display": false },
        "ticks": { "color": "#94a3b8" }
      }
    }
  }
}

Engagement Over the Pilot Period

{
  "type": "line",
  "data": {
    "labels": ["Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6"],
    "datasets": [
      {
        "label": "Daily Interactions (with system)",
        "data": [5.1, 5.8, 6.4, 7.1, 7.6, 8.0],
        "borderColor": "#38bdf8",
        "backgroundColor": "rgba(56,189,248,0.1)",
        "fill": true,
        "tension": 0.4,
        "pointBackgroundColor": "#38bdf8"
      },
      {
        "label": "Baseline (no system)",
        "data": [5.0, 5.0, 5.1, 4.9, 5.0, 5.0],
        "borderColor": "#f87171",
        "backgroundColor": "rgba(248,113,113,0.05)",
        "fill": true,
        "tension": 0.4,
        "borderDash": [5, 5],
        "pointBackgroundColor": "#f87171"
      }
    ]
  },
  "options": {
    "responsive": true,
    "plugins": {
      "title": {
        "display": true,
        "text": "Average Daily Student Interactions Over Time",
        "color": "#e2e8f0",
        "font": { "size": 14 }
      },
      "legend": {
        "labels": { "color": "#94a3b8" }
      }
    },
    "scales": {
      "y": {
        "min": 3,
        "grid": { "color": "rgba(255,255,255,0.07)" },
        "ticks": { "color": "#94a3b8" }
      },
      "x": {
        "grid": { "color": "rgba(255,255,255,0.04)" },
        "ticks": { "color": "#94a3b8" }
      }
    }
  }
}

Simulated BKT Mastery Progression

{
  "type": "line",
  "data": {
    "labels": ["Q1","Q2","Q3","Q4","Q5","Q6","Q7","Q8","Q9","Q10"],
    "datasets": [
      {
        "label": "High-engagement student",
        "data": [0.25, 0.35, 0.50, 0.61, 0.70, 0.78, 0.83, 0.87, 0.90, 0.92],
        "borderColor": "#4ade80",
        "fill": false, "tension": 0.4,
        "pointBackgroundColor": "#4ade80"
      },
      {
        "label": "Average student",
        "data": [0.22, 0.28, 0.35, 0.42, 0.50, 0.57, 0.63, 0.68, 0.72, 0.75],
        "borderColor": "#38bdf8",
        "fill": false, "tension": 0.4,
        "pointBackgroundColor": "#38bdf8"
      },
      {
        "label": "Struggling student",
        "data": [0.20, 0.22, 0.25, 0.30, 0.33, 0.38, 0.44, 0.50, 0.55, 0.60],
        "borderColor": "#f87171",
        "fill": false, "tension": 0.4,
        "pointBackgroundColor": "#f87171"
      }
    ]
  },
  "options": {
    "responsive": true,
    "plugins": {
      "title": {
        "display": true,
        "text": "BKT Mastery Probability p(L) Across 10 Quiz Questions",
        "color": "#e2e8f0",
        "font": { "size": 14 }
      },
      "legend": { "labels": { "color": "#94a3b8" } }
    },
    "scales": {
      "y": {
        "min": 0, "max": 1,
        "grid": { "color": "rgba(255,255,255,0.07)" },
        "ticks": { "color": "#94a3b8" },
        "title": { "display": true, "text": "p(L) β€” mastery probability", "color": "#64748b" }
      },
      "x": {
        "grid": { "color": "rgba(255,255,255,0.04)" },
        "ticks": { "color": "#94a3b8" }
      }
    }
  }
}

06

Privacy: Why Local Deployment Matters

flowchart LR
    subgraph CLOUD["☁️ Cloud LLM (Traditional)"]
        direction TB
        CS([Student]) -->|"⚠️ Raw data leaves campus"| CAPI[External API]
        CAPI --> CLLM[Cloud LLM]
        CLLM --> CDATA[("Third-party\nData Storage")]
        CDATA -.->|"❓ Unknown retention\n& usage policies"| CDATA
    end

    subgraph LOCAL2["πŸ”’ Our Local System"]
        direction TB
        LS([Student]) -->|"πŸ” Encrypted local call"| LAPI[Secure API]
        LAPI --> LLLM[Local LLM]
        LLLM --> LDATA[("On-premise\nStorage")]
        LDATA -.->|"βœ… Institution controls\nall data"| LDATA
    end

    style CLOUD fill:#2d0a0a,stroke:#f87171
    style LOCAL2 fill:#042f2e,stroke:#4ade80
Property Cloud LLM Our Local System
Student data location βœ— External servers βœ“ On-premise only
Internet dependency βœ— Required βœ“ Optional
Curriculum customization ~ Limited βœ“ Full control
GDPR / data compliance βœ— Complex βœ“ Straightforward
Response personalization ~ Generic βœ“ BKT-driven
Offline functionality βœ— No βœ“ Yes

07

Moodle Plugin Architecture

We shipped the system as a native Moodle block plugin, so instructors can deploy it to any course with a single install.

flowchart TD
    MOODLE["πŸŽ“ Moodle LMS"]

    subgraph PLUGIN["block_llm Plugin"]
        direction TB
        BLOCK["block_llm.php\n(Block UI + Form)"]
        VERSION["version.php\n(Plugin metadata)"]
        LANG["lang/en/block_llm.php\n(i18n strings)"]
        BLOCK --- VERSION
        BLOCK --- LANG
    end

    subgraph BACKEND["Local Backend (API)"]
        ROUTER["API Router\n(TLS secured)"]
        RAG2["RAG Pipeline"]
        LLM2["DeepSeek V3\nllama.cpp"]
        BKT2["BKT Engine"]

        ROUTER --> RAG2
        ROUTER --> LLM2
        LLM2 --> BKT2
        RAG2 --> LLM2
    end

    MOODLE --> PLUGIN
    BLOCK -->|"HTTPS POST\n(optional_param sanitized)"| ROUTER
    BKT2 -->|"JSON Response"| BLOCK

    style PLUGIN fill:#1e293b,stroke:#818cf8
    style BACKEND fill:#0f172a,stroke:#38bdf8
    style MOODLE fill:#1e3a5f,stroke:#7dd3fc,color:#e2e8f0

The plugin’s get_llm_response() method (shown in the paper as a stub) is where the real API call replaces the simulation β€” connecting over a TLS-secured internal endpoint to the locally running inference server.


08

Hardware & Deployment Stack

CPUAMD EPYC 7543P β€” 32 cores / 64 threads
GPU2Γ— NVIDIA A6000 Ada (48 GB VRAM each)
RAM256 GB DDR5 ECC
Storage2 TB NVMe SSD (RAID 1)
ModelDeepSeek-V3 671B β€” GGUF Q4_K_M (4-bit quantized)
Inferencellama.cpp + custom RAG pipeline
Network1 Gbps internal Β· TLS-secured API
OSUbuntu 22.04 LTS
graph LR
    subgraph FUTURE["πŸš€ Future: True Edge Deployment"]
        SD["πŸ–₯️ Student Device\n(on-device inference)"] 
        SS["🏫 School Server\n(private on-premise)"]
        SD <-->|"Lightweight API"| SS
    end

    subgraph NOW["πŸ”¬ Current: Centralized Test Server"]
        CS2["πŸ–§ Central Server\n(AMD EPYC + 2Γ— A6000)"]
    end

    NOW -.->|"Transition path"| FUTURE

    style FUTURE fill:#042f2e,stroke:#4ade80
    style NOW fill:#1e3a5f,stroke:#38bdf8

09

Limitations & Road Ahead

quadrantChart
    title Limitation Severity vs. Implementation Effort to Resolve
    x-axis Low Effort --> High Effort
    y-axis Low Severity --> High Severity
    quadrant-1 "⚑ Fix First"
    quadrant-2 "πŸ”¬ Research Needed"
    quadrant-3 "πŸ“‹ Monitor"
    quadrant-4 "πŸ› οΈ Plan for Scale"
    Bias in personalization: [0.75, 0.80]
    Hardware requirements: [0.65, 0.65]
    RAG data fusion quality: [0.45, 0.60]
    Real-time latency: [0.40, 0.55]
    Centralized server: [0.30, 0.45]
    Model hallucinations: [0.70, 0.70]
⚑

Performance Optimization

Reduce latency through hardware acceleration, parallel processing, and smarter caching strategies for RAG retrieval.

πŸ”—

Enhanced Data Fusion

Develop more coherent integration of retrieved passages with generated text to eliminate disconnected or off-topic responses.

βš–οΈ

Bias Detection & Fairness

Embed fairness-aware algorithms to detect and correct content or scoring bias across diverse learner profiles.

πŸ“±

True Edge Deployment

Transition from centralized testing server to on-device or school-network inference for full data sovereignty.

🌍

Scale & Validation

Expand evaluation to multiple disciplines and larger student cohorts to confirm and generalize these findings.

🌐

Multilingual Support

Leverage open-source multilingual models to extend access to students in non-English-language institutions.


Conclusion

This work demonstrates that privacy and pedagogical effectiveness are not competing goals β€” they can be achieved together. By running entirely on institutional infrastructure, our system gives students a capable AI tutor without surrendering control of their data. By coupling that LLM with RAG and Bayesian Knowledge Tracing, it delivers responses that are not only factually grounded but genuinely calibrated to each learner’s current knowledge state.

The pilot results are encouraging: a +15% improvement in success rates, a 20% reduction in response time, and a 60% boost in daily engagement β€” all in a technically demanding course context. Qualitative feedback from both students and instructors reinforces these numbers.

The path forward is clear: move toward true edge deployment, harden bias-mitigation mechanisms, and scale validation across disciplines. The foundations are in place for AI-powered education that is simultaneously more intelligent, more equitable, and more trustworthy.


Published in the Journal of Computer Science, 2026, Vol. 22(4): 1145–1157. DOI: 10.3844/jcssp.2026.1145.1157

References