Adaptive Learning with a Local LLM + RAG
Privacy-first personalized education using local AI and Retrieval-Augmented Generation.
The Problem in One Picture
Traditional cloud-based LLM tools used in education create a fundamental tension: the very data that should remain most protected β student performance records, questions, and struggles β is the data transmitted to external servers.
The privacy paradox: Cloud-based tutoring systems improve with more student data, but improving with student data requires exposing that data. Local, on-device AI breaks this cycle entirely.
Beyond privacy, most existing tools apply a one-size-fits-all approach: the same explanation, the same difficulty, the same pace β regardless of what the student already knows. Our system addresses both problems simultaneously.
System Architecture at a Glance
The full system is built from three tightly coupled modules. Here is the complete data flow:
flowchart TD
U([π§βπ Student]) -->|Query| API
subgraph LOCAL["π Local Institution Server"]
direction TB
API[/"π Secure Communication API\n(TLS-encrypted)"/]
subgraph CORE["Core Processing"]
direction LR
LLM["π§ Local LLM\nDeepSeek-V3 671B\n(GGUF Q4_K_M)"]
RAG["π RAG Module\nDocument Retrieval\n+ Context Injection"]
end
FUSION["βοΈ Information Fusion\nAlign tone Β· difficulty Β· domain"]
BKT["π Adaptive Learning\nBayesian Knowledge Tracing"]
LOGS[("ποΈ Storage & Logs\nInteraction History")]
end
KB[("π Knowledge Base\nTextbooks Β· Lecture Notes\nΒ· Technical Manuals")]
API --> LLM
API --> RAG
RAG <-->|"Semantic Search"| KB
LLM --> FUSION
RAG --> FUSION
FUSION --> BKT
BKT -->|"Contextualized\nPersonalized Output"| U
BKT --> LOGS
LOGS -.->|"Proficiency Updates"| BKT
style LOCAL fill:#0f172a,stroke:#334155,color:#e2e8f0
style CORE fill:#1e293b,stroke:#475569,color:#e2e8f0
style LLM fill:#1e3a5f,stroke:#38bdf8,color:#7dd3fc
style RAG fill:#1e3a5f,stroke:#818cf8,color:#a5b4fc
style BKT fill:#064e3b,stroke:#4ade80,color:#86efac
style FUSION fill:#2d1b00,stroke:#f59e0b,color:#fde68a
style LOGS fill:#1e293b,stroke:#475569,color:#94a3b8
style KB fill:#1a1a2e,stroke:#6366f1,color:#a5b4fc
The parallel LLM + RAG design is key: the LLM provides fluency and reasoning while RAG grounds responses in verified educational content β together they reduce hallucinations while maintaining natural, coherent explanations.
The RAG Pipeline in Detail
Retrieval-Augmented Generation works in two phases: first retrieve, then generate. Here is the step-by-step flow as implemented in our system:
sequenceDiagram
actor S as π§βπ Student
participant API as Secure API
participant QE as Query Encoder
participant IDX as Document Index
participant KB as Knowledge Base
participant GEN as LLM Generator
participant FUSE as Info Fusion
S->>API: Sends question (encrypted)
API->>QE: Forward query
QE->>IDX: Encode & search for relevant chunks
IDX->>KB: Look up textbooks / lecture notes
KB-->>IDX: Return top-k passages
IDX-->>QE: Ranked relevant chunks
QE->>GEN: Query + retrieved context
Note over GEN: Generates response grounded<br/>in retrieved knowledge
GEN-->>FUSE: Raw response + context
FUSE-->>API: Pedagogically tuned output
API-->>S: Final personalized answer
Adaptive Learning: Bayesian Knowledge Tracing
The adaptive core of our system is Bayesian Knowledge Tracing (BKT), a probabilistic model that continuously updates its estimate of what a student knows based on every interaction.
π² The Four BKT Parameters
BKT tracks mastery through four core probabilities, updated after every student response:
The BKT-estimated mastery probability then directly controls how the LLM is prompted:
flowchart LR
SCORE{"BKT Mastery\nEstimate"}
SCORE -->|"p(L) < 0.4\nLow Mastery"| LOW["π₯ Foundational Mode\nβ’ Retrieve basic concepts via RAG\nβ’ Step-by-step LLM explanations\nβ’ Simple analogies & examples"]
SCORE -->|"0.4 β€ p(L) < 0.75\nDeveloping"| MID["π¨ Scaffolded Mode\nβ’ Mix of explanation + challenge\nβ’ Guided problem-solving\nβ’ Conceptual connections"]
SCORE -->|"p(L) β₯ 0.75\nHigh Mastery"| HIGH["π© Advanced Mode\nβ’ Concise, technical responses\nβ’ Complex problems & edge cases\nβ’ Cross-concept synthesis"]
style SCORE fill:#1e3a5f,stroke:#38bdf8,color:#7dd3fc
style LOW fill:#450a0a,stroke:#f87171,color:#fca5a5
style MID fill:#2d1b00,stroke:#f59e0b,color:#fde68a
style HIGH fill:#052e16,stroke:#4ade80,color:#86efac
Adaptive Algorithms Compared
Our system draws on three complementary adaptive techniques. Here is how they relate:
mindmap
root((Adaptive\nLearning\nCore))
BKT
Skill mastery tracking
Probabilistic updates
Drives LLM prompting
Bayesian inference
IRT
1PL Rasch Model
2PL Discrimination
3PL Guessing
Question difficulty calibration
RL
Q-learning rewards
Dynamic path optimization
Real-time feedback loop
Engagement maximization
| Technique | What it Models | Real-time | Interpretable | Our Role |
|---|---|---|---|---|
| BKT | Skill mastery over time | β Yes | β High | Primary scoring & LLM prompt control |
| IRT | Item difficulty & student ability | ~ Partial | β High | Question difficulty calibration |
| RL | Optimal exercise sequencing | β Yes | β Low | Learning path optimization |
| Deep Rec. | Content similarity patterns | ~ Partial | β Low | Supplementary material suggestion |
Pilot Study Results
We ran a pilot study in an Industrial Maintenance & Operational Safety course for Supply Chain Management students β a technically demanding, interdisciplinary context that is a strong test of adaptability.
Success Rate β Before vs. After
{
"type": "bar",
"data": {
"labels": ["Baseline", "With System"],
"datasets": [{
"label": "Student Success Rate (%)",
"data": [65, 80],
"backgroundColor": ["rgba(248,113,113,0.7)", "rgba(74,222,128,0.7)"],
"borderColor": ["#f87171", "#4ade80"],
"borderWidth": 2,
"borderRadius": 8
}]
},
"options": {
"responsive": true,
"plugins": {
"legend": { "display": false },
"title": {
"display": true,
"text": "Quiz & Assignment Success Rate (%)",
"color": "#e2e8f0",
"font": { "size": 14 }
}
},
"scales": {
"y": {
"beginAtZero": true,
"max": 100,
"grid": { "color": "rgba(255,255,255,0.07)" },
"ticks": { "color": "#94a3b8" }
},
"x": {
"grid": { "display": false },
"ticks": { "color": "#94a3b8" }
}
}
}
}
Engagement Over the Pilot Period
{
"type": "line",
"data": {
"labels": ["Week 1", "Week 2", "Week 3", "Week 4", "Week 5", "Week 6"],
"datasets": [
{
"label": "Daily Interactions (with system)",
"data": [5.1, 5.8, 6.4, 7.1, 7.6, 8.0],
"borderColor": "#38bdf8",
"backgroundColor": "rgba(56,189,248,0.1)",
"fill": true,
"tension": 0.4,
"pointBackgroundColor": "#38bdf8"
},
{
"label": "Baseline (no system)",
"data": [5.0, 5.0, 5.1, 4.9, 5.0, 5.0],
"borderColor": "#f87171",
"backgroundColor": "rgba(248,113,113,0.05)",
"fill": true,
"tension": 0.4,
"borderDash": [5, 5],
"pointBackgroundColor": "#f87171"
}
]
},
"options": {
"responsive": true,
"plugins": {
"title": {
"display": true,
"text": "Average Daily Student Interactions Over Time",
"color": "#e2e8f0",
"font": { "size": 14 }
},
"legend": {
"labels": { "color": "#94a3b8" }
}
},
"scales": {
"y": {
"min": 3,
"grid": { "color": "rgba(255,255,255,0.07)" },
"ticks": { "color": "#94a3b8" }
},
"x": {
"grid": { "color": "rgba(255,255,255,0.04)" },
"ticks": { "color": "#94a3b8" }
}
}
}
}
Simulated BKT Mastery Progression
{
"type": "line",
"data": {
"labels": ["Q1","Q2","Q3","Q4","Q5","Q6","Q7","Q8","Q9","Q10"],
"datasets": [
{
"label": "High-engagement student",
"data": [0.25, 0.35, 0.50, 0.61, 0.70, 0.78, 0.83, 0.87, 0.90, 0.92],
"borderColor": "#4ade80",
"fill": false, "tension": 0.4,
"pointBackgroundColor": "#4ade80"
},
{
"label": "Average student",
"data": [0.22, 0.28, 0.35, 0.42, 0.50, 0.57, 0.63, 0.68, 0.72, 0.75],
"borderColor": "#38bdf8",
"fill": false, "tension": 0.4,
"pointBackgroundColor": "#38bdf8"
},
{
"label": "Struggling student",
"data": [0.20, 0.22, 0.25, 0.30, 0.33, 0.38, 0.44, 0.50, 0.55, 0.60],
"borderColor": "#f87171",
"fill": false, "tension": 0.4,
"pointBackgroundColor": "#f87171"
}
]
},
"options": {
"responsive": true,
"plugins": {
"title": {
"display": true,
"text": "BKT Mastery Probability p(L) Across 10 Quiz Questions",
"color": "#e2e8f0",
"font": { "size": 14 }
},
"legend": { "labels": { "color": "#94a3b8" } }
},
"scales": {
"y": {
"min": 0, "max": 1,
"grid": { "color": "rgba(255,255,255,0.07)" },
"ticks": { "color": "#94a3b8" },
"title": { "display": true, "text": "p(L) β mastery probability", "color": "#64748b" }
},
"x": {
"grid": { "color": "rgba(255,255,255,0.04)" },
"ticks": { "color": "#94a3b8" }
}
}
}
}
Privacy: Why Local Deployment Matters
flowchart LR
subgraph CLOUD["βοΈ Cloud LLM (Traditional)"]
direction TB
CS([Student]) -->|"β οΈ Raw data leaves campus"| CAPI[External API]
CAPI --> CLLM[Cloud LLM]
CLLM --> CDATA[("Third-party\nData Storage")]
CDATA -.->|"β Unknown retention\n& usage policies"| CDATA
end
subgraph LOCAL2["π Our Local System"]
direction TB
LS([Student]) -->|"π Encrypted local call"| LAPI[Secure API]
LAPI --> LLLM[Local LLM]
LLLM --> LDATA[("On-premise\nStorage")]
LDATA -.->|"β
Institution controls\nall data"| LDATA
end
style CLOUD fill:#2d0a0a,stroke:#f87171
style LOCAL2 fill:#042f2e,stroke:#4ade80
| Property | Cloud LLM | Our Local System |
|---|---|---|
| Student data location | β External servers | β On-premise only |
| Internet dependency | β Required | β Optional |
| Curriculum customization | ~ Limited | β Full control |
| GDPR / data compliance | β Complex | β Straightforward |
| Response personalization | ~ Generic | β BKT-driven |
| Offline functionality | β No | β Yes |
Moodle Plugin Architecture
We shipped the system as a native Moodle block plugin, so instructors can deploy it to any course with a single install.
flowchart TD
MOODLE["π Moodle LMS"]
subgraph PLUGIN["block_llm Plugin"]
direction TB
BLOCK["block_llm.php\n(Block UI + Form)"]
VERSION["version.php\n(Plugin metadata)"]
LANG["lang/en/block_llm.php\n(i18n strings)"]
BLOCK --- VERSION
BLOCK --- LANG
end
subgraph BACKEND["Local Backend (API)"]
ROUTER["API Router\n(TLS secured)"]
RAG2["RAG Pipeline"]
LLM2["DeepSeek V3\nllama.cpp"]
BKT2["BKT Engine"]
ROUTER --> RAG2
ROUTER --> LLM2
LLM2 --> BKT2
RAG2 --> LLM2
end
MOODLE --> PLUGIN
BLOCK -->|"HTTPS POST\n(optional_param sanitized)"| ROUTER
BKT2 -->|"JSON Response"| BLOCK
style PLUGIN fill:#1e293b,stroke:#818cf8
style BACKEND fill:#0f172a,stroke:#38bdf8
style MOODLE fill:#1e3a5f,stroke:#7dd3fc,color:#e2e8f0
The pluginβs get_llm_response() method (shown in the paper as a stub) is where the real API call replaces the simulation β connecting over a TLS-secured internal endpoint to the locally running inference server.
Hardware & Deployment Stack
graph LR
subgraph FUTURE["π Future: True Edge Deployment"]
SD["π₯οΈ Student Device\n(on-device inference)"]
SS["π« School Server\n(private on-premise)"]
SD <-->|"Lightweight API"| SS
end
subgraph NOW["π¬ Current: Centralized Test Server"]
CS2["π§ Central Server\n(AMD EPYC + 2Γ A6000)"]
end
NOW -.->|"Transition path"| FUTURE
style FUTURE fill:#042f2e,stroke:#4ade80
style NOW fill:#1e3a5f,stroke:#38bdf8
Limitations & Road Ahead
quadrantChart
title Limitation Severity vs. Implementation Effort to Resolve
x-axis Low Effort --> High Effort
y-axis Low Severity --> High Severity
quadrant-1 "β‘ Fix First"
quadrant-2 "π¬ Research Needed"
quadrant-3 "π Monitor"
quadrant-4 "π οΈ Plan for Scale"
Bias in personalization: [0.75, 0.80]
Hardware requirements: [0.65, 0.65]
RAG data fusion quality: [0.45, 0.60]
Real-time latency: [0.40, 0.55]
Centralized server: [0.30, 0.45]
Model hallucinations: [0.70, 0.70]
Performance Optimization
Reduce latency through hardware acceleration, parallel processing, and smarter caching strategies for RAG retrieval.
Enhanced Data Fusion
Develop more coherent integration of retrieved passages with generated text to eliminate disconnected or off-topic responses.
Bias Detection & Fairness
Embed fairness-aware algorithms to detect and correct content or scoring bias across diverse learner profiles.
True Edge Deployment
Transition from centralized testing server to on-device or school-network inference for full data sovereignty.
Scale & Validation
Expand evaluation to multiple disciplines and larger student cohorts to confirm and generalize these findings.
Multilingual Support
Leverage open-source multilingual models to extend access to students in non-English-language institutions.
Conclusion
This work demonstrates that privacy and pedagogical effectiveness are not competing goals β they can be achieved together. By running entirely on institutional infrastructure, our system gives students a capable AI tutor without surrendering control of their data. By coupling that LLM with RAG and Bayesian Knowledge Tracing, it delivers responses that are not only factually grounded but genuinely calibrated to each learnerβs current knowledge state.
The pilot results are encouraging: a +15% improvement in success rates, a 20% reduction in response time, and a 60% boost in daily engagement β all in a technically demanding course context. Qualitative feedback from both students and instructors reinforces these numbers.
The path forward is clear: move toward true edge deployment, harden bias-mitigation mechanisms, and scale validation across disciplines. The foundations are in place for AI-powered education that is simultaneously more intelligent, more equitable, and more trustworthy.
Published in the Journal of Computer Science, 2026, Vol. 22(4): 1145β1157. DOI: 10.3844/jcssp.2026.1145.1157