The increasing use of artificial intelligence in education presents challenges in ensuring the reliability and trustworthiness of generated learning materials, particularly in complex subjects like science, technology, engineering, and mathematics. To address this, Md Motaleb Hossen Manik, Md Zabirul Islam, and Ge Wang from Rensselaer Polytechnic Institute introduce SlideChain, a novel framework that leverages blockchain technology to verify the integrity of semantic information extracted from lecture content. The team developed a system that records and secures the concepts and relationships identified in lecture slides using a blockchain, creating a permanent and tamper-proof record of the AI’s interpretations. This research represents a significant step towards building verifiable and reproducible educational pipelines, offering long-term auditability and supporting the development of trustworthy AI-assisted instructional systems, and the first systematic analysis of semantic disagreements between different AI models interpreting the same educational content reveals substantial discrepancies in their understanding.

Blockchain Provenance for Multimedia Learning Materials

This appendix details the resources needed to reproduce experiments described in a research paper, which explores using blockchain to establish provenance for slide-level semantic extractions from multimedia learning materials. It covers code, data, environment details, and schema, ensuring transparency and verifiability. The appendix is well-organized with clear sections and helpful directory structure descriptions, providing appropriate detail without being overwhelming. The writing is clear, concise, and focused on reproducibility, including a helpful reproducibility checklist. The codebase, including the Solidity smart contract, analysis scripts, and data processing tools, is publicly available on GitHub.

The JSON schema for provenance records is defined, outlining the structure of semantic extractions and metadata. The research includes slide-level provenance JSON files for 1,117 slides, alongside documentation of the hardware and software environment, including Python versions and Hardhat. While the slide data’s public availability is not explicitly stated, clarifying access methods would be beneficial. This appendix demonstrates a strong commitment to reproducibility and transparency, serving as a model for similar research papers. Minor suggestions, such as pinning specific versions of key Python packages and specifying the Hardhat version, would further enhance its quality.

Blockchain Provenance for Semantic Extraction Accuracy

The research team developed SlideChain, a blockchain-backed provenance framework to establish verifiable integrity for semantic extraction from educational materials, addressing inconsistencies in modern vision-language models (VLMs). A meticulously curated dataset, the SlideChain Slides Dataset, comprising 1,117 medical imaging lecture slides, was used to evaluate four state-of-the-art VLMs. Scientists extracted concepts and relational triples from each slide, constructing detailed provenance records documenting the semantic interpretation process. To ensure tamper-evident auditability, cryptographic hashes of these provenance records were anchored onto a local Ethereum Virtual Machine (EVM)-compatible blockchain, creating a permanent record of the semantic analysis.

This approach enables persistent semantic baselines, allowing long-term tracking and verification of model outputs, crucial for STEM education. Experiments revealed significant discrepancies between VLMs, with pronounced differences in concept overlap and near-zero agreement in relational triples when analyzing the same slides. Researchers evaluated the system’s performance by measuring gas usage, throughput, and scalability, achieving perfect tamper detection and deterministic reproducibility. The system delivers a tamper-proof ledger, preserving multimodal semantic features and establishing a foundation for trustworthy, verifiable educational pipelines.

Blockchain Anchors Semantic Provenance of Lecture Slides

Scientists introduced SlideChain, a blockchain-backed system to ensure the integrity and verifiability of semantic information extracted from educational materials, focusing on vision-language models (VLMs). The work centers on a dataset of 1,117 medical imaging lecture slides, used to extract concepts and relational triples from four state-of-the-art VLMs, InternVL3, Qwen2-VL, Qwen3-VL, and LLaVA-OneVision, and create detailed provenance records. These records, containing cryptographic hashes, are anchored on a local Ethereum Virtual Machine-compatible blockchain, providing a tamper-evident audit trail and establishing persistent semantic baselines. Experiments revealed pronounced discrepancies between the VLMs, demonstrating low overlap in extracted concepts and near-zero agreement in relational triples across many slides, highlighting significant semantic inconsistencies.

The team conducted the first systematic analysis of semantic disagreement, cross-model similarity, and lecture-level variability in multimodal educational content, uncovering substantial incoherence even within well-structured STEM material. Further evaluation assessed the blockchain’s performance, measuring gas usage, throughput, and scalability, confirming perfect tamper detection and deterministic reproducibility. Measurements confirm constant gas behavior and a throughput of approximately one slide registered per second, demonstrating the system’s efficiency. These results demonstrate that SlideChain delivers a practical and scalable solution for trustworthy, verifiable educational pipelines, supporting long-term auditability and integrity.

Blockchain Verifies Understanding of Lecture Slides

This research presents SlideChain, a new framework leveraging blockchain technology to establish verifiable integrity for semantic information extracted from educational materials. The team constructed a system capable of recording and permanently storing the conceptual understanding of lecture slides, as determined by several advanced vision-language models. By anchoring cryptographic hashes of these semantic extractions on a blockchain, SlideChain creates a tamper-evident record, ensuring long-term auditability and reproducibility. The core achievement lies in the first systematic analysis of how these models interpret educational content, revealing significant discrepancies in their understanding, even when presented with the same material.

The data demonstrates low overlap in identified concepts and minimal agreement on the relationships between them, highlighting a critical reliability issue in applying these models to sensitive domains like STEM education. While the current implementation uses a local blockchain, the team demonstrated the system’s scalability and ability to detect alterations to the extracted semantic data. Future work could explore deployment on public blockchains and integration with larger educational platforms to further enhance trust and transparency in AI-assisted learning.

👉 More information
🗞 SlideChain: Semantic Provenance for Lecture Understanding via Blockchain Registration
🧠 ArXiv: https://arxiv.org/abs/2512.21684

Tags:

auditability Blockchain concept extraction Ethereum Virtual Machine medical imaging provenance relational triples reproducibility semantic extraction Vision-Language Models

Slidechain Enables Semantic Verification of Educational Content with Blockchain Registration

Blockchain Provenance for Multimedia Learning Materials

Blockchain Provenance for Semantic Extraction Accuracy

Blockchain Anchors Semantic Provenance of Lecture Slides

Blockchain Verifies Understanding of Lecture Slides

Rohail T.

Latest Posts by Rohail T.:

Faster Quantum Error Correction Unlocks Complex Codes

Circuits Describe Qubit States and Steane Code Measurements

Quantum System Tracks Single Photons over 16 Metres