Skip to content
Quantum Zeitgeist
  • Quantum Computing
    • Quantum Algorithms
    • Quantum Applications
    • Quantum Computing Business News
    • Quantum Research News
    • Quantum Funding Landscape
    • Quantum Features
    • Quantum Cloud
    • Quantum Internet
    • Quantum Machine Learning
    • Quantum Security
  • Technology News
    • Artificial Intelligence
    • Metaverse
    • Machine Learning
    • Robotics
    • Physics
    • Technology Features
  • Quantum Navigator

Tag: LLM Inference

  • Matterhorn Shows 1.42% Energy Reduction Via Masked Time-To-First-Spike Encoding
    Emerging Technology

    Matterhorn Shows 1.42% Energy Reduction Via Masked Time-To-First-Spike Encoding

    by Rohail T.February 4, 2026
  • Profinfer Achieves 4% Performance Gain with Fine-Grained LLM Inference Profiling
    Artificial Intelligence

    Profinfer Achieves 4% Performance Gain with Fine-Grained LLM Inference Profiling

    by Rohail T.January 30, 2026
  • Rapid-serve Achieves 4.1x LLM Inference Speedup with Intra-GPU Disaggregation
    Technology News

    Rapid-serve Achieves 4.1x LLM Inference Speedup with Intra-GPU Disaggregation

    by Rohail T.January 22, 2026
  • Tokenpowerbench Achieves LLM Inference Power Consumption Analysis, Attributing over 90% of Energy to Prefill and Decode Stages
    Artificial Intelligence

    Tokenpowerbench Achieves LLM Inference Power Consumption Analysis, Attributing over 90% of Energy to Prefill and Decode Stages

    by Rohail T.December 4, 2025
  • Dsd: Distributed Speculative Decoding Achieves 1.1x Throughput Gain with 9.7% Latency Reduction for Edge-Cloud Large Models
    Artificial Intelligence

    Dsd: Distributed Speculative Decoding Achieves 1.1x Throughput Gain with 9.7% Latency Reduction for Edge-Cloud Large Models

    by Rohail T.November 28, 2025
  • Beluga: CXL Architecture Achieves 7.35x Performance Boost and 89.6% Efficiency for LLM KVCache Management
    Emerging Technology

    Beluga: CXL Architecture Achieves 7.35x Performance Boost and 89.6% Efficiency for LLM KVCache Management

    by Rohail T.November 26, 2025
  • T-sar Achieves 86.2x GEMV Throughput and 24.5x GEMM Speedup for CPU-Only Ternary LLM Inference
    Technology News

    T-sar Achieves 86.2x GEMV Throughput and 24.5x GEMM Speedup for CPU-Only Ternary LLM Inference

    by Rohail T.November 20, 2025
  • Amd MI300X GPU Performance Analysis Demonstrates High-Performance for Large Language Models with Hundreds of Billions of Parameters
    Technology News

    Amd MI300X GPU Performance Analysis Demonstrates High-Performance for Large Language Models with Hundreds of Billions of Parameters

    by Rohail T.November 8, 2025
  • Researchers Accelerate LLM Inference with LiquidGEMM, Achieving 4.94x Speedup Via 4-bit Quantization
    Artificial Intelligence

    Researchers Accelerate LLM Inference with LiquidGEMM, Achieving 4.94x Speedup Via 4-bit Quantization

    by Dr. DonovanSeptember 4, 2025
  • Researchers Accelerate Arbitrary Precision Large Language Models, Overcoming Computational Limits with Novel Methods
    Artificial Intelligence

    Researchers Accelerate Arbitrary Precision Large Language Models, Overcoming Computational Limits with Novel Methods

    by Dr. DonovanAugust 31, 2025
  • Researchers develop GreenLLM framework to minimise GPU energy for Large Language Model inference
    Artificial Intelligence

    Researchers develop GreenLLM framework to minimise GPU energy for Large Language Model inference

    by Dr. DonovanAugust 25, 2025
  • MIRAGE Remaps Model Parameters to Accelerate Large Language Model Inference
    Artificial Intelligence

    MIRAGE Remaps Model Parameters to Accelerate Large Language Model Inference

    by Dr. DonovanJuly 17, 2025
  • Qualcomm AI Accelerator Boosts Large Language Model Efficiency.
    Technology News

    Qualcomm AI Accelerator Boosts Large Language Model Efficiency.

    by Dr. DonovanJuly 3, 2025
  • Large Language Model Inference, Systems, Techniques and Future Challenges.
    Artificial Intelligence

    Large Language Model Inference, Systems, Techniques and Future Challenges.

    by Dr. DonovanJuly 2, 2025
  • Local LLM Inference on Edge Accelerators: Performance and Efficiency Analysis.
    Artificial Intelligence

    Local LLM Inference on Edge Accelerators: Performance and Efficiency Analysis.

    by Dr. DonovanJune 16, 2025
  • Hybrid CPU-GPU Scheduling Boosts Large Language Model Inference Speed.
    Technology News

    Hybrid CPU-GPU Scheduling Boosts Large Language Model Inference Speed.

    by Dr. DonovanJune 6, 2025
  • Faster On-Device AI: Ghidorah Optimises Large Language Model Inference.
    Technology News, Artificial Intelligence

    Faster On-Device AI: Ghidorah Optimises Large Language Model Inference.

    by Dr. DonovanJune 1, 2025
Quantum Computing News
Bluesky Logo

Quantum Computing

  • Quantum Applications
  • Quantum Books
  • Quantum Computing Courses
  • Quantum Machine Learning
  • Quantum Jobs
  • Quantum Programming

Quantum Computing

  • Quantum Cloud
  • Quantum Landscape
  • Quantum Cryptography
  • Quantum Finance
  • Quantum Hardware
  • Quantum Internet
  • Quantum Investment

Technology

  • Artificial Intelligence
  • Analog Computing
  • Deep Tech
  • Emerging Technology
  • High Performance Computing
  • Machine Learning
  • Space
  • Science
  • Robotics

About Us

  • Terms and Conditions
  • Privacy Policy
  • Contact Us

Disclaimer: All material, including information from or attributed to Quantum Zeitgeist or individual authors of content on this website, has been obtained from sources believed to be accurate as of the date of publication. However, Quantum Zeitgeist makes no warranty of the accuracy or completeness of the information and Quantum Zeitgeist does not assume any responsibility for its accuracy, efficacy, or use. Any information on the website obtained by Quantum Zeitgeist from third parties has not been reviewed for accuracy.

Copyright 2019 to 2025 The Quantum Zeitgeist website is owned and operated by Hadamard LLC, a Wyoming limited liability company.

Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
  • Manage options
  • Manage services
  • Manage {vendor_count} vendors
  • Read more about these purposes
View preferences
  • {title}
  • {title}
  • {title}
Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
  • Manage options
  • Manage services
  • Manage {vendor_count} vendors
  • Read more about these purposes
View preferences
  • {title}
  • {title}
  • {title}