Researchers are increasingly focused on distributed computing paradigms to address the growing demands of latency-sensitive applications and data privacy concerns. George Violettas from SYSGO GmbH and University of Macedonia, alongside Lefteris Mamatas from University of Macedonia, et al., present TriCloudEdge, a novel three-tier cloud continuum designed to seamlessly integrate far-edge devices, intermediate edge nodes and central cloud services. This architecture distinguishes itself through its scalability and ability to balance computational load with communication efficiency by employing multiple protocols, offering a significant advancement over systems reliant on single versatile protocols. By distributing computational challenges and demonstrating AI model adaptation on resource-constrained devices, this work provides valuable insight into the practical implementation of cloud continuums and aligns with current research efforts to optimise performance across diverse cloud levels.
Emerging multi-protocol communication challenges within the Edge-Cloud AI Continuum require standardized interoperability solutions
Scientists are investigating the rapid evolution of the Internet of Things (IoT) and the increasing demand for real-time intelligent decision-making near data sources, driving a shift in Artificial Intelligence (AI) deployment closer to event areas. This has led to the emergence of the Edge-Cloud AI Continuum, a distributed architecture integrating decentralized edge resources with central cloud infrastructure to offer scalable, real-time, low-latency, privacy-preserving, end-to-end, AI-capable services.
The continuum improves decentralized services and adjusts workload placement, fully utilizing the potential of each device while allocating AI services as needed. Orchestrating this continuum necessitates integrating diverse communication technologies, such as WebSocket for edge-to-far-edge links, custom protocols for data fragmentation, and standard cloud protocols like Message Queuing Telemetry Transport (MQTT) and HTTP.
This multi-protocol approach introduces resource consumption overhead, particularly on constrained devices, and increases maintenance and development complexity. Researchers focus on the capabilities and exploitation of Edge-AI within the cloud continuum, including the execution of AI models at the edge and the associated challenges.
Recent advances in Edge-AI, such as ultra-low-latency, reduced bandwidth demand, and enhanced security, allow local applications to make decisions and act upon them. A critical advantage of the continuum is data locality, minimizing exposure of sensitive data by processing it directly at the far-edge or edge tiers, increasing resilience against network outages and reducing the attack surface.
This aligns with privacy-preserving and regulatory compliance. Practical implementations of the continuum are based on economies of scale and the distribution of AI tasks over diverse hardware, adjusted to different needs. For example, far-edge devices can perform initial lightweight AI tasks like face detection, while edge devices handle more complex local inference, such as identifying a detected face against a local database.
The Cloud layer provides abundant computational power and storage for advanced analytics, large-scale model training, and federated learning orchestration. This three-tier approach offers resource allocation and ensures computational costs are distributed efficiently, maximizing performance while minimizing operational costs.
Although several works describe edge, fog, and cloud architectures, there is a lack of end-to-end implementation and evaluation demonstrating how AI workloads can be distributed among the far-edge, edge, and cloud tiers under real hardware constraints. Existing approaches are often conceptual, simulation-only, or focused on a single tier.
Therefore, a holistic evaluation of a three-tier continuum over heterogeneous constrained devices, utilizing diverse communication protocols and realistic AI tasks, remains an open challenge. This paper provides a comprehensive analysis of the architectural principles, capabilities, and data flow within the Edge-Cloud AI Continuum, presenting two applicable, fully functional reference implementations of the three-layer Cloud Continuum and a detailed comparison between them, as detailed in Section III.
Two comparative architectures were implemented: a Multi-Protocol Architecture (based on WebSocket) and a Zenoh-Unified Architecture, compared under identical workloads, as described in Section IV. The main contributions of this work are the design and implementation of the TriCloudEdge three-tier Continuum on real hardware, the implementation and comparison of two communication architectures, multi-protocol (WebSocket, HTTP, MQTT) and Zenoh-Unified, under identical AI workloads, and a quantitative evaluation of latency, throughput, and pipeline parallelism using applied figures and metrics, reported in detail in Section IV.
Edge-AI has become a strategic priority for European and global ecosystems, with targeted calls from Horizon Europe, initiatives from the World Economic Forum, and investment through public-private partnerships like the Edge-AI Foundation, aiming to embed AI capabilities directly into edge devices. A recent EU study highlights that deploying large-scale AI solely to the cloud is economically and environmentally unsustainable, necessitating efforts to reduce latency, ensure data privacy, and optimize energy consumption by pushing processing to the edge.
Such deployments rely upon efficient, distributed resource-orchestration across a multi-layer cloud continuum, maximizing the utilization of each level’s characteristics and capabilities. This architecture would execute lightweight inference tasks on far-edge devices, more advanced analytics at the edge, and computationally intensive tasks in the central cloud.
The European Union recognizes the strategic significance of Edge-AI by funding end-to-end multi-layer approaches, underlining the need for scalable and low-latency AI services. Other global initiatives, such as the Edge-AI Continuum, dictate that a distributed, hybrid architecture is necessary to satisfy the latency, privacy, and energy efficiency needs of current AI applications.
Without the continuum topology, Edge-AI may become unsustainable and fragmented, as isolated edge nodes lack the capacity for model training, secure data aggregation, or federated learning. Hence, the continuum is not only a technical achievement but also a prerequisite for maximizing the potential of Edge-AI.
The idea of edge computing originated with “fog computing”, as an extension of cloud computing to the edge of the network, enabling new applications and services. The term “cloudlets” was introduced as elements between resource-constrained mobile devices and resource-rich clouds, noting that mobile hardware will always be resource-poor compared to centralized systems.
The term “Edge” was described as allowing computation at the edge of the network, on downstream data from cloud services and upstream data from IoT services. The ETSI specified Mobile Edge Computing (MEC) as an ICT technology with cloud-computing capabilities at the edge of the mobile network, aiming to reduce latency and ensure efficient network operation.
MEC pushes network control and storage functions to the edge, enabling constrained devices to perform computation-intensive tasks. These notions led to the evolution of the Continuum paradigm, integrating three distinct layers: Far Edge, Edge, and Cloud, capable of supporting modern, decentralized AI applications by balancing processing power and responsiveness.
TriCloudEdge implementation and comparative communication architectures reveal significant performance differences
A three-tier cloud continuum, termed TriCloudEdge, was implemented to investigate the distribution of computational challenges across far-edge devices, intermediate edge nodes, and central cloud services. The far edge utilized ultra-low-cost ESP32-CAM microcontrollers, chosen for their integrated camera and wireless connectivity, costing less than €10 and featuring 0.5 to 1 MB of RAM and 4 to 16 MB of Flash storage.
These devices performed lightweight AI tasks such as immediate object or face detection, processing raw data or low-resolution video frames to ensure privacy by preserving data locality. The intermediate edge employed more capable hardware, such as gateways or edge servers like the ESP32-S3, to receive filtered data from the far edge for further analysis like face identification against a local database.
Two communication architectures were constructed and compared: a multi-protocol approach using WebSocket, MQTT, and HTTP, and a unified architecture leveraging the Zenoh protocol. This comparison assessed trade-offs between resource utilization and communication efficiency across the tiers. Data originating from the far edge, for example a detected face, travelled to the edge for identification, and then to the cloud for comparison against a larger database of images.
The cloud tier, utilising high-capacity resources like Amazon Web Services, enabled large-scale analytics, model adaptation, and federated learning. Performance evaluation focused on latency, throughput, and parallelism, with the system designed to address factors like latency, resource availability, and privacy concerns.
The work specifically tested AI model adaptation on the far edge, analysing the computational effort required under varying degrees of parallelism. Characteristics of each tier were defined, with the far edge operating on battery power with intermittent connectivity via LoRa or BLE, the edge utilising stable connections like Wi-Fi or 5G, and the cloud benefiting from high-speed backbone infrastructure. Security measures were also tiered, ranging from basic hardware security on the far edge to full-stack, policy-driven security in the cloud.
Distributed computation and protocol trade-offs in a three-tier cloud-edge architecture present unique challenges for real-time applications
TriCloudEdge establishes a scalable three-tier cloud continuum integrating far-edge devices, intermediate edge nodes, and central cloud services functioning in parallel as a unified solution. At the far edge, ultra-low-cost microcontrollers execute lightweight AI tasks, while intermediate edge devices deliver local intelligence and the cloud tier facilitates large-scale analytics, model adaptation, and global identity management.
The architecture employs multi-protocols including WebSocket, MQTT, and HTTP, alongside the versatile Zenoh protocol, to transfer bidirectional data across tiers, balancing computational demands and latency requirements. Comparative implementations reveal trade-offs between resource utilisation and communication efficiency inherent in these differing approaches.
The research demonstrates the capacity of TriCloudEdge to distribute computational challenges, addressing both latency and privacy concerns within the system. Tests of AI model adaptation were conducted on the far edge, revealing the computational effort involved under conditions of parallelism. This work offers insight into the practical challenges of implementing a continuum, aligning with recent advances in research addressing issues across different cloud levels.
The system supports locality of information, processing sensitive data directly at the far-edge or edge tiers to minimise data exposure and enhance resilience against network outages. At the far edge, constrained devices perform initial lightweight AI tasks such as face detection. Edge devices, with increased capability, then handle more complex local inference, for example, identifying a detected face against a local database.
The cloud layer provides abundant computational power and storage for advanced analytics, large-scale model training, and federated learning orchestration as needed. This three-tier approach optimises resource allocation, maximising performance while minimising operational costs. Two comparative architectures were implemented: a Multi-Protocol Architecture utilising WebSocket, HTTP, and MQTT, and a Zenoh-Unified Architecture.
These were compared under identical workloads to evaluate performance characteristics. The holistic evaluation of a three-tier continuum over heterogeneous constrained devices, utilising diverse communication protocols and realistic AI tasks, remains a key focus of this study.
Scalable Three-Tier Architecture for Latency and Privacy Optimised AI Deployment enables efficient and secure model serving
TriCloudEdge establishes a scalable architecture integrating far-edge devices, intermediate edge nodes, and central cloud services as a unified system. This three-tier continuum distributes computational demands, addressing limitations in latency and data privacy inherent in traditional cloud-centric approaches.
The system supports diverse bidirectional data transfer utilising multiple protocols, including WebSocket, MQTT, and HTTP, and offers a comparison with architectures employing a single, versatile protocol like Zenoh. Comparative implementations reveal trade-offs between resource utilisation and communication efficiency when selecting different data transfer protocols.
Results demonstrate the feasibility of deploying AI model adaptation on resource-constrained far-edge devices, alongside an assessment of the computational effort required when leveraging parallelism. This work contributes a practical perspective on implementing cloud continuums, aligning with recent advances in addressing challenges across various cloud levels.
The authors acknowledge that utilising a multi-level protocol such as Zenoh may reduce the complexity of integration and maintenance compared to multi-protocol systems. Future research could focus on optimising the balance between computational load distribution and communication overhead within the TriCloudEdge architecture, potentially exploring automated protocol selection based on application requirements and network conditions. The demonstrated capacity for far-edge AI model adaptation suggests a path towards increasingly intelligent and autonomous edge devices, though further investigation is needed to assess the scalability and robustness of this approach in real-world deployments.
👉 More information
🗞 TriCloudEdge: A multi-layer Cloud Continuum
🧠 ArXiv: https://arxiv.org/abs/2602.02121
