Researchers are tackling the challenge of ensuring reliable user experiences in complex microservice architectures. Anatoly A. Krasnovsky from Innopolis University, alongside colleagues, present a novel approach called Emergence-as-Code (EmaC) to govern journey reliability through a combination of declared intent and operational evidence. This work is significant because current Service Level Objective (SLO)-as-code practices focus on individual service reliability, failing to adequately address the emergent reliability of end-to-end user journeys. EmaC allows teams to define journey objectives within code, automatically synthesising journey models from live system data and generating control mechanisms to prevent breaches of reliability targets, ultimately streamlining releases and improving user satisfaction.

Defining and maintaining end-to-end reliability through compiled journey specifications

Scientists have introduced Emergence-as-Code (EmaC), a novel approach to governing the reliability of complex, cloud-native journeys composed of microservices. Current systems excel at defining desired states for individual services, yet user experience hinges on end-to-end journeys whose reliability emerges from intricate interactions between services, routing, and redundancy.
Consequently, journey objectives, such as achieving a p99 checkout time below 400ms, are often maintained separately from code and become misaligned as systems evolve. This research addresses the challenge of ensuring these journey-level objectives remain consistent and achievable in dynamic environments.

The work proposes treating journey reliability not merely as a measurement, but as a compiled artifact derived from explicit intent and continuously updated evidence. An EmaC specification declares a journey’s objective, its control flow, and governance policies, binding these to atomic service level objectives and telemetry data.
A runtime inference component then synthesises a candidate journey model, leveraging operational data like tracing and traffic configuration, and assigning provenance and confidence levels to each element. This model forms the basis for deriving bounded journey SLOs and budgets, accounting for both optimistic independence and pessimistic shared fate assumptions between services.

From this accepted model, a compiler generates control-plane artifacts, including burn-rate alerts, rollout gates, and action guards, all managed within a Git-based workflow. This allows for auditable, version-controlled governance of emergent behaviour. The researchers provide an anonymised artifact repository containing a runnable example specification and the generated outputs, enabling reproducibility and further exploration.

This approach moves beyond simply monitoring individual service health to actively governing the reliability of the entire user journey. The core thesis is that journey SLOs should be treated as compiled artifacts, built from explicit intent, current evidence of system topology and routing, and atomic SLOs, all under clearly defined correlation assumptions.

EmaC differs from existing tools like OpenSLO and Sloth by compiling journey intent and an inferred interaction model into derived journey SLOs with quantifiable uncertainty. Unlike traditional dependability models, it targets the cloud-native control plane, emitting actionable signals for progressive delivery and automated action. The system does not rely on perfect inference, allowing for explicit modelling alongside data-driven discovery, and treats discrepancies between declared and inferred failure domains as reviewable deltas.

Runtime Journey Model Synthesis and Failure Domain Refinement

A runtime inference component forms the core of the Emergence-as-Code (EmaC) methodology, consuming operational artifacts such as tracing data and traffic configuration to synthesize a candidate journey model. This model details the effective operator graph, branch probabilities, redundancy sets, and potential failure domains, with each element annotated to indicate its provenance and associated confidence level.

The system does not mandate perfect inference, allowing for explicit models during initial setup, while discovery processes serve to calibrate and detect drift over time. Failure domains are central to the EmaC approach, being declared within the initial intent specification and refined through evidence gathered during runtime.

Discrepancies between declared and inferred domains are flagged for review, with the more conservative, correlated assignment used for evaluation until the mismatch is resolved. From this accepted model, the EmaC compiler derives computed journey service level objectives (SLOs) alongside confidence bounds, allocating error budgets to services and domains based on the inferred relationships.

Subsequently, the compiler generates burn-rate alerts and multi-window policies, alongside progressive-delivery gates and action guards, all designed to enforce the defined journey reliability objectives. These control-plane artifacts are structured for review within a Git-based workflow, ensuring audibility and version control.

An anonymized artifact repository provides a runnable example specification and the corresponding generated outputs, facilitating reproducibility and further research into governing emergent system behaviour. The methodology specifically addresses the challenge of maintaining consistent journey SLOs in dynamic microservice environments, where topology, routing, and dependencies evolve rapidly.

Journey Reliability Governed Through Intent Declaration and Operational Inference

Emergence-as-Code establishes a framework for managing journey reliability through intent and evidence. From the accepted journey model, a compiler derives bounded SLOs and budgets, utilising correlation assumptions ranging from optimistic independence to pessimistic shared fate.

Control-plane artifacts, including burn-rate alerts, rollout gates, and action guards, are subsequently generated and made reviewable within a Git-based workflow. This approach facilitates a declarative and governable system for managing complex service interactions. The research leverages existing tools like Kubernetes Custom Resources, OpenSLO, OpenTelemetry, and Prometheus for implementation and monitoring.

An anonymised artifact repository is provided, containing a runnable example specification and the resulting generated outputs for detailed examination. This allows for practical application and validation of the proposed Emergence-as-Code methodology in real-world scenarios.

Operationalising end-to-end reliability through intent and evidence

Emergence-as-Code represents a new approach to defining and governing the reliability of complex service journeys. Current practices often treat reliability as a local property of individual microservices, failing to account for the emergent behaviours arising from their interactions. This work proposes a system where journey reliability is declared through intent and substantiated by operational evidence, enabling computable end-to-end service level objectives.

The core of Emergence-as-Code is a specification that details journey intent, including objectives, control-flow, and permitted actions, linked to telemetry and service level objectives. A runtime component then synthesises a journey model from operational data, such as tracing and traffic configuration, establishing confidence levels and provenance.

A compiler translates this model into actionable governance artefacts, including alerts, rollout gates, and action constraints, all managed within a version control system. An anonymised example specification and generated outputs are publicly available for examination. The authors acknowledge that accurately modelling correlations between microservices remains a challenge, with the system offering both optimistic and pessimistic approaches to dependency assumptions.

Limitations also exist in the scope of the current implementation, which focuses on generating governance artefacts rather than fully automated self-adaptation. Future research may explore extending the system to incorporate closed-loop control and more sophisticated modelling of complex dependencies, ultimately contributing to more self-governing and dependable systems.

👉 More information
🗞 Emergence-as-Code for Self-Governing Reliable Systems
🧠 ArXiv: https://arxiv.org/abs/2602.05458

Tags:

burn-rate alerts. control-flow operators Emergence-as-Code journey reliability microservice topology rollout gates Service Level Objectives Telemetry tracing

Code Now Governs System Reliability, Ensuring Journeys Stay on Track for Users

Defining and maintaining end-to-end reliability through compiled journey specifications

Runtime Journey Model Synthesis and Failure Domain Refinement

Journey Reliability Governed Through Intent Declaration and Operational Inference

Operationalising end-to-end reliability through intent and evidence

Rohail T.

Latest Posts by Rohail T.:

AI Swiftly Answers Questions by Focusing on Key Areas

Machine Learning Sorts Quantum States with High Accuracy

Framework Improves Code Testing with Scenario Planning