4. Component Level Architecture

4.1. Introduction to Component Level Architecture

This section describes in detail the Reference Architecture (RA2) for Kubernetes-based cloud infrastructure in terms of the functional capabilities and how they relate to the Reference Model (RM) requirements [1], that is, how the infrastructure profiles are determined, documented, and delivered.

The specifications defined in this section will be detailed with unique identifiers, which will follow the pattern: ra2.<section>.<index>, for example, ra2.ch.001 for the first requirement in the Kubernetes Node section. These specifications will then be used as requirements input for the Reference Implementation based on RA2 specifications (RI2) and any vendor or community implementations.

The Kubernetes Reference Architecture figure below shows the architectural components that are described in the subsequent sections of this chapter.

Kubernetes Reference Architecture

Figure 4.1 Kubernetes Reference Architecture

4.2. Kubernetes Node

This section describes the configuration that will be applied to the physical or virtual machine and its Operating System. For a Kubernetes Node to be conformant with the Reference Architecture, it must be implemented according to the following specifications:

Table 4.1 Node Specifications




Requirement Trace

Reference Implementation Trace


Huge pages

For the node’s profile to qualify as high-performance, it must be possible to enable Huge pages (2048KiB and 1048576KiB) within the Kubernetes Node OS, exposing schedulable resources hugepages-2Mi and hugepages-1Gi.

infra.com.cfg.004 Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section Virtual Compute

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Introduction


SR-IOV capable NICs

For the node’s profile to qualify as high-performance, the physical machines on which the Kubernetes Nodes run must be equipped with NICs that are SR-IOV-capable.

e.cap.013 Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed Performance Optimisation Capabilities

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


SR-IOV Virtual Functions

For the node’s profile to qualify as high-performance, SR-IOV virtual functions (VFs) must be configured within the Kubernetes Node OS, as the SR-IOV Device Plugin does not manage the creation of these VFs.

e.cap.013 Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed Performance Optimisation Capabilities

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


CPU Simultaneous Multi-Threading (SMT)

If SMT is supported, then it must be enabled in the BIOS on the physical machine on which the Kubernetes Node runs.

infra.hw.cpu.cfg.004 Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section Compute Resources

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


CPU Allocation Ratio - VMs

For Kubernetes nodes running as Virtual Machines, the CPU allocation ratio between the vCPU and the physical CPU core must be 1:1.


CPU Allocation Ratio - Pods

To ensure the CPU allocation ratio between the vCPU and the physical CPU core is 1:1, the sum of the CPU requests and limits by the containers in the Pod specifications must remain less than the allocatable quantity of CPU resources (that is, requests.cpu < allocatable.cpu and limits.cpu < allocatable.cpu).

infra.com.cfg.001 Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section Virtual Compute Profiles

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements



To support IPv4/IPv6 dual-stack networking, the Kubernetes Node OS must support and be allocated routable IPv4 and IPv6 addresses.


Physical CPU Quantity

The physical machines on which the Kubernetes nodes run must be equipped with at least two (2) physical sockets, each with at least 20 CPU cores.

infra.hw.cpu.cfg.001 and infra.hw.cpu.cfg.002 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 8, section Telco Edge Cloud: Infrastructure Profiles

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


Physical Storage

The physical machines on which the Kubernetes nodes run should be equipped with solid-state drives (SSDs).

infra.hw.stg.ssd.cfg.002 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section Storage Configurations

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


Local Filesystem Storage Quantity

The Kubernetes nodes must be equipped with local filesystem capacity of at least 320 GB for unpacking and executing containers.


Extra filesystem storage should be provisioned to cater for any overheads required by the Operating System and any required OS processes, such as the container runtime, Kubernetes agents, and so on.

e.cap.003 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed Resource Capabilities

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


Virtual Node CPU Quantity

If using VMs, the Kubernetes nodes must be equipped with at least 16 vCPUs.


Extra CPU capacity should be provisioned to cater for any overheads required by the Operating System and any required OS processes, such as the container runtime, Kubernetes agents, and so on.

  • e.cap.001 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed Resource Capabilities


Kubernetes Node RAM Quantity

The Kubernetes nodes must be equipped with at least 32 GB of RAM.


Extra RAM capacity should be provisioned to cater for any overheads required by the Operating System and any required OS processes, such as the container runtime, Kubernetes agents, and so on.

e.cap.002 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed Resource Capabilities

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


Physical NIC Quantity

The physical machines on which the Kubernetes nodes run must be equipped with at least four (4) Network Interface Card (NIC) ports.

infra.hw.nic.cfg.001 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section NIC configurations

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


Physical NIC Speed - Basic Profile

The speed of the NIC ports housed in the physical machines on which the Kubernetes Nodes run for workloads matching the Basic Profile must be at least 10 Gbps.

infra.hw.nic.cfg.001 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section NIC configurations

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


Physical NIC Speed - High Performance Profile

The speed of the NIC ports housed in the physical machines on which the Kubernetes nodes run for workloads matching the high-performance profile must be at least 25 Gbps.

infra.hw.nic.cfg.001 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section NIC configurations

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 3, section Infrastructure Requirements


Physical PCIe slots

The physical machines on which the Kubernetes nodes run must be equipped with at least eight (8) Gen3.0 PCIe slots, each with at least eight (8) lanes.


Immutable infrastructure

Whether physical or virtual machines are used, the Kubernetes node must not be changed after it is instantiated. New changes to the Kubernetes node must be implemented as new node instances. This covers any changes from the BIOS, through the Operating System, to running processes and all associated configurations.

gen.cnt.02 from Reference Architecture (RA1) for OpenStack based cloud infrastructure [56] Chapter 2, section General Recommendations

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure



Node Feature Discovery [57] must be used to advertise the detailed software and hardware capabilities of each node in the Kubernetes Cluster.


Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


AF_XDP Zero Copy capable netdevs

AF_XDP Zero Copy capable netdevs (dependent on AF_XDP Zero Copy NIC driver) must be available in a compliant Kubernetes worker node if optional AF_XDP is used.

e.cap.025 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed infrastructure capabilities



For Kubernetes nodes belonging to the rt-tsn (ref. Reference Model for Cloud Infrastructure (RM) [1] Chapter 2) flavour, Real-Time versions and/or configurations in BIOS, kernel and OS services

e.cap.026 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed infrastructure capabilities

4.3. Node Operating System

For a Host OS to be compliant with this Reference Architecture, it must meet the following requirements:

Table 4.2 Operating System requirements




Requirement Trace

Reference Implementation Trace


Linux Distribution

A deb-/rpm-compatible distribution of Linux. It must be used for the control plane nodes. It can also be used for worker nodes.




Linux kernel version

A version of the Linux kernel that is compatible with container runtimes and kubeadm - this has been chosen as the baseline because kubeadm is focused on installing and managing the lifecycle of Kubernetes and nothing else, hence it is easily integrated into higher-level tooling for the full lifecycle management of the infrastructure, cluster add-ons, and so on.




Windows server

The Windows server can be used for worker nodes, but beware of the limitations.




Disposable OS

In order to support gen.cnt.02 in Kubernetes Architecture Requirements (immutable infrastructure), the Host OS must be disposable, meaning the configuration of the Host OS (and associated infrastructure such as VM or bare metal server) must be consistent - e.g. the system software and configuration of that software must be identical apart from those areas of configuration that must be different such as IP addresses and hostnames.




Automated deployment

This approach to configuration management supports lcm.gen.01 (automated deployments).



Table 4.3 lists the kernel versions that comply with this Reference Architecture specification.

Table 4.3 Operating System versions

OS Family

Kernel Version(s)




The overlay filesystem snapshotter, used by default by containerd, uses features that were finalized in the 4.x kernel series.


>= 4.18

If using optional AF_XDP (see ra2.ch.019).



If using optional Real-Time (see ra2.ch.020).


1809 (10.0.17763)

For worker nodes only.

4.4. Kubernetes

For the Kubernetes components to be conformant with the Reference Architecture they must be implemented according to the following specifications:

Table 4.4 Kubernetes Specifications




Requirement Trace

Reference Implementation Trace


Kubernetes conformance

The Kubernetes distribution, product, or installer used in the implementation must be listed in the Kubernetes Distributions and Platforms document [58] and marked (X) as conformant for the Kubernetes version defined in Required component versions.

gen.cnt.03 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Highly available etcd

An implementation must consist of either three, five or seven nodes running the etcd service (can be colocated on the control plane nodes, or can run on separate nodes, but not on worker nodes).

gen.rsl.02 in Kubernetes Architecture Requirements, gen.avl.01 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Highly available control plane

An implementation must consist of at least one control plane node per availability zone or fault domain to ensure the high availability and resilience of the Kubernetes control plane services.


Control plane services

A control plane node must run at least the following Kubernetes control plane services: kube-apiserver, kube-scheduler and kube-controller-manager.

gen.rsl.02 in Kubernetes Architecture Requirements, gen.avl.01 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Highly available worker nodes

An implementation must consist of at least one worker node per availability zone or fault domain to ensure the high availability and resilience of workloads managed by Kubernetes

en.rsl.01 in Kubernetes Architecture Requirements, gen.avl.01 in Kubernetes Architecture Requirements, kcm.gen.02 in Kubernetes Architecture Requirements,


Kubernetes API Version

An implementation must use a Kubernetes version as per the subcomponent versions table in Required component versions. In alignment with the Kubernetes Supported versions [59], the difference between the kubernetes release of the control plane nodes and the kubernetes release of the worker nodes must be at most 3 releases (i.e. a n-3 skew).


NUMA support

When hosting workloads matching the high-performance profile, the TopologyManager and CPUManager feature gates must be enabled and configured in the kubelet. –feature-gates=”…, TopologyManager=true,CPUManager=true” –topology-manager-policy=single-numa-node –cpu-manager-policy=static


The TopologyManager feature is enabled by default in Kubernetes v1.18 and later, and the CPUManager feature is enabled by default in Kubernetes v1.10 and later.

e.cap.007 in Cloud Infrastructure Software Profile Capabilities, infra.com.cfg.002 in Reference Model for Cloud Infrastructure (RM) [1], e.cap.013 Reference Model for Cloud Infrastructure (RM) [1] Chapter 8, section Exposed Performance Optimisation Capabilities


DevicePlugins feature gate

When hosting workloads matching the high-performance profile, the DevicePlugins feature gate must be enabled. –feature-gates=”…,DevicePlugins=true,…”


The DevicePlugins feature is enabled by default in Kubernetes v1.10 or later.

Various, e.g. e.cap.013 in Reference Model for Cloud Infrastructure (RM) [1] Chapter 8, section Exposed Performance Optimisation Capabilities

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


System resource reservations

To avoid resource starvation issues on the nodes, the implementation of the architecture must reserve compute resources for system daemons and Kubernetes system daemons such as kubelet, container runtime, and so on. Use the following kubelet flags: –reserved-cpus=[a-z], using two of a-z to reserve 2 SMT threads.

i.cap.014 in Cloud Infrastructure Software Profile Capabilities


CPU pinning

When hosting workloads matching the high-performance profile, in order to support CPU pinning, the kubelet must be started with the –cpu-manager-policy=static option.


Only containers in Guaranteed pods - where CPU resource requests and limits are identical - and configured with positive-integer CPU requests will take advantage of this. All other pods will run on CPUs in the remaining shared pool.

infra.com.cfg.003 in Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section



To support IPv6 and IPv4, the IPv6DualStack feature gate must be enabled on various components (requires Kubernetes v1.16 or later). kube-apiserver: –feature-gates=”IPv6DualStack=true”. kube-controller-manager: –feature-gates=”IPv6DualStack=true” –cluster-cidr=<IPv4 CIDR>,<IPv6 CIDR> –service-cluster-ip-range=<IPv4 CIDR>, <IPv6 CIDR> –node-cidr-mask-size-ipv4 ¦ –node-cidr-mask-size-ipv6 defaults to /24 for IPv4 and /64 for IPv6. kubelet: –feature-gates=”IPv6DualStack=true”. kube-proxy: –cluster-cidr=<IPv4 CIDR>, <IPv6 CIDR> –feature-gates=”IPv6DualStack=true”


The IPv6DualStack feature is enabled by default in Kubernetes v1.21 or later.

inf.ntw.04 in Kubernetes Architecture Requirements


Anuket profile labels

To clearly identify which worker nodes are compliant with the different profiles defined by Anuket, the worker nodes must be labeled according to the following pattern: an anuket.io/profile/basic label must be set to true on the worker node if it can fulfill the requirements of the basic profile and an anuket.io/profile/network-intensive label must be set to true on the worker node if it can fulfill the requirements of the high-performance profile. The requirements for both profiles can be found in Architecture Requirements.


Kubernetes APIs

Kubernetes Kubernetes Alpha API [60] are recommended only for testing, therefore all Alpha APIs must be disabled, except for those required by RA2 Ch4 Specifications currently NFD).


Kubernetes APIs

Backward compatibility of all supported GA APIs of Kubernetes must be supported.


Security groups

Kubernetes must support the NetworkPolicy feature.


Publishing Services (ServiceTypes)

Kubernetes must support LoadBalancer Service (ServiceTypes) [61].


Publishing Services (ServiceTypes)

Kubernetes must support Ingress [62].


Publishing Services (ServiceTypes)

Kubernetes should support NodePort Service (ServiceTypes) [61].

inf.ntw.17 in Kubernetes Architecture Requirements


Publishing Services (ServiceTypes)

Kubernetes should support ExternalName Service (ServiceTypes) [61].


Kubernetes APIs

Kubernetes Beta APIs must be disabled, except for existing APIs as of Kubernetes 1.24 and only when a stable GA of the same version doesn’t exist, or for APIs listed in RA2 Ch6 list of Mandatory API Groups.

int.api.04 in Kubernetes Architecture Requirements


TLS Certificate management for workloads

Cert-manager [35] should be supported and integrated with a PKI certificate provider for workloads to request/renew TLS certificates.

int.api.04 in Kubernetes Architecture Requirements


4.5. Container runtimes

Table 4.5 Container runtime specifications




Requirement Trace

Reference Implementation Trace


Conformance with the Open Container Initiative (OCI) 1.0 runtime specification

The container runtime must be implemented as per the OCI 1.0 [63] specification.

gen.ost.01 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Kubernetes Container Runtime Interface (CRI)

The Kubernetes container runtime must be implemented as per the Kubernetes Container Runtime Interface (CRI) [64]

gen.ost.01 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure

4.6. Networking solutions

For the networking solutions to be conformant with the Reference Architecture, they must be implemented according to the following specifications:

Table 4.6 Networking Solution Specifications




Requirement Trace

Reference Implementation Trace


Centralized network administration

The networking solution deployed within the implementation must be administered through the Kubernetes API using native Kubernetes API resources and objects, or Custom Resources.

inf.ntw.03 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Default Pod Network - CNI

The networking solution deployed within the implementation must use a CNI-conformant Network Plugin for the Default Pod Network, as the alternative (kubenet) does not support cross-node networking or Network Policies.

gen.ost.01 in Kubernetes Architecture Requirements, inf.ntw.08 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Multiple connection points

The networking solution deployed within the implementation must support the capability to connect at least 5 connection points to each Pod, which are additional to the default connection point managed by the default Pod network CNI plugin.

e.cap.004 in Cloud Infrastructure Software Profile Capabilities

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Multiple connection points presentation

The networking solution deployed within the implementation must ensure that all additional non-default connection points are requested by Pods using standard Kubernetes resource scheduling mechanisms, such as annotations, or container resource requests and limits.

inf.ntw.03 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure



The networking solution deployed within the implementation may use a multiplexer/meta-plugin.

inf.ntw.06 in Kubernetes Architecture Requirements, inf.ntw.07 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Multiplexer/meta-plugin CNI conformance

If used, the selected multiplexer/meta-plugin must integrate with the Kubernetes control plane via CNI.

gen.ost.01 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Multiplexer/meta-plugin CNI Plugins

If used, the selected multiplexer/meta-plugin must support the use of multiple CNI-conformant Network Plugins.

gen.ost.01 in Kubernetes Architecture Requirements, inf.ntw.06 Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


SR-IOV device plugin for high performance

When hosting workloads that match the high-performance profile and require SR-IOV acceleration, a Device Plugin for SR-IOV must be used to configure the SR-IOV devices and advertise them to the kubelet.

e.cap.013 in Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed Performance Optimisation Capabilities`

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Multiple connection points with multiplexer / meta-plugin

When a multiplexer/meta-plugin is used, the additional non-default connection points must be managed by a CNI-conformant Network Plugin.

gen.ost.01 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


User plane networking

When hosting workloads that match the high-performance profile, CNI network plugins that support the use of DPDK, VPP, and/or SR-IOV must be deployed as part of the networking solution.

infra.net.acc.cfg.001 in Reference Model for Cloud Infrastructure (RM) [1], Chapter 5, section Virtual Networking Profiles

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


NATless connectivity

When hosting workloads that require source and destination IP addresses to be preserved in the traffic headers, a NATless CNI plugin that exposes the pod IP directly to the external networks (e.g. Calico, MACVLAN or IPVLAN CNI plugins) must be used.

inf.ntw.14 in Kubernetes Architecture Requirements


Device Plugins

When hosting workloads matching the High Performance profile that require the use of FPGA, SR-IOV or other Acceleration Hardware, a Device Plugin for that FPGA or Acceleration Hardware must be used.

e.cap.016 and e.cap.013 in Reference Model for Cloud Infrastructure (RM) [1], Chapter 4, section Exposed Performance Optimisation Capabilities`

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Dual-stack CNI

The networking solution deployed within the implementation must use a CNI-conformant network plugin that is able to support dual-stack IPv4/IPv6 networking.

inf.ntw.04 in Kubernetes Architecture Requirements


Security groups

The networking solution deployed within the implementation must support network policies.

infra.net.cfg.004 Reference Model for Cloud Infrastructure (RM) [1] Chapter 5, section Virtual Networking Profiles


IPAM plugin for multiplexer

When a multiplexer/meta-plugin is used, a CNI-conformant IPAM network plugin must be installed to allocate IP addresses for secondary network interfaces across all nodes of the cluster.

inf.ntw.10 in Kubernetes Architecture Requirements


Kubernetes Network Custom Resource Definition De-Facto Standard-compliant multiplexer/meta-plugin

When a multiplexer/meta-plugin is used, the multiplexer/meta-plugin must implement version 1.2 of the Kubernetes Network Custom Resource Definition De-facto Standard [65].

gen.ost.01 in Kubernetes Architecture Requirements

Reference Implementation based on RA2 specifications (RI2) [55] Chapter 4, section Installation on Bare Metal Infratructure


Kubernetes Load Balancer

The networking solution deployed within the implementation must include a L4 (TCP/UDP - except QUIC) Load Balancer to steer inbound traffic across the primary interfaces of multiple CNF pods.

inf.ntw.15 in Kubernetes Architecture Requirements


Kubernetes Load Balancer - API

The Load Balancer solution deployed per ra2.ntw.017 must support the Service type Loadbalancer API.

inf.ntw.15 in Kubernetes Architecture Requirements


Kubernetes Load Balancer - API

The Load Balancer solution deployed per ra2.ntw.017 may support the Gateway API additionally.

inf.ntw.15 in Kubernetes Architecture Requirements


Kubernetes Load Balancer - Advertisements

The Load Balancer solution deployed per ra2.ntw.017 must be capable of advertising the IPs of Services to external networks.

inf.ntw.15 in Kubernetes Architecture Requirements


Kubernetes Load Balancer - Active/active Multipath

The Load Balancer solution deployed per ra2.ntw.017 must support multi-path advertisements in an active/active design, allowing the same service IP to be advertised by multiple cluster nodes.

inf.ntw.15 in Kubernetes Architecture Requirements


Kubernetes Load Balancer - High Availability

The networking solution deployed per ra2.ntw.017 must be capable of fast failover. Upon node or pod failure, it must redirect traffic (i.e., advertisements/routes must be updated) in less than 5 seconds.

inf.ntw.15 in Kubernetes Architecture Requirements


Time Sensitive Networking

Timing accuracy with PTP Hardware Clock and synronization with SyncE.

e.cap.027 from Reference Model for Cloud Infrastructure (RM) [1] Chapter 4, section Exposed infrastructure capabilities

4.7. Storage components

For the storage solutions to be conformant with the Reference Architecture they must be implemented according to the following specifications:

Table 4.7 Storage solution specifications




Requirement Trace

Reference Implementation Trace


Ephemeral storage

An implementation must support ephemeral storage, for the unpacked container images to be stored and executed from, as a directory in the filesystem on the worker node on which the container is running. See the Container runtimes section above for more information on how this meets the requirement for ephemeral storage for containers.


Kubernetes Volumes

An implementation may attach additional storage to containers using Kubernetes Volumes.


Kubernetes Volumes

An implementation may use Volume Plugins (see ra2.stg.005 below) to allow the use of a storage protocol (such as iSCSI and NFS) or management APIs (such as Cinder and EBS) for the attaching and mounting of storage into a Pod.


Persistent Volumes

An implementation may support Kubernetes Persistent Volumes (PV) to provide persistent storage for Pods. Persistent Volumes exist independent of the lifecycle of containers and/or pods.

inf.stg.01 in Kubernetes Architecture Requirements


Storage Volume Types

An implementation must support the following Volume types: emptyDir, ConfigMap, Secret, and PersistentVolumeClaim. Other Volume plugins may be supported to allow for the use of a range of backend storage systems.


Container Storage Interface (CSI)

An implementation may support the Container Storage Interface (CSI), an Out-of-tree plugin. To support CSI, the feature gates CSIDriverRegistry and CSINodeInfo must be enabled. The implementation must use a CSI driver (full list of CSI drivers [66]). An implementation may support ephemeral storage through a CSI-compatible volume plugin. In this case, the CSIInlineVolume feature gate must be enabled. An implementation may support Persistent Volumes through a CSI-compatible volume plugin. In this case, the CSIPersistentVolume feature gate must be enabled.


Storage Classes

An implementation should use Kubernetes Storage Classes to support automation and the separation of concerns between providers of a service and consumers of the service.


This Reference Architecture does not include any specifications for object storage, as this is neither a native Kubernetes object, nor something that is required by CSI drivers. Object storage is an application-level requirement that would ordinarily be provided by a highly scalable service offering, rather than being something an individual Kubernetes cluster could offer.

Todo: specifications/commentary to support inf.stg.04 (SDS) and inf.stg.05 (high performance and horizontally scalable storage). Also sec.gen.06 (storage resource isolation), sec.gen.10 (CIS - if applicable) and sec.zon.03 (data encryption at rest).

4.8. Service meshes

Application service meshes are not in the scope of the architecture. The service mesh is a dedicated infrastructure layer for handling service-to-service communication. It is recommended to secure service-to-service communications within a cluster and to reduce the attack surface. The benefits of the service mesh framework are described in Using Transport Layer Security and service mesh. In addition to securing communications, the use of a service mesh extends Kubernetes capabilities regarding observability and reliability.

Network service mesh specifications are handled in Networking solutions.

4.9. Kubernetes Application package managers

For the application package managers to be conformant with the Reference Architecture, they must be implemented according to the following specifications:

Table 4.8 Kubernetes Application Package Managers Specifications




Requirement Trace

Reference Implementation Trace


API-based package management

A package manager must use the Kubernetes APIs to manage application artifacts. Cluster-side components such as Tiller must not be required.

int.api.02 in Kubernetes Architecture Requirements


Helm version 3

All workloads must be packaged using Helm (version 3) charts.

Helm version 3 has been chosen as the Application packaging mechanism to ensure compliance with the ONAP ASD NF descriptor specification [67] and ETSI SOL-001 rel. 4 MCIOP specification [68].

4.10. Kubernetes workloads

For the Kubernetes workloads to be conformant with the Reference Architecture, they must be implemented according to the following specifications:

Table 4.9 Kubernetes Workload specifications




Requirement Trace

Reference Implementation Trace


Consumption of additional, non-default connection points

Any additional non-default connection points must be requested through the use of workload annotations or resource requests and limits within the container spec passed to the Kubernetes API Server.

int.api.01 in Kubernetes Architecture Requirements



Host Volumes

Workloads must not use hostPath volumes [69], as Pods with identical configuration (such as those created from a PodTemplate) may behave differently on different nodes due to different files on the nodes.

kcm.gen.02 in Kubernetes Architecture Requirements



Infrastructure dependency

Workloads must not rely on the availability of the control plane nodes for the successful execution of their functionality (that is, loss of the control plane nodes may affect non-functional behaviours, such as healing and scaling. However, components that are already running will continue to do so without issue).




Device plugins

Workload descriptors must use the resources advertised by the device plugins to indicate their need for an FPGA, SR-IOV, or other acceleration device.




Node Feature Discovery (NFD)

If the workload requires special hardware or software features from the worker node, these requirements must be described in the workload descriptors using the labels advertised by Node Feature Discovery [57].




Published helm chart

Helm charts of the CNF must be published in a helm registry and must not be used from local copies.

CNF Testsuite, Rationale, Test if the Helm chart is published: helm_chart_published [70]



Valid Helm chart

Helm charts of the CNF must be valid and should pass the helm lint validation.

CNF Testsuite, Rationale, Test if the Helm chart is valid: helm_chart_valid [71]



Rolling update

Rolling updates of the CNF must be possible using Kubernetes deployments.

CNF Testsuite, Rationale, To test if the CNF can perform a rolling update: rolling_update [72]



Rolling downgrade

Rolling downgrades of the CNF must be possible using Kubernetes deployments.

CNF Testsuite, Rationale, To check if a CNF version can be downgraded through a rolling_downgrade: rolling_downgrade [73]



CNI compatibility

The CNF must use CNI compatible networking plugins.

CNF Testsuite, Rationale, To check if the CNF is compatible with different CNIs: cni_compatibility [74]



Kubernetes API stability

The CNF must not use any Kubernetes alpha APIs, except for those required by the specifications in this chapter (for example, NFD).

CNF Testsuite, Rationale, To check if the CNF is compatible with different CNIs: cni_compatibility [74]



CNF resiliency (node drain)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even in the event of a node drain and consequent rescheduling.

CNF Testsuite, Rationale, Test if the CNF crashes when node drain occurs: node_drain [75]



CNF resiliency (network latency)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even if network latency of up to 2000 ms occurs.

CNF Testsuite, Rationale, Test if the CNF crashes when network latency occurs: pod_network_latency [76]



CNF resiliency (pod delete)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even if a pod delete occurs.

CNF Testsuite, Rationale, Test if the CNF crashes when disk fill occurs: disk_fill [77]



CNF resiliency (pod memory hog)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even if a pod memory hog occurs.

CNF Testsuite, Rationale, Test if the CNF crashes when pod memory hog occurs: pod_memory_hog [78]



CNF resiliency (pod I/O stress)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even if pod I/O stress occurs.

CNF Testsuite, Rationale, Test if the CNF crashes when pod io stress occurs: pod_io_stress [79]



CNF resiliency (pod network corruption)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even if pod network corruption occurs.

CNF Testsuite, Rationale, Test if the CNF crashes when pod network corruption occurs: pod_network_corruption [80]



CNF resiliency (pod network duplication)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even if a pod network duplication occurs.

CNF Testsuite, Rationale, Test if the CNF crashes when pod network duplication occurs: pod_network_duplication [81]



CNF resiliency (pod DNS error)

The CNF must not lose data. It must continue to run and its readiness probe outcome must be Success, even if a pod DNS error occurs.



CNF local storage

The CNF must not use local storage.

CNF Testsuite, Rationale, To test if the CNF uses local storage: no_local_volume_configuration [82]



Liveness probe

All Pods of the CNF must have livenessProbe defined.

CNF Testsuite, Rationale, To test if there is a liveness entry in the Helm chart: liveness [83]



Readiness probe

All Pods of the CNF must have readinessProbe defined.

CNF Testsuite, Rationale, To test if there is a readiness entry in the Helm chart: readiness [84]



No access to container daemon sockets

The CNF must not have any of the container daemon sockets (for example, /var/run/docker.sock, /var/run/containerd.sock or /var/run/crio.sock) mounted.



No automatic service account mapping

Non-specified service accounts must not be automatically mapped. To prevent this, the automountServiceAccountToken: false flag must be set in all Pods of the CNF.

CNF Testsuite, Rationale, To check if there is automatic mapping of service accounts: service_account_mapping [85]



No host network access

Host network must not be attached to any of the Pods of the CNF. The hostNetwork attribute of the Pod specifications must be False, or it should not be specified.

CNF Testsuite, Rationale, To check if there is a host network attached to a pod: host_network [86]



Host process namespace separation

The Pods of the CNF must not share the host process ID namespace or the host IPC namespace. The Pod manifests must not have the hostPID or the hostIPC attribute set to true.

CNF Testsuite, Rationale, To check if containers are running with hostPID or hostIPC privileges: host_pid_ipc_privileges [87]



Resource limits

All containers and namespaces of the CNF must have defined resource limits for at least the CPU and memory resources.

CNF Testsuite, Rationale, To check if containers have resource limits defined: resource_policies [88]



Read only filesystem

All the containers of the CNF must have a read-only filesystem. The readOnlyRootFilesystem attribute of the Pods in their securityContext should be set to true.

CNF Testsuite, Rationale, To check if containers have immutable file systems: immutable_file_systems [89]



Container image tags

All the referred container images in the Pod manifests must be referred by a version tag pointing to a concrete version of the image. The latest tag must not be used.

Kubernetes documentation: Images [90]



No hardcoded IP addresses

The CNF must not have any hardcoded IP addresses in its Pod specifications.

CNF Testsuite, Rationale, To test if there are any (non-declarative) hardcoded IP addresses or subnet masks in the K8s runtime configuration: hardcoded_ip_addresses_in_k8s_runtime_configuration [91]



No node ports

The service declarations of the CNF must not contain a nodePort definition.




Immutable config maps

ConfigMaps used by the CNF must be immutable.




Horizontal scaling

If the CNF supports scaling, increasing and decreasing its capacity must be implemented using horizontal scaling. If horizontal scaling is supported, automatic scaling must be possible using Kubernetes Horizontal Pod Autoscaler (HPA) [94] feature.




CNF image size

The different container images of the CNF should not be bigger than 5GB.

CNF Testsuite, Rationale, To check if the CNF has a reasonable image size: reasonable_image_size [95]



CNF startup time

The startup time of the Pods of a CNF should not exceed 60 seconds, where the startup time is the time between the starting of the Pod and the readiness probe outcome registering Success.

CNF Testsuite, Rationale, To check if the CNF have a reasonable startup time: reasonable_startup_time [96]



No privileged mode

Pods of the CNF must not run in privileged mode.

CNF Testsuite, Rationale, To check if there are any privileged containers: privileged_containers [97]



No root user

Pods of the CNF must not run as a root user.

CNF Testsuite, Rationale, To check if any containers are running as a root user (checks the user outside the container that is running dockerd): non_root_user [98]



No privilege escalation

None of the containers of the CNF should allow privilege escalation.

CNF Testsuite, Rationale, To check if any containers allow for privilege escalation: privilege_escalation [99]



Non-root user

All the Pods of the CNF must be able to execute with a non-root user having a non-root group. Both the runAsUser and the runAsGroup attributes must be set to a value greater than 999.

CNF Testsuite, Rationale, To check if containers are running with non-root user with non-root membership: non_root_containers [100]




The Pods of the CNF should define at least the following labels: app.kubernetes.io/name, app.kubernetes.io/version and app.kubernetes.io/part-of

Kubernetes documentation: Recommended Labels [101]



Log output

The Pods of the CNF must direct their logs to sdout or stderr. This enables the treatment of the logs as event steams.

The Twelve Factor App: Logs [102]



Host ports

The Pods of the CNF should not use the host ports. Using the host ports ties the CNF to a specific node, thereby making the CNF less portable and scalable.

CNF Testsuite, Rationale, To test if there are host ports used in the service configuration: hostport_not_used [103]



SELinux options

If SELinux is used in the Pods of the CNF, the options used to escalate privileges should not be allowed. The options spec.securityContext.seLinuxOptions.type, spec.containers[*].securityContext.seLinuxOptions.type, spec.initContainers[*].securityContext.seLinuxOptions, and spec.ephemeralContainers[*].securityContext.seLinuxOptions.type must either be unset altogether or set to one of the following allowed values container_t, container_init_t, or container_kvm_t.


4.11. Additional required components

This chapter should list any additional components needed to provide the services defined in the chapter Infrastructure Services (for example, Prometheus).

4.12. Platform service components

The architecture may support additional platform services, this chapter defines the requirements for the platform service componenets when the platform service is supported.

Table 4.10 Platform service components requirements


Platform service category


RM reference


Data stores/databases

The platform may support any open source datastore or database technology

Reference Model [1] Chapter 5.1.5


Streaming and messaging

The platform may support any Streaming and messaging technology

Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If an external load balancer is used it must be exposed via the LoadBalancer property of the Kubernetes Service [104]

Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support workload resource scaling

pas.lb.001 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support resource resiliency

pas.lb.002 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support scaling and resiliency in the local environment

pas.lb.003 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support OSI Layer 3/4 load balancing

pas.lb.004 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support round-robin load balancing

pas.lb.005 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must create event logs with the appropriate severity levels (catastrophic, critical, and so on)

pas.lb.006 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support monitoring of endpoints

pas.lb.006 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support Direct Server Return (DSR)

pas.lb.006 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support stateful TCP load balancing

pas.lb.006 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support UDP load balancing

pas.lb.006 in Reference Model [1] Chapter 5.1.5


Load balancer and service proxy

If a load balancer is supported it must support load balancing and the correct handling of fragmented packets

pas.lb.006 in Reference Model [1] Chapter 5.1.5


Service mesh

If a service mesh is supported the service must should support the Service Mesh Interface [105]

Reference Model [1] Chapter 5.1.5



The platform may support any open source monitoring technology

Reference Model [1] Chapter 5.1.5



The platform may support any open source logging technology

Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must support log management from multiple distributed sources

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must manage log rotation at configurable periods

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must manage log rotation at configurable log file status (%full)

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must manage archival and retention of logs for configurable periods by different log types

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must ensure log file integrity (no changes, particularlychanges that may affect the completeness, consistency, and accuracy, including event times, of the log file content) different log types

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must monitor log rotation and log archival processes

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must monitor the logging status of all the log sources

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must ensure that the clock of each logging host is synchronized to a common time source

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must support the reconfiguring of logging as needed, based on policy changes, technology changes, and other factors

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must support the documenting and reporting of anomalies in log settings, configurations, and processes

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must support the correlating of entries from multiple logs that relate to the same event

pas.lb.006 in Reference Model [1] Chapter 5.1.5



If a logging framework is supported it must support the correlating of multiple log entries from a single source or multiple sources, based on logged values (for example, event types, timestamps, and IP addresses)

pas.lb.006 in Reference Model [1] Chapter 5.1.5


Application definition and image build

Kubernetes Application package managers must follow the specifications defined in Chapter 4.9 Kubernetes Application package managers

Reference Model [1] Chapter 5.1.5



The platform may support any open source CI/CD technology

Reference Model [1] Chapter 5.1.5


Ingress/egress controllers

The platform may support any open source Ingress/egress controllers technology

Reference Model [1] Chapter 5.1.5


Ingress/egress controllers

The platform may support any open source Ingress/egress controllers technology

Reference Model [1] Chapter 5.1.5


Ingress/egress controllers

If an egress controller is supported it must provide fixed and consistent source IP addresses for any given egress traffic.


Ingress/egress controllers

If an egress controller is supported it must support several source IP addresses on egress control.


Ingress/egress controllers

If an egress controller is supported it must provide a way to preserve the client IP address on egress control


Ingress/egress controllers

If an ingress and egress controller is supported it must support symmetric IP/VIP for ingress and egress


Ingress/egress controllers

If an egress controller is supported it must provide capabilities to route and isolate egress traffic based on traffic types (OAM, Signaling, etc), connected to e.g., to separate VRFs


Ingress/egress controllers

If an egress controller is supported it must support VLAN tagging for egress traffic


Ingress/egress controllers

If an egress controller is supported it must support the separation for overlapping destination address.


Network service

The platform may support any open source network service technology

Reference Model [1] Chapter 5.1.5


Coordination and service discovery

The platform may support any open source coordination and service discovery technology

Reference Model [1] Chapter 5.1.5


Automation and configuration

The platform may support any open source automation and configuration technology

Reference Model [1] Chapter 5.1.5


Key management

The platform may support any open source key management technology

Reference Model [1] Chapter 5.1.5



The platform may support any open source tracing technology

Reference Model [1] Chapter 5.1.5