1. Introduction

1.1. Overview

The objective of this Reference Architecture (RA) is to develop a usable Kubernetes-based platform for the Telco industry. The RA will be based on the standard Kubernetes platform wherever possible. This Reference Architecture for Kubernetes will describe the high-level system components and their interactions, taking the goals and requirements from the Cloud Infrastructure Reference Model [1] (RM) and mapping them to Kubernetes (and related) components. This document needs to be sufficiently detailed and robust such that it can be used to guide the production deployment of Kubernetes within an operator, whilst being flexible enough to evolve with and remain aligned with the wider Kubernetes ecosystem outside of Telco.

To set this in context, it makes sense to start with the high-level definition and understanding of Kubernetes. Kubernetes [2] is a “portable, extensible, open-source platform for managing containerised workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available” [3]. Kubernetes is developed as an open source project in the kubernetes [4] repository of GitHub.

To assist with the goal of creating a reference architecture that will support Telco workloads, but at the same time leverage the work that already has been completed in the Kubernetes community, RA2 will take an “RA2 Razor” approach to build the foundation. This can be explained along the lines of “if something is useful for non-Telco workloads, we will not include it only for Telco workloads”. For example, start the Reference Architecture from a vanilla Kubernetes (say, v1.16) feature set, then provide clear evidence that a functional requirement cannot be met by that system (say, multi-NIC support), only then the RA would add the least invasive, Kubernetes-community aligned extension (say, Multus) to fill the gap. If there are still gaps that cannot be filled by standard Kubernetes community technologies or extensions then the RA will concisely document the requirement in the Introduction to Gaps, Innovation, and Development chapter of this document and approach the relevant project maintainers with a request to add this functionality into the feature set.

The Kubernetes Reference Architecture will be used to determine a Kubernetes Reference Implementation. The Kubernetes Reference Implementation would then also be used to test and validate the supportability and compatibility with Kubernetes-based Network Function workloads, and lifecycle management of Kubernetes clusters, of interest to the Anuket community. The intention is to expand as much of the existing test frameworks to be used for the verification and conformance testing of Kubernetes-based workloads, and Kubernetes cluster lifecycle management.

1.1.1. Required component versions

Component

Required version(s)

Kubernetes

1.29

1.1.2. Principles

1.1.2.1. Architectural principles

This Reference Architecture conforms with the Anuket principles:

  1. Open-source preference: for building Cloud Infrastructure solutions, components and tools, using open-source technology.

  2. Open APIs: to enable interoperability, component substitution, and minimise integration efforts.

  3. Separation of concerns: to promote lifecycle independence of different architectural layers and modules (e.g., disaggregation of software from hardware).

  4. Automated lifecycle management: to minimise the end-to-end lifecycle costs, maintenance downtime (target zero downtime), and errors resulting from manual processes.

  5. Automated scalability: of workloads to minimise costs and operational impacts.

  6. Automated closed loop assurance: for fault resolution, simplification, and cost reduction of cloud operations.

  7. Cloud nativeness: to optimise the utilisation of resources and enable operational efficiencies.

  8. Security compliance: to ensure the architecture follows the industry best security practices and is at all levels compliant to relevant security regulations.

  9. Resilience and Availability: to withstand Single Point of Failure.

1.1.2.2. Cloud Native Principles

For the purposes of this document, the CNCF TOC’s (Technical Oversight Committee) definition of Cloud Native applies:

CNCF Cloud Native Definition v1.0 Approved by TOC: 2018-06-11

“Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.

These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.

The Cloud Native Computing Foundation seeks to drive adoption of this paradigm by fostering and sustaining an ecosystem of open source, vendor-neutral projects. We democratize state-of-the-art patterns to make these innovations accessible for everyone.”

The CNCF TUG (Telecom User Group), formed in June 2019, published a set of Cloud Native Principles suited to the requirements of the Telecom community [5]. There are many similarities with the CNCF principles, briefly that infrastructure needs to be:

  • scalable

  • dynamic environments

  • service meshes

  • microservices

  • immutable infrastructure

  • declarative APIs

  • loosely coupled

  • resilient

  • manageable

  • observable

  • robust automation

  • high-impact changes frequently and predictably

1.1.2.3. Exceptions

Anuket specifications define certain policies and general principles and strive to .. add general principles from common coalesce the industry towards conformant Cloud Infrastructure technologies and configurations. With the currently available technology options, incompatibilities, performance, and operator constraints (including costs), these policies and principles may not always be achievable and, thus, require an exception process. These policies describe how to handle non-conforming technologies. .. add policies:anuket project policies for managing non-conforming technologies from common In general, non-conformance with policies is handled through a set of exceptions. .. add gov/chapters/chapter09:exception types

The following sub-sections list the exceptions to the principles of Anuket specifications and shall be updated whenever technology choices, versions and requirements change. The Exceptions have an associated period of validity and this period shall include time for transitioning.

1.1.2.3.1. Technology Exceptions

The list of Technology Exceptions will be updated or removed when alternative technologies, aligned with the principles of Anuket specifications, develop and mature.

Table 1.1 Technology Exceptions

Ref

Name

Description

Valid Until

Rationale

Implication

ra2.exc.tec.001

SR-IOV

This exception allows workloads to use SR-IOV over PCI-PassThrough technology.

TBD

Emulation of virtual devices for each virtual machine creates an I/O bottleneck resulting in poor performance and limits the number of virtual machines a physical server can support. SR-IOV implements virtual devices in hardware, and by avoiding the use of a switch, near maximal performance can be achieved. For containerisation the downsides of creating dependencies on hardware is reduced as Kubernetes nodes are either physical, or if virtual have no need to “live migrate” as a VNF VM might.

1.1.3. Approach

The approach taken in this Reference Architecture is to start with a basic Kubernetes architecture, based on the community distribution, and then add detail and additional features/extensions as is required to meet the requirements of the Reference Model and the functional and non-functional requirements of common cloud native network functions.

This document starts with a description of interfaces and capabilities requirements (the “what”) before providing guidance on “how” those elements are deployed, through specifications. The details of how the elements will be used together are documented in full detail in the Reference Implementation.

1.2. Scope

The scope of this particular Reference Architecture can be described as follows (the capabilities themselves will be listed and described in subsequent chapters):

  • Kubernetes platform capabilities required to conform to the Reference Model requirements

  • Support for CNFs that consist wholly of containers

  • Support for CNFs that consist partly of containers and partly of VMs, both of which will be orchestrated by Kubernetes

  • Kubernetes Cluster lifecycle management: including Cluster creation/upgrade/scaling/deletion, and node customisation due to workload requirements.

The following items are considered out of scope:

  • Kubernetes-based Application / CNF Management: this is an application layer capability that is out of scope of Anuket.

Kubernetes Reference Architecture scope

Figure 1.1 Kubernetes Reference Architecture scope

1.3. Definitions

Table 1.2 Definitions

Term

Description

Abstraction

Process of removing concrete, fine-grained or lower-level details or attributes or common properties in the study of systems to focus attention on topics of greater importance or general concepts. It can be the result of decoupling.

Anuket

A LFN open-source project developing open reference infrastructure models, architectures, tools, and programs.

CaaS Containers as a Service

A Platform suitable to host and run Containerised workloads, such as Kubernetes. Instances of CaaS Platforms are known as CaaS Clusters.

CaaS Manager

A management plane function that manages the lifecycle (instantiation, scaling, healing, etc.) of one or more CaaS instances, including communication with VIM for master/node lifecycle management.

Cloud Infrastructure

A generic term covering NFVI, IaaS and CaaS capabilities - essentially the infrastructure on which a Workload can be executed. NFVI, IaaS and CaaS layers can be built on top of each other. In case of CaaS some cloud infrastructure features (e.g.: HW management or multitenancy) are implemented by using an underlying IaaS* layer.

Cloud Infrastructure Hardware Profile

Defines the behaviour, capabilities, configuration, and metrics provided by a cloud infrastructure hardware layer resources available for the workloads.

Cloud Infrastructure Profile

The combination of the Cloud Infrastructure Software Profile and the Cloud Infrastructure Hardware Profile that defines the capabilities and configuration of the Cloud Infrastructure resources available for the workloads.

Cloud Infrastructure Software Profile

Defines the behaviour, capabilities and metrics provided by a Cloud Infrastructure Software Layer on resources available for the workloads.

Cloud Native Network Function (CNF)

A cloud native network function (CNF) is a cloud native application that implements network functionality. A CNF consists of one or more microservices. All layers of a CNF are developed using Cloud Native Principles including immutable infrastructure, declarative APIs, and a “repeatable deployment process”. This definition is derived from the Cloud Native Thinking for Telecommunications Whitepaper, which also includes further detail and examples.

Compute Node

An abstract definition of a server. A compute node can refer to a set of hardware and software that support the VMs or Containers running on it.

Container

A lightweight and portable executable image that contains software and all its dependencies. OCI defines Container as “An environment for executing processes with configurable isolation and resource limitations. For example, namespaces, resource limits, and mounts are all part of the container environment.” A Container provides operating-system-level virtualisation by abstracting the “user space”. One big difference between Containers and VMs is that unlike VMs, where each VM is self-contained with all the operating systems components are within the VM package, containers “share” the host system’s kernel with other containers.

Container Engine

Software components used to create, destroy, and manage containers on top of an operating system.

Container Image

Stored instance of a container that holds a set of software needed to run an application.

Container Runtime

The software that is responsible for running containers. It reads the configuration files for a Container from a directory structure, uses that information to create a container, launches a process inside the container, and performs other lifecycle actions.

Core (physical)

An independent computer processing unit that can independently execute CPU instructions and is integrated with other cores on a multiprocessor (chip, integrated circuit die). Please note that the multiprocessor chip is also referred to as a CPU that is placed in a socket of a computer motherboard.

CPU Type

A classification of CPUs by features needed for the execution of computer programs; for example, instruction sets, cache size, number of cores.

Decoupling, Loose Coupling

Loosely coupled system is one in which each of its components has, or makes use of, little or no knowledge of the implementation details of other separate components. Loose coupling is the opposite of tight coupling

Encapsulation

Restriction of direct access to some of an object’s components.

External Network

External networks provide network connectivity for a cloud infrastructure tenant to resources outside of the tenant space.

Fluentd

An open-source data collector for unified logging layer, which allows data collection and consumption for better use and understanding of data. Fluentd is a CNCF graduated project.

Functest

An open-source project part of Anuket LFN project. It addresses functional testing with a collection of state-of-the-art virtual infrastructure test suites, including automatic VNF testing.

Hardware resources

Compute, storage and network hardware resources on which the cloud infrastructure platform software, virtual machines and containers run on.

Host Profile

Is another term for a Cloud Infrastructure Hardware Profile.

Huge pages

Physical memory is partitioned and accessed using the basic page unit (in Linux default size of 4 KB). Huge pages, typically 2MB and 1GB size, allows large amounts of memory to be utilised with reduced overhead. In an NFV environment, huge pages are critical to support large memory pool allocation for data packet buffers. This results in fewer Translation Lookaside Buffers (TLB) lookups, which reduces the virtual to physical pages’ address translations. Without huge pages enabled high TLB miss rates would occur thereby degrading performance.

Hypervisor

A software that abstracts and isolates workloads with their own operating systems from the underlying physical resources. Also known as a virtual machine monitor (VMM).

Instance

Is a virtual compute resource, in a known state such as running or suspended, that can be used like a physical server. It can be used to specify VM Instance or Container Instance.

Kibana

An open-source data visualisation system.

Kubernetes

An open-source system for automating deployment, scaling, and management of containerised applications.

Kubernetes Cluster

A set of machines, called nodes (either workers or control plane), that run containerised applications managed by Kubernetes.

Kubernetes Control Plane

The container orchestration layer that exposes the API and interfaces to define, deploy, and manage the lifecycle of containers.

Kubernetes Node

A node is a worker machine in Kubernetes. A worker node may be a VM or physical host, depending on the cluster. It has local daemons or services necessary to run Pods and is managed by the control plane.

Kubernetes Service

An abstract way to expose an application running on a set of Pods as a Kubernetes network service.

Monitoring (Capability)

Monitoring capabilities are used for the passive observation of workload-specific traffic traversing the Cloud Infrastructure. Note, as with all capabilities, Monitoring may be unavailable or intentionally disabled for security reasons in a given cloud infrastructure instance.

Multi-tenancy

Feature where physical, virtual or service resources are allocated in such a way that multiple tenants and their computations and data are isolated from and inaccessible by each other.

Network Function (NF)

Functional block or application that has well-defined external interfaces and well-defined functional behaviour. Within NFV, a Network Function is implemented in a form of Virtualised NF (VNF) or a Cloud Native NF (CNF).

NFV Orchestrator (NFVO)

Manages the VNF lifecycle and Cloud Infrastructure resources (supported by the VIM) to ensure an optimised allocation of the necessary resources and connectivity.

Network Function Virtualisation (NFV)

The concept of separating network functions from the hardware they run on by using a virtual hardware abstraction layer.

Network Function Virtualisation Infrastructure (NFVI)

The totality of all hardware and software components used to build the environment in which a set of virtual applications (VAs) are deployed; also referred to as cloud infrastructure. The NFVI can span across many locations, e.g., places where data centres or edge nodes are operated. The network providing connectivity between these locations is regarded to be part of the cloud infrastructure. NFVI and VNF are the top-level conceptual entities in the scope of Network Function Virtualisation. All other components are sub-entities of these two main entities.

Network Service (NS)

Composition of Network Function(s) and/or Network Service(s), defined by its functional and behavioural specification, including the service lifecycle.

Open Network Automation Platform (ONAP)

An LFN project developing a comprehensive platform for orchestration, management, and automation of network and edge computing services for network operators, cloud providers, and enterprises.

ONAP OpenLab

ONAP community lab.

Open Platform for NFV (OPNFV)

A collaborative project under the Linux Foundation. OPNFV is now part of the LFN Anuket project. It aims to implement, test, and deploy tools for conformance and performance of NFV infrastructure.

OPNFV Verification Program (OVP)

An open-source, community-led compliance and verification program aiming to demonstrate the readiness and availability of commercial NFV products and services using OPNFV and ONAP components.

Platform

A cloud capabilities type in which the cloud service user can deploy, manage and run customer-created or customer-acquired applications using one or more programming languages and one or more execution environments supported by the cloud service provider. Adapted from ITU-T Y.3500. This includes the physical infrastructure, Operating Systems, virtualisation/containerisation software and other orchestration, security, monitoring/logging and life-cycle management software.

Pod

The smallest and simplest Kubernetes object. A Pod represents a set of running containers on a cluster. A Pod is typically set up to run a single primary container. It can also run optional sidecar containers that add supplementary features like logging.

Prometheus

An open-source monitoring and alerting system.

Quota

An imposed upper limit on specific types of resources, usually used to prevent excessive resource consumption by a given consumer (tenant, VM, container).

Resource pool

A logical grouping of cloud infrastructure hardware and software resources. A resource pool can be based on a certain resource type (for example, compute, storage and network) or a combination of resource types. A Cloud Infrastructure resource can be part of none, one or more resource pools.

Simultaneous Multithreading (SMT)

Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution on a single core to better utilise the resources provided by modern processor architectures.

Tenant

Cloud service users sharing access to a set of physical and virtual resources, ITU-T Y.3500. Tenants represent an independently manageable logical pool of compute, storage and network resources abstracted from physical hardware.

Tenant Instance

Refers to an Instance owned by or dedicated for use by a single Tenant.

Tenant (Internal) Networks

Virtual networks that are internal to Tenant Instances.

User

Natural person, or entity acting on their behalf, associated with a cloud service customer that uses cloud services. Examples of such entities include devices and applications.

Virtual CPU (vCPU)

Represents a portion of the host’s computing resources allocated to a virtualised resource, for example, to a virtual machine or a container. One or more vCPUs can be assigned to a virtualised resource.

Virtualised Infrastructure Manager (VIM)

Responsible for controlling and managing the Network Function Virtualisation Infrastructure (NFVI) compute, storage and network resources.

Virtual Machine (VM)

Virtualised computation environment that behaves like a physical computer/server. A VM consists of all of the components (processor (CPU), memory, storage, interfaces/ports, etc.) of a physical computer/server. It is created using sizing information or Compute Flavour.

Virtualised Network Function (VNF)

A software implementation of a Network Function, capable of running on the Cloud Infrastructure. VNFs are built from one or more VNF Components (VNFC) and, in most cases, the VNFC is hosted on a single VM or Container.

Workload

An application (for example VNF, or CNF) that performs certain task(s) for the users. In the Cloud Infrastructure, these applications run on top of compute resources such as VMs or Containers.

1.4. Abbreviations

Term

Description

API

Application Programming Interface

BGP VPN

Border gateway Protocol Virtual Private network

CaaS

Containers as a Service

CI/CD

Continuous Integration/Continuous Deployment

CNF

Containerised Network Function

CNTT

Cloud iNfrastructure Telco Task Force

CPU

Central Processing Unit

DNS

Domain Name System

DPDK

Data Plane Development Kit

DHCP

Dynamic Host Configuration Protocol

ECMP

Equal Cost Multi-Path routing

ETSI

European Telecommunications Standards Institute

FPGA

Field Programmable Gate Array

MB/GB/TB

MegaByte/GigaByte/TeraByte

GPU

Graphics Processing Unit

GRE

Generic Routing Encapsulation

GSM

Global System for Mobile Communications (originally Groupe Spécial Mobile)

GSMA

GSM Association

GSLB

Global Service Load Balancer

GUI

Graphical User Interface

HA

High Availability

HDD

Hard Disk Drive

HTTP

Hypertext Transfer Protocol

HW

Hardware

IaaC (also IaC)

Infrastructure as a Code

IaaS

Infrastructure as a Service

ICMP

Internet Control Message Protocol

IMS

IP Multimedia Sub System

IO

Input/Output

IOPS

Input/Output per Second

IPMI

Intelligent Platform Management Interface

KVM

Kernel-based Virtual Machine

LCM

Life Cycle Management

LDAP

Lightweight Directory Access Protocol

LFN

Linux Foundation Networking

LMA

Logging, Monitoring and Analytics

LVM

Logical Volume Management

MANO

Management And Orchestration

MLAG

Multi-chassis Link Aggregation Group

NAT

Network Address Translation

NFS

Network File System

NFV

Network Function Virtualisation

NFVI

Network Function Virtualisation Infrastructure

NIC

Network Interface Card

NPU

Numeric Processing Unit

NTP

Network Time Protocol

NUMA

Non-Uniform Memory Access

OAI

Open Air Interface

OS

Operating System

OSTK

OpenStack

OPNFV

Open Platform for NFV

OVS

Open vSwitch

OWASP

Open Web Application Security Project

PCIe

Peripheral Component Interconnect Express

PCI-PT

PCIe Passthrough

PXE

Preboot Execution Environment

QoS

Quality of Service

RA

Reference Architecture

RA-2

Reference Architecture 2 (i.e., Reference Architecture for Kubernetes-based Cloud Infrastructure)

RBAC

Role-based Access Control

RBD

RADOS Block Device

REST

Representational state transfer

RI

Reference Implementation

RM

Reference Model

SAST

Static Application Security Testing

SDN

Software Defined Networking

SFC

Service Function Chaining

SG

Security Group

SLA

Service Level Agreement

SMP

Symmetric Multiprocessing

SMT

Simultaneous Multithreading

SNAT

Source Network Address Translation

SNMP

Simple Network Management Protocol

SR-IOV

Single Root Input Output Virtualisation

SSD

Solid State Drive

SSL

Secure Sockets Layer

SUT

System Under Test

TCP

Transmission Control Protocol

TLS

Transport Layer Security

ToR

Top of Rack

TPM

Trusted Platform Module

UDP

User Data Protocol

VIM

Virtualised Infrastructure Manager

VLAN

Virtual LAN

VM

Virtual Machine

VNF

Virtual Network Function

VRRP

Virtual Router Redundancy Protocol

VTEP

VXLAN Tunnel End Point

VXLAN

Virtual Extensible LAN

WAN

Wide Area Network

ZTA

Zero Trust Architecture

1.5. References

1

Reference model for cloud infrastructure (rm). GSMA PRD NG.126 v3.0, 2022.

2

Kubernetes documentation. URL: https://kubernetes.io/docs/home/.

3

What is kubernetes. URL: https://kubernetes.io/docs/concepts/overview/.

4

Kubernetes github repository. URL: https://github.com/kubernetes/kubernetes.

5

Expanded cloud native principles. URL: https://networking.cloud-native-principles.org/cloud-native-principles.

6

Scott O. Bradner. Key words for use in rfcs to indicate requirement levels. RFC 2119, March 1997. URL: https://www.rfc-editor.org/info/rfc2119, doi:10.17487/RFC2119.

7

Cis password policy guide. URL: https://www.cisecurity.org/insights/white-papers/cis-password-policy-guide.

8

Cis controls list. URL: https://www.cisecurity.org/controls/cis-controls-list.

9

Cve - common vulnerabilities and exposures. URL: https://cve.mitre.org/.

10

Sbom - software bill of materials. URL: https://ntia.gov/page/software-bill-materials.

11

Cloud security alliance. URL: https://cloudsecurityalliance.org/.

12

Owasp cheat sheet series (ocss). URL: https://github.com/OWASP/CheatSheetSeries/.

13

Owasp top ten security risks. URL: https://owasp.org/www-project-top-ten/.

14

Owasp software maturity model (samm). URL: https://owaspsamm.org/.

15

Owasp web security testing guide. URL: https://github.com/OWASP/wstg/.

16

Iso/iec 27001. URL: https://www.iso.org/obp/ui/#iso:std:iso-iec:27001:ed-2:v1:en.

17

Iso/iec 27032. URL: https://www.iso.org/obp/ui/#iso:std:iso-iec:27032:ed-1:v1:en.

18

Cncf kubernetes conformance. URL: https://github.com/cncf/k8s-conformance.

19

Open container initiative (oci) runtime spec. URL: https://github.com/opencontainers/runtime-spec.

20

Kubernetes docs - pod. URL: https://kubernetes.io/docs/concepts/workloads/pods.

21

Kubernetes docs - replicaset. URL: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset.

22

Kubernetes docs - deployment. URL: https://kubernetes.io/docs/concepts/workloads/controllers/deployment.

23

Kubernetes docs - daemonset. URL: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset.

24

Kubernetes docs - job. URL: https://kubernetes.io/docs/concepts/workloads/controllers/job.

25

Kubernetes docs - cronjob. URL: https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs.

26

Kubernetes docs - statefulset. URL: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset.

27

Kubernetes docs - cpu manager. URL: https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies.

28

Kubernetes docs - huge pages. URL: https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages.

29

Kubernetes docs - memory manager. URL: https://kubernetes.io/docs/tasks/administer-cluster/memory-manager.

30

Kubernetes docs - topology manager. URL: https://kubernetes.io/docs/tasks/administer-cluster/topology-manager.

31

Kubernetes node feature discovery. URL: https://kubernetes-sigs.github.io/node-feature-discovery/.

32

Kubernetes docs - device plugin framework. URL: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.

33

Kubernetes sr-iov network device plugin. URL: https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin.

34

Github: multus-cni. URL: https://github.com/k8snetworkplumbingwg/multus-cni.

35

Cert-manager. URL: https://cert-manager.io/.

36

Kubernetes docs - network plugins. URL: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins.

37

Kubernetes network custom resource definition. URL: https://github.com/k8snetworkplumbingwg/multi-net-spec.

38

Tungsten fabric. URL: https://tungsten.io/.

39

Af_xdp device plugin. URL: https://github.com/intel/afxdp-plugins-for-kubernetes.

40

Cloud native data plane. URL: https://cndp.io/.

41

Dpdk af_xdp poll mode driver. URL: https://doc.dpdk.org/guides/nics/af_xdp.html.

42

Kubernetes docs - ingress. URL: https://kubernetes.io/docs/concepts/services-networking/ingress.

43

Kubernetes docs - service. URL: https://kubernetes.io/docs/concepts/services-networking/service.

44

Kubernetes docs - endpointslices. URL: https://kubernetes.io/docs/concepts/services-networking/endpoint-slices.

45

Kubernetes docs - network policies. URL: https://kubernetes.io/docs/concepts/services-networking/network-policies.

46

Container network interface. URL: https://github.com/containernetworking/cni.

47

Kubernetes network special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-network.

48

Kubernetes network plumbing working group. URL: https://github.com/k8snetworkplumbingwg/community.

49

Helm documentation. URL: https://helm.sh/docs.

50

Kubernetes docs - custom resources. URL: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources.

51

Kubernetes docs - custom resource definitions. URL: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.

52

Kubernetes docs - api server aggregation. URL: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation.

53

Kubernetes docs - custom controllers. URL: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#custom-controllers.

54

Kubernetes docs - operator pattern. URL: https://kubernetes.io/docs/concepts/extend-kubernetes/operator.

55

Operator hub. URL: https://operatorhub.io.

56

Reference implementation based on ra2 specifications (ri2). URL: https://cntt.readthedocs.io/projects/ri2/.

57

Reference architecture (ra1) for openstack based cloud infrastructure. URL: https://cntt.readthedocs.io/projects/ra1/.

58

Node feature discovery. URL: https://kubernetes-sigs.github.io/node-feature-discovery/stable/get-started/index.html.

59

Kubernetes distributions and platforms document. URL: https://docs.google.com/spreadsheets/d/1uF9BoDzzisHSQemXHIKegMhuythuq_GL3N1mlUUK2h0/.

60

Kubernetes supported versions. URL: https://kubernetes.io/releases/version-skew-policy/#supported-versions.

61

Kubernetes alpha api. URL: https://kubernetes.io/docs/reference/using-api/#api-versioning.

62

Kubernetes publishing services (servicetypes). URL: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types.

63

Kubernetes ingress. URL: https://kubernetes.io/docs/concepts/services-networking/ingress/.

64

Open container initiative runtime specification, version 1.0.0. URL: https://github.com/opencontainers/runtime-spec/blob/v1.0.0/spec.md.

65

Kubernetes: introducing container runtime interface (cri) in kubernetes. URL: https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/.

66

Kubernetes network custom resource definition de-facto standard. URL: https://github.com/k8snetworkplumbingwg/multi-net-spec/tree/master/v1.2.

67

Kubernetes csi drivers. URL: https://kubernetes-csi.github.io/docs/drivers.html.

68

Application service descriptor (asd) and packaging proposals for cnf. URL: https://wiki.onap.org/display/DW/Application+Service+Descriptor+%28ASD%29+and+packaging+Proposals+for+CNF.

69

Etsi gs nfv-sol 001: nfv descriptors based on tosca specification. ETSI GS NFV-SOL 001, 2022. URL: https://www.etsi.org/deliver/etsi_gs/NFV-SOL/001_099/001/04.02.01_60/gs_NFV-SOL001v040201p.pdf.

70

Kubernetes types of volumes, hostpath. URL: https://kubernetes.io/docs/concepts/storage/volumes/#hostpath.

71

Cnf testsuite, rationale, test if the helm chart is published: helm_chart_published. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-helm-chart-is-published-helm_chart_published.

72

Cnf testsuite, rationale, test if the helm chart is valid: helm_chart_valid. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-helm-chart-is-valid-helm_chart_valid.

73

Cnf testsuite, rationale, to test if the cnf can perform a rolling update: rolling_update. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-test-if-the-cnf-can-perform-a-rolling-update-rolling_update.

74

Cnf testsuite, rationale, to check if a cnf version can be downgraded through a rolling_downgrade: rolling_downgrade. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-a-cnf-version-can-be-downgraded-through-a-rolling_downgrade-rolling_downgrade.

75

Cnf testsuite, rationale, to check if the cnf is compatible with different cnis: cni_compatibility. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-the-cnf-is-compatible-with-different-cnis-cni_compatibility.

76

Cnf testsuite, rationale, test if the cnf crashes when node drain occurs: node_drain. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-cnf-crashes-when-node-drain-occurs-node_drain.

77

Cnf testsuite, rationale, test if the cnf crashes when network latency occurs: pod_network_latency. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-cnf-crashes-when-network-latency-occurs-pod_network_latency.

78

Cnf testsuite, rationale, test if the cnf crashes when disk fill occurs: disk_fill. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-cnf-crashes-when-disk-fill-occurs.

79

Cnf testsuite, rationale, test if the cnf crashes when pod memory hog occurs: pod_memory_hog. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-cnf-crashes-when-pod-memory-hog-occurs-pod_memory_hog.

80

Cnf testsuite, rationale, test if the cnf crashes when pod io stress occurs: pod_io_stress. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-cnf-crashes-when-pod-io-stress-occurs-pod_io_stress.

81

Cnf testsuite, rationale, test if the cnf crashes when pod network corruption occurs: pod_network_corruption. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-cnf-crashes-when-pod-network-corruption-occurs-pod_network_corruption.

82

Cnf testsuite, rationale, test if the cnf crashes when pod network duplication occurs: pod_network_duplication. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#test-if-the-cnf-crashes-when-pod-network-duplication-occurs-pod_network_duplication.

83

Cnf testsuite, rationale, to test if the cnf uses local storage: no_local_volume_configuration. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-test-if-the-cnf-uses-local-storage-no_local_volume_configuration.

84

Cnf testsuite, rationale, to test if there is a liveness entry in the helm chart: liveness. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-test-if-there-is-a-liveness-entry-in-the-helm-chart-liveness.

85

Cnf testsuite, rationale, to test if there is a readiness entry in the helm chart: readiness. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-test-if-there-is-a-readiness-entry-in-the-helm-chart-readiness.

86

Cnf testsuite, rationale, to check if there is automatic mapping of service accounts: service_account_mapping. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-there-is-automatic-mapping-of-service-accounts-service_account_mapping.

87

Cnf testsuite, rationale, to check if there is a host network attached to a pod: host_network. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-there-is-a-host-network-attached-to-a-pod-host_network.

88

Cnf testsuite, rationale, to check if containers are running with hostpid or hostipc privileges: host_pid_ipc_privileges. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-containers-are-running-with-hostpid-or-hostipc-privileges-host_pid_ipc_privileges.

89

Cnf testsuite, rationale, to check if containers have resource limits defined: resource_policies. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-containers-have-resource-limits-defined-resource_policies.

90

Cnf testsuite, rationale, to check if containers have immutable file systems: immutable_file_systems. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-containers-have-immutable-file-systems-immutable_file_systems.

91

Kubernetes documentation: images. URL: https://kubernetes.io/docs/concepts/containers/images/.

92

Cnf testsuite, rationale, to test if there are any (non-declarative) hardcoded ip addresses or subnet masks in the k8s runtime configuration: hardcoded_ip_addresses_in_k8s_runtime_configuration. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-test-if-there-are-any-non-declarative-hardcoded-ip-addresses-or-subnet-masks-in-the-k8s-runtime-configuration-hardcoded_ip_addresses_in_k8s_runtime_configuration.

93

Kubernetes documentation: service. URL: https://kubernetes.io/docs/concepts/services-networking/service/.

94

Kubernetes documentation: immutable configmaps. URL: https://kubernetes.io/docs/concepts/configuration/configmap/#configmap-immutable.

95

Kubernetes documentation: horizontal pod autoscaling. URL: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.

96

Cnf testsuite, rationale, to check if the cnf has a reasonable image size: reasonable_image_size. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-the-cnf-has-a-reasonable-image-size-reasonable_image_size.

97

Cnf testsuite, rationale, to check if the cnf have a reasonable startup time: reasonable_startup_time. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-the-cnf-have-a-reasonable-startup-time-reasonable_startup_time.

98

Cnf testsuite, rationale, to check if there are any privileged containers: privileged_containers. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-there-are-any-privileged-containers-privileged_containers.

99

Cnf testsuite, rationale, to check if any containers are running as a root user (checks the user outside the container that is running dockerd): non_root_user. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-any-containers-are-running-as-a-root-user-checks-the-user-outside-the-container-that-is-running-dockerd-non_root_user.

100

Cnf testsuite, rationale, to check if any containers allow for privilege escalation: privilege_escalation. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-any-containers-allow-for-privilege-escalation-privilege_escalation.

101

Cnf testsuite, rationale, to check if containers are running with non-root user with non-root membership: non_root_containers. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-check-if-containers-are-running-with-non-root-user-with-non-root-membership-non_root_containers.

102

Kubernetes documentation: recommended labels. URL: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/.

103

The twelve factor app: logs. URL: https://12factor.net/logs.

104

Cnf testsuite, rationale, to test if there are host ports used in the service configuration: hostport_not_used. URL: https://github.com/cnti-testcatalog/testsuite/blob/main/RATIONALE.md#to-test-if-there-are-host-ports-used-in-the-service-configuration-hostport_not_used.

105

Kubernetes documentation: loadbalancer property of the kubernetes service. URL: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer.

106

Service mesh interface (smi). URL: https://smi-spec.io/.

107

Kubernetes documentation ports and protocols. URL: https://kubernetes.io/docs/reference/networking/ports-and-protocols/.

108

Building secure microservices-based applications using service-mesh architecture. NIST Special Publication 800-204A, 2020. URL: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-204a.pdf.

109

Attribute-based access control for microservices-based applications using a service mesh. NIST Special Publication 800-204B, 2020. URL: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-204B.pdf.

110

Arstechnica: tesla cloud resources are hacked to run cryptocurrency-mining malware. URL: https://arstechnica.com/information-technology/2018/02/tesla-cloud-resources-are-hacked-to-run-cryptocurrency-mining-malware/.

111

Kubernetes testing special interest group. URL: https://github.com/kubernetes/community/blob/master/sig-testing/charter.md.

112

Kubernetes end-to-end testing. URL: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/e2e-tests.md.

113

Kubernetes feature gates. URL: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-gates.

114

Kubernetes kep-3136. URL: https://github.com/kubernetes/enhancements/blob/master/keps/sig-architecture/3136-beta-apis-off-by-default/README.md.

115

Kubernetes api. URL: https://kubernetes.io/docs/reference/using-api/.

116

Kubernetes api reference. URL: https://kubernetes.io/docs/reference/kubernetes-api/.

117

Kubernetes api groups. URL: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.29/.

118

Kubernetes api machinery special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-api-machinery.

119

Kubernetes feature crossnamespacepodaffinity. URL: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#namespace-selector.

120

Kubernetes feature storageversionapi. URL: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#storageversion-v1alpha1-internal-apiserver-k8s-io.

121

Kubernetes apps special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-apps.

122

Kubernetes feature daemonsetupdatesurge. URL: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#rollingupdatedaemonset-v1-apps.

123

Kubernetes feature indexedjob. URL: https://kubernetes.io/docs/concepts/workloads/controllers/job/.

124

Kubernetes feature statefulset. URL: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/.

125

Kubernetes feature suspendjob. URL: https://kubernetes.io/docs/concepts/workloads/controllers/job/#suspending-a-job.

126

Kubernetes feature tainteviction. URL: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions.

127

Kubernetes feature ttlafterfinished. URL: https://kubernetes.io/docs/concepts/workloads/controllers/ttlafterfinished/.

128

Kubernetes auth special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-auth.

129

Kubernetes feature boundserviceaccounttokenvolume. URL: https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/1205-bound-service-account-tokens/README.md.

130

Kubernetes cluster lifecycle special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-cluster-lifecycle.

131

Kubernetes instrumentation special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-instrumentation.

132

Kubernetes network special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-network.

133

Kubernetes feature ipv6dualstack. URL: https://kubernetes.io/docs/concepts/services-networking/dual-stack/.

134

Kubernetes extend service ip ranges. URL: https://kubernetes.io/docs/tasks/network/extend-service-ip-ranges/.

135

Kubernetes node special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-node.

136

Kubernetes feature probeterminationgraceperiod. URL: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#probe-level-terminationgraceperiodseconds.

137

Kubernetes feature downwardapihugepages. URL: https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information.

138

Kubernetes feature podreadinessgate. URL: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate.

139

Kubernetes feature sysctls. URL: https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/.

140

Kubernetes scheduling special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-scheduling.

141

Kubernetes feature localstoragecapacityisolation. URL: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/.

142

Kubernetes storage special interest group. URL: https://github.com/kubernetes/community/tree/master/sig-storage.

143

Alibaba cloud blog: what can we learn from twitter's move to kubernetes. URL: https://www.alibabacloud.com/blog/what-can-we-learn-from-twitters-move-to-kubernetes_595156.

144

Youtube: kubernetes failure stories, or: how to crash your cluster - henning jacobs. URL: https://www.youtube.com/watch?v=LpFApeaGv7A.

145

Cncf blog: demystifying kubernetes as a service – how alibaba cloud manages 10,000s of kubernetes clusters. URL: https://www.cncf.io/blog/2019/12/12/demystifying-kubernetes-as-a-service-how-does-alibaba-cloud-manage-10000s-of-kubernetes-clusters/.

146

Google docs: kep: multinetwork podnetwork object. URL: https://docs.google.com/document/d/17LhyXsEgjNQ0NWtvqvtgJwVqdJWreizsgAZHWflgP-A/edit.

147

Kubernetes docs: user namespaces. URL: https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/.

148

Kep-127: support user namespaces in stateless pods. URL: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/127-user-namespaces.

149

Wikipedia: linux namespaces. URL: https://en.wikipedia.org/wiki/Linux_namespaces.

1.6. Conventions

The key words “MUST”, “MUST NOT”, “required”, “SHALL”, SHALL NOT”, “SHOULD”, “SHOULD NOT”, “recommended”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 [6].