1. Introduction

1.1. Overview

This Reference Architecture is focussed on OpenStack as the Virtualised Infrastructure Manager (VIM) chosen based on the criteria laid out in the Cloud Infrastructure Reference Model [1] (referred to as “Reference Model” or “RM” in the document). OpenStack [2] has the advantage of being a mature and widely accepted open-source technology; a strong ecosystem of vendors that support it, the OpenInfra Foundation for managing the community, and, most importantly, it is widely deployed by the global operator community for both internal infrastructure and external facing products and services. This means that resources with the right skill sets to support a Cloud Infrastructure (or Network Function Virtualisation Infrastructure, NFVI [3]) are available. Another reason to choose OpenStack is that it has a large active community of vendors and operators, which means that any code or component changes needed to support the Common Telco Cloud Infrastructure requirements can be managed through the existing project communities’ processes to add and validate the required features through well-established mechanisms.

1.1.1. Vision

This Reference Architecture specifies OpenStack based Cloud Infrastructure for hosting NFV workloads, primarily VNFs (Virtual Network Functions). The Reference Architecture document can be used by operators to deploy Anuket conformant infrastructure; hereafter, “conformant” denotes that the resource can satisfy tests conducted to verify conformance with this reference architecture.

1.2. Use Cases

Several NFV use cases are documented in OpenStack. For more examples and details refer to the OpenStack Use cases [4].

Examples include:

  • Overlay networks: The overlay functionality design includes OpenStack Networking in Open vSwitch [5] GRE tunnel mode. In this case, the layer-3 external routers pair with VRRP, and switches pair with an implementation of MLAG to ensure that you do not lose connectivity with the upstream routing infrastructure.

  • Performance tuning: Network level tuning for this workload is minimal. Quality of Service (QoS) applies to these workloads for a middle ground Class Selector depending on existing policies. It is higher than a best effort queue, but lower than an Expedited Forwarding or Assured Forwarding queue. Since this type of application generates larger packets with longer-lived connections, you can optimise bandwidth utilisation for long duration TCP. Normal bandwidth planning applies here with regards to benchmarking a session’s usage multiplied by the expected number of concurrent sessions with overhead.

  • Network functions: are software components that support the exchange of information (data, voice, multi-media) over a system’s network. Some of these workloads tend to consist of a large number of small-sized packets that are short lived, such as DNS queries or SNMP traps. These messages need to arrive quickly and, thus, do not handle packet loss. Network function workloads have requirements that may affect configurations including at the hypervisor level. For an application that generates 10 TCP sessions per user with an average bandwidth of 512 kilobytes per second per flow and expected user count of ten thousand (10,000) concurrent users, the expected bandwidth plan is approximately 4.88 gigabits per second. The supporting network for this type of configuration needs to have a low latency and evenly distributed load across the topology. These types of workload benefit from having services local to the consumers of the service. Thus, use a multi-site approach, as well as, deploying many copies of the application to handle load as close as possible to consumers. Since these applications function independently, they do not warrant running overlays to interconnect tenant networks. Overlays also have the drawback of performing poorly with rapid flow setup and may incur too much overhead with large quantities of small packets and therefore we do not recommend them. QoS is desirable for some workloads to ensure delivery. DNS has a major impact on the load times of other services and needs to be reliable and provide rapid responses. Configure rules in upstream devices to apply a higher-Class Selector to DNS to ensure faster delivery or a better spot in queuing algorithms.

1.3. OpenStack Reference Release

This Reference Architecture document conforms to the OpenStack Wallaby [6] release. While many features and capabilities are conformant with many OpenStack releases, this document will refer to features, capabilities and APIs that are part of the OpenStack Wallaby release. For ease, this Reference Architecture document version can be referred to as “RA-1 OSTK Wallaby.”

1.4. Principles

1.4.1. Architectural principles

This Reference Architecture for OpenStack based Cloud Infrastructure must obey the following set of architectural principles:

  1. Open-source preference: for building Cloud Infrastructure solutions, components and tools, using open-source technology.

  2. Open APIs: to enable interoperability, component substitution, and minimise integration efforts.

  3. Separation of concerns: to promote lifecycle independence of different architectural layers and modules (e.g., disaggregation of software from hardware).

  4. Automated lifecycle management: to minimise the end-to-end lifecycle costs, maintenance downtime (target zero downtime), and errors resulting from manual processes.

  5. Automated scalability: of workloads to minimise costs and operational impacts.

  6. Automated closed loop assurance: for fault resolution, simplification, and cost reduction of cloud operations.

  7. Cloud nativeness: to optimise the utilisation of resources and enable operational efficiencies.

  8. Security compliance: to ensure the architecture follows the industry best security practices and is at all levels compliant to relevant security regulations.

  9. Resilience and Availability: to withstand Single Point of Failure.

1.4.2. OpenStack specific principles

OpenStack considers the following Four Opens essential for success:

  • Open Source

  • Open Design

  • Open Development

  • Open Community

This OpenStack Reference Architecture is organised around the three major Cloud Infrastructure resource types as core services of compute, storage and networking, and a set of shared services of identity management, image management, graphical user interface, orchestration engine, etc.

1.5. Document Organisation

Chapter 2 defines the Reference Architecture requirements and, when appropriate, provides references to where these requirements are addressed in this document. The intent of this document is to address all of the mandatory (“MUST”) requirements and the most useful of the other optional (“SHOULD”) requirements. Chapter 3 and 4 cover the Cloud Infrastructure resources and the core OpenStack services, while the APIs are covered in Chapter 5. Chapter 6 covers the implementation and enforcement of security capabilities and controls. Life Cycle Management of the Cloud Infrastructure and VIM are covered in Chapter 7 with stress on Logging, Monitoring and Analytics (LMA), configuration management and some other operational items. Please note that Chapter 7 is not a replacement for the implementation, configuration and operational documentation that accompanies the different OpenStack distributions. Chapter 8 addresses the conformance. It provides an automated validation mechanism to test the conformance of a deployed cloud infrastructure to this reference architecture. Finally, Chapter 9 identifies certain Gaps that currently exist and plans on howto address them (for example, resources autoscaling).

1.6. Terminology

Abstraction: process of removing concrete, fine-grained or lower-level details or attributes or common properties in the study of systems to focus attention on topics of greater importance or general concepts. It can be the result of decoupling.

Anuket: a LFN open-source project developing open reference infrastructure models, architectures, tools, and programs.

Cloud Infrastructure: a generic term covering NFVI, IaaS and CaaS capabilities - essentially the infrastructure on which a Workload can be executed. NFVI, IaaS and CaaS layers can be built on top of each other. In case of CaaS some cloud infrastructure features (e.g.: HW management or multitenancy) are implemented by using an underlying IaaS layer.

Cloud Infrastructure Hardware Profile: defines the behaviour, capabilities, configuration, and metrics provided by a cloud infrastructure hardware layer resources available for the workloads.

Cloud Infrastructure Profile: the combination of the Cloud Infrastructure Software Profile and the Cloud Infrastructure Hardware Profile that defines the capabilities and configuration of the Cloud Infrastructure resources available for the workloads.

Cloud Infrastructure Software Profile: defines the behaviour, capabilities and metrics provided by a Cloud Infrastructure Software Layer on resources available for the workloads.

Cloud Native Network Function (CNF): a cloud native network function (CNF) is a cloud native application that implements network functionality. A CNF consists of one or more microservices. All layers of a CNF are developed using Cloud Native Principles including immutable infrastructure, declarative APIs, and a “repeatable deployment process”. This definition is derived from the Cloud Native Thinking for Telecommunications Whitepaper, which also includes further detail and examples.

Compute Node: an abstract definition of a server. A compute node can refer to a set of hardware and software that support the VMs or Containers running on it.

Container: a lightweight and portable executable image that contains software and all of its dependencies. OCI defines Container as “An environment for executing processes with configurable isolation and resource limitations. For example, namespaces, resource limits, and mounts are all part of the container environment.” A Container provides operating-system-level virtualisation by abstracting the “user space”. One big difference between Containers and VMs is that unlike VMs, where each VM is self-contained with all the operating systems components are within the VM package, containers “share” the host system’s kernel with other containers.

Container Image: stored instance of a container that holds a set of software needed to run an application.

Core (physical): an independent computer processing unit that can independently execute CPU instructions and is integrated with other cores on a multiprocessor (chip, integrated circuit die). Please note that the multiprocessor chip is also referred to as a CPU that is placed in a socket of a computer motherboard.

CPU Type: a classification of CPUs by features needed for the execution of computer programs; for example, instruction sets, cache size, number of cores.

Decoupling, Loose Coupling: loosely coupled system is one in which each of its components has, or makes use of, little or no knowledge of the implementation details of other separate components. Loose coupling is the opposite of tight coupling

Encapsulation: restricting of direct access to some of an object’s components.

External Network: external networks provide network connectivity for a cloud infrastructure tenant to resources outside of the tenant space.

Fluentd: an open-source data collector for unified logging layer, which allows data collection and consumption for better use and understanding of data. Fluentd is a CNCF graduated project.

Functest: an open-source project part of Anuket LFN project. It addresses functional testing with a collection of state-of-the-art virtual infrastructure test suites, including automatic VNF testing.

Hardware resources: compute/Storage/Network hardware resources on which the cloud infrastructure platform software, virtual machines and containers run on.

Host Profile: is another term for a Cloud Infrastructure Hardware Profile.

Huge pages: physical memory is partitioned and accessed using the basic page unit (in Linux default size of 4 KB). Hugepages, typically 2 MB and 1GB size, allows large amounts of memory to be utilised with reduced overhead. In an NFV environment, huge pages are critical to support large memory pool allocation for data packet buffers. This results in fewer Translation Lookaside Buffers (TLB) lookups, which reduces the virtual to physical pages’ address translations. Without huge pages enabled high TLB miss rates would occur thereby degrading performance.

Hypervisor: a software that abstracts and isolates workloads with their own operating systems from the underlying physical resources. Also known as a virtual machine monitor (VMM).

Instance: is a virtual compute resource, in a known state such as running or suspended, that can be used like a physical server. It can be used to specify VM Instance or Container Instance.

Kibana: an open-source data visualisation system.

Kubernetes: an open-source system for automating deployment, scaling, and management of containerised applications.

Monitoring (Capability): monitoring capabilities are used for the passive observation of workload-specific traffic traversing the Cloud Infrastructure. Note, as with all capabilities, Monitoring may be unavailable or intentionally disabled for security reasons in a given cloud infrastructure instance.

Multi-tenancy: feature where physical, virtual or service resources are allocated in such a way that multiple tenants and their computations and data are isolated from and inaccessible by each other.

Network Function (NF): functional block or application that has well-defined external interfaces and well-defined functional behaviour. Within NFV, a Network Function is implemented in a form of Virtualised NF (VNF) or a Cloud Native NF (CNF).

NFV Orchestrator (NFVO): manages the VNF lifecycle and Cloud Infrastructure resources (supported by the VIM) to ensure an optimised allocation of the necessary resources and connectivity.

Network Function Virtualisation (NFV): the concept of separating network functions from the hardware they run on by using a virtual hardware abstraction layer.

Network Function Virtualisation Infrastructure (NFVI): the totality of all hardware and software components used to build the environment in which a set of virtual applications (VAs) are deployed; also referred to as cloud infrastructure. The NFVI can span across many locations, e.g., places where data centres or edge nodes are operated. The network providing connectivity between these locations is regarded to be part of the cloud infrastructure. NFVI and VNF are the top-level conceptual entities in the scope of Network Function Virtualisation. All other components are sub-entities of these two main entities.

Network Service (NS): composition of Network Function(s) and/or Network Service(s), defined by its functional and behavioural specification, including the service lifecycle.

Open Network Automation Platform (ONAP): a LFN project developing a comprehensive platform for orchestration, management, and automation of network and edge computing services for network operators, cloud providers, and enterprises.

ONAP OpenLab: ONAP community lab.

Open Platform for NFV (OPNFV): a collaborative project under the Linux Foundation. OPNFV is now part of the LFN Anuket project. It aims to implement, test, and deploy tools for conformance and performance of NFV infrastructure.

OPNFV Verification Program (OVP): an open-source, community-led compliance and verification program aiming to demonstrate the readiness and availability of commercial NFV products and services using OPNFV and ONAP components.

Platform: a cloud capabilities type in which the cloud service user can deploy, manage and run customer-created or customer-acquired applications using one or more programming languages and one or more execution environments supported by the cloud service provider. Adapted from ITU-T Y.3500. This includes the physical infrastructure, Operating Systems, virtualisation/containerisation software and other orchestration, security, monitoring/logging and life-cycle management software.

Prometheus: an open-source monitoring and alerting system.

Quota: an imposed upper limit on specific types of resources, usually used to prevent excessive resource consumption by a given consumer (tenant, VM, container).

Resource pool: a logical grouping of cloud infrastructure hardware and software resources. A resource pool can be based on a certain resource type (for example, compute, storage and network) or a combination of resource types. A Cloud Infrastructure resource can be part of none, one or more resource pools.

Simultaneous Multithreading (SMT): simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution on a single core to better utilise the resources provided by modern processor architectures.

Shaker: a distributed data-plane testing tool built for OpenStack.

Software Defined Storage (SDS): an architecture which consists of the storage software that is independent from the underlying storage hardware. The storage access software provides data request interfaces (APIs) and the SDS controller software provides storage access services and networking.

Tenant: cloud service users sharing access to a set of physical and virtual resources, ITU-T Y.3500. Tenants represent an independently manageable logical pool of compute, storage and network resources abstracted from physical hardware.

Tenant Instance: refers to an Instance owned by or dedicated for use by a single Tenant.

Tenant (Internal) Networks: virtual networks that are internal to Tenant Instances.

User: natural person, or entity acting on their behalf, associated with a cloud service customer that uses cloud services. Examples of such entities include devices and applications.

Virtual CPU (vCPU): represents a portion of the host’s computing resources allocated to a virtualised resource, for example, to a virtual machine or a container. One or more vCPUs can be assigned to a virtualised resource.

Virtualised Infrastructure Manager (VIM): responsible for controlling and managing the Network Function Virtualisation Infrastructure (NFVI) compute, storage and network resources.

Virtual Machine (VM): virtualised computation environment that behaves like a physical computer/server. A VM consists of all of the components (processor (CPU), memory, storage, interfaces/ports, etc.) of a physical computer/server. It is created using sizing information or Compute Flavour.

Virtualised Network Function (VNF): a software implementation of a Network Function, capable of running on the Cloud Infrastructure. VNFs are built from one or more VNF Components (VNFC) and, in most cases, the VNFC is hosted on a single VM or Container.

Virtual Compute resource (a.k.a. virtualisation container): partition of a compute node that provides an isolated virtualised computation environment.

Virtual Storage resource: virtualised non-volatile storage allocated to a virtualised computation environment hosting a VNFC.

Virtual Networking resource: routes information among the network interfaces of a virtual compute resource and physical network interfaces, providing the necessary connectivity.

VMTP: a data path performance measurement tool built specifically for OpenStack clouds.

Workload: an application (for example VNF, or CNF) that performs certain task(s) for the users. In the Cloud Infrastructure, these applications run on top of compute resources such as VMs or Containers.

1.7. Abbreviations




Application Programming Interface


Border gateway Protocol Virtual Private network


Continuous Integration/Continuous Deployment


Cloud iNfrastructure Task Force


Central Processing Unit


Domain Name System


Data Plane Development Kit


Dynamic Host Configuration Protocol


Equal Cost Multi-Path routing


European Telecommunications Standards Institute


Field Programmable Gate Array




Graphics Processing Unit


Generic Routing Encapsulation


Global System for Mobile Communications (originally Groupe Spécial Mobile)


GSM Association


Global Service Load Balancer


Graphical User Interface


High Availability


Hard Disk Drive


HyperText Transfer Protocol



IaaC (also IaC)

Infrastructure as a Code


Infrastructure as a Service


Internet Control Message Protocol


IP Multimedia Sub System




Input/Output per Second


Intelligent Platform Management Interface


Kernel-based Virtual Machine


LifeCycle Management


Lightweight Directory Access Protocol


Linux Foundation Networking


Logging, Monitoring and Analytics


Logical Volume Management


Management ANd Orchestration


Multi-chassis Link Aggregation Group


Network Address Translation


Network File System


Network Function Virtualisation


Network Function Virtualisation Infrastructure


Network Interface Card


Numeric Processing Unit


Network Time Protocol


Non-Uniform Memory Access


Open Air Interface


Operating System




Open Platform for NFV


Open vSwitch


Open Web Application Security Project


Peripheral Component Interconnect Express


PCIe PassThrough


Preboot Execution Environment


Quality of Service


Reference Architecture


Reference Architecture 1 (i.e., Reference Architecture for OpenStack-based Cloud Infrastructure)


Role-based Access Control


RADOS Block Device


Representational state transfer


Reference Implementation


Reference Model


Static Application Security Testing


Software Defined Networking


Service Function Chaining


Security Group


Service Level Agreement


Symmetric MultiProcessing


Simultaneous MultiThreading


Source Network Address Translation


Simple Network Management Protocol


Single Root Input Output Virtualisation


Solid State Drive


Secure Sockets Layer


System Under Test


Transmission Control Protocol


Transport Layer Security


Top of Rack


Trusted Platform Module


User Data Protocol


Virtualised Infrastructure Manager


Virtual LAN


Virtual Machine


Virtual Network Function


Virtual Router Redundancy Protocol


VXLAN Tunnel End Point


Virtual Extensible LAN


Wide Area Network


Zero Trust Architecture

1.8. Conventions

The key words “MUST”, “MUST NOT”, “required”, “SHALL”, SHALL NOT”, “SHOULD”, “SHOULD NOT”, “recommended”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 [7].