Commit fb38cd83 authored by Brian Kocoloski's avatar Brian Kocoloski
Browse files

add facility overview

parent 1b8f17f6
......@@ -5,8 +5,3 @@ weight: 4
description: >
Docs for operating a Merge testbed facility
---
{{% pageinfo %}}
This is a placeholder page for facility operations
{{% /pageinfo %}}
---
title: "Installation"
linkTitle: "Installation"
weight: 1
weight: 3
description: >
Docs for installing a Merge testbed facility
How to install Merge on your facility
---
## Foreward
......
---
title: "Modeling"
linkTitle: "Modeling"
weight: 2
description: >
How to model your equipment as a Merge facility
---
---
title: "Networks"
linkTitle: "Networks"
weight: 3
weight: 5
description: >
A detailed overview of how testbed facility networks work
---
......
---
title: "Operation"
linkTitle: "Operation"
weight: 2
weight: 4
description: >
Docs for operating a Merge testbed facility
How to operate your Merge facility
---
{{% pageinfo %}}
......
---
title: "Overview"
linkTitle: "Overview"
weight: 1
description: >
So you want to operate a Merge testbed facility?
---
You've come to the right place! This page overviews the core components and network architecture of
a typical Merge facility. Once you understand the basic design of a facility, head on over to these
pages for detailed instructions on:
- [How to model your equipment as a Merge facility](../model)
- [How to install Merge on your facility](../install)
- [How to operate your Merge facility](../operate)
- [How facility networks work](../network)
## High Level Design
A testbed facility houses the equipment that materializes experiments. A facility
typically consists of at least the following assets:
- A set of testbed nodes that host experiment nodes. Testbed nodes either operate as
*hypervisors* (supporting many materialization nodes through virtual machine technology) or as
*bare-metal* machines (supporting one materialization node)
- A set of physical networks that provide a collection of different services:
- a "management" network through which an operator can connect to and control facility assets
- an "infrastructure" network which provides services to experiment materializations such as DHCP,
DNS, node imaging, and mass storage
- an "experiment" network on which virtual networks for experiment materializations are embedded
- Infrastructure servers that host the aforementioned testbed services (DHCP, DNS, etc.) and serve
them over the infrastructure network
- Storage servers that provide network attached mass storage for experiment use
- An operations server that functions as central point of command and control for a human operator
## Canonical Facility Architecture
The following image shows a generalized view of how a Merge testbed facility is typically configured:
![](/img/facility/overview.png)
### Testbed Nodes & Hypervisors
These machines host experiment nodes for materializations as described by user [experiment
models](/docs/experimentation/model-ref). Experiment nodes can either be deployed as virtual
machines on top of the Merge hypervisor stack, or can be deployed on a bare-metal server. Each
individual testbed node in the facility can alternate between bare-metal operation and hypervisor
operation throughout its lifetime. Any transitioning of modes that occurs is entirely automated by
Merge software.
### Network Emulators
These are special purpose machines that precisely control the performance characteristics of network
links in user materializations. When a user requests [precise link
emulation](/docs/experimentation/emulation) in a materialization, the embedding for that link is
done such that the link traverses a network emulation server in the facility. Currently, emulation
servers run a modified version of the [Fastclick]( https://github.com/tbarbette/fastclick) system to
control capacity, delay, and loss rates on emulated links. Because link emulation can be a resource
intensive workload, a recommended architectural model is to include a number of dedicated emulation
servers that do not host any user experiment nodes or other testbed services.
### Infrapod Servers
Infrapod servers host "infrapods", which are per-materialization
["pods"](https://docs.podman.io/en/latest/markdown/podman-pod.1.html) that provide services to the
materialization, including DHCP, DNS, and an experiment VPN access points for secure external
access. A separate infrapod is allocated for each materialization hosted by the facility, which
means that these services are confined to the namespace of a single materialization.
### Storage Servers
Storage servers host network-attached mass storage for materializations.
### Networks
The canonical facility architecture includes a distinct infrastructure network ("Infranet") and
experiment network ("xpnet"). The infranet and xpnet serve conceptually distinct purposes and are
often deployed in entirely disjoint network fabrics, though it is possible for these networks to
share switches.
The infranet supports testbed services such as DHCP, DNS, node imaging, and mass storage. These
services support efficient testbed operation but are not designed to transit user experiment
traffic, which is instead supported by the xpnet. The xpnet is the underlying substrate over which
network links are embedded to provide the desired topological connectivity and link performance
characteristics requested by the user.
{{% alert title="Tip" color="info" %}}
There is no required network architecture for the infranet or xpnet. A facility operator only needs
to reflect the physical cabling and document the capabilities of each switch (e.g., whether it
supports VXLAN) in the [facility model](../model). As long as the model reflects the physical
capabilities and connectivity accurately, Merge will figure out how to embed links to provide
infranet and xpnet connectivity to the materialization.
With that said, we recommend that infrapod servers and storage servers have high bandwidth
connectivity to the infranet, as node imaging and mass storage are bandwidth intensive services that
can easily become bottlenecked by a low bandwidth link to the infranet fabric. The image above thus
depicts these servers as being connected directly to an infranet "spine" switch.
{{% /alert %}}
## End-to-end Example
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment