Teleport Kubernetes Discussion
Summary
We have been running Teleport using the original proof of concept deployment. The design criteria for that deployment was to get it running as quickly as possible and keep it as simple as possible (one VM) until we outgrew the MVP. We have come to the point where we have outgrown this setup and it's time to move to something new. There are some decisions to make now, and it's time to gather feedback and opinions on how we should do things moving forward.
More Detail
We started by routing read only rails consoles through the Teleport proxy (bastion). Then we added the database consoles and have now migrated most database console users to it. Before we start routing Kubernetes API access through it, we need it to be high availability.
While it would probably be easiest to just create a namespace for it in the regional cluster, we would have to be extremely careful to avoid breaking our access to manage it. Even with caution, it's usually ill advised to run a management tool inside of the service that it manages (our ops instance exists for this reason). However there may still be an argument for doing it this way and we should discuss it.
Decisions
- Do we run the service in Kubernetes, or with multiple VM's? (Consensus seems to be K8s)
- Do we build a small "bastion only" cluster for this, with tighter security (similar to how we lock down
sudo
access on our existing bastions) - Or do we run the app as just another namespace in our regional cluster?
- Where do we put the Terraform and Helm code for this? The
config-mgmt
mono repo? Or a separate bastion repo on ops with higher security and isolation?
Architecture Options
Separate bastion cluster
stateDiagram-v2
direction LR
User --> Balancer
Balancer --> Teleport1
Balancer --> Teleport2
state BastionCluster {
Teleport1 --> Console1
Teleport1 --> Console2
Teleport2 --> Console1
Teleport2 --> Console2
Teleport1 --> KubeAPI
Teleport2 --> KubeAPI
}
state RegionalCluster {
state KubeAPI {
[*]
}
state ConsoleNamespace {
Console1
Console2
}
}
Teleport1 --> Postgres
Teleport2 --> Postgres
Teleport in regional cluster
stateDiagram-v2
direction LR
User --> Balancer
Balancer --> Teleport1
Balancer --> Teleport2
state RegionalCluster {
state KubeAPI {
Teleport1 --> [*]
Teleport2 --> [*]
}
state BastionNamespace {
Teleport1 --> Console1
Teleport1 --> Console2
Teleport2 --> Console1
Teleport2 --> Console2
}
state ConsoleNamespace {
Console1
Console2
}
}
Teleport1 --> Postgres
Teleport2 --> Postgres
Teleport in zonal clusters
stateDiagram-v2
direction LR
User --> Balancer
Balancer --> Teleport1
Balancer --> Teleport2
state ZonalCluster1 {
state KubeAPI {
Teleport1 --> [*]
}
state BastionNamespace1 {
Teleport1 --> Console1
}
state ConsoleNamespace1 {
Console1
}
}
state ZonalCluster2 {
state KubeAPI2 {
Teleport2 --> [*]
}
state BastionNamespace2 {
Teleport2 --> Console2
}
state ConsoleNamespace {
Console2
}
}
Teleport1 --> Postgres
Teleport2 --> Postgres
Teleport in Ops Cluster
stateDiagram-v2
direction LR
User --> Balancer
Balancer --> Teleport1
Balancer --> Teleport2
state OpsCluster {
state BastionNamespace1 {
Teleport1 --> Agent1
Teleport1 --> Agent2
Teleport2 --> Agent1
Teleport2 --> Agent2
}
}
state ZonalCluster1 {
Agent1 --> KubeAPI1
Agent1 --> Console1
state KubeAPI1 {
[*]
}
state ConsoleNamespace1 {
Console1
}
}
state ZonalCluster2 {
Agent2 --> KubeAPI2
Agent2 --> Console2
state KubeAPI2 {
[*]
}
state ConsoleNamespace {
Console2
}
}
Agent1 --> Postgres
Agent2 --> Postgres