This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Amazon EKS Anywhere
EKS Anywhere provides a means of managing Kubernetes clusters using the same operational excellence and practices that Amazon Web Services uses for its Amazon Elastic Kubernetes Service (Amazon EKS). Based on
EKS Distro, EKS Anywhere adds methods for deploying, using, and managing Kubernetes clusters that run in your own data centers. Its goal is to include full lifecycle management of multiple Kubernetes clusters that are capable of operating completely independently of any AWS services.
The tenets of the EKS Anywhere project are:
- Simple: Make using a Kubernetes distribution simple and boring (reliable and secure).
- Opinionated Modularity: Provide opinionated defaults about the best components to include with Kubernetes, but give customers the ability to swap them out
- Open: Provide open source tooling backed, validated and maintained by Amazon
- Ubiquitous: Enable customers and partners to integrate a Kubernetes distribution in the most common tooling.
- Stand Alone: Provided for use anywhere without AWS dependencies
- Better with AWS: Enable AWS customers to easily adopt additional AWS services
1 - Overview
Provides an overview of EKS Anywhere
EKS Anywhere uses the eksctl
executable to create a Kubernetes cluster in your environment.
Currently it allows you to create and delete clusters in a vSphere environment.
You can run cluster create and delete commands from an Ubuntu or Mac administrative machine.
To create a cluster, you need to create a specification file that includes all of your vSphere details and information about your EKS Anywhere cluster.
Running the eksctl anywhere create cluster
command from your admin machine creates the workload cluster in vSphere.
It does this by first creating a temporary bootstrap cluster to direct the workload cluster creation.
Once the workload cluster is created, the cluster management resources are moved to your workload cluster and the local bootstrap cluster is deleted.
Once your workload cluster is created, a KUBECONFIG file is stored on your admin machine with RBAC admin permissions for the workload cluster.
You’ll be able to use that file with kubectl
to set up and deploy workloads.
For a detailed description, see Cluster creation workflow
.
Here’s a diagram that explains the process visually.
EKS Anywhere Create Cluster

Next steps:
2 - Getting started
The Getting started section includes information on starting to set up your own EKS Anywhere local or production environment.
EKS Anywhere can be deployed as a simple, unsupported local environment or as a production-quality environment that can become a supported on-premises Kubernetes platform.
This section lists the different ways to set up and run EKS Anywhere.
When you install EKS Anywhere, choose an installation type based on: ease of maintenance, security, control, available resources, and expertise required to operate and manage a cluster.
Install EKS Anywhere
To create an EKS Anywhere cluster you’ll need to download the command line tool that is used to create and manage a cluster.
You can install it using the installation guide
Local environment
If you just want to try out EKS Anywhere, there is a single-system method for installing and running EKS Anywhere using Docker.
See EKS Anywhere local environment
.
Production environment
When evaluating a solution for a production environment
consider deploying EKS Anywhere on Bare Metal
or vSphere
.
2.1 - Install EKS Anywhere
EKS Anywhere will create and manage Kubernetes clusters on multiple providers.
Currently we support creating development clusters locally using Docker and production clusters using Bare Metal or VMware vSphere.
Creating an EKS Anywhere cluster begins with setting up an Administrative machine where you will run Docker and add some binaries.
From there, you create the cluster for your chosen provider.
See Create cluster workflow
for an overview of the cluster creation process.
To create an EKS Anywhere cluster you will need eksctl
and the eksctl-anywhere
plugin.
This will let you create a cluster in multiple providers for local development or production workloads.
Administrative machine prerequisites
Via Homebrew (macOS and Linux)
Warning
EKS Anywhere only works on computers with x86 and amd64 process architecture.
It currently will not work on computers with Apple Silicon or Arm based processors.
You can install eksctl
and eksctl-anywhere
with homebrew
.
This package will also install kubectl
and the aws-iam-authenticator
which will be helpful to test EKS Anywhere clusters.
brew install aws/tap/eks-anywhere
Manually (macOS and Linux)
Install the latest release of eksctl
.
The EKS Anywhere plugin requires eksctl
version 0.66.0 or newer.
curl "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" \
--silent --location \
| tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin/
Install the eksctl-anywhere
plugin.
export EKSA_RELEASE="0.10.0" OS="$(uname -s | tr A-Z a-z)" RELEASE_NUMBER=14
curl "https://anywhere-assets.eks.amazonaws.com/releases/eks-a/${RELEASE_NUMBER}/artifacts/eks-a/v${EKSA_RELEASE}/${OS}/amd64/eksctl-anywhere-v${EKSA_RELEASE}-${OS}-amd64.tar.gz" \
--silent --location \
| tar xz ./eksctl-anywhere
sudo mv ./eksctl-anywhere /usr/local/bin/
Upgrade eksctl-anywhere
If you installed eksctl-anywhere
via homebrew you can upgrade the binary with
brew update
brew upgrade eks-anywhere
If you installed eksctl-anywhere
manually you should follow the installation steps to download the latest release.
You can verify your installed version with
Deploy a cluster
Once you have the tools installed you can deploy a local cluster or production cluster in the next steps.
2.2 - Create local cluster
EKS Anywhere docker provider deployments
EKS Anywhere supports a Docker provider for development and testing use cases only.
This allows you to try EKS Anywhere on your local system before deploying to a supported provider.
To install the EKS Anywhere binaries and see system requirements please follow the installation guide
.
Steps
-
Generate a cluster config
CLUSTER_NAME=dev-cluster
eksctl anywhere generate clusterconfig $CLUSTER_NAME \
--provider docker > $CLUSTER_NAME.yaml
The command above creates a file named eksa-cluster.yaml with the contents below in the path where it is executed.
The configuration specification is divided into two sections:
- Cluster
- DockerDatacenterConfig
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: dev-cluster
spec:
clusterNetwork:
cniConfig:
cilium: {}
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneConfiguration:
count: 1
datacenterRef:
kind: DockerDatacenterConfig
name: dev-cluster
externalEtcdConfiguration:
count: 1
kubernetesVersion: "1.21"
managementCluster:
name: dev-cluster
workerNodeGroupConfigurations:
- count: 1
name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: DockerDatacenterConfig
metadata:
name: dev-cluster
spec: {}
- Apart from the base configuration, you can add additional optional configuration to enable supported features:
-
Create Cluster: Create your cluster either with or without curated packages:
-
Cluster creation without curated packages installation
eksctl anywhere create cluster -f $CLUSTER_NAME.yaml
Example command output
Performing setup and validations
✅ validation succeeded {"validation": "docker Provider setup is valid"}
Creating new bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Provider specific setup
Creating new workload cluster
Installing networking on workload cluster
Installing cluster-api providers on workload cluster
Moving cluster management from bootstrap to workload cluster
Installing EKS-A custom components (CRD and controller) on workload cluster
Creating EKS-A CRDs instances on workload cluster
Installing AddonManager and GitOps Toolkit on workload cluster
GitOps field not specified, bootstrap flux skipped
Deleting bootstrap cluster
🎉 Cluster created!
-
Cluster creation with optional curated packages
Note
- It is optional to install curated packages as part of the cluster creation.
eksctl anywhere version
version should be later than v0.9.0
.
- If including curated packages during cluster creation, please set the environment variable:
export CURATED_PACKAGES_SUPPORT=true
- Post-creation installation and detailed package configurations can be found here.
-
Discover curated-packages to install
eksctl anywhere list packages --source registry --kube-version 1.21
Example command output
Package Version(s)
------- ----------
harbor 2.5.0-4324383d8c5383bded5f7378efb98b4d50af827b
-
Generate a curated-packages config
The example shows how to install the harbor
package from the curated package list
.
eksctl anywhere generate package harbor --source registry --kube-version 1.21 > packages.yaml
-
Create a cluster
# Create a cluster with curated packages installation
eksctl anywhere create cluster -f $CLUSTER_NAME.yaml --install-packages packages.yaml
Example command output
Performing setup and validations
✅ validation succeeded {"validation": "docker Provider setup is valid"}
Creating new bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Provider specific setup
Creating new workload cluster
Installing networking on workload cluster
Installing cluster-api providers on workload cluster
Moving cluster management from bootstrap to workload cluster
Installing EKS-A custom components (CRD and controller) on workload cluster
Creating EKS-A CRDs instances on workload cluster
Installing AddonManager and GitOps Toolkit on workload cluster
GitOps field not specified, bootstrap flux skipped
Deleting bootstrap cluster
🎉 Cluster created!
----------------------------------------------------------------------------------------------------------------
The EKS Anywhere package controller and the EKS Anywhere Curated Packages
(referred to as “features”) are provided as “preview features” subject to the AWS Service Terms,
(including Section 2 (Betas and Previews)) of the same. During the EKS Anywhere Curated Packages Public Preview,
the AWS Service Terms are extended to provide customers access to these features free of charge.
These features will be subject to a service charge and fee structure at ”General Availability“ of the features.
----------------------------------------------------------------------------------------------------------------
Installing curated packages controller on workload cluster
package.packages.eks.amazonaws.com/my-harbor created
-
Use the cluster
Once the cluster is created you can use it with the generated KUBECONFIG
file in your local directory
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
kubectl get ns
Example command output
NAME STATUS AGE
capd-system Active 21m
capi-kubeadm-bootstrap-system Active 21m
capi-kubeadm-control-plane-system Active 21m
capi-system Active 21m
capi-webhook-system Active 21m
cert-manager Active 22m
default Active 23m
eksa-system Active 20m
kube-node-lease Active 23m
kube-public Active 23m
kube-system Active 23m
You can now use the cluster like you would any Kubernetes cluster.
Deploy the test application with:
kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
Verify the test application in the deploy test application section
.
Next steps:
-
See the Cluster management
section for more information on common operational tasks like scaling and deleting the cluster.
-
See the Package management
section for more information on post-creation curated packages installation.
2.3 - Create production cluster
EKS Anywhere allows you to provision and manage Amazon EKS on your own infrastructure.
To get started with different production-quality EKS Anywhere providers, choose from the providers below:
2.3.1 - Create Bare Metal production cluster
EKS Anywhere supports a Bare Metal provider for production grade EKS Anywhere deployments.
EKS Anywhere allows you to provision and manage Kubernetes clusters based on Amazon EKS software on your own infrastructure.
This document walks you through setting up EKS Anywhere as a self-managed cluster.
It does not yet support the concept of a separate management cluster for managing one or more workload clusters.
Prerequisite checklist
EKS Anywhere needs:
Also, see the Ports and protocols
page for information on ports that need to be accessible from control plane, worker, and Admin machines.
Steps
The following steps are needed to create a self-managed Bare Metal EKS Anywhere cluster.
Create the cluster
Follow these steps to create an EKS Anywhere cluster.
-
Set an environment variables for your cluster name
-
Generate a cluster config file for your Bare Metal provider (using tinkerbell as the provider type).
eksctl anywhere generate clusterconfig $CLUSTER_NAME --provider tinkerbell > eksa-mgmt-cluster.yaml
-
Modify the cluster config (eksa-mgmt-cluster.yaml
) by referring to the Bare Metal configuration
reference documentation.
-
Set License Environment Variable
If you are creating a licensed cluster, set and export the license variable (see License cluster
if you are licensing an existing cluster):
export EKSA_LICENSE='my-license-here'
After you have created your eksa-mgmt-cluster.yaml
and set your credential environment variables, you will be ready to create the cluster.
-
Create the cluster, using the hardware.csv
file you made in Bare Metal preparation
,
either with or without curated packages:
-
Once the cluster is created you can use it with the generated KUBECONFIG
file in your local directory:
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
-
Check the cluster nodes:
To check that the cluster completed, list the machines to see the control plane and worker nodes:
Example command output:
NAMESPACE NAME PROVIDERID PHASE VERSION
eksa-system mgmt-b2xyz tinkerbell:/xxxxx Running v1.21.2-eks-1-21-5
eksa-system mgmt-md-8-6xr-rnr tinkerbell:/xxxxx Running v1.21.2-eks-1-21-5
...
-
Check the cluster:
You can now use the cluster as you would any Kubernetes cluster.
To try it out, run the test application with:
export CLUSTER_NAME=mgmt
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
Verify the test application in Deploy test workload
.
Next steps:
-
See the Cluster management
section for more information on common operational tasks like deleting the cluster.
-
See the Package management
section for more information on post-creation curated packages installation.
2.3.2 - Create vSphere production cluster
EKS Anywhere supports a VMware vSphere provider for production grade EKS Anywhere deployments.
This document walks you through setting up EKS Anywhere on vSphere in a way that:
- Deploys an initial cluster on your vSphere environment. That cluster can be used as a self-managed cluster (to run workloads) or a management cluster (to create and manage other clusters)
- Deploys zero or more workload clusters from the management cluster
If your initial cluster is a management cluster, it is intended to stay in place so you can use it later to modify, upgrade, and delete workload clusters.
Using a management cluster makes it faster to provision and delete workload clusters.
Also it lets you keep vSphere credentials for a set of clusters in one place: on the management cluster.
The alternative is to simply use your initial cluster to run workloads.
Important
Creating an EKS Anywhere management cluster is the recommended model.
Separating management features into a separate, persistent management cluster
provides a cleaner model for managing the lifecycle of workload clusters (to create, upgrade, and delete clusters), while workload clusters run user applications.
This approach also reduces provider permissions for workload clusters.
Prerequisite Checklist
EKS Anywhere needs to:
Also, see the Ports and protocols
page for information on ports that need to be accessible from control plane, worker, and Admin machines.
Steps
The following steps are divided into two sections:
- Create an initial cluster (used as a management or self-managed cluster)
- Create zero or more workload clusters from the management cluster
Create an initial cluster
Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a self-managed cluster (for running workloads itself).
-
Generate an initial cluster config (named mgmt
for this example):
CLUSTER_NAME=mgmt
eksctl anywhere generate clusterconfig $CLUSTER_NAME \
--provider vsphere > eksa-mgmt-cluster.yaml
-
Modify the initial cluster config (eksa-mgmt-cluster.yaml
) as follows:
- Refer to vsphere configuration
for information on configuring this cluster config for a vSphere provider.
- Add Optional
configuration settings as needed.
- Create at least two control plane nodes, three worker nodes, and three etcd nodes for a production cluster, to provide high availability and rolling upgrades.
-
Set Credential Environment Variables
Before you create the initial cluster, you will need to set and export these environment variables for your vSphere user name and password.
Make sure you use single quotes around the values so that your shell does not interpret the values:
export EKSA_VSPHERE_USERNAME='billy'
export EKSA_VSPHERE_PASSWORD='t0p$ecret'
-
Set License Environment Variable
If you are creating a licensed cluster, set and export the license variable (see License cluster
if you are licensing an existing cluster):
export EKSA_LICENSE='my-license-here'
After you have created your eksa-mgmt-cluster.yaml
and set your credential environment variables, you will be ready to create the cluster.
-
Create initial cluster: Create your initial cluster either with or without curated packages:
-
Once the cluster is created you can use it with the generated KUBECONFIG
file in your local directory:
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
-
Check the cluster nodes:
To check that the cluster completed, list the machines to see the control plane, etcd, and worker nodes:
Example command output
NAMESPACE NAME PROVIDERID PHASE VERSION
eksa-system mgmt-b2xyz vsphere:/xxxxx Running v1.21.2-eks-1-21-5
eksa-system mgmt-etcd-r9b42 vsphere:/xxxxx Running
eksa-system mgmt-md-8-6xr-rnr vsphere:/xxxxx Running v1.21.2-eks-1-21-5
...
The etcd machine doesn’t show the Kubernetes version because it doesn’t run the kubelet service.
-
Check the initial cluster’s CRD:
To ensure you are looking at the initial cluster, list the CRD to see that the name of its management cluster is itself:
kubectl get clusters mgmt -o yaml
Example command output
...
kubernetesVersion: "1.21"
managementCluster:
name: mgmt
workerNodeGroupConfigurations:
...
Note
The initial cluster is now ready to deploy workload clusters.
However, if you just want to use it to run workloads, you can deploy pod workloads directly on the initial cluster without deploying a separate workload cluster and skip the section on running separate workload clusters.
To make sure the cluster is ready to run workloads, run the test application in the
Deploy test workload section.
Create separate workload clusters
Follow these steps if you want to use your initial cluster to create and manage separate workload clusters.
-
Generate a workload cluster config:
CLUSTER_NAME=w01
eksctl anywhere generate clusterconfig $CLUSTER_NAME \
--provider vsphere > eksa-w01-cluster.yaml
Refer to the initial config described earlier for the required and optional settings.
The main differences are that you must have a new cluster name and cannot use the same vSphere resources.
-
Create a workload cluster
To create a new workload cluster from your management cluster run this command, identifying:
- The workload cluster YAML file
- The initial cluster’s credentials (this causes the workload cluster to be managed from the management cluster)
# Create a cluster without curated packages installation
eksctl anywhere create cluster \
-f eksa-w01-cluster.yaml \
--kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
As noted earlier, adding the --kubeconfig
option tells eksctl
to use the management cluster identified by that kubeconfig file to create a different workload cluster.
-
Check the workload cluster:
You can now use the workload cluster as you would any Kubernetes cluster.
Change your credentials to point to the new workload cluster (for example, mgmt-w01
), then run the test application with:
export CLUSTER_NAME=mgmt-w01
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
Verify the test application in the deploy test application section
.
-
Add more workload clusters:
To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml
), modifying resource names, and running the create cluster command again.
Next steps:
-
See the Cluster management
section for more information on common operational tasks like scaling and deleting the cluster.
-
See the Package management
section for more information on post-creation curated packages installation.
3 - Concepts
The Concepts section will describe the components and overall architecture of EKS Anywhere.
Most of the content of this section will cover how EKS Anywhere deploys, upgrades and otherwise manages Kubernetes clusters. It will point to Kubernetes documentation for specifics on how Kubernetes itself works.
3.1 - Compare EKS Anywhere and EKS
Comparing Amazon EKS Anywhere features to Amazon EKS
Amazon EKS Anywhere is a new deployment option for Amazon EKS
that enables you to easily create and operate Kubernetes clusters on-premises.
EKS Anywhere provides an installable software package for creating and operating Kubernetes clusters on-premises
and automation tooling for cluster lifecycle support.
To learn more, see EKS Anywhere
.
Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service that makes it easy for you to run Kubernetes on the AWS cloud.
Amazon EKS is certified Kubernetes conformant, so existing applications that run on upstream Kubernetes are compatible with Amazon EKS.
To learn more about Amazon EKS, see Amazon Elastic Kubernetes Service
.
Comparing Amazon EKS Anywhere to Amazon EKS
Feature |
Amazon EKS Anywhere |
Amazon EKS |
Control plane |
|
|
K8s control plane management |
Managed by customer |
Managed by AWS |
K8s control plane location |
Customer’s datacenter |
AWS cloud |
Cluster updates |
Manual CLI updates for control plane. Flux supported rolling updates for data plane |
Managed in-place updates for control plane and managed rolling updates for data plane. |
|
|
|
Compute |
|
|
Compute options |
VMware vSphere, Bare Metal servers |
Amazon EC2, AWS Fargate |
Supported node operating systems |
Bottlerocket, Ubuntu |
Amazon Linux 2, Windows Server, Bottlerocket, Ubuntu |
Physical hardware (servers, network equipment, storage, etc.) |
Managed by customer |
Managed by AWS |
Serverless |
Not supported |
Amazon EKS on AWS Fargate |
|
|
|
Management |
|
|
Command line interface (CLI) |
eksctl (OSS command line tool) |
eksctl (OSS command line tool) |
Console view for Kubernetes objects |
Optional EKS console connection using EKS Connector (public preview) |
Native EKS console connection |
Infrastructure-as-code |
Cluster manifest, Kubernetes controllers, 3rd-party solutions
|
AWS CloudFormation, 3rd-party solutions
|
Logging and monitoring |
3rd-party solutions
|
CloudWatch, CloudTrail, 3rd-party solutions
|
GitOps |
Flux controller |
Flux controller |
|
|
|
Functions and tooling |
|
|
Networking and Security |
Cilium CNI and network policy supported |
Amazon VPC CNI supported. Calico supported for network policy. Other compatible 3rd-party CNI plugins
available. |
Load balancer |
Elastic Load Balancing including Application Load Balancer (ALB), and Network Load Balancer (NLB) |
|
Service mesh |
Community or 3rd-party solutions
|
AWS App Mesh, community, or 3rd-party solutions
|
Community tools and Helm |
Works with compatible community tooling and helm charts. |
Works with compatible community tooling and helm charts. |
|
|
|
Pricing and support |
|
|
Control plane pricing |
Free to download, paid support subscription option |
Hourly pricing per cluster |
AWS Support |
Additional annual subscription (per cluster) for AWS support |
Basic support included. Included in paid AWS support plans (developer, business, and enterprise) |
|
|
|
3.2 - Cluster creation workflow
Explanation of the process of creating an EKS Anywhere cluster
The EKS Anywhere cluster creation process makes it easy not only to bring up a cluster initially, but also to update configuration settings and to upgrade Kubernetes versions going forward.
The EKS Anywhere cluster versions match the same Kubernetes distribution versions that are used in the AWS EKS cloud service.
Each EKS Anywhere cluster is built from a cluster specification file, with the structure of the configuration file based on the target provider for the cluster.
Currently, VMware vSphere is the recommended provider for supported EKS Anywhere clusters.
So, vSphere is the example provider we step through here.
This document provides an in-depth description of the process of creating an EKS Anywhere cluster.
It starts by describing the components to put in place before creating the cluster.
Then it shows you what happens at each step of the process.
After that, the document describes the attributes of the resulting cluster.
Before cluster creation
Some assets need to be in place before you can create an EKS Anywhere cluster.
You need to have an Administrative machine that includes the tools required to create the cluster.
Next, you need get the software tools and artifacts used to build the cluster.
Then you also need to prepare the provider, in this case a vCenter environment, on which to create the resulting cluster.
Administrative machine
The Administrative machine is needed to provide:
- A place to run the commands to create and manage the workload cluster.
- A Docker container runtime to run a temporary, local bootstrap cluster that creates the resulting workload cluster.
- A place to hold the
kubeconfig
file needed to perform administrative actions using kubectl
.
(The kubeconfig
file is stored in the root of the folder created during cluster creation.)
The Administrative machine can be any computer (such as your local laptop) with a supported operating system that meets the requirements.
It must also have Internet access to the places where the command line tools and EKS Anywhere artifacts are made available.
Likewise, the Administrative machine must be able to reach and have access to the provider (vSphere).
See the Install EKS Anywhere
guide for Administrative machine requirements.
EKS Anywhere software
To obtain EKS Anywhere software, you need Internet access to the repositories holding that software.
EKS Anywhere does not currently support the use of private registries and repositories for the software that EKS Anywhere needs to draw on during cluster creation at this time.
EKS Anywhere software is divided into two types of components.
The CLI interface for managing clusters and the cluster components and controllers used to run workloads and configure clusters.
The software you need to obtain includes:
The sites to which the administrative machine and the target workload environment need access are listed in the Requirements
section.
If you are operating behind a firewall that limits access to the Internet, you can configure EKS Anywhere to identify the location of the proxy service
you choose to connect to the Internet.
For more information on the software used in EKS Distro, which includes the Kubernetes release and related software in EKS Anywhere, see the EKS Distro Releases
GitHub page.
For information on the Ubuntu and Bottlerocket operating systems used with EKS Anywhere, see the EKS Anywhere Artifacts
page.
Providers
EKS Anywhere uses an infrastructure provider model for creating, upgrading, and managing Kubernetes clusters that leverages the Kubernetes Cluster API
project.
The first supported EKS Anywhere provider, VMware vSphere, is implemented based on the Kubernetes Cluster API Provider vsphere
(CAPV) specifications.
Like Cluster API, EKS Anywhere runs a kind
cluster on the local Administrative machine to act as a bootstrap cluster.
However, instead of using CAPI directly with the clusterctl
command to manage the workload cluster, you use the eksctl anywhere
command which abstracts that process for you, including calling clusterctl
under the covers.
As for other providers, the EKS Anywhere project documents the Cluster API Provider Docker (CAPD)
, but doesn’t support it for production use.
Expect other providers to be supported for EKS Anywhere in the future.
If you are interested in EKS Anywhere supporting a different provider, feel free to create an an issue on Github
for consideration.
With your Administrative machine in place, to prepare the vSphere provider for EKS Anywhere you need to make sure your vSphere environment meets the EKS Anywhere requirements
and that permissions
set up properly.
If you don’t want to use the default OVA images, you can import the OVAs
representing the operating systems and Kubernetes releases you want.
Creating a cluster
With the provider (vSphere) prepared and the Administrative machine set up to run Docker and the required binaries, you can create an EKS Anywhere cluster.
This section steps through an example of an EKS Anywhere cluster being created on a vSphere provider.
Once you understand this process, you can use the following documentation to create your own cluster:
Starting the process
To start, the eksctl anywhere
command is used to generate a cluster config file, which you can then modify and use to create the cluster.
The following diagram illustrates what happens when you start the cluster creation process:

1. Generate an EKS Anywhere config file
When you run eksctl anywhere generate clusterconfig
, the two pieces of information you provide are the name of the cluster ($CLUSTER_NAME) and the type of provider (-p vsphere
, in this example).
Then you can direct the yaml cluster config output into a file (> $CLUSTER_NAME.yaml
). For example:
eksctl anywhere generate clusterconfig $CLUSTER_NAME -p vpshere > $CLUSTER_NAME.yaml
The provider is important because the type of cluster config created is based on the provider.
The docker
provider is the only other (although unsupported for production use) provider documented with EKS Anywhere.
The result of this command is a config file template that you need to modify for the specific instance of your provider.
2. Modify the EKS Anywhere config file
Using the generated cluster config file, make modifications to suit your situation.
Details about this config file are contained in the vSphere Config
There are several things to consider when modifying the cluster config file:
Pay particular attention to which settings are optional and which are required.
Also, not all properties can be upgraded, so it is important to get those settings right at cluster creation.
See supported cluster properties, related to GitOps
and eksctl anywhere upgrade
methods of cluster upgrades, for information on which properties can be modified after initial cluster creation.
3. Launch the cluster creation
Once you have modified the cluster configuration file, use eksctl anywhere cluster create -f $CLUSTER_NAME.yaml
as described in the production environment
section to start the cluster creation process.
To see details on the cluster creation process, you can increase the verbosity (-v=9
provides maximum verbosity).
4. Authenticate and create bootstrap cluster
After authenticating to vSphere and validating the assets there, the cluster creation process starts off creating a temporary Kubernetes bootstrap cluster on the Administrative machine.
If you are watching the output of eksctl
anywhere cluster create for those steps, you should see something similar to the following:
To begin, the cluster creation process runs a series of govc
commands to check on the vSphere environment.
First, it checks that the vSphere environment is available:
Performing setup and validations
✅ Connected to server
Using the URL and credentials provided in the cluster spec files, it authenticates to the vSphere provider:
✅ Authenticated to vSphere
It validates the datacenter exists:
✅ Datacenter validated
It validates that the datacenter network exists:
✅ Network validated
It validates that the identified datastore (to store your EKS Anywhere cluster) exists, that the folder holding your EKS Anywhere cluster VMs exists, and that the resource pools containing compute resources exist.
If you have multiple VSphereMachineConfig
objects in your config file, will see these validations repeated:
✅ Datastore validated
✅ Folder validated
✅ Resource pool validated
It validates the virtual machine templates to be used for the control plane and worker nodes (such as ubuntu-2004-kube-v1.20.7
):
✅ Control plane and Workload templates validated
If all those validations passed, you will see this message:
✅ Vsphere Provider setup is valid
Next, the process runs the kind
command to build a single-node Kubernetes bootstrap cluster on the Administrative machine.
This includes pulling the kind node image, preparing the node, writing the configuration, starting the control-plane, installing CNI, and installing the StorageClass. You will see:
Creating new bootstrap cluster
After this point the bootstrap cluster is installed, but not yet fully configured.
Continuing cluster creation
If all goes well, the cluster should be created from the eksctl anywhere cluster create
command and the config file you provided without any further actions from you.
The following diagram illustrates the activities that occur next:

1. Add CAPI management
Cluster API (CAPI) management is added to the bootstrap cluster to direct the creation of the workload cluster.
2. Set up cluster
Configure the control plane and worker nodes.
3. Add Cilium networking
Add Cilium as the CNI plugin to use for networking between the cluster services and pods.
4. Add storage
Add the default storage class to the cluster
5. Add CAPI to workload cluster
Add the CAPI service to the workload cluster in preparation for it to take over management of the cluster after the cluster creation is completed and the bootstrap cluster is deleted.
The bootstrap cluster can then begin moving the CAPI objects over to the workload cluster, so it can take over the management of itself.
The following text continues to follow along with the output from eksctl anywhere cluster create as just described.
Installs the CAPI service on the bootstrap node:
Installing cluster-api providers on bootstrap cluster
Performs provider-specific setup for core components.
For the default configuration, you should see these: etcdadm-bootstrap, etcdadm-controller, control-plane-kubeadm, and infrastructure-vsphere and sets up cert-manager.
The CAPI controller-manager is also configured:
Provider specific setup
With the bootstrap cluster running and configured on the Administrative machine, the creation of the workload cluster begins.
It uses kubectl
to apply a workload cluster configuration.
Then it waits for etcd, the control plane, and the worker nodes to be ready:
Creating new workload cluster
Once etcd, the control plane, and the worker nodes are ready, it applies the networking configuration to the workload cluster:
Installing networking on workload cluster
Next, the default storage class is installed on the workload cluster:
Installing storage class on workload cluster
After that, the CAPI providers are configured on the workload cluster, in preparation for the workload cluster to take over responsibilities for running the components needed to manage the itself.
Installing cluster-api providers on workload cluster
With CAPI running on the workload cluster, CAPI objects for the workload cluster are moved from the bootstrap cluster to the workload cluster’s CAPI service (done internally with the clusterctl
command):
Moving cluster management from bootstrap to workload cluster
At this point, the cluster creation process will add Kubernetes CRDs and other addons that are specific to EKS Anywhere.
That configuration is applied directly to the cluster:
Installing EKS-A custom components (CRD and controller) on workload cluster
Creating EKS-A CRDs instances on workload cluster
Installing AddonManager and GitOps Toolkit on workload cluster
If you did not specify GitOps support, starting the flux service is skipped:
GitOps field not specified, bootstrap flux skipped
The cluster configuration is saved:
Writing cluster config file
With the workload cluster up, and the CAPI service running on the workload cluster, the bootstrap cluster is no longer needed and is deleted:

Deleting bootstrap cluster
Cluster creation is complete:
🎉 Cluster created!
At this point, the workload cluster is ready to use, both to run workloads and to accept requests to change, update, or upgrade the cluster itself.
You can continue to use eksctl
anywhere to manage your cluster, with EKS Anywhere handling the fact that CAPI management is now being fulfilled from the workload cluster instead of the bootstrap cluster.
After cluster creation
With the EKS Anywhere cluster up and running, you might be interested to know how your cluster is set up and what it is composed of.
The following sections describe different aspects of an EKS Anywhere cluster on a vSphere provider and what you should know about them going forward.
See Add integrations
for information on example third-party tools for adding features to EKS Anywhere.
Networking
Networking features of your EKS Anywhere cluster start with how virtual machines in the EKS-A cluster in vSphere are set up.
The current state of networking on the vSphere node level include the following:
- DHCP: EKS Anywhere requires that a DHCP server be available to the control plane and worker nodes in vSphere for them to obtain their IP addresses.
There is currently no support for static IP addresses or multi-network clusters.
All control plane and nodes are on the same network.
- CAPI endpoint: A static IP address should have been assigned to the control plane configuration endpoint, to provide access to the Cluster API.
It should have been set up to not conflict with any other node IP addresses in the cluster.
This is a specific requirement of CAPI, not EKS Anywhere.
- Proxy server: If a proxy server
was identified to the EKS Anywhere workload cluster, that server should have inbound access from the cluster nodes and outbound access to the internet.
Networking for the cluster itself has the following attributes:
- Cilium CNI: The Cilium
Kubernetes CNI is used to provide networking between components of the control plane and data plane components.
No other CNI plugins, including Cilium Enterprise, is supported at this time.
- Pod/Service IP ranges: Separate IP address blocks were assigned from the configuration file
during cluster creation for the Pods network and Services network managed by Cilium.
Refer to the clusterNetwork section of your configuration file to see how the cidrBlocks for pods and services were set.
Networking setups for accessing cluster resources on your running EKS Anywhere cluster include the following documented features:
- Load balancers: You can add external load balancers to your EKS Anywhere cluster. EKS Anywhere project documents how to configure KubeVip
and MetalLB
.
- Ingress controller: You can add a Kubernetes ingress controller to EKS Anywhere.
The project documents the use of Emissary-ingress
ingress controller.
Operating systems
The Ubuntu or Mac operating system representing the Administrative machine can continue to use the binaries to manage the EKS anywhere cluster.
You may need to update those binaries
(kubectl
, eksctl anywhere
, and others) from time to time.
In the workload cluster itself, the operating system on each node is provided from either Bottlerocket or Ubuntu OVAs.
Note that it is not recommended that you add software or change the configuration of these systems once they are running in the cluster.
In fact, Bottlerocket contains limited writeable areas and does not include a software package management system.
If you need to modify an operating system, you can rebuild an Ubuntu OVA
to use with EKS Anywhere.
In other words, all operating system changes should be done before the OVA is added to your EKS Anywhere cluster.
Authentication
Supported authentication types are listed in the AuthN / AuthZ
section of the EKS Anywhere FAQ.
Storage
The amount of storage assigned to each virtual machine is 25GiB, by default.
It could be different in your case if you had changed the diskGiB
field in the EKS Anywhere config.
As for application storage, EKS Anywhere configures a default storage class and supports adding compatible Container Storage Interface (CSI) drivers to a running workload cluster.
See Kubernetes Storage
for details.
3.3 - EKS Anywhere curated packages
All information you may need for EKS Anywhere curated packages
Note
The EKS Anywhere package controller and the EKS Anywhere Curated Packages (referred to as “features”) are provided as “preview features” subject to the AWS Service Terms (including Section 2 “Betas and Previews”) of the same. During the EKS Anywhere Curated Packages Public Preview, the AWS Service Terms are extended to provide customers access to these features free of charge. These features will be subject to a service charge and fee structure at ”General Availability“ of the features.
Overview
Amazon EKS Anywhere Curated Packages are Amazon-curated software packages that extend the core functionalities of Kubernetes on your EKS Anywhere clusters. If you operate EKS Anywhere clusters on-premises, you probably install additional software to ensure the security and reliability of your clusters. However, you may be spending a lot of effort researching for the right software, tracking updates, and testing them for compatibility. Now with the EKS Anywhere Curated Packages, you can rely on Amazon to provide trusted, up-to-date, and compatible software that are supported by Amazon, reducing the need for multiple vendor support agreements.
- Amazon-built: All container images of the packages are built from source code by Amazon, including the open source (OSS) packages. OSS package images are built from the open source upstream.
- Amazon-scanned: Amazon continuously scans the container images including the OSS package images for security vulnerabilities and provides remediation.
- Amazon-signed: Amazon signs the package bundle manifest (a Kubernetes manifest) for the list of curated packages. The manifest is signed with AWS Key Management Service (AWS KMS) managed private keys. The curated packages are installed and managed by a package controller on the clusters. Amazon provides validation of signatures through an admission control webhook in the package controller and the public keys distributed in the bundle manifest file.
- Amazon-tested: Amazon tests the compatibility of all curated packages including the OSS packages with each new version of EKS Anywhere.
- Amazon-supported: All curated packages including the curated OSS packages are supported under the EKS Anywhere Support Subscription.
The main components of EKS Anywhere Curated Packages are the package controller
, the package build artifacts
and the command line interface
. The package controller will run in a pod in an EKS Anywhere cluster. The package controller will manage the lifecycle of all curated packages.
Curated packages
Please check out curated package list
for the complete list of EKS Anywhere curated packages.
Workshop
Please check out workshop
for curated packages.
FAQ
-
Can I install software not from the curated package list?
Yes. You can install any optional software of your choice. Be aware you cannot use EKS Anywhere tooling to install or update your self-managed software. Amazon does not provide testing, security patching, software updates, or customer support for your self-managed software.
-
Can I install software that’s on the curated package list but not sourced from EKS Anywhere repository?
If, for example, you deploy a Harbor image that is not built and signed by Amazon, Amazon will not provide testing or customer support to your self-built images.
3.3.1 - EKS Anywhere curated package controller
Overview
The package controller will install, upgrade, configure and remove packages from the cluster. The package controller will watch the packages and packagebundle custom resources for the packages to run and their configuration values.
Package release information is stored in a package bundle manifest. The package controller will continually monitor and download new package bundles. When a new package bundle is downloaded, it will show up as update available and users can use the CLI to activate the bundle to upgrade the installed packages.
Any changes to a package custom resource will trigger and install, upgrade, configuration or removal of that package. The package controller will use ECR or private registry to get all resources including bundle, helm charts, and container images.
Installation
Please check out create local cluster
and create production cluster
for how to install package controller at the cluster creation time.
Please check out package management
for how to install package controller after cluster creation and manage curated packages.
3.3.2 - EKS Anywhere curated package build artifacts
There are three types of build artifacts for packages: the container images, the helm charts and the package bundle manifests. The container images, helm charts and bundle manifests for all of the packages will be built and stored in EKS Anywhere public ECR repository. Each package may have multiple versions specified in the packages bundle. The bundle will reference the helm chart tag in the ECR repository. The helm chart will reference the container images for the package.
3.3.3 - EKS Anywhere curated package CLI
Overview
The Curated Packages CLI provides the user experience required to manage curated packages.
Through the CLI, a user is able to discover, create, delete, and upgrade curated packages to a cluster.
These functionalities can be achieved during and after an EKS Anywhere cluster is created.
The CLI provides both imperative and declarative mechanisms to manage curated packages. These
packages will be included as part of a packagebundle
that will be provided by the EKS Anywhere team.
Whenever a user requests a package creation through the CLI (eksctl anywhere create packages
), a custom resource is created on the cluster
indicating the existence of a new package that needs to be installed. When a user executes a delete operation (eksctl anywhere delete package
),
the custom resource will be removed from the cluster indicating the need for uninstalling a package.
An upgrade through the CLI (eksctl anywhere upgrade packages
) upgrades all packages to the latest release.
Installation
Please check out Install EKS Anywhere
to install the eksctl anywhere
CLI on your machine.
Also check out Create local cluster
and Create production cluster
for how to use the CLI during and after cluster creation.
Check out EKS Anywhere curated package management
for how to use the CLI after a cluster is created and manage curated packages.
4 - Tasks
Common actions and set-up you may need for EKS-A
4.1 - Workload management
Common tasks for managing workloads.
4.1.1 - Deploy test workload
How to deploy a workload to check that your cluster is working properly
We’ve created a simple test application for you to verify your cluster is working properly.
You can deploy it with the following command:
kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
To see the new pod running in your cluster, type:
kubectl get pods -l app=hello-eks-a
Example output:
NAME READY STATUS RESTARTS AGE
hello-eks-a-745bfcd586-6zx6b 1/1 Running 0 22m
To check the logs of the container to make sure it started successfully, type:
kubectl logs -l app=hello-eks-a
There is also a default web page being served from the container.
You can forward the deployment port to your local machine with
kubectl port-forward deploy/hello-eks-a 8000:80
Now you should be able to open your browser or use curl
to http://localhost:8000
to view the page example application.
Example output:
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
Thank you for using
███████╗██╗ ██╗███████╗
██╔════╝██║ ██╔╝██╔════╝
█████╗ █████╔╝ ███████╗
██╔══╝ ██╔═██╗ ╚════██║
███████╗██║ ██╗███████║
╚══════╝╚═╝ ╚═╝╚══════╝
█████╗ ███╗ ██╗██╗ ██╗██╗ ██╗██╗ ██╗███████╗██████╗ ███████╗
██╔══██╗████╗ ██║╚██╗ ██╔╝██║ ██║██║ ██║██╔════╝██╔══██╗██╔════╝
███████║██╔██╗ ██║ ╚████╔╝ ██║ █╗ ██║███████║█████╗ ██████╔╝█████╗
██╔══██║██║╚██╗██║ ╚██╔╝ ██║███╗██║██╔══██║██╔══╝ ██╔══██╗██╔══╝
██║ ██║██║ ╚████║ ██║ ╚███╔███╔╝██║ ██║███████╗██║ ██║███████╗
╚═╝ ╚═╝╚═╝ ╚═══╝ ╚═╝ ╚══╝╚══╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚══════╝
You have successfully deployed the hello-eks-a pod hello-eks-a-c5b9bc9d8-qp6bg
For more information check out
https://anywhere.eks.amazonaws.com
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
If you would like to expose your applications with an external load balancer or an ingress controller, you can follow the steps in Adding an external load balancer
.
4.1.2 - Add an external load balancer
How to deploy a load balancer controller to expose a workload running in EKS Anywhere
A production-quality Kubernetes cluster requires planning and preparation for various networking features.
The purpose of this document is to walk you through getting set up with a recommended Kubernetes Load Balancer for EKS Anywhere.
Load Balancing is essential in order to maximize availability and scalability. It enables efficient distribution of incoming network traffic among multiple backend services.
4.1.2.1 - RECOMMENDED: Kube-Vip for Service-type Load Balancer
How to set up kube-vip for Service-type Load Balancer (Recommended)
We recommend using Kube-Vip cloud controller to expose your services as service-type Load Balancer.
Detailed information about Kube-Vip can be found here
.
There are two ways Kube-Vip can manage virtual IP addresses on your network.
Please see the following guides for ARP or BGP mode depending on your on-prem networking preferences.
Setting up Kube-Vip for Service-type Load Balancer
Kube-Vip Service-type Load Balancer can be set up in either ARP mode or BGP mode
4.1.2.1.1 - Kube-Vip ARP Mode
How to set up kube-vip for Service-type Load Balancer in ARP mode
In ARP mode, kube-vip will perform leader election and assign the Virtual IP to the leader.
This node will inherit the VIP and become the load-balancing leader within the cluster.
Setting up Kube-Vip for Service-type Load Balancer in ARP mode
-
Enable strict ARP in kube-proxy as it’s required for kube-vip
kubectl get configmap kube-proxy -n kube-system -o yaml | \
sed -e "s/strictARP: false/strictARP: true/" | \
kubectl apply -f - -n kube-system
-
Create a configMap to specify the IP range for load balancer.
You can use either a CIDR block or an IP range
CIDR=192.168.0.0/24 # Use your CIDR range here
kubectl create configmap --namespace kube-system kubevip --from-literal cidr-global=${CIDR}
IP_START=192.168.0.0 # Use the starting IP in your range
IP_END=192.168.0.255 # Use the ending IP in your range
kubectl create configmap --namespace kube-system kubevip --from-literal range-global=${IP_START}-${IP_END}
-
Deploy kubevip-cloud-provider
kubectl apply -f https://kube-vip.io/manifests/controller.yaml
-
Create ClusterRoles and RoleBindings for kube-vip Daemonset
kubectl apply -f https://kube-vip.io/manifests/rbac.yaml
-
Create the kube-vip Daemonset
An example manifest has been included at the end of this document which you can use in place of this step.
alias kube-vip="docker run --network host --rm plndr/kube-vip:v0.3.5"
kube-vip manifest daemonset --services --inCluster --arp --interface eth0 | kubectl apply -f -
-
Deploy the Hello EKS Anywhere
test application.
kubectl apply -f https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml
-
Expose the hello-eks-a service
kubectl expose deployment hello-eks-a --port=80 --type=LoadBalancer --name=hello-eks-a-lb
-
Describe the service to get the IP.
The external IP will be the one in CIDR range specified in step 4
EXTERNAL_IP=$(kubectl get svc hello-eks-a-lb -o jsonpath='{.spec.loadBalancerIP}')
-
Ensure the load balancer is working by curl’ing the IP you got in step 8
You should see something like this in the output
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡
Thank you for using
███████╗██╗ ██╗███████╗
██╔════╝██║ ██╔╝██╔════╝
█████╗ █████╔╝ ███████╗
██╔══╝ ██╔═██╗ ╚════██║
███████╗██║ ██╗███████║
╚══════╝╚═╝ ╚═╝╚══════╝
█████╗ ███╗ ██╗██╗ ██╗██╗ ██╗██╗ ██╗███████╗██████╗ ███████╗
██╔══██╗████╗ ██║╚██╗ ██╔╝██║ ██║██║ ██║██╔════╝██╔══██╗██╔════╝
███████║██╔██╗ ██║ ╚████╔╝ ██║ █╗ ██║███████║█████╗ ██████╔╝█████╗
██╔══██║██║╚██╗██║ ╚██╔╝ ██║███╗██║██╔══██║██╔══╝ ██╔══██╗██╔══╝
██║ ██║██║ ╚████║ ██║ ╚███╔███╔╝██║ ██║███████╗██║ ██║███████╗
╚═╝ ╚═╝╚═╝ ╚═══╝ ╚═╝ ╚══╝╚══╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚══════╝
You have successfully deployed the hello-eks-a pod hello-eks-a-c5b9bc9d8-fx2fr
For more information check out
https://anywhere.eks.amazonaws.com
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡
Here is an example manifest for kube-vip from step 5. Also available here
apiVersion: apps/v1
kind: DaemonSet
metadata:
creationTimestamp: null
name: kube-vip-ds
namespace: kube-system
spec:
selector:
matchLabels:
name: kube-vip-ds
template:
metadata:
creationTimestamp: null
labels:
name: kube-vip-ds
spec:
containers:
- args:
- manager
env:
- name: vip_arp
value: "true"
- name: vip_interface
value: eth0
- name: port
value: "6443"
- name: vip_cidr
value: "32"
- name: svc_enable
value: "true"
- name: vip_startleader
value: "false"
- name: vip_addpeerstolb
value: "true"
- name: vip_localpeer
value: ip-172-20-40-207:172.20.40.207:10000
- name: vip_address
image: plndr/kube-vip:v0.3.5
imagePullPolicy: Always
name: kube-vip
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
- SYS_TIME
hostNetwork: true
serviceAccountName: kube-vip
updateStrategy: {}
status:
currentNumberScheduled: 0
desiredNumberScheduled: 0
numberMisscheduled: 0
numberReady: 0
4.1.2.1.2 - Kube-Vip BGP Mode
How to set up kube-vip for Service-type Load Balancer in BGP mode
In BGP mode, kube-vip will assign the Virtual IP to all running Pods.
All nodes, therefore, will advertise the VIP address.
Prerequisites
- BGP-capable network switch connected to EKS-A cluster
- Vendor-specific BGP configuration on switch
Required BGP settings on network vendor equipment are described in BGP Configuration on Network Switch Side
section below.
Setting up Kube-Vip for Service-type Load Balancer in BGP mode
-
Create a configMap to specify the IP range for load balancer.
You can use either a CIDR block or an IP range
CIDR=192.168.0.0/24 # Use your CIDR range here
kubectl create configmap --namespace kube-system kubevip --from-literal cidr-global=${CIDR}
IP_START=192.168.0.0 # Use the starting IP in your range
IP_END=192.168.0.255 # Use the ending IP in your range
kubectl create configmap --namespace kube-system kubevip --from-literal range-global=${IP_START}-${IP_END}
-
Deploy kubevip-cloud-provider
kubectl apply -f https://kube-vip.io/manifests/controller.yaml
-
Create ClusterRoles and RoleBindings for kube-vip Daemonset
kubectl apply -f https://kube-vip.io/manifests/rbac.yaml
-
Create the kube-vip Daemonset
alias kube-vip="docker run --network host --rm plndr/kube-vip:latest"
kube-vip manifest daemonset \
--interface lo \
--localAS <AS#> \
--sourceIF <src interface> \
--services \
--inCluster \
--bgp \
--bgppeers <bgp-peer1>:<peerAS>::<bgp-multiphop-true-false>,<bgp-peer2>:<peerAS>::<bgp-multihop-true-false> | kubectl apply -f -
Explanation of the options provided above to kube-vip for manifest generation:
--interface — This interface needs to be set to the loopback in order to suppress ARP responses from worker nodes that get the LoadBalancer VIP assigned
--localAS — Local Autonomous System ID
--sourceIF — source interface on the worker node which will be used to communicate BGP with the switch
--services — Service Type LoadBalancer (not Control Plane)
--inCluster — Defaults to looking inside the Pod for the token
--bgp — Enables BGP peering from kube-vip
--bgppeers — Comma separated list of BGP peers in the format <address:AS:password:multihop>
Below is an example Daemonset creation command.
kube-vip manifest daemonset \
--interface $INTERFACE \
--localAS 65200 \
--sourceIF eth0 \
--services \
--inCluster \
--bgp \
--bgppeers 10.69.20.2:65000::false,10.69.20.3:65000::false
Below is the manifest generated with these example values.
apiVersion: apps/v1
kind: DaemonSet
metadata:
creationTimestamp: null
name: kube-vip-ds
namespace: kube-system
spec:
selector:
matchLabels:
name: kube-vip-ds
template:
metadata:
creationTimestamp: null
labels:
name: kube-vip-ds
spec:
containers:
- args:
- manager
env:
- name: vip_arp
value: "false"
- name: vip_interface
value: lo
- name: port
value: "6443"
- name: vip_cidr
value: "32"
- name: svc_enable
value: "true"
- name: cp_enable
value: "false"
- name: vip_startleader
value: "false"
- name: vip_addpeerstolb
value: "true"
- name: vip_localpeer
value: docker-desktop:192.168.65.3:10000
- name: bgp_enable
value: "true"
- name: bgp_routerid
- name: bgp_source_if
value: eth0
- name: bgp_as
value: "65200"
- name: bgp_peeraddress
- name: bgp_peerpass
- name: bgp_peeras
value: "65000"
- name: bgp_peers
value: 10.69.20.2:65000::false,10.69.20.3:65000::false
- name: bgp_routerinterface
value: eth0
- name: vip_address
image: ghcr.io/kube-vip/kube-vip:v0.3.7
imagePullPolicy: Always
name: kube-vip
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
- SYS_TIME
hostNetwork: true
serviceAccountName: kube-vip
updateStrategy: {}
status:
currentNumberScheduled: 0
desiredNumberScheduled: 0
numberMisscheduled: 0
numberReady: 0
-
Manually add the following to the manifest file as shown in the example above
- name: bgp_routerinterface
value: eth0
-
Deploy the Hello EKS Anywhere
test application.
kubectl apply -f https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml
-
Expose the hello-eks-a service
kubectl expose deployment hello-eks-a --port=80 --type=LoadBalancer --name=hello-eks-a-lb
-
Describe the service to get the IP. The external IP will be the one in CIDR range specified in step 4
EXTERNAL_IP=$(kubectl get svc hello-eks-a-lb -o jsonpath='{.spec.externalIP}')
-
Ensure the load balancer is working by curl’ing the IP you got in step 8
You should see something like this in the output
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
Thank you for using
███████╗██╗ ██╗███████╗
██╔════╝██║ ██╔╝██╔════╝
█████╗ █████╔╝ ███████╗
██╔══╝ ██╔═██╗ ╚════██║
███████╗██║ ██╗███████║
╚══════╝╚═╝ ╚═╝╚══════╝
█████╗ ███╗ ██╗██╗ ██╗██╗ ██╗██╗ ██╗███████╗██████╗ ███████╗
██╔══██╗████╗ ██║╚██╗ ██╔╝██║ ██║██║ ██║██╔════╝██╔══██╗██╔════╝
███████║██╔██╗ ██║ ╚████╔╝ ██║ █╗ ██║███████║█████╗ ██████╔╝█████╗
██╔══██║██║╚██╗██║ ╚██╔╝ ██║███╗██║██╔══██║██╔══╝ ██╔══██╗██╔══╝
██║ ██║██║ ╚████║ ██║ ╚███╔███╔╝██║ ██║███████╗██║ ██║███████╗
╚═╝ ╚═╝╚═╝ ╚═══╝ ╚═╝ ╚══╝╚══╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚══════╝
You have successfully deployed the hello-eks-a pod hello-eks-a-c5b9bc9d8-fx2fr
For more information check out
https://anywhere.eks.amazonaws.com
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
BGP Configuration on Network Switch Side
BGP configuration will vary depending upon network vendor equipment and local network environment. Listed below are the basic conceptual configuration steps for BGP operation. Included with each step is a sample configuration from a Cisco Switch (Cisco Nexus 9000) running in NX-OS mode. You will need to find similar steps in your network vendor equipment’s manual for BGP configuration on your specific switch.
-
Configure BGP local AS, router ID, and timers
router bgp 65000
router-id 10.69.5.1
timers bgp 15 45
log-neighbor-changes
-
Configure BGP neighbors
BGP neighbors can be configured individually or as a subnet
a. Individual BGP neighbors
Determine the IP addresses of each of the EKS-A nodes via VMWare console or DHCP server allocation.
In the example below, node IP addresses are 10.69.20.165, 10.69.20.167, and 10.69.20.170.
Note that remote-as is the AS used as the bgp_as value in the generated example manifest above.
neighbor 10.69.20.165
remote-as 65200
address-family ipv4 unicast
soft-reconfiguration inbound always
neighbor 10.69.20.167
remote-as 65200
address-family ipv4 unicast
soft-reconfiguration inbound always
neighbor 10.69.20.170
remote-as 65200
address-family ipv4 unicast
soft-reconfiguration inbound always
b. Subnet-based BGP neighbors
Determine the subnet address and netmask of the EKS-A nodes.
In this example the EKS-A nodes are on 10.69.20.0/24 subnet.
Note that remote-as is the AS used as the bgp_as value in the generated example manifest above.
neighbor 10.69.20.0/24
remote-as 65200
address-family ipv4 unicast
soft-reconfiguration inbound always
-
Verify bgp neighbors are established with each node
switch% show ip bgp summary
information for VRF default, address family IPv4 Unicast
BGP router identifier 10.69.5.1, local AS number 65000
BGP table version is 181, IPv4 Unicast config peers 7, capable peers 7
32 network entries and 63 paths using 11528 bytes of memory
BGP attribute entries [16/2752], BGP AS path entries [6/48]
BGP community entries [0/0], BGP clusterlist entries [0/0]
3 received paths for inbound soft reconfiguration
3 identical, 0 modified, 0 filtered received paths using 0 bytes
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.69.20.165 4 65200 34283 34276 181 0 0 5d20h 1
10.69.20.167 4 65200 34543 34531 181 0 0 5d20h 1
10.69.20.170 4 65200 34542 34530 181 0 0 5d20h 1
-
Verify routes learned from EKS-A cluster match the external IP address assigned by kube-vip LoadBalancer configuration
In the example below, 10.35.10.13 is the external kube-vip LoadBalancer IP
switch% show ip bgp neighbors 10.69.20.165 received-routes
Peer 10.69.20.165 routes for address family IPv4 Unicast:
BGP table version is 181, Local Router ID is 10.69.5.1
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup, 2 - best2
Network Next Hop Metric LocPrf Weight Path
*>e10.35.10.13/32 10.69.20.165 0 65200 i
4.1.2.2 - Alternative: MetalLB Service-type Load Balancer
How to set up MetalLB for Service-type Load Balancer
The purpose of this document is to walk you through getting set up with MetalLB Kubernetes Load Balancer for your cluster.
This is suggested as an alternative if your networking requirements do not allow you to use Kube-Vip
.
MetalLB is a native Kubernetes load balancing solution for bare-metal Kubernetes clusters.
Detailed information about MetalLB can be found here
.
Prerequisites
You will need Helm installed on your system as this is the easiest way to deploy MetalLB.
Helm can be installed from here
.
MetalLB installation is described here
Steps
-
Enable strict ARP as it’s required for MetalLB
kubectl get configmap kube-proxy -n kube-system -o yaml | \
sed -e "s/strictARP: false/strictARP: true/" | \
kubectl apply -f - -n kube-system
-
Pull helm repo for metalLB
helm repo add metallb https://metallb.github.io/metallb
-
Create an override file to specify LB IP range
LB-IP-RANGE can be a CIDR block like 198.18.210.0/24 or range like 198.18.210.0-198.18.210.10
cat << 'EOF' >> values.yaml
configInline:
address-pools:
- name: default
protocol: layer2
addresses:
- <LB-IP-range>
EOF
-
Install metalLB on your cluster
helm install metallb metallb/metallb -f values.yaml
-
Deploy the Hello EKS Anywhere
test application.
kubectl apply -f https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml
-
Expose the hello-eks-a deployment
kubectl expose deployment hello-eks-a --port=80 --type=LoadBalancer --name=hello-eks-a-lb
-
Get the load balancer external IP
EXTERNAL_IP=$(kubectl get svc hello-eks-a-lb -o jsonpath='{.spec.externalIP}')
-
Hit the external ip
4.1.3 - Add an ingress controller
How to deploy an ingress controller for simple host or URL-based HTTP routing into workload running in EKS-A
A production-quality Kubernetes cluster requires planning and preparation for various networking features.
The purpose of this document is to walk you through getting set up with a recommended Kubernetes Ingress Controller for EKS Anywhere.
Ingress Controller is essential in order to have routing rules that decide how external users access services running in a Kubernetes cluster. It enables efficient distribution of incoming network traffic among multiple backend services.
Current Recommendation: Emissary-ingress
We currently recommend using Emissary-ingress Kubernetes Ingress Controller by Ambassador. Emissary-ingress allows you to route and secure traffic to your cluster with an Open Source Kubernetes-native API Gateway. Detailed information about Emissary-ingress can be found here
.
Setting up Emissary-ingress for Ingress Controller
-
Deploy the Hello EKS Anywhere
test application.
kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
-
Set up kube-vip service type: Load Balancer in your cluster by following the instructions here
.
Alternatively, you can set up MetalLB Load Balancer by following the instructions here
-
Install Ambassador CRDs and ClusterRoles and RoleBindings
kubectl apply -f "https://www.getambassador.io/yaml/ambassador/ambassador-crds.yaml"
kubectl apply -f "https://www.getambassador.io/yaml/ambassador/ambassador-rbac.yaml"
-
Create Ambassador Service with Type LoadBalancer.
kubectl apply -f - <<EOF
---
apiVersion: v1
kind: Service
metadata:
name: ambassador
spec:
type: LoadBalancer
externalTrafficPolicy: Local
ports:
- port: 80
targetPort: 8080
selector:
service: ambassador
EOF
-
Create a Mapping on your cluster. This Mapping tells Emissary-ingress to route all traffic inbound to the /backend/ path to the quote Service.
kubectl apply -f - <<EOF
---
apiVersion: getambassador.io/v2
kind: Mapping
metadata:
name: hello-backend
spec:
prefix: /backend/
service: hello-eks-a
EOF
-
Store the Emissary-ingress load balancer IP address to a local environment variable. You will use this variable to test accessing your service.
export EMISSARY_LB_ENDPOINT=$(kubectl get svc ambassador -o "go-template={{range .status.loadBalancer.ingress}}{{or .ip .hostname}}{{end}}")
-
Test the configuration by accessing the service through the Emissary-ingress load balancer.
curl -Lk http://$EMISSARY_LB_ENDPOINT/backend/
NOTE: URL base path will need to match what is specified in the prefix exactly, including the trailing ‘/’
You should see something like this in the output
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
Thank you for using
███████╗██╗ ██╗███████╗
██╔════╝██║ ██╔╝██╔════╝
█████╗ █████╔╝ ███████╗
██╔══╝ ██╔═██╗ ╚════██║
███████╗██║ ██╗███████║
╚══════╝╚═╝ ╚═╝╚══════╝
█████╗ ███╗ ██╗██╗ ██╗██╗ ██╗██╗ ██╗███████╗██████╗ ███████╗
██╔══██╗████╗ ██║╚██╗ ██╔╝██║ ██║██║ ██║██╔════╝██╔══██╗██╔════╝
███████║██╔██╗ ██║ ╚████╔╝ ██║ █╗ ██║███████║█████╗ ██████╔╝█████╗
██╔══██║██║╚██╗██║ ╚██╔╝ ██║███╗██║██╔══██║██╔══╝ ██╔══██╗██╔══╝
██║ ██║██║ ╚████║ ██║ ╚███╔███╔╝██║ ██║███████╗██║ ██║███████╗
╚═╝ ╚═╝╚═╝ ╚═══╝ ╚═╝ ╚══╝╚══╝ ╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚══════╝
You have successfully deployed the hello-eks-a pod hello-eks-a-c5b9bc9d8-fx2fr
For more information check out
https://anywhere.eks.amazonaws.com
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
4.1.4 - Secure connectivity with CNI and Network Policy
How to validate the setup of Cilium CNI and deploy network policies to secure workload connectivity.
EKS Anywhere uses Cilium
for pod networking and security.
Cilium is installed by default as a Kubernetes CNI plugin and so is already running in your EKS Anywhere cluster.
This section provides information about:
-
Understanding Cilium components and requirements
-
Validating your Cilium networking setup.
-
Using Cilium to securing workload connectivity using Kubernetes Network Policy.
Cilium Components
The primary Cilium Agent runs as a DaemonSet on each Kubernetes node. Each cluster also includes a Cilium Operator Deployment to handle certain cluster-wide operations. For EKS Anywhere, Cilium is configured to use the Kubernetes API server as the identity store, so no etcd cluster connectivity is required.
In a properly working environment, each Kubernetes node should have a Cilium Agent pod (cilium-WXYZ
) in “Running” and ready (1/1) state.
By default there will be two
Cilium Operator pods (cilium-operator-123456-WXYZ
) in “Running” and ready (1/1) state on different Kubernetes nodes for high-availability.
Run the following command to ensure all cilium related pods are in a healthy state.
kubectl get pods -n kube-system | grep cilium
Example output for this command in a 3 node environment is:
kube-system cilium-fsjmd 1/1 Running 0 4m
kube-system cilium-nqpkv 1/1 Running 0 4m
kube-system cilium-operator-58ff67b8cd-jd7rf 1/1 Running 0 4m
kube-system cilium-operator-58ff67b8cd-kn6ss 1/1 Running 0 4m
kube-system cilium-zz4mt 1/1 Running 0 4m
Network Connectivity Requirements
To provide pod connectivity within an on-premises environment, the Cilium agent implements an overlay network using the GENEVE tunneling protocol. As a result,
UDP port 6081 connectivity MUST be allowed by any firewall running between Kubernetes nodes running the Cilium agent.
Allowing ICMP Ping (type = 8, code = 0) as well as TCP port 4240 is also recommended in order for Cilium Agents to validate node-to-node connectivity as
part of internal status reporting.
Validating Connectivity
Cilium includes a connectivity check YAML that can be deployed into a test namespace in order to validate proper installation and connectivity within a Kubernetes cluster. If the connectivity check passes, all pods created by the YAML manifest will reach “Running” and ready (1/1) state. We recommend running this test only once you have multiple worker nodes in your environment to ensure you are validating cross-node connectivity.
It is important that this test is run in a dedicated namespace, with no existing network policy. For example:
kubectl create ns cilium-test
kubectl apply -n cilium-test -f https://docs.isovalent.com/v1.10/public/connectivity-check-eksa.yaml
Once all pods have started, simply checking the status of pods in this namespace will indicate whether the tests have passed:
kubectl get pods -n cilium-test
Successful test output will show all pods in a “Running” and ready (1/1) state:
NAME READY STATUS RESTARTS AGE
echo-a-d576c5f8b-zlfsk 1/1 Running 0 59s
echo-b-787dc99778-sxlcc 1/1 Running 0 59s
echo-b-host-675cd8cfff-qvvv8 1/1 Running 0 59s
host-to-b-multi-node-clusterip-6fd884bcf7-pvj5d 1/1 Running 0 58s
host-to-b-multi-node-headless-79f7df47b9-8mzbp 1/1 Running 0 58s
pod-to-a-57695cc7ff-6tqpv 1/1 Running 0 59s
pod-to-a-allowed-cnp-7b6d5ff99f-4rhrs 1/1 Running 0 59s
pod-to-a-denied-cnp-6887b57579-zbs2t 1/1 Running 0 59s
pod-to-b-intra-node-hostport-7d656d7bb9-6zjrl 1/1 Running 0 57s
pod-to-b-intra-node-nodeport-569d7c647-76gn5 1/1 Running 0 58s
pod-to-b-multi-node-clusterip-fdf45bbbc-8l4zz 1/1 Running 0 59s
pod-to-b-multi-node-headless-64b6cbdd49-9hcqg 1/1 Running 0 59s
pod-to-b-multi-node-hostport-57fc8854f5-9d8m8 1/1 Running 0 58s
pod-to-b-multi-node-nodeport-54446bdbb9-5xhfd 1/1 Running 0 58s
pod-to-external-1111-56548587dc-rmj9f 1/1 Running 0 59s
pod-to-external-fqdn-allow-google-cnp-5ff4986c89-z4h9j 1/1 Running 0 59s
Afterward, simply delete the namespace to clean-up the connectivity test:
kubectl delete ns cilium-test
Kubernetes Network Policy
By default, all Kubernetes workloads within a cluster can talk to any other workloads in the cluster, as well as any workloads outside the cluster. To enable a stronger security posture, Cilium implements the Kubernetes Network Policy specification to provide identity-aware firewalling / segmentation of Kubernetes workloads.
Network policies are defined as Kubernetes YAML specifications that are applied to a particular namespaces to describe that connections should be allowed to or from a given set of pods. These network policies are “identity-aware” in that they describe workloads within the cluster using Kubernetes metadata like namespace and labels, rather than by IP Address.
Basic network policies are validated as part of the above Cilium connectivity check test.
For next steps on leveraging Network Policy, we encourage you to explore:
Additional Cilium Features
Many advanced features of Cilium are not yet enabled as part of EKS Anywhere, including: Hubble observability, DNS-aware and HTTP-Aware Network Policy, Multi-cluster Routing, Transparent Encryption, and Advanced Load-balancing.
Please contact the EKS Anywhere team if you are interested in leveraging these advanced features along with EKS Anywhere.
4.2 - Cluster management
Common tasks for managing clusters.
4.2.1 - Cluster management overview
Overview of tools and interfaces for managing EKS Anywhere clusters
The content in this page will describe the tools and interfaces available to an administrator after an EKS Anywhere cluster is up and running.
It will also describe which administrative actions done:
- Directly in Kubernetes itself (such as adding nodes with
kubectl
)
- Through the EKS Anywhere API (such as deleting a cluster with
eksctl
).
- Through tools which interface with the Kubernetes API (such as managing a cluster with
terraform
)
Note that direct changes to OVAs before nodes are deployed is not yet supported.
However, we are working on a solution for that issue.
4.2.2 - Scale cluster
How to scale your cluster
4.2.2.1 - Scale Bare Metal cluster
How to scale your Bare Metal cluster
Before you can scale up nodes on a Bare Metal cluster, you must ensure you have enough available hardware for the scale up operation to function.
For scale down operation, you can skip directly to the scale commands.
To check if you have enough available hardware for scale up, you can use the kubectl
command below to check if there are hardware with the selector labels corresponding to the controlplane/worker node group and without the ownerName
label.
kubectl get hardware -n eksa-system --show-labels
For example, if you want to scale a worker node group with selector label type=worker-group-1
, then you must have an additional hardware object in your cluster with the label type=worker-group-1
that doesn’t have the ownerName
label.
In the command shown below, eksa-worker2
matches the selector label and it doesn’t have the ownerName
label. Thus, it can be used to scale up worker-group-1
by 1.
kubectl get hardware -n eksa-system --show-labels
NAME STATE LABELS
eksa-controlplane type=controlplane,v1alpha1.tinkerbell.org/ownerName=abhnvp-control-plane-template-1656427179688-9rm5f,v1alpha1.tinkerbell.org/ownerNamespace=eksa-system
eksa-worker1 type=worker-group-1,v1alpha1.tinkerbell.org/ownerName=abhnvp-md-0-1656427179689-9fqnx,v1alpha1.tinkerbell.org/ownerNamespace=eksa-system
eksa-worker2 type=worker-group-1
If you don’t have any available hardware that match this requirement in the cluster, you can setup a new hardware CSV
and then run the following command to push the additional hardware to your cluster
eksctl generate hardware -z <hardware.csv> | kubectl apply -f -
Once you verify you have the additional hardware available, you are ready to scale your cluster.
To scale a worker node group:
kubectl scale machinedeployments -n eksa-system <workerNodeGroupName> --replicas <num replicas>
To scale control plane nodes:
kubectl scale kubeadmcontrolplane -n eksa-system <controlPlaneName> --replicas <num replicas>
4.2.2.2 - Scale vSphere cluster
How to scale your vSphere cluster
When you are scaling your vSphere EKS Anywhere cluster, consider the number of nodes you need for your control plane and for your data plane.
Each plane can be scaled horizontally (add more nodes) or vertically (provide nodes with more resources).
In each case you can scale the cluster manually, semi-automatically, or automatically.
See the Kubernetes Components
documentation to learn the differences between the control plane and the data plane (worker nodes).
Manual cluster scaling
Horizontally scaling the cluster is done by increasing the number for the control plane or worker node groups under the Cluster specification.
NOTE: If etcd is running on your control plane (the default configuration) you should scale your control plane in odd numbers (3, 5, 7…).
apiVersion: anywhere.eks.amazonaws.com/v1
kind: Cluster
metadata:
name: test-cluster
spec:
controlPlaneConfiguration:
count: 1 # increase this number to horizontally scale your control plane
...
workerNodeGroupsConfiguration:
- count: 1 # increase this number to horizontally scale your data plane
Vertically scaling your cluster is done by updating the machine config spec for your infrastructure provider.
For a vSphere cluster an example is
NOTE: Not all providers can be vertically scaled (e.g. bare metal)
apiVersion: anywhere.eks.amazonaws.com/v1
kind: VSphereMachineConfig
metadata:
name: test-machine
namespace: default
spec:
diskGiB: 25 # increase this number to make the VM disk larger
numCPUs: 2 # increase this number to add vCPUs to your VM
memoryMiB: 8192 # increase this number to add memory to your VM
Once you have made configuration updates you can apply the changes to your cluster.
If you are adding or removing a node, only the terminated nodes will be affected.
If you are vertically scaling your nodes, then all nodes will be replaced one at a time.
eksctl anywhere upgrade cluster -f cluster.yaml
Semi-automatic scaling
Scaling your cluster in a semi-automatic way still requires changing your cluster manifest configuration.
In a semi-automatic mode you change your cluster spec and then have automation make the cluster changes.
You can do this by storing your cluster config manifest in git and then having a CI/CD system deploy your changes.
Or you can use a GitOps controller to apply the changes.
To read more about making changes with the integrated Flux GitOps controller you can read how to Manage a cluster with GitOps
.
Automatic scaling
Automatic cluster scaling is designed for worker nodes and it is not advised to automatically scale your control plane.
Typically, autoscaling is done with a controller such as the Kubernetes Cluster Autoscaler
.
This has some concerns in an on-prem environment.
Automatic scaling does not work with some providers such as Docker or bare metal.
An EKS Anywhere cluster currently is not intended to be used with the Kubernetes Cluster Autoscaler so that it does not interfere with built in controllers or cause unexpected machine thrashing.
In future versions of EKS Anywhere we will be adding support for automatic autoscaling for specific providers.
4.2.3 - Etcd Backup and Restore
NOTE: External etcd topology is supported for vSphere clusters, but not yet for Bare Metal clusters.
This page contains steps for backing up a cluster by taking an etcd snapshot, and restoring the cluster from a snapshot. These steps are for an EKS Anywhere cluster provisioned using the external etcd topology (selected by default) and Ubuntu OVAs.
Use case
EKS-Anywhere clusters use etcd as the backing store. Taking a snapshot of etcd backs up the entire cluster data. This can later be used to restore a cluster back to an earlier state if required. Etcd backups can be taken prior to cluster upgrade, so if the upgrade doesn’t go as planned you can restore from the backup.
Backup
Etcd offers a built-in snapshot mechanism. You can take a snapshot using the etcdctl snapshot save
command by following the steps given below.
- Login to any one of the etcd VMs
ssh -i $PRIV_KEY ec2-user@$ETCD_VM_IP
- Run the etcdctl command to take a snapshot with the following steps
sudo su
source /etc/etcd/etcdctl.env
etcdctl snapshot save snapshot.db
chown ec2-user snapshot.db
- Exit the VM. Copy the snapshot from the VM to your local/admin setup where you can save snapshots in a secure place. Before running scp, make sure you don’t already have a snapshot file saved by the same name locally.
scp -i $PRIV_KEY ec2-user@$ETCD_VM_IP:/home/ec2-user/snapshot.db .
NOTE: This snapshot file contains all information stored in the cluster, so make sure you save it securely (encrypt it).
Restore
Restoring etcd is a 2-part process. The first part is restoring etcd using the snapshot, creating a new data-dir for etcd. The second part is replacing the current etcd data-dir with the one generated after restore. During etcd data-dir replacement, we cannot have any kube-apiserver instances running in the cluster. So we will first stop all instances of kube-apiserver and other controlplane components using the following steps for every controlplane VM:
Pausing Etcdadm controller reconcile
During restore, it is required to pause the Etcdadm controller reconcile for the target cluster (whether it is management or workload cluster). To do that, you need to add a cluster.x-k8s.io/paused
annotation to the target cluster’s etcdadmclusters
resource. For example,
kubectl annotate etcdadmclusters workload-cluster-1-etcd cluster.x-k8s.io/paused=true -n eksa-system --kubeconfig mgmt-cluster.kubeconfig
Stopping the controlplane components
- Login to a controlplane VM
ssh -i $PRIV_KEY ec2-user@$CONTROLPLANE_VM_IP
- Stop controlplane components by moving the static pod manifests under a temp directory:
sudo su
mkdir temp-manifests
mv /etc/kubernetes/manifests/*.yaml temp-manifests
- Repeat these steps for all other controlplane VMs
After this you can restore etcd from a saved snapshot using the etcdctl snapshot save
command following the steps given below.
Restoring from the snapshot
- The snapshot file should be made available in every etcd VM of the cluster. You can copy it to each etcd VM using this command:
scp -i $PRIV_KEY snapshot.db ec2-user@$ETCD_VM_IP:/home/ec2-user
- To run the etcdctl snapshot restore command, you need to provide the following configuration parameters:
- name: This is the name of the etcd member. The value of this parameter should match the value used while starting the member. This can be obtained by running:
export ETCD_NAME=$(cat /etc/etcd/etcd.env | grep ETCD_NAME | awk -F'=' '{print $2}')
- initial-advertise-peer-urls: This is the advertise peer URL with which this etcd member was configured. It should be the exact value with which this etcd member was started. This can be obtained by running:
export ETCD_INITIAL_ADVERTISE_PEER_URLS=$(cat /etc/etcd/etcd.env | grep ETCD_INITIAL_ADVERTISE_PEER_URLS | awk -F'=' '{print $2}')
- initial-cluster: This should be a comma-separated mapping of etcd member name and its peer URL. For this, get the
ETCD_NAME
and ETCD_INITIAL_ADVERTISE_PEER_URLS
values for each member and join them. And then use this exact value for all etcd VMs. For example, for a 3 member etcd cluster this is what the value would look like (The command below cannot be run directly without substituting the required variables and is meant to be an example)
export ETCD_INITIAL_CLUSTER=${ETCD_NAME_1}=${ETCD_INITIAL_ADVERTISE_PEER_URLS_1},${ETCD_NAME_2}=${ETCD_INITIAL_ADVERTISE_PEER_URLS_2},${ETCD_NAME_3}=${ETCD_INITIAL_ADVERTISE_PEER_URLS_3}
- initial-cluster-token: Set this to a unique value and use the same value for all etcd members of the cluster. It can be any value such as
etcd-cluster-1
as long as it hasn’t been used before.
- Gather the required env vars for the restore command
cat <<EOF >> restore.env
export ETCD_NAME=$(cat /etc/etcd/etcd.env | grep ETCD_NAME | awk -F'=' '{print $2}')
export ETCD_INITIAL_ADVERTISE_PEER_URLS=$(cat /etc/etcd/etcd.env | grep ETCD_INITIAL_ADVERTISE_PEER_URLS | awk -F'=' '{print $2}')
EOF
cat /etc/etcd/etcdctl.env >> restore.env
- Make sure you form the correct
ETCD_INITIAL_CLUSTER
value using all etcd members, and set it as an env var in the restore.env file created in the above step.
- Once you have obtained all the right values, run the following commands to restore etcd replacing the required values:
sudo su
source restore.env
etcdctl snapshot restore snapshot.db --name=${ETCD_NAME} --initial-cluster=${ETCD_INITIAL_CLUSTER} --initial-cluster-token=etcd-cluster-1 --initial-advertise-peer-urls=${ETCD_INITIAL_ADVERTISE_PEER_URLS}
- This is going to create a new data-dir for the restored contents under a new directory
{ETCD_NAME}.etcd
. To start using this, restart etcd with the new data-dir with the following steps:
systemctl stop etcd.service
mv /var/lib/etcd/member /var/lib/etcd/member.bak
mv ${ETCD_NAME}.etcd/member /var/lib/etcd/
- Perform this directory swap on all etcd VMs, and then start etcd again on those VMs
systemctl start etcd.service
NOTE: Until the etcd process is started on all VMs, it might appear stuck on the VMs where it was started first, but this should be temporary.
Starting the controlplane components
- Login to a controlplane VM
ssh -i $PRIV_KEY ec2-user@$CONTROLPLANE_VM_IP
- Start the controlplane components by moving back the static pod manifests from under the temp directory to the /etc/kubernetes/manifests directory:
mv temp-manifests/*.yaml /etc/kubernetes/manifests
- Repeat these steps for all other controlplane VMs
- It may take a few minutes for the kube-apiserver and the other components to get restarted. After this you should be able to access all objects present in the cluster at the time the backup was taken.
Resuming Etcdadm controller reconcile
Resume Etcdadm controller reconcile for the target cluster by removing the cluster.x-k8s.io/paused
annotation in the target cluster’s etcdadmclusters
resource. For example,
kubectl annotate etcdadmclusters workload-cluster-1-etcd cluster.x-k8s.io/paused- -n eksa-system
4.2.4 - Verify cluster
How to verify an EKS Anywhere cluster is running properly
To verify that a cluster control plane is up and running, use the kubectl
command to show that the control plane pods are all running.
kubectl get po -A -l control-plane=controller-manager
NAMESPACE NAME READY STATUS RESTARTS AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-57b99f579f-sd85g 2/2 Running 0 47m
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager-79cdf98fb8-ll498 2/2 Running 0 47m
capi-system capi-controller-manager-59f4547955-2ks8t 2/2 Running 0 47m
capi-webhook-system capi-controller-manager-bb4dc9878-2j8mg 2/2 Running 0 47m
capi-webhook-system capi-kubeadm-bootstrap-controller-manager-6b4cb6f656-qfppd 2/2 Running 0 47m
capi-webhook-system capi-kubeadm-control-plane-controller-manager-bf7878ffc-rgsm8 2/2 Running 0 47m
capi-webhook-system capv-controller-manager-5668dbcd5-v5szb 2/2 Running 0 47m
capv-system capv-controller-manager-584886b7bd-f66hs 2/2 Running 0 47m
You may also check the status of the cluster control plane resource directly.
This can be especially useful to verify clusters with multiple control plane nodes after an upgrade.
kubectl get kubeadmcontrolplanes.controlplane.cluster.x-k8s.io
NAME INITIALIZED API SERVER AVAILABLE VERSION REPLICAS READY UPDATED UNAVAILABLE
supportbundletestcluster true true v1.20.7-eks-1-20-6 1 1 1
To verify that the expected number of cluster worker nodes are up and running, use the kubectl
command to show that nodes are Ready
.
This will confirm that the expected number of worker nodes are present.
Worker nodes are named using the cluster name followed by the worker node group name (example: my-cluster-md-0)
kubectl get nodes
NAME STATUS ROLES AGE VERSION
supportbundletestcluster-md-0-55bb5ccd-mrcf9 Ready <none> 4m v1.20.7-eks-1-20-6
supportbundletestcluster-md-0-55bb5ccd-zrh97 Ready <none> 4m v1.20.7-eks-1-20-6
supportbundletestcluster-mdrwf Ready control-plane,master 5m v1.20.7-eks-1-20-6
To test a workload in your cluster you can try deploying the hello-eks-anywhere
.
4.2.5 - Add cluster integrations
How to add integrations to an EKS Anywhere cluster
EKS Anywhere offers AWS support for certain third-party vendor components,
namely Ubuntu TLS, Cilium, and Flux.
It also provides flexibility for you to integrate with your choice of tools in other areas.
Below is a list of example third-party tools your consideration.
For a full list of partner integration options, please visit Amazon EKS Anywhere Partner page
.
Note
The solutions listed on this page have not been tested by AWS and are not covered by the EKS Anywhere Support Subscription.
4.2.6 - Reboot nodes
How to properly reboot a node in an EKS Anywhere cluster
If you need to reboot a node in your cluster for maintenance or any other reason, performing the following steps will help prevent possible disruption of services on those nodes:
Warning
Rebooting a cluster node as described here is good for all nodes, but is critically important when rebooting a Bottlerocket node running the boots
service on a Bare Metal cluster.
If it does go down while running the boots
service, the Bottlerocket node will not be able to boot again until the boots
service is restored on another machine. This is because Bottlerocket must get its address from a DHCP service.
-
Cordon the node so no further workloads are scheduled to run on it:
kubectl cordon <node-name>
-
Drain the node of all current workloads:
kubectl drain <node-name>
-
Shut down. Using the appropriate method for your provider, shut down the node.
-
Perform system maintenance or other task you need to do on the node and boot up the node.
-
Uncordon the node so that it can begin receiving workloads again.
kubectl uncordon <node-name>
4.2.7 - Connect cluster to console
Connect a cluster to the EKS console
The AWS EKS Connector lets you connect your EKS Anywhere cluster to the AWS EKS console, where you can see your the EKS Anywhere cluster, its configuration, workloads, and their status.
EKS Connector is a software agent that can be deployed on your EKS Anywhere cluster, enabling the cluster to register with the EKS console.
Visit AWS EKS Connector
for details.
4.2.8 - License cluster
How to license your cluster.
If you are are licensing an existing cluster, apply the following secret to your cluster (replacing my-license-here
with your license):
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: eksa-license
namespace: eksa-system
stringData:
license: "my-license-here"
type: Opaque
EOF
4.2.9 - Upgrade cluster
How to perform a cluster version upgrade
NOTE: Cluster upgrade is supported for vSphere clusters, but is not yet available for Bare Metal clusters
EKS Anywhere provides the command upgrade
, which allows you to upgrade
various aspects of your EKS Anywhere cluster.
When you run eksctl anywhere upgrade cluster -f ./cluster.yaml
, EKS Anywhere runs a set of preflight checks to ensure your cluster is ready to be upgraded.
EKS Anywhere then performs the upgrade, modifying your cluster to match the updated specification.
The upgrade command also upgrades core components of EKS Anywhere and lets the user enjoy the latest features, bug fixes and security patches.
Minor Version Upgrades
Kubernetes has minor releases three times per year
and EKS Distro follows a similar cadence.
EKS Anywhere will add support for new EKS Distro releases as they are released, and you are advised to upgrade your cluster when possible.
Cluster upgrades are not handled automatically and require administrator action to modify the cluster specification and perform an upgrade.
You are advised to upgrade your clusters in development environments first and verify your workloads and controllers are compatible with the new version.
Cluster upgrades are performed in place using a rolling process (similar to Kubernetes Deployments).
Upgrades can only happen one minor version at a time (e.g. 1.20
-> 1.21
).
Control plane components will be upgraded before worker nodes.
A new VM is created with the new version and then an old VM is removed.
This happens one at a time until all the control plane components have been upgraded.
Core component upgrades
EKS Anywhere upgrade
also supports upgrading the following core components:
- Core CAPI
- CAPI providers
- Cilium CNI plugin
- Cert-manager
- Etcdadm CAPI provider
- EKS Anywhere controllers and CRDs
- GitOps controllers (Flux) - this is an optional component, will be upgraded only if specified
The latest versions of these core EKS Anywhere components are embedded into a bundles manifest that the CLI uses to fetch the latest versions
and image builds needed for each component upgrade.
The command detects both component version changes and new builds of the same versioned component.
If there is a new Kubernetes version that is going to get rolled out, the core components get upgraded before the Kubernetes
version.
Irrespective of a Kubernetes version change, the upgrade command will always upgrade the internal EKS
Anywhere components mentioned above to their latest available versions. All upgrade changes are backwards compatible.
Check upgrade components
Before you perform an upgrade, check the current and new versions of components that are ready to upgrade by typing:
Management Cluster
eksctl anywhere upgrade plan cluster -f mgmt-cluster.yaml
Workload Cluster
eksctl anywhere upgrade plan cluster -f workload-cluster.yaml --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
The output should appear similar to the following:
Worker node group name not specified. Defaulting name to md-0.
Warning: The recommended number of control plane nodes is 3 or 5
Worker node group name not specified. Defaulting name to md-0.
Checking new release availability...
NAME CURRENT VERSION NEXT VERSION
EKS-A v0.0.0-dev+build.1000+9886ba8 v0.0.0-dev+build.1105+46598cb
cluster-api v1.0.2+e8c48f5 v1.0.2+1274316
kubeadm v1.0.2+92c6d7e v1.0.2+aa1a03a
vsphere v1.0.1+efb002c v1.0.1+ef26ac1
kubadm v1.0.2+f002eae v1.0.2+f443dcf
etcdadm-bootstrap v1.0.2-rc3+54dcc82 v1.0.0-rc3+df07114
etcdadm-controller v1.0.2-rc3+a817792 v1.0.0-rc3+a310516
To the format output in json, add -o json
to the end of the command line.
To perform a cluster upgrade you can modify your cluster specification kubernetesVersion
field to the desired version.
As an example, to upgrade a cluster with version 1.20 to 1.21 you would change your spec
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: dev
spec:
controlPlaneConfiguration:
count: 1
endpoint:
host: "198.18.99.49"
machineGroupRef:
kind: VSphereMachineConfig
name: dev
...
kubernetesVersion: "1.21"
...
NOTE: If you have a custom machine image for your nodes you may also need to update your vsphereMachineConfig
with a new template
.
and then you will run the command
Management Cluster
eksctl anywhere upgrade cluster -f mgmt-cluster.yaml
Workload Cluster
eksctl anywhere upgrade cluster -f workload-cluster.yaml --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
This will upgrade the cluster specification (if specified), upgrade the core components to the latest available versions and apply the changes using the provisioner controllers.
Example output:
✅ control plane ready
✅ worker nodes ready
✅ nodes ready
✅ cluster CRDs ready
✅ cluster object present on workload cluster
✅ upgrade cluster kubernetes version increment
✅ validate immutable fields
🎉 all cluster upgrade preflight validations passed
Performing provider setup and validations
Pausing EKS-A cluster controller reconcile
Pausing Flux kustomization
GitOps field not specified, pause flux kustomization skipped
Creating bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Moving cluster management from workload to bootstrap cluster
Upgrading workload cluster
Moving cluster management from bootstrap to workload cluster
Applying new EKS-A cluster resource; resuming reconcile
Resuming EKS-A controller reconciliation
Updating Git Repo with new EKS-A cluster spec
GitOps field not specified, update git repo skipped
Forcing reconcile Git repo with latest commit
GitOps not configured, force reconcile flux git repo skipped
Resuming Flux kustomization
GitOps field not specified, resume flux kustomization skipped
Upgradeable Cluster Attributes
EKS Anywhere upgrade
supports upgrading more than just the kubernetesVersion
,
allowing you to upgrade a number of fields simultaneously with the same procedure.
Upgradeable Attributes
Cluster
:
kubernetesVersion
controlPlaneConfig.count
controlPlaneConfigurations.machineGroupRef.name
workerNodeGroupConfigurations.count
workerNodeGroupConfigurations.machineGroupRef.name
etcdConfiguration.externalConfiguration.machineGroupRef.name
identityProviderRefs
(Only for kind:OIDCConfig
, kind:AWSIamConfig
is immutable)
VSphereMachineConfig
:
datastore
diskGiB
folder
memoryMiB
numCPUs
resourcePool
template
users
OIDCConfig
:
clientID
groupsClaim
groupsPrefix
issuerUrl
requiredClaims.claim
requiredClaims.value
usernameClaim
usernamePrefix
EKS Anywhere upgrade
also supports adding more worker node groups post-creation.
To add more worker node groups, modify your cluster config file to define the additional group(s).
Example:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: dev
spec:
controlPlaneConfiguration:
...
workerNodeGroupConfigurations:
- count: 2
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-machines
name: md-0
- count: 2
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-machines
name: md-1
...
Worker node groups can use the same machineGroupRef as previous groups, or you can define a new machine configuration for your new group.
Troubleshooting
Attempting to upgrade a cluster with more than 1 minor release will result in receiving the following error.
✅ validate immutable fields
❌ validation failed {"validation": "Upgrade preflight validations", "error": "validation failed with 1 errors: WARNING: version difference between upgrade version (1.21) and server version (1.19) do not meet the supported version increment of +1", "remediation": ""}
Error: failed to upgrade cluster: validations failed
For more errors you can see the troubleshooting section
.
4.2.10 - Multus CNI plugin configuration
EKS Anywhere configuration for Multus CNI plugin
NOTE: Currently, Multus support is only available with the EKS Anywhere Bare Metal provider.
The vSphere provider, does not have multi-network support for cluster machines.
Once multiple network support is added to EKS Anywhere vSphere clusters, Multus CNI can be supported.
Multus CNI
is a container network interface plugin for Kubernetes that enables attaching multiple network interfaces to pods.
In Kubernetes, each pod has only one network interface by default, other than local loopback.
With Multus, you can create multi-homed pods that have multiple interfaces.
Multus acts a as ‘meta’ plugin that can call other CNI plugins to configure additional interfaces.
Pre-Requisites
Given that Multus CNI is used to create pods with multiple network interfaces, the cluster machines that these pods run on need to have multiple network interfaces attached and configured.
The interfaces on multi-homed pods need to map to these interfaces on the machines.
For Bare Metal clusters using the Tinkerbell provider, the cluster machines need to have multiple network interfaces cabled in and appropriate network configuration put in place during machine provisioning.
Overview of Multus setup
The following diagrams show the result of two applications (app1 and app2) running in pods that use the Multus plugin to communicate over two network interfaces (eth0 and net1) from within the pods.
The Multus plugin uses two network interfaces on the worker node (eth0 and eth1) to provide communications outside of the node.

Follow the procedure below to set up Multus as illustrated in the previous diagrams.
Deploying Multus using a Daemonset will spin up pods that install a Multus binary and configure Multus for usage in every node in the cluster.
Here are the steps for doing that.
-
Clone the Multus CNI repo:
git clone https://github.com/k8snetworkplumbingwg/multus-cni.git && cd multus-cni
-
Apply Multus daemonset to your EKS Anywhere cluster:
kubectl apply -f ./deployments/multus-daemonset-thick-plugin.yml
-
Verify that you have Multus pods running:
kubectl get pods --all-namespaces | grep -i multus
-
Check that Multus is running:
kubectl get pods -A | grep multus
Output:
kube-system kube-multus-ds-bmfjs 1/1 Running 0 3d1h
kube-system kube-multus-ds-fk2sk 1/1 Running 0 3d1h
Create Network Attachment Definition
You need to create a Network Attachment Definition for the CNI you wish to use as the plugin for the additional interface.
You can verify that your intended CNI plugin is supported by ensuring that the binary corresponding to that CNI plugin is present in the node’s /opt/cni/bin
directory.
Below is an example of a Network Attachment Definition yaml:
cat <<EOF | kubectl create -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ipvlan-conf
spec:
config: '{
"cniVersion": "0.3.0",
"type": "ipvlan",
"master": "eth1",
"mode": "l3",
"ipam": {
"type": "host-local",
"subnet": "198.17.0.0/24",
"rangeStart": "198.17.0.200",
"rangeEnd": "198.17.0.216",
"routes": [
{ "dst": "0.0.0.0/0" }
],
"gateway": "198.17.0.1"
}
}'
EOF
Note that eth1
is used as the master parameter.
This master parameter should match the interface name on the hosts in your cluster.
Verify the configuration
Type the following to verify the configuration you created:
kubectl get network-attachment-definitions
kubectl describe network-attachment-definitions ipvlan-conf
Deploy sample applications with network attachment
-
Create a sample application 1 (app1) with network annotation created in the previous steps:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: app1
annotations:
k8s.v1.cni.cncf.io/networks: ipvlan-conf
spec:
containers:
- name: app1
command: ["/bin/sh", "-c", "trap : TERM INT; sleep infinity & wait"]
image: alpine
EOF
-
Create a sample application 2 (app2) with the network annotation created in the previous step:
cat <<EOF | kubectl apply -f - kube
apiVersion: v1
kind: Pod
metadata:
name: app2
annotations:
k8s.v1.cni.cncf.io/networks: ipvlan-conf
spec:
containers:
- name: app2
command: ["/bin/sh", "-c", "trap : TERM INT; sleep infinity & wait"]
image: alpine
EOF
-
Verify that the additional interfaces were created on these application pods using the defined network attachment:
kubectl exec -it app1 -- ip a
Output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
*2: net1@if3: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 00:50:56:9a:84:3b brd ff:ff:ff:ff:ff:ff
inet 198.17.0.200/24 brd 198.17.0.255 scope global net1
valid_lft forever preferred_lft forever
inet6 fe80::50:5600:19a:843b/64 scope link
valid_lft forever preferred_lft forever*
31: eth0@if32: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
link/ether 0a:9e:a0:b4:21:05 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.218/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::89e:a0ff:feb4:2105/64 scope link
valid_lft forever preferred_lft forever
kubectl exec -it app2 -- ip a
Output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
*2: net1@if3: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 00:50:56:9a:84:3b brd ff:ff:ff:ff:ff:ff
inet 198.17.0.201/24 brd 198.17.0.255 scope global net1
valid_lft forever preferred_lft forever
inet6 fe80::50:5600:29a:843b/64 scope link
valid_lft forever preferred_lft forever*
33: eth0@if34: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
link/ether b2:42:0a:67:c0:48 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.210/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::b042:aff:fe67:c048/64 scope link
valid_lft forever preferred_lft forever
Note that both pods got the new interface net1. Also, the additional network interface on each pod got assigned an IP address out of the range specified by the Network Attachment Definition.
-
Test the network connectivity across these pods for Multus interfaces:
kubectl exec -it app1 -- ping -I net1 198.17.0.201
Output:
PING 198.17.0.201 (198.17.0.201): 56 data bytes
64 bytes from 198.17.0.201: seq=0 ttl=64 time=0.074 ms
64 bytes from 198.17.0.201: seq=1 ttl=64 time=0.077 ms
64 bytes from 198.17.0.201: seq=2 ttl=64 time=0.078 ms
64 bytes from 198.17.0.201: seq=3 ttl=64 time=0.077 ms
kubectl exec -it app2 -- ping -I net1 198.17.0.200
Output:
PING 198.17.0.200 (198.17.0.200): 56 data bytes
64 bytes from 198.17.0.200: seq=0 ttl=64 time=0.074 ms
64 bytes from 198.17.0.200: seq=1 ttl=64 time=0.077 ms
64 bytes from 198.17.0.200: seq=2 ttl=64 time=0.078 ms
64 bytes from 198.17.0.200: seq=3 ttl=64 time=0.077 ms
4.2.11 - Authenticate cluster with AWS IAM Authenticator
Configure AWS IAM Authenticator to authenticate user access to the cluster
AWS IAM Authenticator Support (optional)
EKS Anywhere supports configuring AWS IAM Authenticator
as an authentication provider for clusters.
When you create a cluster with IAM Authenticator enabled, EKS Anywhere
- Installs
aws-iam-authenticator
server as a DaemonSet on the workload cluster.
- Configures the Kubernetes API Server to communicate with iam authenticator using a token authentication webhook
.
- Creates the necessary ConfigMaps based on user options.
Note
Enabling IAM Authenticator needs to be done during cluster creation.
Create IAM Authenticator enabled cluster
Generate your cluster configuration and add the necessary IAM Authenticator configuration. For a full spec reference check AWSIamConfig
.
Create an EKS Anywhere cluster as follows:
CLUSTER_NAME=my-cluster-name
eksctl anywhere create cluster -f ${CLUSTER_NAME}.yaml
Example AWSIamConfig configuration
This example uses a region in the default aws partition and EKSConfigMap
as backendMode
. Also, the IAM ARNs are mapped to the kubernetes system:masters
group.
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
# IAM Authenticator
identityProviderRefs:
- kind: AWSIamConfig
name: aws-iam-auth-config
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: AWSIamConfig
metadata:
name: aws-iam-auth-config
spec:
awsRegion: us-west-1
backendMode:
- EKSConfigMap
mapRoles:
- roleARN: arn:aws:iam::XXXXXXXXXXXX:role/myRole
username: myKubernetesUsername
groups:
- system:masters
mapUsers:
- userARN: arn:aws:iam::XXXXXXXXXXXX:user/myUser
username: myKubernetesUsername
groups:
- system:masters
partition: aws
Note
When using backend mode
CRD
, the
mapRoles
and
mapUsers
are not required. For more details on configuring CRD mode, refer to
CRD
Authenticating with IAM Authenticator
After your cluster is created you may now use the mapped IAM ARNs to authenticate to the cluster.
EKS Anywhere generates a KUBECONFIG
file in your local directory that uses aws-iam-authenticator client
to authenticate with the cluster. The file can be found at
${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-aws.kubeconfig
Steps
-
Ensure the IAM role/user ARN mapped in the cluster is configured on the local machine from which you are trying to access the cluster.
-
Install the aws-iam-authenticator client
binary on the local machine.
- We recommend installing the binary referenced in the latest
release manifest
of the kubernetes version used when creating the cluster.
- The below commands can be used to fetch the installation uri for clusters created with
1.21
kubernetes version and OS linux
.
CLUSTER_NAME=my-cluster-name
KUBERNETES_VERSION=1.21
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
EKS_D_MANIFEST_URL=$(kubectl get bundles $CLUSTER_NAME -o jsonpath="{.spec.versionsBundles[?(@.kubeVersion==\"$KUBERNETES_VERSION\")].eksD.manifestUrl}")
OS=linux
curl -fsSL $EKS_D_MANIFEST_URL | yq e '.status.components[] | select(.name=="aws-iam-authenticator") | .assets[] | select(.os == '"\"$OS\""' and .type == "Archive") | .archive.uri' -
-
Export the generated IAM Authenticator based KUBECONFIG
file.
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-aws.kubeconfig
-
Run kubectl
commands to check cluster access. Example,
Modify IAM Authenticator mappings
EKS Anywhere supports modifying IAM ARNs that are mapped on the cluster. The mappings can be modified by either running the upgrade cluster
command or using GitOps
.
upgrade command
The mapRoles
and mapUsers
lists in AWSIamConfig
can be modified when running the upgrade cluster
command from EKS Anywhere.
As an example, let’s add another IAM user to the above example configuration.
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: AWSIamConfig
metadata:
name: aws-iam-auth-config
spec:
...
mapUsers:
- userARN: arn:aws:iam::XXXXXXXXXXXX:user/myUser
username: myKubernetesUsername
groups:
- system:masters
- userARN: arn:aws:iam::XXXXXXXXXXXX:user/anotherUser
username: anotherKubernetesUsername
partition: aws
and then run the upgrade command
CLUSTER_NAME=my-cluster-name
eksctl anywhere upgrade cluster -f ${CLUSTER_NAME}.yaml
EKS Anywhere now updates the role mappings for IAM authenticator in the cluster and a new user gains access to the cluster.
GitOps
If the cluster created has GitOps configured, then the mapRoles
and mapUsers
list in AWSIamConfig
can be modified by the GitOps controller. For GitOps configuration details refer to Manage Cluster with GitOps
.
- Clone your git repo and modify the cluster specification.
The default path for the cluster file is:
clusters/$CLUSTER_NAME/eksa-system/eksa-cluster.yaml
- Modify the
AWSIamConfig
object and add to the mapRoles
and mapUsers
object lists.
- Commit the file to your git repository
git add eksa-cluster.yaml
git commit -m 'Adding IAM Authenticator access ARNs'
git push origin main
EKS Anywhere GitOps Controller now updates the role mappings for IAM authenticator in the cluster and users gains access to the cluster.
4.2.12 - Manage cluster with GitOps
Use Flux to manage clusters with GitOps
NOTE: GitOps support is available for vSphere clusters, but is not yet available for Bare Metal clusters
GitOps Support (optional)
EKS Anywhere supports a GitOps
workflow for the management of your cluster.
When you create a cluster with GitOps enabled, EKS Anywhere will automatically commit your cluster configuration to the provided GitHub repository and install a GitOps toolkit on your cluster which watches that committed configuration file.
You can then manage the scale of the cluster by making changes to the version controlled cluster configuration file and committing the changes.
Once a change has been detected by the GitOps controller running in your cluster, the scale of the cluster will be adjusted to match the committed configuration file.
If you’d like to learn more about GitOps, and the associated best practices, check out this introduction from Weaveworks
.
NOTE: Installing a GitOps controller needs to be done during cluster creation.
In the event that GitOps installation fails, EKS Anywhere cluster creation will continue.
Supported Cluster Properties
Currently, you can manage a subset of cluster properties with GitOps:
Management Cluster
Cluster
:
workerNodeGroupConfigurations.count
workerNodeGroupConfigurations.machineGroupRef.name
WorkerNodes VSphereMachineConfig
:
datastore
diskGiB
folder
memoryMiB
numCPUs
resourcePool
template
users
Workload Cluster
Cluster
:
kubernetesVersion
controlPlaneConfiguration.count
controlPlaneConfiguration.machineGroupRef.name
workerNodeGroupConfigurations.count
workerNodeGroupConfigurations.machineGroupRef.name
identityProviderRefs
(Only for kind:OIDCConfig
, kind:AWSIamConfig
is immutable)
ControlPlane / Etcd / WorkerNodes VSphereMachineConfig
:
datastore
diskGiB
folder
memoryMiB
numCPUs
resourcePool
template
users
OIDCConfig
:
clientID
groupsClaim
groupsPrefix
issuerUrl
requiredClaims.claim
requiredClaims.value
usernameClaim
usernamePrefix
Any other changes to the cluster configuration in the git repository will be ignored.
If an immutable field has been changed in a Git repository, there are two ways to find the error message:
- If a notification webhook is set up, check the error message in notification channel.
- Check the Flux Kustomization Controller log:
kubectl logs -f -n flux-system kustomize-controller-******
for error message containing text similar to Invalid value: 1: field is immutable
Getting Started with EKS Anywhere GitOps with Github
In order to use GitOps to manage cluster scaling, you need a couple of things:
Create a GitHub Personal Access Token
Create a Personal Access Token (PAT)
to access your provided GitHub repository.
It must be scoped for all repo
permissions.
NOTE: GitOps configuration only works with hosted github.com and will not work on a self-hosted GitHub Enterprise instances.
This PAT should have at least the following permissions:

NOTE: The PAT must belong to the owner
of the repository
or, if using an organization as the owner
, the creator of the PAT
must have repo permission in that organization.
You need to set your PAT as the environment variable $EKSA_GITHUB_TOKEN to use it during cluster creation:
export EKSA_GITHUB_TOKEN=ghp_MyValidPersonalAccessTokenWithRepoPermissions
Create GitOps configuration repo
If you have an existing repo you can set that as your repository name in the configuration.
If you specify a repo in your FluxConfig
which does not exist EKS Anywhere will create it for you.
If you would like to create a new repo you can click here
to create a new repo.
If your repository contains multiple cluster specification files, store them in sub-folders and specify the configuration path
in your cluster specification.
In order to accommodate the management cluster feature, the CLI will now structure the repo directory following a new convention:
clusters
└── management-cluster
├── flux-system
│ └── ...
├── management-cluster
│ └── eksa-system
│ └── eksa-cluster.yaml
├── workload-cluster-1
│ └── eksa-system
│ └── eksa-cluster.yaml
└── workload-cluster-2
└── eksa-system
└── eksa-cluster.yaml
By default, Flux kustomization reconciles at the management cluster’s root level (./clusters/management-cluster
), so both the management cluster and all the workload clusters it manages are synced.
Example GitOps cluster configuration for Github
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: mynewgitopscluster
spec:
... # collapsed cluster spec fields
# Below added for gitops support
gitOpsRef:
kind: FluxConfig
name: my-cluster-name
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
name: my-cluster-name
spec:
github:
personal: true
repository: mygithubrepository
owner: mygithubusername
Create a GitOps enabled cluster
Generate your cluster configuration and add the GitOps configuration.
For a full spec reference see the Cluster Spec reference
.
NOTE: After your cluster has been created the cluster configuration will automatically be committed to your git repo.
-
Create an EKS Anywhere cluster with GitOps enabled.
CLUSTER_NAME=gitops
eksctl anywhere create cluster -f ${CLUSTER_NAME}.yaml
Test GitOps controller
After your cluster has been created, you can test the GitOps controller by modifying the cluster specification.
-
Clone your git repo and modify the cluster specification.
The default path for the cluster file is:
clusters/$CLUSTER_NAME/eksa-system/eksa-cluster.yaml
-
Modify the workerNodeGroupsConfigurations[0].count
field with your desired changes.
-
Commit the file to your git repository
git add eksa-cluster.yaml
git commit -m 'Scaling nodes for test'
git push origin main
-
The flux controller will automatically make the required changes.
If you updated your node count, you can use this command to see the current node state.
Getting Started with EKS Anywhere GitOps with any Git source
You can configure EKS Anywhere to use a generic git repository as the source of truth for GitOps by providing a FluxConfig
with a git
configuration.
EKS Anywhere requires a valid SSH Known Hosts file and SSH Private key in order to connect to your repository and bootstrap Flux.
Create a Git repository for use by EKS Anywhere and Flux
When using the git
provider, EKS Anywhere requires that the configuration repository be pre-initialized.
You may re-use an existing repo or use the same repo for multiple management clusters.
Create the repository through your git provider and initialize it with a README.md
documenting the purpose of the repository.
Create a Private Key for use by EKS Anywhere and Flux
EKS Anywhere requires a private key to authenticate to your git repository, push the cluster configuration, and configure Flux for ongoing management and monitoring of that configuration.
The private key should have permissions to read and write from the repository in question.
It is recommended that you create a new private key for use exclusively by EKS Anywhere.
You can use ssh-keygen
to generate a new key.
ssh-keygen -t ecdsa -C "my_email@example.com"
Please consult the documentation for your git provider to determine how to add your corresponding public key; for example, if using Github enterprise, you can find the documentation for adding a public key to your github account here
.
Add your private key to your SSH agent on your management machine
When using a generic git provider, EKS Anywhere requires that your management machine has a running SSH agent and the private key be added to that SSH agent.
You can start an SSH agent and add your private key by executing the following in your current session:
eval "$(ssh-agent -s)" && ssh-add $EKSA_GIT_PRIVATE_KEY
Create an SSH Known Hosts file for use by EKS Anywhere and Flux
EKS Anywhere needs an SSH known hosts file to verify the identity of the remote git host.
A path to a valid known hosts file must be provided to the EKS Anywhere command line via the environment variable EKSA_GIT_KNOWN_HOSTS
.
For example, if you have a known hosts file at /home/myUser/.ssh/known_hosts
that you want EKS Anywhere to use, set the environment variable EKSA_GIT_KNOWN_HOSTS
to the path to that file, /home/myUser/.ssh/known_hosts
.
export EKSA_GIT_KNOWN_HOSTS=/home/myUser/.ssh/known_hosts
While you can use your pre-existing SSH known hosts file, it is recommended that you generate a new known hosts file for use by EKS Anywhere that contains only the known-hosts entries required for your git host and key type.
For example, if you wanted to generate a known hosts file for a git server located at example.com
with key type ecdsa
, you can use the OpenSSH utility ssh-keyscan
:
ssh-keyscan -t ecdsa example.com >> my_eksa_known_hosts
This will generate a known hosts file which contains only the entry necessary to verify the identity of example.com when using an ecdsa
based private key file.
Example FluxConfig cluster configuration for a generic git provider
For a full spec reference see the Cluster Spec reference
.
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: mynewgitopscluster
spec:
... # collapsed cluster spec fields
# Below added for gitops support
gitOpsRef:
kind: FluxConfig
name: my-cluster-name
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
name: my-cluster-name
spec:
git:
repositoryUrl: ssh://git@provider.com/myAccount/myClusterGitopsRepo.git
sshKeyAlgorithm: ecdsa
4.2.13 - Manage cluster with Terraform
Use Terraform to manage EKS Anywhere Clusters
NOTE: Support for using Terraform to manage and modify an EKS Anywhere cluster is available for vSphere clusters, but not yet for Bare Metal clusters.
This guide explains how you can use Terraform to manage and modify an EKS Anywhere cluster.
The guide is meant for illustrative purposes and is not a definitive approach to building production systems with Terraform and EKS Anywhere.
At its heart, EKS Anywhere is a set of Kubernetes CRDs, which define an EKS Anywhere cluster,
and a controller, which moves the cluster state to match these definitions.
These CRDs, and the EKS-A controller, live on the management cluster or
on a self-managed cluster.
We can manage a subset of the fields in the EKS Anywhere CRDs with any tool that can interact with the Kubernetes API, like kubectl
or, in this case, the Terraform Kubernetes provider.
In this guide, we’ll show you how to import your EKS Anywhere cluster into Terraform state and
how to scale your EKS Anywhere worker nodes using the Terraform Kubernetes provider.
Prerequisites
-
An existing EKS Anywhere cluster
-
the latest version of Terraform
-
the latest version of tfk8s
, a tool for converting Kubernetes manifest files to Terraform HCL
Guide
- Create an EKS-A management cluster, or a self-managed stand-alone cluster.
-
Set up the Terraform Kubernetes provider
Make sure your KUBECONFIG environment variable is set
export KUBECONFIG=/path/to/my/kubeconfig.kubeconfig
Set an environment variable with your cluster name:
export MY_EKSA_CLUSTER="myClusterName"
cat << EOF > ./provider.tf
provider "kubernetes" {
config_path = "${KUBECONFIG}"
}
EOF
-
Get tfk8s
and use it to convert your EKS Anywhere cluster Kubernetes manifest into Terraform HCL:
- Install tfk8s
- Convert the manifest into Terraform HCL:
kubectl get cluster ${MY_EKSA_CLUSTER} -o yaml | tfk8s --strip -o ${MY_EKSA_CLUSTER}.tf
-
Configure the Terraform cluster resource definition generated in step 2
- Set
metadata.generation
as a computed field
. Add the following to your cluster resource configuration
computed_fields = ["metadata.generated"]
field_manager {
force_conflicts = true
}
- Add the
namespace
default
to the metadata
of the cluster
- Remove the
generation
field from the metadata
of the cluster
- Your Terraform cluster resource should look similar to this:
computed_fields = ["metadata.generated"]
field_manager {
force_conflicts = true
}
manifest = {
"apiVersion" = "anywhere.eks.amazonaws.com/v1alpha1"
"kind" = "Cluster"
"metadata" = {
"name" = "MyClusterName"
"namespace" = "default"
}
-
Import your EKS Anywhere cluster into terraform state:
terraform init
terraform import kubernetes_manifest.cluster_${MY_EKSA_CLUSTER} "apiVersion=anywhere.eks.amazonaws.com/v1alpha1,kind=Cluster,namespace=default,name=${MY_EKSA_CLUSTER}"
After you import
your cluster, you will need to run terraform apply
one time to ensure that the manifest
field of your cluster resource is in-sync.
This will not change the state of your cluster, but is a required step after the initial import.
The manifest
field stores the contents of the associated kubernetes manifest, while the object
field stores the actual state of the resource.
-
Modify Your Cluster using Terraform
- Modify the
count
value of one of your workerNodeGroupConfigurations
, or another mutable field, in the configuration stored in ${MY_EKSA_CLUSTER}.tf
file.
- Check the expected diff between your cluster state and the modified local state via
terraform plan
You should see in the output that the worker node group configuration count field (or whichever field you chose to modify) will be modified by Terraform.
-
Now, actually change your cluster to match the local configuration:
-
Observe the change to your cluster. For example:
Appendix
Terraform K8s Provider https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs
tfk8s https://github.com/jrhouston/tfk8s
4.2.14 - Delete cluster
How to delete an EKS Anywhere cluster
NOTE: EKS Anywhere Bare Metal clusters do not yet support separate workload and management clusters. Use the instructions for Deleting a management cluster to delete a Bare Metal cluster.
Deleting a workload cluster
Follow these steps to delete your EKS Anywhere cluster that is managed by a separate management cluster.
To delete a workload cluster, you will need:
- name of your workload cluster
- kubeconfig of your workload cluster
- kubeconfig of your management cluster
Run the following commands to delete the cluster:
-
Set up CLUSTER_NAME
and KUBECONFIG
environment variables:
export CLUSTER_NAME=eksa-w01-cluster
export KUBECONFIG=${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
export MANAGEMENT_KUBECONFIG=<path-to-management-cluster-kubeconfig>
-
Run the delete command:
Deleting a management cluster
Follow these steps to delete your management cluster.
To delete a cluster you will need:
- cluster name or cluster configuration
- kubeconfig of your cluster
Run the following commands to delete the cluster:
-
Set up CLUSTER_NAME
and KUBECONFIG
environment variables:
export CLUSTER_NAME=mgmt
export KUBECONFIG=${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
-
Run the delete command:
-
If you are running the delete command from the directory which has the cluster folder with ${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.yaml
:
eksctl anywhere delete cluster ${CLUSTER_NAME}
-
Otherwise, use this command to manually specify the clusterconfig file path:
export CONFIG_FILE=<path-to-config-file>
eksctl anywhere delete cluster -f ${CONFIG_FILE}
Example output:
Performing provider setup and validations
Creating management cluster
Installing cluster-api providers on management cluster
Moving cluster management from workload cluster
Deleting workload cluster
Clean up Git Repo
GitOps field not specified, clean up git repo skipped
🎉 Cluster deleted!
For vSphere, this will delete all of the VMs that were created in your provider.
For Bare Metal, the servers will be powered off if BMC information has been provided.
If your workloads created external resources such as external DNS entries or load balancer endpoints you may need to delete those resources manually.
4.3 - Cluster troubleshooting
Troubleshooting your EKS Anywhere Cluster
4.3.1 - Troubleshooting
Troubleshooting EKS Anywhere clusters
This guide covers EKS Anywhere troubleshooting. It is divided into the following sections:
You may want to search this document for a fragment of the error you are seeing.
General troubleshooting
Increase eksctl anywhere output
If you’re having trouble running eksctl anywhere
you may get more verbose output with the -v 6
option. The highest level of verbosity is -v 9
and the default level of logging is level equivalent to -v 0
.
Cannot run docker commands
The EKS Anywhere binary requires access to run docker commands without using sudo
.
If you’re using a Linux distribution you will need to be using Docker 20.x.x add your user needs to be part of the docker group.
To add your user to the docker group you can use.
sudo usermod -a -G docker $USER
Now you need to log out and back in to get the new group permissions.
Minimum requirements for docker version have not been met
Error: failed to validate docker: minimum requirements for docker version have not been met. Install Docker version 20.x.x or above
Ensure you are running Docker 20.x.x for example:
% docker --version
Docker version 20.10.6, build 370c289
Minimum requirements for docker version have not been met on Mac OS
Error: EKS Anywhere does not support Docker desktop versions between 4.3.0 and 4.4.1 on macOS
Error: EKS Anywhere requires Docker desktop to be configured to use CGroups v1. Please set `deprecatedCgroupv1:true` in your `~/Library/Group\\ Containers/group.com.docker/settings.json` file
Ensure you are running Docker Desktop 4.4.2 or newer and have set "deprecatedCgroupv1": true
in your settings.json file
% defaults read /Applications/Docker.app/Contents/Info.plist CFBundleShortVersionString
4.42
% docker info --format '{{json .CgroupVersion}}'
"1"
ECR access denied
Error: failed to create cluster: unable to initialize executables: failed to setup eks-a dependencies: Error response from daemon: pull access denied for public.ecr.aws/***/cli-tools, repository does not exist or may require 'docker login': denied: Your authorization token has expired. Reauthenticate and try again.
All images needed for EKS Anywhere are public and do not need authentication. Old cached credentials could trigger this error.
Remove cached credentials by running:
docker logout public.ecr.aws
error unmarshaling JSON: while decoding JSON: json: unknown field “spec”
Error: loading config file "cluster.yaml": error unmarshaling JSON: while decoding JSON: json: unknown field "spec"
Use eksctl anywhere create cluster -f cluster.yaml
instead of eksctl create cluster -f cluster.yaml
to create an EKS Anywhere cluster.
Error: old cluster config file exists under my-cluster, please use a different clusterName to proceed
Error: old cluster config file exists under my-cluster, please use a different clusterName to proceed
The my-cluster
directory already exists in the current directory.
Either use a different cluster name or move the directory.
failed to create cluster: node(s) already exist for a cluster with the name
Performing provider setup and validations
Creating new bootstrap cluster
Error create bootstrapcluster {"error": "error creating bootstrap cluster: error executing create cluster: ERROR: failed to create cluster: node(s) already exist for a cluster with the name \"cluster-name\"\n, try rerunning with --force-cleanup to force delete previously created bootstrap cluster"}
Failed to create cluster {"error": "error creating bootstrap cluster: error executing create cluster: ERROR: failed to create cluster: node(s) already exist for a cluster with the name \"cluster-name\"\n, try rerunning with --force-cleanup to force delete previously created bootstrap cluster"}ry rerunning with --force-cleanup to force delete previously created bootstrap cluster"}
A bootstrap cluster already exists with the same name. If you are sure the cluster is not being used, you may use the --force-cleanup
option to eksctl anywhere
to delete the cluster or you may delete the cluster with kind delete cluster --name <cluster-name>
. If you do not have kind
installed, you may use docker stop
to stop the docker container running the KinD cluster.
Memory or disk resource problem
There are various disk and memory issues that can cause problems.
Make sure docker is configured with enough memory.
Make sure the system wide Docker memory configuration provides enough RAM for the bootstrap cluster.
Make sure you do not have unneeded KinD clusters running kind get clusters
.
You may want to delete unneeded clusters with kind delete cluster --name <cluster-name>
.
If you do not have kind installed, you may install it from https://kind.sigs.k8s.io/ or use docker ps
to see the KinD clusters and docker stop
to stop the cluster.
Make sure you do not have any unneeded Docker containers running with docker ps
.
Terminate any unneeded Docker containers.
Make sure Docker isn’t out of disk resources.
If you don’t have any other docker containers running you may want to run docker system prune
to clean up disk space.
You may want to restart Docker.
To restart Docker on Ubuntu sudo systemctl restart docker
.
Waiting for cert-manager to be available… Error: timed out waiting for the condition
Failed to create cluster {"error": "error initializing capi resources in cluster: error executing init: Fetching providers\nInstalling cert-manager Version=\"v1.1.0\"\nWaiting for cert-manager to be available...\nError: timed out waiting for the condition\n"}
This is likely a Memory or disk resource problem
.
You can also try using techniques from Generic cluster unavailable
.
The connection to the server localhost:8080 was refused
Performing provider setup and validations
Creating new bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Error initializing capi in bootstrap cluster {"error": "error waiting for capi-kubeadm-control-plane-controller-manager in namespace capi-kubeadm-control-plane-system: error executing wait: The connection to the server localhost:8080 was refused - did you specify the right host or port?\n"}
Failed to create cluster {"error": "error waiting for capi-kubeadm-control-plane-controller-manager in namespace capi-kubeadm-control-plane-system: error executing wait: The connection to the server localhost:8080 was refused - did you specify the right host or port?\n"}
This is likely a Memory or disk resource problem
.
Generic cluster unavailable
Troubleshoot more by inspecting bootstrap cluster or workload cluster (depending on the stage of failure) using kubectl commands.
kubectl get pods -A --kubeconfig=<kubeconfig>
kubectl get nodes -A --kubeconfig=<kubeconfig>
kubectl get logs <podname> -n <namespace> --kubeconfig=<kubeconfig>
....
Capv troubleshooting guide: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/master/docs/troubleshooting.md#debugging-issues
Bootstrap cluster fails to come up
If your bootstrap cluster has problems you may get detailed logs by looking at the files created under the ${CLUSTER_NAME}/logs
folder. The capv-controller-manager log file will surface issues with vsphere specific configuration while the capi-controller-manager log file might surface other generic issues with the cluster configuration passed in.
You may also access the logs from your bootstrap cluster directly as below:
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
kubectl logs -f -n capv-system -l control-plane="controller-manager" -c manager
It also might be useful to start a shell session on the docker container running the bootstrap cluster by running docker ps
and then docker exec -it <container-id> bash
the kind container.
Bootstrap cluster fails to come up
Error: creating bootstrap cluster: executing create cluster: ERROR: failed to create cluster: node(s) already exist for a cluster with the name \"cluster-name\"
, try rerunning with —force-cleanup to force delete previously created bootstrap cluster
Cluster creation fails because a cluster of the same name already exists.
Try running the eksctl anywhere create cluster
again, adding the --force-cleanup
option.
If that doesn’t work, you can manually delete the old cluster:
kind delete cluster --name cluster-name
Creating new workload cluster hangs or fails
Cluster creation appears to be hung waiting for the Control Plane to be ready.
If the CLI is hung on this message for over 30 mins, something likely failed during the OS provisioning:
Waiting for Control Plane to be ready
Or if cluster creation times out on this step and fails with the following messages:
Support bundle archive created {"path": "support-bundle-2022-06-28T00_41_24.tar.gz"}
Analyzing support bundle {"bundle": "CLUSTER_NAME/generated/bootstrap-cluster-2022-06-28T00:41:24Z-bundle.yaml", "archive": "support-bundle-2022-06-28T00_41_24.tar.gz"}
Analysis output generated {"path": "CLUSTER_NAME/generated/bootstrap-cluster-2022-06-28T00:43:40Z-analysis.yaml"}
collecting workload cluster diagnostics
Error: waiting for workload cluster control plane to be ready: executing wait: error: timed out waiting for the condition on clusters/CLUSTER_NAME
In either of those cases, the following steps can help you determine the problem:
-
Export the kind cluster’s kubeconfig file:
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
-
If you have provided BMC information:
-
Check all of the machines that the EKS Anywhere CLI has picked up from the pool of hardware in the CSV file:
-
Check if those nodes are powered on. If any of those nodes are not powered on after a while then it could be possible that BMC credentials are invalid. You can verify it by checking the logs:
kubectl get bmt -n eksa-system
kubectl get bmt <bmt-name> -n eksa-system -o yaml
Validate BMC credentials are correct if a connection error is observed on the bmt
resource. Note that “IPMI over LAN” must be enabled in the BMC configuration for the bmt
resource to communicate successfully.
-
If the machine is powered on but you see linuxkit is not running, then Tinkerbell failed to serve the node via iPXE. In this case, you would want to:
-
Check the boots service logs from the machine where you are running the CLI to see if it received and/or responded to the request:
-
Confirm no other DHCP service responded to the request and check for any errors in the BMC console. Other DHCP servers on the network can result in race conditions and should be avoided by configuring the other server to block all MAC addresses and exclude all IP addresses used by EKS Anywhere.
-
If you see Welcome to LinuxKit
, click enter in the BMC console to access the LinuxKit terminal. Run the following commands to check if the tink-worker container is running.
docker ps -a
docker logs <container-id>
-
If the machine has already started provisioning the OS and it’s in irrecoverable state, get the workflow of the provisioning/provisioned machine using:
kubectl get workflows -n eksa-system
kubectl describe workflow/<workflow-name> -n eksa-system
Check all the actions and their status to determine if all actions have been executed successfully or not. If the stream-image has action failed, it’s likely due to a timeout or network related issue. You can also provide your own image_url
by specifying osImageURL
under datacenter spec.
vSphere troubleshooting
EKSA_VSPHERE_USERNAME is not set or is empty
❌ Validation failed {"validation": "vsphere Provider setup is valid", "error": "failed setup and validations: EKSA_VSPHERE_USERNAME is not set or is empty", "remediation": ""}
Two environment variables need to be set and exported in your environment to create clusters successfully.
Be sure to use single quotes around your user name and password to avoid shell manipulation of these values.
export EKSA_VSPHERE_USERNAME='<vSphere-username>'
export EKSA_VSPHERE_PASSWORD='<vSphere-password>'
vSphere authentication failed
❌ Validation failed {"validation": "vsphere Provider setup is valid", "error": "error validating vCenter setup: vSphere authentication failed: govc: ServerFaultCode: Cannot complete login due to an incorrect user name or password.\n", "remediation": ""}
Error: failed to create cluster: validations failed
Two environment variables need to be set and exported in your environment to create clusters successfully.
Be sure to use single quotes around your user name and password to avoid shell manipulation of these values.
export EKSA_VSPHERE_USERNAME='<vSphere-username>'
export EKSA_VSPHERE_PASSWORD='<vSphere-password>'
Issues detected with selected template
Issues detected with selected template. Details: - -1:-1:VALUE_ILLEGAL: No supported hardware versions among [vmx-15]; supported: [vmx-04, vmx-07, vmx-08, vmx-09, vmx-10, vmx-11, vmx-12, vmx-13].
Our upstream dependency on CAPV makes it a requirement that you use vSphere 6.7 update 3 or newer.
Make sure your ESXi hosts are also up to date.
Timed out waiting for the condition on deployments/capv-controller-manager
Failed to create cluster {"error": "error initializing capi in bootstrap cluster: error waiting for capv-controller-manager in namespace capv-system: error executing wait: error: timed out waiting for the condition on deployments/capv-controller-manager\n"}
Debug this problem using techniques from Generic cluster unavailable
.
Timed out waiting for the condition on clusters/
Failed to create cluster {"error": "error waiting for workload cluster control plane to be ready: error executing wait: error: timed out waiting for the condition on clusters/test-cluster\n"}
This can be an issue with the number of control plane and worker node replicas defined in your cluster yaml file.
Try to start off with a smaller number (3 or 5 is recommended for control plane) in order to bring up the cluster.
This error can also occur because your vCenter server is using self-signed certificates and you have insecure
set to true in the generated cluster yaml.
To check if this is the case, run the commands below:
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
kubectl get machines
If all the machines are in Provisioning
phase, this is most likely the issue.
To resolve the issue, set insecure
to false
and thumbprint
to the TLS thumbprint of your vCenter server in the cluster yaml and try again.
"msg"="discovered IP address"
The aforementioned log message can also appear with an address value of the control plane in either of the ${CLUSTER_NAME}/logs/capv-controller-manager.log file
or the capv-controller-manager pod log which can be extracted with the following command,
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
kubectl logs -f -n capv-system -l control-plane="controller-manager" -c manager
Make sure you are choosing an ip in your network range that does not conflict with other VMs.
https://anywhere.eks.amazonaws.com/docs/reference/clusterspec/vsphere/#controlplaneconfigurationendpointhost-required
Workload VM is created on vSphere but can not power on
A similar issue is the VM does power on but does not show any logs on the console and does not have any IPs assigned.
This issue can occur if the resourcePool
that the VM uses does not have enough CPU or memory resources to run a VM.
To resolve this issue, increase the CPU and/or memory reservations or limits for the resourcePool.
Workload VMs start but Kubernetes not working properly
If the workload VMs start, but Kubernetes does not start or is not working properly, you may want to log onto the VMs and check the logs there.
If Kubernetes is at least partially working, you may use kubectl
to get the IPs of the nodes:
kubectl get nodes -o=custom-columns="NAME:.metadata.name,IP:.status.addresses[2].address"
If Kubernetes is not working at all, you can get the IPs of the VMs from vCenter or using govc
.
When you get the external IP you can ssh into the nodes using the private ssh key associated with the public ssh key you provided in your cluster configuration:
ssh -i <ssh-private-key> <ssh-username>@<external-IP>
create command stuck on Creating new workload cluster
There can we a few reasons if the create command is stuck on Creating new workload cluster
for over 30 min.
First, check the vSphere UI to see if any workload VM are created.
If any VMs are created, check to see if they have any IPv4 IPs assigned to them.
If there are no IPv4 IPs assigned to them, this is most likely because you don’t have a DHCP server configured for the network
configured in the cluster config yaml.
Ensure that you have DHCP running and run the create command again.
If there are any IPv4 IPs assigned, check if one of the VMs have the controlPlane IP specified in Cluster.spec.controlPlaneConfiguration.endpoint.host
in the clusterconfig yaml.
If this IP is not present on any control plane VM, make sure the network
has access to the following endpoints:
- public.ecr.aws
- anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
- distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
- d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
- api.github.com (only if GitOps is enabled)
If the IPv4 IPs are assigned to the VM and you have the workload kubeconfig under <cluster-name>/<cluster-name>-eks-a-cluster.kubeconfig
, you can use it to check vsphere-cloud-controller-manager
logs.
kubectl logs -n kube-system vsphere-cloud-controller-manager-<xxxxx> --kubeconfig <cluster-name>/<cluster-name>-eks-a-cluster.kubeconfig
If you see this message in the logs, it means your cluster nodes do not have access to vSphere, which is required for cluster to get to a ready state.
Failed to connect to <vSphere-FQDN>: connection refused
In this case, you need to enable inbound traffic from your cluster nodes on your vCenter’s management network.
If VMs are created, but they do not get a network connection and DHCP is not configured for your vSphere deployment, you may need to create your own DHCP server
.
If no VMs are created, check the capi-controller-manager
, capv-controller-manager
and capi-kubeadm-control-plane-controller-manager
logs using the commands mentioned in Generic cluster unavailable
section.
Cluster Deletion Fails
If cluster deletion fails, you may need to manually delete the VMs associated with the cluster.
The VMs should be named with the cluster name.
You can power off and delete from disk using the vCenter web user interface.
You may also use govc
:
govc find -type VirtualMachine --name '<cluster-name>*'
This will give you a list of virtual machines that should be associated with your cluster.
For each of the VMs you want to delete run:
VM_NAME=vm-to-destroy
govc vm.power -off -force $VM_NAME
govc object.destroy $VM_NAME
Troubleshooting GitOps integration
Failed cluster creation can sometimes leave behind cluster configuration files committed to your GitHub.com repository.
Make sure to delete these configuration files before you re-try eksctl anywhere create cluster
.
If these configuration files are not deleted, GitOps installation will fail but cluster creation will continue.
They’ll generally be located under the directory
clusters/$CLUSTER_NAME
if you used the default path in your flux
gitops
config.
Delete the entire directory named $CLUSTER_NAME.
Failed cluster creation can sometimes leave behind a completely empty GitHub.com repository.
This can cause the GitOps installation to fail if you re-try the creation of a cluster which uses this repository.
If cluster creation failure leaves behind an empty github repository, please manually delete the created GitHub.com repository before attempting cluster creation again.
Changes not syncing to cluster
Please remember that the only fields currently supported for GitOps are:
Cluster
Cluster.workerNodeGroupConfigurations.count
Cluster.workerNodeGroupConfigurations.machineGroupRef.name
Worker Nodes
VsphereMachineConfig.diskGiB
VsphereMachineConfig.numCPUs
VsphereMachineConfig.memoryMiB
VsphereMachineConfig.template
VsphereMachineConfig.datastore
VsphereMachineConfig.folder
VsphereMachineConfig.resourcePool
If you’ve changed these fields and they’re not syncing to the cluster as you’d expect,
check out the logs of the pod in the source-controller
deployment in the flux-system
namespaces.
If flux
is having a problem connecting to your GitHub repository the problem will be logged here.
$ kubectl get pods -n flux-system
NAME READY STATUS RESTARTS AGE
helm-controller-7d644b8547-k8wfs 1/1 Running 0 4h15m
kustomize-controller-7cf5875f54-hs2bt 1/1 Running 0 4h15m
notification-controller-776f7d68f4-v22kp 1/1 Running 0 4h15m
source-controller-7c4555748d-7c7zb 1/1 Running 0 4h15m
$ kubectl logs source-controller-7c4555748d-7c7zb -n flux-system
A well behaved flux pod will simply log the ongoing reconciliation process, like so:
{"level":"info","ts":"2021-07-01T19:58:51.076Z","logger":"controller.gitrepository","msg":"Reconciliation finished in 902.725344ms, next run in 1m0s","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"info","ts":"2021-07-01T19:59:52.012Z","logger":"controller.gitrepository","msg":"Reconciliation finished in 935.016754ms, next run in 1m0s","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"info","ts":"2021-07-01T20:00:52.982Z","logger":"controller.gitrepository","msg":"Reconciliation finished in 970.03174ms, next run in 1m0s","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
If there are issues connecting to GitHub, you’ll instead see exceptions in the source-controller
log stream.
For example, if the deploy key used by flux
has been deleted, you’d see something like this:
{"level":"error","ts":"2021-07-01T20:04:56.335Z","logger":"controller.gitrepository","msg":"Reconciler error","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system","error":"unable to clone 'ssh://git@github.com/youruser/gitops-vsphere-test', error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain"}
Other ways to troubleshoot GitOps integration
If you’re still having problems after deleting any empty EKS Anywhere created GitHub repositories and looking at the source-controller
logs.
You can look for additional issues by checking out the deployments in the flux-system
and eksa-system
namespaces and ensure they’re running and their log streams are free from exceptions.
$ kubectl get deployments -n flux-system
NAME READY UP-TO-DATE AVAILABLE AGE
helm-controller 1/1 1 1 4h13m
kustomize-controller 1/1 1 1 4h13m
notification-controller 1/1 1 1 4h13m
source-controller 1/1 1 1 4h13m
$ kubectl get deployments -n eksa-system
NAME READY UP-TO-DATE AVAILABLE AGE
eksa-controller-manager 1/1 1 1 4h13m
4.3.2 - Generating a Support Bundle
Using the Support Bundle with your EKS Anywhere Cluster
This guide covers the use of the EKS Anywhere Support Bundle for troubleshooting and support.
This allows you to gather cluster information, save it to your administrative machine, and perform analysis of the results.
EKS Anywhere leverages troubleshoot.sh
to collect
and analyze
kubernetes cluster logs,
cluster resource information, and other relevant debugging information.
EKS Anywhere has two Support Bundle commands:
eksctl anywhere generate support-bundle
will execute a support bundle on your cluster,
collecting relevant information, archiving it locally, and performing analysis of the results.
eksctl anywhere generate support-bundle-config
will generate a Support Bundle config yaml file for you to customize.
Do not add personally identifiable information (PII) or other confidential or sensitive information to your support bundle.
If you provide the support bundle to get support from AWS, it will be accessible to other AWS services, including AWS Support.
Collecting a Support Bundle and running analyzers
eksctl anywhere generate support-bundle
generate support-bundle
will allow you to quickly collect relevant logs and cluster resources and save them locally in an archive file.
This archive can then be used to aid in further troubleshooting and debugging.
If you provide a cluster configuration file containing your cluster spec using the -f
flag,
generate support-bundle
will customize the auto-generated support bundle collectors and analyzers
to match the state of your cluster.
If you provide a support bundle configuration file using the --bundle-config
flag,
for example one generated with generate support-bundle-config
,
generate support-bundle
will use the provided configuration when collecting information from your cluster and analyzing the results.
Flags:
--bundle-config string Bundle Config file to use when generating support bundle
-f, --filename string Filename that contains EKS-A cluster configuration
-h, --help help for support-bundle
--since string Collect pod logs in the latest duration like 5s, 2m, or 3h.
--since-time string Collect pod logs after a specific datetime(RFC3339) like 2021-06-28T15:04:05Z
-w, --w-config string Kubeconfig file to use when creating support bundle for a workload cluster
Collecting and analyzing a bundle
You only need to run a single command to generate a support bundle, collect information and analyze the output:
eksctl anywhere generate support-bundle -f myCluster.yaml
This command will collect the information from your cluster
and run an analysis of the collected information.
The collected information will be saved to your local disk in an archive which can be used for
debugging and obtaining additional in-depth support.
The analysis will be printed to your console.
Collect phase:
$ ./bin/eksctl anywhere generate support-bundle -f ./testcluster100.yaml
Collecting support bundle cluster-info
Collecting support bundle cluster-resources
Collecting support bundle secret
Collecting support bundle logs
Analyzing support bundle
Analysis phase:
Analyze Results
------------
Check PASS
Title: gitopsconfigs.anywhere.eks.amazonaws.com
Message: gitopsconfigs.anywhere.eks.amazonaws.com is present on the cluster
------------
Check PASS
Title: vspheredatacenterconfigs.anywhere.eks.amazonaws.com
Message: vspheredatacenterconfigs.anywhere.eks.amazonaws.com is present on the cluster
------------
Check PASS
Title: vspheremachineconfigs.anywhere.eks.amazonaws.com
Message: vspheremachineconfigs.anywhere.eks.amazonaws.com is present on the cluster
------------
Check PASS
Title: capv-controller-manager Status
Message: capv-controller-manager is running.
------------
Check PASS
Title: capv-controller-manager Status
Message: capv-controller-manager is running.
------------
Check PASS
Title: coredns Status
Message: coredns is running.
------------
Check PASS
Title: cert-manager-webhook Status
Message: cert-manager-webhook is running.
------------
Check PASS
Title: cert-manager-cainjector Status
Message: cert-manager-cainjector is running.
------------
Check PASS
Title: cert-manager Status
Message: cert-manager is running.
------------
Check PASS
Title: capi-kubeadm-control-plane-controller-manager Status
Message: capi-kubeadm-control-plane-controller-manager is running.
------------
Check PASS
Title: capi-kubeadm-bootstrap-controller-manager Status
Message: capi-kubeadm-bootstrap-controller-manager is running.
------------
Check PASS
Title: capi-controller-manager Status
Message: capi-controller-manager is running.
------------
Check PASS
Title: capi-controller-manager Status
Message: capi-controller-manager is running.
------------
Check PASS
Title: capi-kubeadm-control-plane-controller-manager Status
Message: capi-kubeadm-control-plane-controller-manager is running.
------------
Check PASS
Title: capi-kubeadm-control-plane-controller-manager Status
Message: capi-kubeadm-control-plane-controller-manager is running.
------------
Check PASS
Title: capi-kubeadm-bootstrap-controller-manager Status
Message: capi-kubeadm-bootstrap-controller-manager is running.
------------
Check PASS
Title: clusters.anywhere.eks.amazonaws.com
Message: clusters.anywhere.eks.amazonaws.com is present on the cluster
------------
Check PASS
Title: bundles.anywhere.eks.amazonaws.com
Message: bundles.anywhere.eks.amazonaws.com is present on the cluster
------------
Archive phase:
a support bundle has been created in the current directory: {"path": "support-bundle-2021-09-02T19_29_41.tar.gz"}
Generating a custom Support Bundle configuration for your EKS Anywhere Cluster
EKS Anywhere will automatically generate a support bundle based on your cluster configuration;
however, if you’d like to customize the support bundle to collect specific information,
you can generate your own support bundle configuration yaml for EKS Anywhere to run on your cluster.
eksctl anywhere generate support-bundle-config
will generate a default support bundle configuration and print it as yaml.
eksctl anywhere generate support-bundle-config -f myCluster.yaml
will generate a support bundle configuration customized to your cluster and print it as yaml.
To run a customized support bundle configuration yaml file on your cluster,
save this output to a file and run the command eksctl anywhere generate support-bundle
using the flag --bundle-config
.
eksctl anywhere generate support-bundle-config
Flags:
-f, --filename string Filename that contains EKS-A cluster configuration
-h, --help help for support-bundle-config
4.3.3 - Curated Packages Troubleshooting
Troubleshooting specific to curated packages
You must set and export the CURATED_PACKAGES_SUPPORT
environment variable before running any commands for packages to activate the feature flag.
export CURATED_PACKAGES_SUPPORT=true
The major component of Curated Packages is the package controller. If the container is not running or not running correctly, packages will not be installed. Generally it should be debugged like any other Kubernetes application. The first step is to check that the pod is running.
kubectl get pods -n eksa-packages
You should see one pod running with two containers
NAME READY STATUS RESTARTS AGE
eks-anywhere-packages-6c7db8bc6f-xg6bq 2/2 Running 0 3m35s
The describe command might help to get more detail on why there is a problem
kubectl describe pods -n eksa-packages
Logs of the controller can be seen in a normal Kubernetes fashion
kubectl logs deploy/eks-anywhere-packages -n eksa-packages controller
The general state of the package can be seen through the custom resources
kubectl get packages,packagebundles,packagebundlecontrollers -A
This will generate output similar to this
NAMESPACE NAME PACKAGE AGE STATE CURRENTVERSION TARGETVERSION DETAIL
eksa-packages package.packages.eks.amazonaws.com/my-test Test 2m33s installing v0.1.1-8b3810e1514b7432e032794842425accc837757a-helm (latest) loading helm chart my-test: locating helm chart oci://public.ecr.aws/l0g8r8j6/hello-eks-anywhere tag sha256:64ea03b119d2421f9206252ff4af4bf7cdc2823c343420763e0e6fc20bf03b68: failed to download "oci://public.ecr.aws/l0g8r8j6/hello-eks-anywhere" at version "v0.1.1-8b3810e1514b7432e032794842425accc837757a-helm"
NAMESPACE NAME STATE
eksa-packages packagebundle.packages.eks.amazonaws.com/v1-21-1001 active
NAMESPACE NAME STATE
eksa-packages packagebundlecontroller.packages.eks.amazonaws.com/eksa-packages-bundle-controller active
Looking at the output, you can see the active packagebundlecontroller and packagebundle. The state of the package is “installing”.
Error: curated packages installation is not supported in this release
Error: curated packages installation is not supported in this release
Curated packages is supported behind a feature flag, you must set and export the CURATED_PACKAGES_SUPPORT
environment variable before
export CURATED_PACKAGES_SUPPORT=true
Error: this command is currently not supported
Error: this command is currently not supported
Curated packages is supported behind a feature flag, you must set and export the CURATED_PACKAGES_SUPPORT
environment variable.
export CURATED_PACKAGES_SUPPORT=true
Package controller not running
If you do not see a pod or various resources for the package controller, it may be that it is not installed.
No resources found in eksa-packages namespace.
Most likely the cluster was created with an older version of the EKS Anywhere CLI or the feature flag was not enabled. If you run the version command, it should return v0.9.0
or later release.
Curated packages is supported behind a feature flag, you must set and export the CURATED_PACKAGES_SUPPORT
environment variable.
export CURATED_PACKAGES_SUPPORT=true
During cluster creation, you should see messages after the cluster is created when the package controller and any packages are installed.
🎉 Cluster created!
----------------------------------------------------------------------------------------------------------------
The EKS Anywhere package controller and the EKS Anywhere Curated Packages
(referred to as “features”) are provided as “preview features” subject to the AWS Service Terms,
(including Section 2 (Betas and Previews)) of the same. During the EKS Anywhere Curated Packages Public Preview,
the AWS Service Terms are extended to provide customers access to these features free of charge.
These features will be subject to a service charge and fee structure at ”General Availability“ of the features.
----------------------------------------------------------------------------------------------------------------
Installing curated packages controller on workload cluster
package.packages.eks.amazonaws.com/my-harbor created
ImagePullBackOff on Package or Package Controller
If a package or the package controller fails to start with ImagePullBackOff
NAME READY STATUS RESTARTS AGE
eks-anywhere-packages-6589449669-q7rjr 0/2 ImagePullBackOff 0 13h
This is most like because the machine running kubelet in your Kubernetes cluster cannot access the registry with the images or those images do not exist on that registry. Log into the machine and see if it has access to the images:
ctr image pull public.ecr.aws/eks-anywhere/eks-anywhere-packages@sha256:whateveritis
4.4 - EKS Anywhere curated package management
Common tasks for managing curated packages.
The main goal of EKS Anywhere curated packages is to make it easy to install, configure and maintain operational components in an EKS Anywhere cluster. EKS Anywhere curated packages offers to run secure and tested operational components on EKS Anywhere clusters. Please check out EKS Anywhere curated packages
for more details.
Check the existence of package controller
kubectl get pods -n eksa-packages | grep "eks-anywhere-packages"
Skip the following installation steps if the returned result is not empty.
Important
-
To install EKS Anywhere, create an EKS Anywhere cluster or review the EKS Anywhere system requirements. See the Getting started
guide for details.
-
Check if the version of eksctl anywhere
is v0.9.0
or above with the following commands:
-
Make sure cert-manager is up and running in the cluster.
Install package controller
-
Install the package controller
eksctl anywhere install packagecontroller --kube-version 1.21
-
Check the package controller
kubectl get pods -n eksa-packages
Example command output
NAME READY STATUS RESTARTS AGE
eks-anywhere-packages-57778bc88f-587tq 2/2 Running 0 16h
Curated package list
See packages
for the complete curated package list.
4.4.1 - Harbor
Install/upgrade/uninstall Harbor
Install
-
Generate the package configuration
eksctl anywhere generate package harbor --source cluster > harbor.yaml
-
Add the desired configuration to harbor.yaml
Please see complete configuration options
for all configuration options and their default values.
Important
- All configuration options are listed in dot notations (e.g.,
expose.tls.enabled
) in the doc, but they have to be transformed to hierachical structures when specified in the config
section in the YAML spec.
- Harbor web portal is exposed through
NodePort
by default, and its default port number is 30003
with TLS enabled and 30002
with TLS disabled.
- TLS is enabled by default for connections to Harbor web portal, and a secret resource named
harbor-tls-secret
is required for that purpose. It can be provisioned through cert-manager or manually with the following command using self-signed certificate:
kubectl create secret tls harbor-tls-secret --cert=[path to certificate file] --key=[path to key file] -n eksa-packages
secretKey
has to be set as a string of 16 characters for encryption.
TLS example with auto certificate generation
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
name: my-harbor
namespace: eksa-packages
spec:
packageName: harbor
config: |-
secretKey: "use-a-secret-key"
externalURL: https://harbor.eksa.demo:30003
expose:
tls:
certSource: auto
auto:
commonName: "harbor.eksa.demo"
Non-TLS example
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
name: my-harbor
namespace: eksa-packages
spec:
packageName: harbor
config: |-
secretKey: "use-a-secret-key"
externalURL: http://harbor.eksa.demo:30002
expose:
tls:
enabled: false
-
Install Harbor
eksctl anywhere create packages -f harbor.yaml
-
Check Harbor
eksctl anywhere get packages
Example command output
NAME PACKAGE AGE STATE CURRENTVERSION TARGETVERSION DETAIL
my-harbor harbor 5m34s installed v2.5.0 v2.5.0 (latest)
Harbor web portal is accessible at whatever externalURL
is set to. See complete configuration options
for all default values.

Upgrade
Note
- New versions of software packages will be automatically downloaded but not automatically installed. You can always manually run
eksctl
to check and install updates.
-
Verify a new bundle is available
eksctl anywhere get packagebundle
Example command output
NAME VERSION STATE
v1.21-1000 1.21 active (upgrade available)
v1.21-1001 1.21 inactive
-
Upgrade Harbor
eksctl anywhere upgrade packages --bundle-version v1.21-1001
-
Check Harbor
eksctl anywhere get packages
Example command output
NAME PACKAGE AGE STATE CURRENTVERSION TARGETVERSION DETAIL
my-harbor Harbor 14m installed v2.5.1 v2.5.1 (latest)
Uninstall
-
Uninstall Harbor
Important
- By default, PVCs created for jobservice and registry are not removed during a package delete operation, which can be changed by leaving
persistence.resourcePolicy
empty.
eksctl anywhere delete package my-harbor
4.4.2 - MetalLB
Install/upgrade/uninstall MetalLB
Install
-
Generate the package configuration
eksctl anywhere generate package metallb --source cluster > metallb.yaml
-
Add the desired configuration to metallb.yaml
Please see complete configuration options
for all configuration options and their default values.
Example package file with bgp configuration:
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
name: mylb
namespace: eksa-packages
spec:
packageName: metallb
config: |
peers:
- peer-address: 10.220.0.2
peer-asn: 65000
my-asn: 65002
address-pools:
- name: default
protocol: bgp
addresses:
- 10.220.0.90/32
- 10.220.0.97-10.220.0.120
Example package file with ARP configuration:
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
name: mylb
namespace: eksa-packages
spec:
packageName: metallb
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 10.220.0.90/32
- 10.220.0.97-10.220.0.120
-
Install MetalLB
eksctl anywhere create packages -f metallb.yaml
-
Validate the installation
eksctl anywhere get packages
Example command output
NAME PACKAGE AGE STATE CURRENTVERSION TARGETVERSION DETAIL
mylb metallb 22h installed 0.12.1-ce5b5de19014202cebd4ab4c091830a3b6dfea06 0.12.1-ce5b5de19014202cebd4ab4c091830a3b6dfea06 (latest)
Upgrade
MetalLB will automatically be upgraded when a new bundle is activated.
Uninstall
To uninstall MetalLB, simply delete the package
eksctl anywhere delete package mylb
5 - Reference
Reference documents for EKS Anywhere configuration
5.1 - Config
Config reference for EKS Anywhere clusters
5.1.1 - Bare metal configuration
Full EKS Anywhere configuration reference for a Bare Metal cluster.
This is a generic template with detailed descriptions below for reference.
The following additional optional configuration can also be included:
To generate your own cluster configuration, follow instructions from the Bare Metal Create production cluster
section and modify it using descriptions below.
For information on how to add cluster configuration settings to this file for advanced node configuration, see Advanced Bare Metal cluster configuration
.
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
clusterNetwork:
cniConfig:
cilium: {}
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneConfiguration:
count: 1
endpoint:
host: "<Control Plane Endpoint IP>"
machineGroupRef:
kind: TinkerbellMachineConfig
name: my-cluster-name-cp
datacenterRef:
kind: TinkerbellDatacenterConfig
name: my-cluster-name
kubernetesVersion: "1.22"
managementCluster:
name: my-cluster-name
workerNodeGroupConfigurations:
- count: 1
machineGroupRef:
kind: TinkerbellMachineConfig
name: my-cluster-name
name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellDatacenterConfig
metadata:
name: my-cluster-name
spec:
tinkerbellIP: "<Tinkerbell IP>"
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
name: my-cluster-name-cp
spec:
hardwareSelector: {}
osFamily: ubuntu
templateRef: {}
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2... jwjones@833efcab1482.home.example.com
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
name: my-cluster-name
spec:
hardwareSelector: {}
osFamily: ubuntu
templateRef:
kind: TinkerbellTemplateConfig
name: my-cluster-name
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2... jwjones@833efcab1482.home.example.com
Cluster Fields
name (required)
Name of your cluster (my-cluster-name
in this example).
clusterNetwork (required)
Specific network configuration for your Kubernetes cluster.
clusterNetwork.cniConfig (required)
CNI plugin to be installed in the cluster. The only supported value at the moment is cilium
.
clusterNetwork.pods.cidrBlocks[0] (required)
Subnet used by pods in CIDR notation. Please note that only 1 custom pods CIDR block specification is permitted.
This CIDR block should not conflict with the clusterNetwork.services.cidrBlocks
and network subnet range selected for the machines.
clusterNetwork.services.cidrBlocks[0] (required)
Subnet used by services in CIDR notation. Please note that only 1 custom services CIDR block specification is permitted.
This CIDR block should not conflict with the clusterNetwork.pods.cidrBlocks
and network subnet range selected for the machines.
clusterNetwork.dns.resolvConf.path (optional)
Path to the file with a custom DNS resolver configuration.
controlPlaneConfiguration (required)
Specific control plane configuration for your Kubernetes cluster.
controlPlaneConfiguration.count (required)
Number of control plane nodes.
This number needs to be odd to maintain ETCD quorum.
controlPlaneConfiguration.endpoint.host (required)
A unique IP you want to use for the control plane in your EKS Anywhere cluster. Choose an IP in your network
range that does not conflict with other machines.
NOTE: This IP should be outside the network DHCP range as it is a floating IP that gets assigned to one of
the control plane nodes for kube-apiserver loadbalancing.
controlPlaneConfiguration.machineGroupRef (required)
Refers to the Kubernetes object with Tinkerbell-specific configuration for your nodes. See TinkerbellMachineConfig Fields
below.
controlPlaneConfiguration.taints
A list of taints to apply to the control plane nodes of the cluster.
Replaces the default control plane taint, node-role.kubernetes.io/master
. The default control plane components will tolerate the provided taints.
Modifying the taints associated with the control plane configuration will cause new nodes to be rolled-out, replacing the existing nodes.
NOTE: The taints provided will be used instead of the default control plane taint node-role.kubernetes.io/master
.
Any pods that you run on the control plane nodes must tolerate the taints you provide in the control plane configuration.
controlPlaneConfiguration.labels
A list of labels to apply to the control plane nodes of the cluster. This is in addition to the labels that
EKS Anywhere will add by default.
Modifying the labels associated with the control plane configuration will cause new nodes to be rolled out, replacing
the existing nodes.
datacenterRef
Refers to the Kubernetes object with Tinkerbell-specific configuration. See TinkerbellDatacenterConfig Fields
below.
kubernetesVersion (required)
The Kubernetes version you want to use for your cluster. Supported values: 1.22
, 1.21
, 1.20
managementCluster
Identifies the name of the management cluster.
If this is a standalone cluster or if it were serving as the management cluster for other workload clusters, this will be the same as the cluster name.
Bare Metal EKS Anywhere clusters do not yet support the creation of separate workload clusters.
workerNodeGroupConfigurations (required)
This takes in a list of node groups that you can define for your workers.
You may define one or more worker node groups.
workerNodeGroupConfigurations.count (required)
Number of worker nodes
workerNodeGroupConfigurations.machineGroupRef (required)
Refers to the Kubernetes object with Tinkerbell-specific configuration for your nodes. See TinkerbellMachineConfig Fields
below.
workerNodeGroupConfigurations.name (required)
Name of the worker node group (default: md-0)
workerNodeGroupConfigurations.taints
A list of taints to apply to the nodes in the worker node group.
Modifying the taints associated with a worker node group configuration will cause new nodes to be rolled-out, replacing the existing nodes associated with the configuration.
At least one node group must not have NoSchedule
or NoExecute
taints applied to it.
workerNodeGroupConfigurations.labels
A list of labels to apply to the nodes in the worker node group. This is in addition to the labels that
EKS Anywhere will add by default.
Modifying the labels associated with a worker node group configuration will cause new nodes to be rolled out, replacing
the existing nodes associated with the configuration.
TinkerbellDatacenterConfig Fields
tinkerbellIP
Required field to identify the IP address of the Tinkerbell service.
This IP address must be a unique IP in the network range that does not conflict with other IPs.
Once the Tinkerbell services move from the Admin machine to run on the target cluster, this IP address makes it possible for the stack to be used for future provisioning needs.
When separate management and workload clusters are supported in Bare Metal, the IP address becomes a necessity.
osImageURL
Optional field to replace the default operating system image.
This field is useful if you want to provide a customized operating system image or simply host the standard image locally.
See Artifacts
for details.
hookImagesURLPath
Optional field to replace the HookOS image.
This field is useful if you want to provide a customized HookOS image or simply host the standard image locally.
See Artifacts
for details.
Example TinkerbellDatacenterConfig.spec
spec:
tinkerbellIP: "192.168.0.10" # Available, routable IP
osImageURL: "http://my-web-server/ubuntu-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.gz" # Full URL to the OS Image hosted locally
hookImagesURLPath: "http://my-web-server/hook" # Path to the hook images. This path contains vmlinuz-x86_64 and initramfs-x86_64
This is the folder structure for my-web-server
:
my-web-server
├── hook
│ ├── initramfs-x86_64
│ └── vmlinuz-x86_64
└── ubuntu-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.gz
TinkerbellMachineConfig Fields
In the example, there are TinkerbellMachineConfig
sections for control plane (my-cluster-name-cp
) and worker (my-cluster-name
) machine groups.
The following fields identify information needed to configure the nodes in each of those groups.
NOTE: Currently, you can only have one machine group for all machines in the control plane, although you can have multiple machine groups for the workers.
hardwareSelector
Use fields under hardwareSelector
to add key/value pair labels to match particular machines that you identified in the CSV file where you defined the machines in your cluster.
Choose any label name you like.
For example, if you had added the label node=cp-machine
to the machines listed in your CSV file that you want to be control plane nodes, the following hardwareSelector
field would cause those machines to be added to the control plane:
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
name: my-cluster-name-cp
spec:
hardwareSelector:
node: "cp-machine"
osFamily (required)
Operating system on the machine. For example, bottlerocket
or ubuntu
.
templateRef (optional)
Identifies the template that defines the actions that will be applied to the TinkerbellMachineConfig.
See TinkerbellTemplateConfig fields below.
EKS Anywhere will generate default templates based on osFamily
during the create
command.
You can override this default template by providing your own template here.
users
The name of the user you want to configure to access your virtual machines through SSH.
The default is ec2-user
.
Currently, only one user is supported.
users[0].sshAuthorizedKeys (optional)
The SSH public keys you want to configure to access your machines through SSH (as described below). Only 1 is supported at this time.
users[0].sshAuthorizedKeys[0] (optional)
This is the SSH public key that will be placed in authorized_keys
on all EKS Anywhere cluster machines so you can SSH into
them. The user will be what is defined under name
above. For example:
ssh -i <private-key-file> <user>@<machine-IP>
The default is generating a key in your $(pwd)/<cluster-name>
folder when not specifying a value.
When you generate a Bare Metal cluster configuration, the TinkerbellTemplateConfig
is kept internally and not shown in the generated configuration file.
TinkerbellTemplateConfig
settings define the actions done to install each node, such as get installation media, configure networking, add users, and otherwise configure the node.
Advanced users can override the default values set for TinkerbellTemplateConfig
.
They can also add their own Tinkerbell actions
to make personalized modifications to EKS Anywhere nodes.
The following shows two TinkerbellTemplateConfig
examples that you can add to your cluster configuration file to override the values that EKS Anywhere sets: one for Ubuntu and one for Bottlerocket.
Most actions used differ for different operating systems.
NOTE: For the stream-image
action, DEST_DISK
points to the device representing the entire hard disk (for example, /dev/sda
).
For UEFI-enabled images, such as Ubuntu, write actions use DEST_DISK
to point to the second partition (for example, /dev/sda2
), with the first being the EFI partition.
For the Bottlerocket image, which has 12 partitions, DEST_DISK
is partition 12 (for example, /dev/sda12
).
Device names will be different for different disk types.
Ubuntu TinkerbellTemplateConfig example
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellTemplateConfig
metadata:
name: my-cluster-name
spec:
template:
global_timeout: 6000
id: ""
name: my-cluster-name
tasks:
- actions:
- environment:
COMPRESSED: "true"
DEST_DISK: /dev/sda
IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-22/ubuntu-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.gz
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: stream-image
timeout: 360
- environment:
CONTENTS: |
network:
version: 2
renderer: networkd
ethernets:
eno1:
dhcp4: true
DEST_DISK: /dev/sda2
DEST_PATH: /etc/netplan/config.yaml
DIRMODE: "0755"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: write-netplan
timeout: 90
- environment:
CONTENTS: |
datasource:
Ec2:
metadata_urls: []
strict_id: false
system_info:
default_user:
name: tink
groups: [wheel, adm]
sudo: ["ALL=(ALL) NOPASSWD:ALL"]
shell: /bin/bash
manage_etc_hosts: localhost
warnings:
dsid_missing_source: off
DEST_DISK: /dev/sda2
DEST_PATH: /etc/cloud/cloud.cfg.d/10_tinkerbell.cfg
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0600"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: add-tink-cloud-init-config
timeout: 90
- environment:
CONTENTS: |
datasource: Ec2
DEST_DISK: /dev/sda2
DEST_PATH: /etc/cloud/ds-identify.cfg
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0600"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: add-tink-cloud-init-ds-config
timeout: 90
- environment:
BLOCK_DEVICE: /dev/sda2
FS_TYPE: ext4
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/kexec:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: kexec-image
pid: host
timeout: 90
name: my-cluster-name
volumes:
- /dev:/dev
- /dev/console:/dev/console
- /lib/firmware:/lib/firmware:ro
worker: '{{.device_1}}'
version: "0.1"
Bottlerocket TinkerbellTemplateConfig example
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellTemplateConfig
metadata:
name: my-cluster-name
spec:
template:
global_timeout: 6000
id: ""
name: my-cluster-name
tasks:
- actions:
- environment:
COMPRESSED: "true"
DEST_DISK: /dev/sda
IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-22/bottlerocket-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.img.gz
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: stream-image
timeout: 360
- environment:
BOOTCONFIG_CONTENTS: |
kernel {
console = "tty0", "ttyS0,115200n8"
}
DEST_DISK: /dev/sda12
DEST_PATH: /bootconfig.data
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: write-bootconfig
timeout: 90
- environment:
CONTENTS: |
# Version is required, it will change as we support
# additional settings
version = 1
# "eno1" is the interface name
# Users may turn on dhcp4 and dhcp6 via boolean
[eno1]
dhcp4 = true
# Define this interface as the "primary" interface
# for the system. This IP is what kubelet will use
# as the node IP. If none of the interfaces has
# "primary" set, we choose the first interface in
# the file
primary = true
DEST_DISK: /dev/sda12
DEST_PATH: /net.toml
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: write-netconfig
timeout: 90
- environment:
HEGEL_URL: http://<hegel-ip>:50061
DEST_DISK: /dev/sda12
DEST_PATH: /user-data.toml
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: write-user-data
timeout: 90
- name: "reboot"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/reboot:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
timeout: 90
volumes:
- /worker:/worker
version: "0.1"
TinkerbellTemplateConfig Fields
The values in the TinkerbellTemplateConfig
fields are created from the contents of the CSV file used to generate a configuration.
The template contains actions that are performed on a Bare Metal machine when it first boots up to be provisioned.
For advanced users, you can add these fields to your cluster configuration file if you have special needs to do so.
While there are fields that apply to all provisioned operating systems, actions are specific to each operating system.
Examples below describe actions for Ubuntu and Bottlerocket operating systems.
template.global_timeout
Sets the timeout value for completing the configuration. Set to 6000 (100 minutes) by default.
template.id
Not set by default.
template.tasks
Within the TinkerbellTemplateConfig template
under tasks
is a set of actions.
The following descriptions cover the actions shown in the example templates for Ubuntu and Bottlerocket:
template.tasks.actions.name.stream-image (Ubuntu and Bottlerocket)
The stream-image
action streams the selected image to the machine you are provisioning. It identifies:
- environment.COMPRESSED: When set to
true
, Tinkerbell expects IMG_URL
to be a compressed image, which Tinkerbell will uncompress when it writes the contents to disk.
- environment.DEST_DISK: The hard disk on which the operating system is deployed. The default is the first SCSI disk (/dev/sda), but can be changed for other disk types.
- environment.IMG_URL: The operating system tarball (ubuntu or other) to stream to the machine you are configuring.
- image: Container image needed to perform the steps needed by this action.
- timeout: Sets the amount of time (in seconds) that Tinkerbell has to stream the image, uncompress it, and write it to disk before timing out. Consider increasing this limit from the default 600 to a higher limit if this action is timing out.
Ubuntu-specific actions
template.tasks.actions.name.write-netplan (Ubuntu)
The write-netplan
action writes Ubuntu network configuration information to the machine (see Netplan
) for details. It identifies:
- environment.CONTENTS.network.version: Identifies the network version.
- environment.CONTENTS.network.renderer: Defines the service to manage networking. By default, the
networkd
systemd service is used.
- environment.CONTENTS.network.ethernets: Network interface to external network (eno1, by default) and whether or not to use dhcp4 (true, by default).
- environment.DEST_DISK: Destination block storage device partition where the operating system is copied. By default, /dev/sda2 is used (sda1 is the EFI partition).
- environment.DEST_PATH: File where the networking configuration is written (/etc/netplan/config.yaml, by default).
- environment.DIRMODE: Linux directory permissions bits to use when creating directories (0755, by default)
- environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
- environment.GID: The Linux group ID to set on file. Set to 0 (root group) by default.
- environment.MODE: The Linux permission bits to set on file (0644, by default).
- environment.UID: The Linux user ID to set on file. Set to 0 (root user) by default.
- image: Container image used to perform the steps needed by this action.
- timeout: Time needed to complete the action, in seconds.
template.tasks.actions.add-tink-cloud-init-config (Ubuntu)
The add-tink-cloud-init-config
action configures cloud-init features to further configure the operating system. See cloud-init Documentation
for details. It identifies:
- environment.CONTENTS.datasource: Identifies Ec2 (Ec2.metadata_urls) as the data source and sets
Ec2.strict_id: false
to prevent cloud-init from producing warnings about this datasource.
- environment.CONTENTS.system_info: Creates the
tink
user and gives it administrative group privileges (wheel, adm) and passwordless sudo privileges, and set the default shell (/bin/bash).
- environment.CONTENTS.manage_etc_hosts: Updates the system’s
/etc/hosts
file with the hostname. Set to localhost
by default.
- environment.CONTENTS.warnings: Sets dsid_missing_source to
off
.
- environment.DEST_DISK: Destination block storage device partition where the operating system is located (
/dev/sda2
, by default).
- environment.DEST_PATH: Location of the cloud-init configuration file on disk (
/etc/cloud/cloud.cfg.d/10_tinkerbell.cfg
, by default)
- environment.DIRMODE: Linux directory permissions bits to use when creating directories (0700, by default)
- environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
- environment.GID: The Linux group ID to set on file. Set to 0 (root group) by default.
- environment.MODE: The Linux permission bits to set on file (0600, by default).
- environment.UID: The Linux user ID to set on file. Set to 0 (root user) by default.
- image: Container image used to perform the steps needed by this action.
- timeout: Time needed to complete the action, in seconds.
template.tasks.actions.add-tink-cloud-init-ds-config (Ubuntu)
The add-tink-cloud-init-ds-config
action configures cloud-init data store features. This identifies the location of your metadata source once the machine is up and running. It identifies:
- environment.CONTENTS.datasource: Sets the datasource. Uses Ec2, by default.
- environment.DEST_DISK: Destination block storage device partition where the operating system is located (/dev/sda2, by default).
- environment.DEST_PATH: Location of the data store identity configuration file on disk (/etc/cloud/ds-identify.cfg, by default)
- environment.DIRMODE: Linux directory permissions bits to use when creating directories (0700, by default)
- environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
- environment.GID: The Linux group ID to set on file. Set to 0 (root group) by default.
- environment.MODE: The Linux permission bits to set on file (0600, by default).
- environment.UID: The Linux user ID to set on file. Set to 0 (root user) by default.
- image: Container image used to perform the steps needed by this action.
- timeout: Time needed to complete the action, in seconds.
template.tasks.actions.kexec-image (Ubuntu)
The kexec-image
action performs provisioning activities on the machine, then allows kexec to pivot the kernel to use the system installed on disk. This action identifies:
- environment.BLOCK_DEVICE: Disk partition on which the operating system is installed (/dev/sda2, by default)
- environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
- image: Container image used to perform the steps needed by this action.
- pid: Process ID. Set to host, by default.
- timeout: Time needed to complete the action, in seconds.
- volumes: Identifies mount points that need to be remounted to point to locations in the installed system.
There are known issues related to drivers with some hardware that may make it necessary to replace the kexec-image action with a full reboot.
If you require a full reboot, you can change the kexec-image setting as follows:
actions:
- name: "reboot"
image: public.ecr.aws/l0g8r8j6/tinkerbell/hub/reboot-action:latest
timeout: 90
volumes:
- /worker:/worker
Bottlerocket-specific actions
template.tasks.actions.write-bootconfig (Bottlerocket)
The write-bootconfig action identifies the location on the machine to put content needed to boot the system from disk.
- environment.BOOTCONFIG_CONTENTS.kernel: Add kernel parameters that are passed to the kernel when the system boots.
- environment.DEST_DISK: Identifies the block storage device that holds the boot partition.
- environment.DEST_PATH: Identifies the file holding boot configuration data (
/bootconfig.data
in this example).
- environment.DIRMODE: The Linux permissions assigned to the boot directory.
- environment.FS_TYPE: The filesystem type associated with the boot partition.
- environment.GID: The group ID associated with files and directories created on the boot partition.
- environment.MODE: The Linux permissions assigned to files in the boot partition.
- environment.UID: The user ID associated with files and directories created on the boot partition. UID 0 is the root user.
- image: Container image used to perform the steps needed by this action.
- timeout: Time needed to complete the action, in seconds.
template.tasks.actions.write-netconfig (Bottlerocket)
The write-netconfig action configures networking for the system.
- environment.CONTENTS: Add network values, including:
version = 1
(version number), [eno1]
(external network interface), dhcp4 = true
(turns on dhcp4), and primary = true
(identifies this interface as the primary interface used by kubelet).
- environment.DEST_DISK: Identifies the block storage device that holds the network configuration information.
- environment.DEST_PATH: Identifies the file holding network configuration data (
/net.toml
in this example).
- environment.DIRMODE: The Linux permissions assigned to the directory holding network configuration settings.
- environment.FS_TYPE: The filesystem type associated with the partition holding network configuration settings.
- environment.GID: The group ID associated with files and directories created on the partition. GID 0 is the root group.
- environment.MODE: The Linux permissions assigned to files in the partition.
- environment.UID: The user ID associated with files and directories created on the partition. UID 0 is the root user.
- image: Container image used to perform the steps needed by this action.
template.tasks.actions.write-user-data (Bottlerocket)
The write-user-data action configures the Tinkerbell Hegel service, which provides the metadata store for Tinkerbell.
- environment.HEGEL_URL: The IP address and port number of the Tinkerbell Hegel
service.
- environment.DEST_DISK: Identifies the block storage device that holds the network configuration information.
- environment.DEST_PATH: Identifies the file holding network configuration data (
/net.toml
in this example).
- environment.DIRMODE: The Linux permissions assigned to the directory holding network configuration settings.
- environment.FS_TYPE: The filesystem type associated with the partition holding network configuration settings.
- environment.GID: The group ID associated with files and directories created on the partition. GID 0 is the root group.
- environment.MODE: The Linux permissions assigned to files in the partition.
- environment.UID: The user ID associated with files and directories created on the partition. UID 0 is the root user.
- image: Container image used to perform the steps needed by this action.
- timeout: Time needed to complete the action, in seconds.
template.tasks.actions.reboot (Bottlerocket)
The reboot action defines how the system restarts to bring up the installed system.
- image: Container image used to perform the steps needed by this action.
- timeout: Time needed to complete the action, in seconds.
- volumes: The volume (directory) to mount into the container from the installed system.
version
Matches the current version of the Tinkerbell template.
Custom Tinkerbell action examples
By creating your own custom Tinkerbell actions, you can add to or modify the installed operating system so those changes take effect when the installed system first starts (from a reboot or pivot).
The following example shows how to add a .deb package (openssl
) to an Ubuntu installation:
- environment:
BLOCK_DEVICE: /dev/sda1
CHROOT: "y"
CMD_LINE: apt -y update && apt -y install openssl
DEFAULT_INTERPRETER: /bin/sh -c
FS_TYPE: ext4
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/cexec:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: install-openssl
timeout: 90
The following shows an example of adding a new user (tinkerbell
) to an installed Ubuntu system:
- environment:
BLOCK_DEVICE: <block device path> # E.g. /dev/sda1
FS_TYPE: ext4
CHROOT: y
DEFAULT_INTERPRETER: "/bin/sh -c"
CMD_LINE: "useradd --password $(openssl passwd -1 tinkerbell) --shell /bin/bash --create-home --groups sudo tinkerbell"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/cexec:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: "create-user"
timeout: 90
Look for more examples as they are added to the Tinkerbell examples
page.
5.1.2 - vSphere configuration
Full EKS Anywhere configuration reference for a VMware vSphere cluster.
This is a generic template with detailed descriptions below for reference.
The following additional optional configuration can also be included:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
clusterNetwork:
cniConfig:
cilium: {}
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneConfiguration:
count: 1
endpoint:
host: ""
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-machines
taints:
- key: ""
value: ""
effect: ""
labels:
"<key1>": ""
"<key2>": ""
datacenterRef:
kind: VSphereDatacenterConfig
name: my-cluster-datacenter
externalEtcdConfiguration:
count: 3
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-machines
kubernetesVersion: "1.22"
workerNodeGroupConfigurations:
- count: 1
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-machines
name: md-0
taints:
- key: ""
value: ""
effect: ""
labels:
"<key1>": ""
"<key2>": ""
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereDatacenterConfig
metadata:
name: my-cluster-datacenter
spec:
datacenter: ""
server: ""
network: ""
insecure:
thumbprint: ""
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
name: my-cluster-machines
spec:
diskGiB:
datastore: ""
folder: ""
numCPUs:
memoryMiB:
osFamily: ""
resourcePool: ""
storagePolicyName: ""
template: ""
users:
- name: ""
sshAuthorizedKeys:
- ""
Cluster Fields
name (required)
Name of your cluster my-cluster-name
in this example
clusterNetwork (required)
Specific network configuration for your Kubernetes cluster.
clusterNetwork.cni (required)
CNI plugin to be installed in the cluster. The only supported value at the moment is cilium
.
clusterNetwork.pods.cidrBlocks[0] (required)
Subnet used by pods in CIDR notation. Please note that only 1 custom pods CIDR block specification is permitted.
This CIDR block should not conflict with the network subnet range selected for the VMs.
clusterNetwork.services.cidrBlocks[0] (required)
Subnet used by services in CIDR notation. Please note that only 1 custom services CIDR block specification is permitted.
This CIDR block should not conflict with the network subnet range selected for the VMs.
clusterNetwork.dns.resolvConf.path (optional)
Path to the file with a custom DNS resolver configuration.
controlPlaneConfiguration (required)
Specific control plane configuration for your Kubernetes cluster.
controlPlaneConfiguration.count (required)
Number of control plane nodes
controlPlaneConfiguration.machineGroupRef (required)
Refers to the Kubernetes object with vsphere specific configuration for your nodes. See VSphereMachineConfig Fields
below.
controlPlaneConfiguration.endpoint.host (required)
A unique IP you want to use for the control plane VM in your EKS Anywhere cluster. Choose an IP in your network
range that does not conflict with other VMs.
NOTE: This IP should be outside the network DHCP range as it is a floating IP that gets assigned to one of
the control plane nodes for kube-apiserver loadbalancing. Suggestions on how to ensure this IP does not cause issues during cluster
creation process are here
controlPlaneConfiguration.taints
A list of taints to apply to the control plane nodes of the cluster.
Replaces the default control plane taint, node-role.kubernetes.io/master
. The default control plane components will tolerate the provided taints.
Modifying the taints associated with the control plane configuration will cause new nodes to be rolled-out, replacing the existing nodes.
NOTE: The taints provided will be used instead of the default control plane taint node-role.kubernetes.io/master
.
Any pods that you run on the control plane nodes must tolerate the taints you provide in the control plane configuration.
controlPlaneConfiguration.labels
A list of labels to apply to the control plane nodes of the cluster. This is in addition to the labels that
EKS Anywhere will add by default.
Modifying the labels associated with the control plane configuration will cause new nodes to be rolled out, replacing
the existing nodes.
workerNodeGroupConfigurations (required)
This takes in a list of node groups that you can define for your workers.
You may define one or more worker node groups.
workerNodeGroupConfigurations.count (required)
Number of worker nodes
workerNodeGroupConfigurations.machineGroupRef (required)
Refers to the Kubernetes object with vsphere specific configuration for your nodes. See VSphereMachineConfig Fields
below.
workerNodeGroupConfigurations.name (required)
Name of the worker node group (default: md-0)
workerNodeGroupConfigurations.taints
A list of taints to apply to the nodes in the worker node group.
Modifying the taints associated with a worker node group configuration will cause new nodes to be rolled-out, replacing the existing nodes associated with the configuration.
At least one node group must not have NoSchedule
or NoExecute
taints applied to it.
workerNodeGroupConfigurations.labels
A list of labels to apply to the nodes in the worker node group. This is in addition to the labels that
EKS Anywhere will add by default.
Modifying the labels associated with a worker node group configuration will cause new nodes to be rolled out, replacing
the existing nodes associated with the configuration.
externalEtcdConfiguration.count
Number of etcd members
externalEtcdConfiguration.machineGroupRef
Refers to the Kubernetes object with vsphere specific configuration for your etcd members. See VSphereMachineConfig Fields
below.
datacenterRef
Refers to the Kubernetes object with vsphere environment specific configuration. See VSphereDatacenterConfig Fields
below.
kubernetesVersion (required)
The Kubernetes version you want to use for your cluster. Supported values: 1.22
, 1.21
, 1.20
VSphereDatacenterConfig Fields
datacenter (required)
The vSphere datacenter to deploy the EKS Anywhere cluster on. For example SDDC-Datacenter
.
network (required)
The VM network to deploy your EKS Anywhere cluster on.
server (required)
The vCenter server fully qualified domain name or IP address. If the server IP is used, the thumbprint
must be set
or insecure
must be set to true.
insecure (optional)
Set insecure to true
if the vCenter server does not have a valid certificate. (Default: false)
thumbprint (required if insecure=false)
The SHA1 thumbprint of the vCenter server certificate which is only required if you have a self signed certificate.
There are several ways to obtain your vCenter thumbprint. The easiest way is if you have govc
installed, you
can run:
govc about.cert -thumbprint -k
Another way is from the vCenter web UI, go to Administration/Certificate Management and click view details of the
machine certificate. The format of this thumbprint does not exactly match the format required though and you will
need to add :
to separate each hexadecimal value.
Another way to get the thumbprint is use this command with your servers certificate in a file named ca.crt
:
openssl x509 -sha1 -fingerprint -in ca.crt -noout
If you specify the wrong thumbprint, an error message will be printed with the expected thumbprint. If no valid
certificate is being used, insecure
must be set to true.
VSphereMachineConfig Fields
memoryMiB (optional)
Size of RAM on virtual machines (Default: 8192)
numCPUs (optional)
Number of CPUs on virtual machines (Default: 2)
osFamily (optional)
Operating System on virtual machines. Permitted values: ubuntu, bottlerocket (Default: bottlerocket)
diskGiB (optional)
Size of disk on virtual machines if snapshots aren’t included (Default: 25)
users (optional)
The users you want to configure to access your virtual machines. Only one is permitted at this time
users[0].name (optional)
The name of the user you want to configure to access your virtual machines through ssh.
The default is ec2-user
if osFamily=bottlrocket
and capv
if osFamily=ubuntu
users[0].sshAuthorizedKeys (optional)
The SSH public keys you want to configure to access your virtual machines through ssh (as described below). Only 1 is supported at this time.
users[0].sshAuthorizedKeys[0] (optional)
This is the SSH public key that will be placed in authorized_keys
on all EKS Anywhere cluster VMs so you can ssh into
them. The user will be what is defined under name above. For example:
ssh -i <private-key-file> <user>@<VM-IP>
The default is generating a key in your $(pwd)/<cluster-name>
folder when not specifying a value
template (optional)
The VM template to use for your EKS Anywhere cluster. This template was created when you
imported the OVA file into vSphere
.
This is a required field if you are using Bottlerocket OVAs.
datastore (required)
The vSphere datastore
to deploy your EKS Anywhere cluster on.
folder (required)
The VM folder for your EKS anywhere cluster VMs. This allows you to organize your VMs. If the folder does not exist,
it will be created for you. If the folder is blank, the VMs will go in the root folder.
resourcePool (required)
The vSphere Resource pools
for your VMs in the EKS Anywhere cluster. Examples of resource pool values include:
- If there is no resource pool:
/<datacenter>/host/<cluster-name>/Resources
- If there is a resource pool:
/<datacenter>/host/<cluster-name>/Resources/<resource-pool-name>
- The wild card option
*/Resources
also often works.
storagePolicyName (optional)
The storage policy name associated with your VMs.
Optional VSphere Credentials
Use the following environment variables to configure Cloud Provider and CSI Driver with different credentials.
EKSA_VSPHERE_CP_USERNAME
Username for Cloud Provider (Default: $EKSA_VSPHERE_USERNAME).
EKSA_VSPHERE_CP_PASSWORD
Password for Cloud Provider (Default: $EKSA_VSPHERE_PASSWORD).
EKSA_VSPHERE_CSI_USERNAME
Username for CSI Driver (Default: $EKSA_VSPHERE_USERNAME).
EKSA_VSPHERE_CSI_PASSWORD
Password for CSI Driver (Default: $EKSA_VSPHERE_PASSWORD).
5.1.3 - Optional configuration
Config reference to optional features for EKS Anywhere clusters
5.1.3.1 - CNI plugin configuration
EKS Anywhere cluster yaml cni plugin specification reference
Specifying CNI Plugin in EKS Anywhere cluster spec
EKS Anywhere currently supports two CNI plugins: Cilium and Kindnet. Only one of them can be selected
for a cluster, and the plugin cannot be changed once the cluster is created.
Up until the 0.7.x releases, the plugin had to be specified using the cni
field on cluster spec.
Starting with release 0.8, the plugin should be specified using the new cniConfig
field as follows:
-
For selecting Cilium as the CNI plugin:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
cniConfig:
cilium: {}
EKS Anywhere selects this as the default plugin when generating a cluster config.
-
Or for selecting Kindnetd as the CNI plugin:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
cniConfig:
kindnetd: {}
NOTE: EKS Anywhere allows specifying only 1 plugin for a cluster and does not allow switching the plugins
after the cluster is created.
Policy Configuration options for Cilium plugin
Cilium accepts policy enforcement modes from the users to determine the allowed traffic between pods.
The allowed values for this mode are: default
, always
and never
.
Please refer the official Cilium documentation
for more details on how each mode affects
the communication within the cluster and choose a mode accordingly.
You can choose to not set this field so that cilium will be launched with the default
mode.
Starting release 0.8, Cilium’s policy enforcement mode can be set through the cluster spec
as follows:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
cniConfig:
cilium:
policyEnforcementMode: "always"
Please note that if the always
mode is selected, all communication between pods is blocked unless
NetworkPolicy objects allowing communication are created.
In order to ensure that the cluster gets created successfully, EKS Anywhere will create the required
NetworkPolicy objects for all its core components. But it is up to the user to create the NetworkPolicy
objects needed for the user workloads once the cluster is created.
Network policies created by EKS Anywhere for “always” mode
As mentioned above, if Cilium is configured with policyEnforcementMode
set to always
,
EKS Anywhere creates NetworkPolicy objects to enable communication between
its core components. These policies are created based on the type of cluster as follows:
-
For self-managed/management cluster, EKS Anywhere will create NetworkPolicy resources in the following namespaces allowing all ingress/egress traffic by default:
- kube-system
- eksa-system
- All core Cluster API namespaces:
- capi-system
- capi-kubeadm-bootstrap-system
- capi-kubeadm-control-plane-system
- etcdadm-bootstrap-provider-system
- etcdadm-controller-system
- cert-manager
- Infrastruture provider’s namespace (for instance, capd-system OR capv-system)
- If Gitops is enabled, then the gitops namespace (flux-system by default)
This is the NetworkPolicy that will be created in these namespaces for the self-managed cluster:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-all-ingress-egress
namespace: test
spec:
podSelector: {}
ingress:
- {}
egress:
- {}
policyTypes:
- Ingress
- Egress
-
For a workload cluster managed by another EKS Anywhere cluster, EKS Anywhere will create NetworkPolicy resource only in the following namespace by default:
For the workload clusters using Kubernetes version 1.21 and higher, the ingress/egress of pods in the kube-system namespace will be limited
to other pods only in the kube-system namespace by using the following NetworkPolicy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-all-ingress-egress
namespace: test
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
policyTypes:
- Ingress
- Egress
For workload clusters using Kubernetes version 1.20, the NetworkPolicy in kube-system will
allow ingress/egress from all pods. This is because Kubernetes versions prior to 1.21 do not
set the default labels on the namespaces so EKS Anywhere cannot use a namespace selector.
This NetworkPolicy will ensure that the cluster gets created successfully. Later the cluster admin can edit/replace it if required.
Switching the Cilium policy enforcement mode
The policy enforcement mode for Cilium can be changed as a part of cluster upgrade
through the cli upgrade command.
-
Switching to always
mode: When switching from default
/never
to always
mode,
EKS Anywhere will create the required NetworkPolicy objects for its core components (listed above).
This will ensure that the cluster gets upgraded successfully. But it is up to the user to create
the NetworkPolicy objects required for the user workloads.
-
Switching from always
mode: When switching from always
to default
mode, EKS Anywhere
will not delete any of the existing NetworkPolicy objects, including the ones required
for EKS Anywhere components (listed above). The user must delete NetworkPolicy objects as needed.
Node IPs configuration option
Starting with release v0.10, the node-cidr-mask-size
flag
for Kubernetes controller manager (kube-controller-manager) is configurable via the EKS anywhere cluster spec. The clusterNetwork.nodes
being an optional field,
is not generated in the EKS Anywhere spec using generate clusterconfig
command. This block for nodes
will need to be manually added to the cluster spec under the
clusterNetwork
section:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
cniConfig:
cilium: {}
nodes:
cidrMaskSize: 24
If the user does not specify the clusterNetwork.nodes
field in the cluster yaml spec, the value for this flag defaults to 24 for IPv4.
Please note that this mask size needs to be greater than the pods CIDR mask size. In the above spec, the pod CIDR mask size is 16
and the node CIDR mask size is 24
. This ensures the cluster 256 blocks of /24 networks. For example, node1 will get
192.168.0.0/24, node2 will get 192.168.1.0/24, node3 will get 192.168.2.0/24 and so on.
To support more than 256 nodes, the cluster CIDR block needs to be large, and the node CIDR mask size needs to be
small, to support that many IPs.
For instance, to support 1024 nodes, a user can do any of the following things
- Set the pods cidr blocks to
192.168.0.0/16
and node cidr mask size to 26
- Set the pods cidr blocks to
192.168.0.0/15
and node cidr mask size to 25
Please note that the node-cidr-mask-size
needs to be large enough to accommodate the number of pods you want to run on each node.
A size of 24 will give enough IP addresses for about 250 pods per node, however a size of 26 will only give you about 60 IPs.
This is an immutable field, and the value can’t be updated once the cluster has been created.
5.1.3.2 - IAM for Pods configuration
EKS Anywhere cluster spec for Pod IAM (IRSA)
IAM Role for Service Account on EKS Anywhere clusters with self-hosted signing keys
IAM Roles for Service Account (IRSA) enables applications running in clusters to authenticate with AWS services using IAM roles. The current solution for leveraging this in EKS Anywhere involves creating your own OIDC provider for the cluster, and hosting your cluster’s public service account signing key. The public keys along with the OIDC discovery document should be hosted somewhere that AWS STS can discover it. The steps below assume the keys will be hosted on a publicly accessible S3 bucket. Refer this
doc to ensure that the s3 bucket is publicly accessible.
The steps below are based on the guide for configuring IRSA for DIY Kubernetes
, with modifications specific to EKS Anywhere’s cluster provisioning workflow. The main modification is the process of generating the keys.json document. As per the original guide, the user has to create the service account signing keys, and then use that to create the keys.json document prior to cluster creation. This order is reversed for EKS Anywhere clusters, so you will create the cluster first, and then retrieve the service account signing key generated by the cluster, and use it to create the keys.json document. The sections below show how to do this in detail.
Create an OIDC provider and make its discovery document publicly accessible
-
Create an s3 bucket to host the public signing keys and OIDC discovery document for your cluster as per this section
. Ensure you follow all the steps and save the $HOSTNAME
and $ISSUER_HOSTPATH
.
-
Create the OIDC discovery document as follows:
cat <<EOF > discovery.json
{
"issuer": "https://$ISSUER_HOSTPATH",
"jwks_uri": "https://$ISSUER_HOSTPATH/keys.json",
"authorization_endpoint": "urn:kubernetes:programmatic_authorization",
"response_types_supported": [
"id_token"
],
"subject_types_supported": [
"public"
],
"id_token_signing_alg_values_supported": [
"RS256"
],
"claims_supported": [
"sub",
"iss"
]
}
EOF
-
Upload it to the publicly accessible S3 bucket:
aws s3 cp --acl public-read ./discovery.json s3://$S3_BUCKET/.well-known/openid-configuration
-
Create an OIDC provider
for your cluster. Set the Provider URL
to https://$ISSUER_HOSTPATH
, and audience to sts.amazonaws.com
.
-
Note down the Provider
field of OIDC provider after it is created.
-
Assign an IAM role to this OIDC provider.
- To do so from the AWS console, select and click on the OIDC provider, and click on Assign role at the top right.
- Select Create a new role.
- In the Select type of trusted entity section, choose Web identity.
- In the Choose a web identity provider section:
- For Identity provider, choose the auto selected Identity Provider URL for your cluster.
- For Audience, choose sts.amazonaws.com.
- Choose Next: Permissions.
- In the Attach Policy section, select the IAM policy that has the permissions that you want your applications running in the pods to use.
- Continue with the next sections of adding tags if desired and a suitable name for this role and create the role.
- After the role is created, note down the name of this IAM Role as
OIDC_IAM_ROLE
. After the cluster is created, you can create service accounts and grant them this role by editing the trust relationship of this role. The last section shows how to do this.
Create the EKS Anywhere cluster
- When creating the EKS Anywhere cluster, you need to configure the kube-apiserver’s
service-account-issuer
flag so it can issue and mount projected service account tokens in pods. For this, use the value obtained in the first section for $ISSUER_HOSTPATH
as the service-account-issuer
. Configure the kube-apiserver by setting this value through the EKS Anywhere cluster spec as follows:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
podIamConfig:
serviceAccountIssuer: https://$ISSUER_HOSTPATH
Set the remaining fields in cluster spec
as required and create the cluster using the eksctl anywhere create cluster
command.
Generate keys.json and make it publicly accessible
-
The cluster provisioning workflow generates a pair of service account signing keys. Retrieve the public signing key generated and used by the cluster, and create a keys.json document containing the public signing key.
kubectl get secret ${CLUSTER_NAME}-sa -n eksa-system -o jsonpath={.data.tls\\.crt} | base64 --decode > ${CLUSTER_NAME}-sa.pub
wget https://raw.githubusercontent.com/aws/amazon-eks-pod-identity-webhook/master/hack/self-hosted/main.go -O keygenerator.go
go run keygenerator.go -key ${CLUSTER_NAME}-sa.pub | jq '.keys += [.keys[0]] | .keys[1].kid = ""' > keys.json
-
Upload the keys.json document to the s3 bucket.
aws s3 cp --acl public-read ./keys.json s3://$S3_BUCKET/keys.json
Deploy pod identity webhook
-
After hosting the service account public signing key and OIDC discovery documents, the applications running in pods can start accessing the desired AWS resources, as long as the pod is mounted with the right service account tokens. This part of configuring the pods with the right service account tokens and env vars is automated by the amazon pod identity webhook
. Once the webhook is deployed, it mutates any pods launched using service accounts annotated with eks.amazonaws.com/role-arn
-
Check out this commit
of the amazon-eks-pod-identity-webhook.
-
Set the $KUBECONFIG env var to the path of the EKS Anywhere cluster.
-
Run the following command:
make cluster-up IMAGE=amazon/amazon-eks-pod-identity-webhook:a65cc3d
In order to grant certain service accounts access to the desired AWS resources, edit the trust relationship for the OIDC provider’s IAM Role (OIDC_IAM_ROLE
) created in the first section, and add in the desired service accounts.
- Choose the role in the console to open it for editing.
- Choose the Trust relationships tab, and then choose Edit trust relationship.
- Find the line that looks similar to the following:
"$ISSUER_HOSTPATH:aud": "sts.amazonaws.com"
Change the line to look like the following line. Replace aud
with sub
and replace KUBERNETES_SERVICE_ACCOUNT_NAMESPACE
and KUBERNETES_SERVICE_ACCOUNT_NAME
with the name of your Kubernetes service account and the Kubernetes namespace that the account exists in.
"$ISSUER_HOSTPATH:sub": "system:serviceaccount:KUBERNETES_SERVICE_ACCOUNT_NAMESPACE:KUBERNETES_SERVICE_ACCOUNT_NAME"
Refer this
doc for different ways of configuring one or multiple service accounts through the condition operators in the trust relationship.
- Choose Update Trust Policy to finish.
5.1.3.3 - etcd configuration
EKS Anywhere cluster yaml etcd specification reference
Unstacked etcd topology (recommended)
There are two types of etcd topologies for configuring a Kubernetes cluster:
- Stacked: The etcd members and control plane components are colocated (run on the same node/machines)
- Unstacked/External: With the unstacked or external etcd topology, etcd members have dedicated machines and are not colocated with control plane components
The unstacked etcd topology is recommended for a HA cluster for the following reasons:
- External etcd topology decouples the control plane components and etcd member.
So if a control plane-only node fails, or if there is a memory leak in a component like kube-apiserver, it won’t directly impact an etcd member.
- Etcd is resource intensive, so it is safer to have dedicated nodes for etcd, since it could use more disk space or higher bandwidth.
Having a separate etcd cluster for these reasons could ensure a more resilient HA setup.
EKS Anywhere supports both topologies.
In order to configure a cluster with the unstacked/external etcd topology, you need to configure your cluster by updating the configuration file before creating the cluster.
This is a generic template with detailed descriptions below for reference:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
cniConfig:
cilium: {}
controlPlaneConfiguration:
count: 1
endpoint:
host: ""
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-name-cp
datacenterRef:
kind: VSphereDatacenterConfig
name: my-cluster-name
# etcd configuration
externalEtcdConfiguration:
count: 3
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-name-etcd
kubernetesVersion: "1.19"
workerNodeGroupConfigurations:
- count: 1
machineGroupRef:
kind: VSphereMachineConfig
name: my-cluster-name
name: md-0
externalEtcdConfiguration (under Cluster)
This field accepts any configuration parameters for running external etcd.
count (required)
This determines the number of etcd members in the cluster.
The recommended number is 3.
machineGroupRef (required)
5.1.3.4 - AWS IAM Authenticator configuration
EKS Anywhere cluster yaml specification AWS IAM Authenticator reference
AWS IAM Authenticator support (optional)
EKS Anywhere can create clusters that support AWS IAM Authenticator-based api server authentication.
In order to add IAM Authenticator support, you need to configure your cluster by updating the configuration file before creating the cluster.
This is a generic template with detailed descriptions below for reference:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
# IAM Authenticator support
identityProviderRefs:
- kind: AWSIamConfig
name: aws-iam-auth-config
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: AWSIamConfig
metadata:
name: aws-iam-auth-config
spec:
awsRegion: ""
backendMode:
- ""
mapRoles:
- roleARN: arn:aws:iam::XXXXXXXXXXXX:role/myRole
username: myKubernetesUsername
groups:
- ""
mapUsers:
- userARN: arn:aws:iam::XXXXXXXXXXXX:user/myUser
username: myKubernetesUsername
groups:
- ""
partition: ""
identityProviderRefs (Under Cluster)
List of identity providers you want configured for the Cluster.
This would include a reference to the AWSIamConfig
object with the configuration below.
awsRegion (required)
- Description: awsRegion can be any region in the aws partition that the IAM roles exist in.
- Type: string
backendMode (required)
- Description: backendMode configures the IAM authenticator server’s backend mode (i.e. where to source mappings from). We support EKSConfigMap
and CRD
modes supported by AWS IAM Authenticator, for more details refer to backendMode
- Type: string
mapRoles, mapUsers (recommended for EKSConfigMap
backend)
partition
- Description: This field is used to set the aws partition that the IAM roles are present in. Default value is
aws
.
- Type: string
5.1.3.5 - OIDC configuration
EKS Anywhere cluster yaml specification OIDC reference
OIDC support (optional)
EKS Anywhere can create clusters that support api server OIDC authentication.
In order to add OIDC support, you need to configure your cluster by updating the configuration file before creating the cluster.
This is a generic template with detailed descriptions below for reference:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
# OIDC support
identityProviderRefs:
- kind: OIDCConfig
name: my-cluster-name
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: OIDCConfig
metadata:
name: my-cluster-name
spec:
clientId: ""
groupsClaim: ""
groupsPrefix: ""
issuerUrl: "https://x"
requiredClaims:
- claim: ""
value: ""
usernameClaim: ""
usernamePrefix: ""
identityProviderRefs (Under Cluster)
List of identity providers you want configured for the Cluster.
This would include a reference to the OIDCConfig
object with the configuration below.
clientId (required)
- Description: ClientId defines the client ID for the OpenID Connect client
- Type: string
groupsClaim (optional)
- Description: GroupsClaim defines the name of a custom OpenID Connect claim for specifying user groups
- Type: string
groupsPrefix (optional)
- Description: GroupsPrefix defines a string to be prefixed to all groups to prevent conflicts with other authentication strategies
- Type: string
issuerUrl (required)
- Description: IssuerUrl defines the URL of the OpenID issuer, only HTTPS scheme will be accepted
- Type: string
requiredClaims (optional)
List of RequiredClaim objects listed below.
Only one is supported at this time.
requiredClaims[0] (optional)
- Description: RequiredClaim defines a key=value pair that describes a required claim in the ID Token
- Type: object
usernameClaim (optional)
- Description: UsernameClaim defines the OpenID claim to use as the user name.
Note that claims other than the default (‘sub’) is not guaranteed to be unique and immutable
- Type: string
usernamePrefix (optional)
- Description: UsernamePrefix defines a string to be prefixed to all usernames.
If not provided, username claims other than ‘email’ are prefixed by the issuer URL to avoid clashes.
To skip any prefixing, provide the value ‘-’.
- Type: string
5.1.3.6 - GitOpsConfig configuration
Configuration reference for GitOps cluster management.
GitOps Support (Optional)
EKS Anywhere can create clusters that supports GitOps configuration management with Flux.
In order to add GitOps support, you need to configure your cluster by updating the configuration file before creating the cluster.
We currently support two types of configurations: FluxConfig
and GitOpsConfig
.
Flux Configuration
The flux configuration spec has three optional fields, regardless of the chosen git provider.
Flux Configuration Spec Details
systemNamespace (optional)
- Description: Namespace in which to install the gitops components in your cluster. Defaults to
flux-system
- Type: string
clusterConfigPath (optional)
- Description: The path relative to the root of the git repository where EKS Anywhere will store the cluster configuration files. Defaults to the cluster name
- Type: string
branch (optional)
- Description: The branch to use when committing the configuration. Defaults to
main
- Type: string
EKS Anywhere currently supports two git providers for FluxConfig: Github and Git.
Github provider
Please note that for the Flux config to work successfully with the Github provider, the environment variable EKSA_GITHUB_TOKEN
needs to be set with a valid GitHub PAT
.
This is a generic template with detailed descriptions below for reference:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
#GitOps Support
gitOpsRef:
name: my-github-flux-provider
kind: FluxConfig
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
name: my-github-flux-provider
spec:
systemNamespace: "my-alternative-flux-system-namespace"
clusterConfigPath: "path-to-my-clusters-config"
branch: "main"
github:
personal: true
repository: myClusterGitopsRepo
owner: myGithubUsername
---
github Configuration Spec Details
repository (required)
- Description: The name of the repository where EKS Anywhere will store your cluster configuration, and sync it to the cluster. If the repository exists, we will clone it from the git provider; if it does not exist, we will create it for you.
- Type: string
owner (required)
- Description: The owner of the Github repository; either a Github username or Github organization name. The Personal Access Token used must belong to the owner if this is a personal repository, or have permissions over the organization if this is not a personal repository.
- Type: string
personal (optional)
- Description: Is the repository a personal or organization repository?
If personal, this value is
true
; otherwise, false
.
If using an organizational repository (e.g. personal
is false
) the owner
field will be used as the organization
when authenticating to github.com
- Default: true
- Type: boolean
Git provider
Before you create a cluster using the Git provider, you will need to set and export the EKSA_GIT_KNOWN_HOSTS
and EKSA_GIT_PRIVATE_KEY
environment variables.
EKSA_GIT_KNOWN_HOSTS
EKS Anywhere uses the provided known hosts file to verify the identity of the git provider when connecting to it with SSH.
The EKSA_GIT_KNOWN_HOSTS
environment variable should be a path to a known hosts file containing entries for the git server to which you’ll be connecting.
For example, if you wanted to provide a known hosts file which allows you to connect to and verify the identity of github.com using a private key based on the key algorithm ecdsa, you can use the OpenSSH utility ssh-keyscan
to obtain the known host entry used by github.com for the ecdsa
key type.
EKS Anywhere supports ecdsa
, rsa
, and ed25519
key types, which can be specified via the sshKeyAlgorithm
field of the git provider config.
ssk-keyscan -t ecdsa github.com >> my_eksa_known_hosts
This will produce a file which contains known-hosts entries for the ecdsa
key type supported by github.com, mapping the host to the key-type and public key.
github.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBEmKSENjQEezOmxkZMy7opKgwFB9nkt5YRrYMjNuG5N87uRgg6CLrbo5wAdT/y6v0mKV0U2w0WZ2YB/++Tpockg=
EKS Anywhere will use the content of the file at the path EKA_GIT_KNOWN_HOSTS
to verify the identity of the remote git server, and the provided known hosts file must contain an entry for the remote host and key type.
EKSA_GIT_PRIVATE_KEY
The EKSA_GIT_PRIVATE_KEY
environment variable should be a path to the private key file associated with a valid SSH public key registered with your Git provider.
This key must have permission to both read from and write to your repository.
The key can use the key algorithms rsa
, ecdsa
, and ed25519
.
This key file must have restricted file permissions, allowing only the owner to read and write, such as octal permissions 600
.
If your private key file is passphrase protected, you must also set EKSA_GIT_SSH_KEY_PASSPHRASE
with that value.
This is a generic template with detailed descriptions below for reference:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
#GitOps Support
gitOpsRef:
name: my-git-flux-provider
kind: FluxConfig
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
name: my-git-flux-provider
spec:
systemNamespace: "my-alternative-flux-system-namespace"
clusterConfigPath: "path-to-my-clusters-config"
branch: "main"
git:
repositoryUrl: ssh://git@github.com/myAccount/myClusterGitopsRepo.git
sshKeyAlgorithm: ecdsa
---
git Configuration Spec Details
repositoryUrl (required)
- Description: The URL of an existing repository where EKS Anywhere will store your cluster configuration and sync it to the cluster.
- Type: string
sshKeyAlgorithm (optional)
- Description: The SSH key algorithm of the private key specified via
EKSA_PRIVATE_KEY_FILE
. Defaults to ecdsa
- Type: string
Supported SSH key algorithm types are ecdsa
, rsa
, and ed25519
.
Be sure that this SSH key algorithm matches the private key file provided by EKSA_GIT_PRIVATE_KEY_FILE
and that the known hosts entry for the key type is present in EKSA_GIT_KNOWN_HOSTS
.
GitOps Configuration
Warning
GitOps Config will be deprecated in v0.11.0 in lieu of using the Flux Config described above.
Please note that for the GitOps config to work successfully the environment variable EKSA_GITHUB_TOKEN
needs to be set with a valid GitHub PAT
. This is a generic template with detailed descriptions below for reference:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
#GitOps Support
gitOpsRef:
name: my-gitops
kind: GitOpsConfig
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: GitOpsConfig
metadata:
name: my-gitops
spec:
flux:
github:
personal: true
repository: myClusterGitopsRepo
owner: myGithubUsername
fluxSystemNamespace: ""
clusterConfigPath: ""
GitOps Configuration Spec Details
flux (required)
- Description: our supported gitops provider is
flux
.
This is the only supported value.
- Type: object
Flux Configuration Spec Details
github (required)
- Description:
github
is the only currently supported git provider.
This defines your github configuration to be used by EKS Anywhere and flux.
- Type: object
github Configuration Spec Details
repository (required)
- Description: The name of the repository where EKS Anywhere will store your cluster configuration, and sync it to the cluster.
If the repository exists, we will clone it from the git provider; if it does not exist, we will create it for you.
- Type: string
owner (required)
- Description: The owner of the Github repository; either a Github username or Github organization name.
The Personal Access Token used must belong to the
owner
if this is a personal
repository, or have permissions over the organization if this is not a personal
repository.
- Type: string
personal (optional)
- Description: Is the repository a personal or organization repository?
If personal, this value is
true
; otherwise, false
.
If using an organizational repository (e.g. personal
is false
) the owner
field will be used as the organization
when authenticating to github.com
- Default:
true
- Type: boolean
clusterConfigPath (optional)
- Description: The path relative to the root of the git repository where EKS Anywhere will store the cluster configuration files.
- Default:
clusters/$MANAGEMENT_CLUSTER_NAME
- Type: string
fluxSystemNamespace (optional)
- Description: Namespace in which to install the gitops components in your cluster.
- Default:
flux-system
.
- Type: string
branch (optional)
- Description: The branch to use when committing the configuration.
- Default:
main
- Type: string
5.1.3.7 - Proxy configuration
EKS Anywhere cluster yaml specification proxy configuration reference
Proxy support (optional)
You can configure EKS Anywhere to use a proxy to connect to the Internet. This is the
generic template with proxy configuration for your reference:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
proxyConfiguration:
httpProxy: http-proxy-ip:port
httpsProxy: https-proxy-ip:port
noProxy:
- list of no proxy endpoints
Proxy Configuration Spec Details
proxyConfiguration (required)
- Description: top level key; required to use proxy.
- Type: object
httpProxy (required)
- Description: HTTP proxy to use to connect to the internet; must be in the format IP:port
- Type: string
- Example:
httpProxy: 192.168.0.1:3218
httpsProxy (required)
- Description: HTTPS proxy to use to connect to the internet; must be in the format IP:port
- Type: string
- Example:
httpsProxy: 192.168.0.1:3218
noProxy (optional)
- Description: list of endpoints that should not be routed through the proxy; can be an IP, CIDR block, or a domain name
- Type: list of strings
- Example
noProxy:
- localhost
- 192.168.0.1
- 192.168.0.0/16
- .example.com
5.1.3.8 - Registry Mirror configuration
EKS Anywhere cluster yaml specification for registry mirror configuration
Registry Mirror Support (optional)
You can configure EKS Anywhere to use a private registry as a mirror for pulling the required images.
The following cluster spec shows an example of how to configure registry mirror:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-cluster-name
spec:
...
registryMirrorConfiguration:
endpoint: <private registry IP or hostname>
port: <private registry port>
caCertContent: |
-----BEGIN CERTIFICATE-----
MIIF1DCCA...
...
es6RXmsCj...
-----END CERTIFICATE-----
Registry Mirror Configuration Spec Details
registryMirrorConfiguration (required)
- Description: top level key; required to use a private registry.
- Type: object
endpoint (required)
- Description: IP address or hostname of the private registry for pulling images
- Type: string
- Example:
endpoint: 192.168.0.1
port (optional)
- Description: Port for the private registry. This is an optional field. If a port
is not specified, the default HTTPS port
443
is used
- Type: string
- Example:
port: 443
caCertContent (optional)
- Description: Certificate Authority (CA) Certificate for the private registry . When using
self-signed certificates it is necessary to pass this parameter in the cluster spec.
It is also possible to configure CACertContent by exporting an environment variable:
export EKSA_REGISTRY_MIRROR_CA="/path/to/certificate-file"
- Type: string
- Example:
CACertContent: |
-----BEGIN CERTIFICATE-----
MIIF1DCCA...
...
es6RXmsCj...
-----END CERTIFICATE-----
Import images into a private registry
You can use the import-images
command to pull images from public.ecr.aws
and push them to your
private registry.
Starting with release 0.8, import-images command also pulls the cilium chart from public.ecr.aws
and pushes it to the registry mirror. It requires the registry credentials for performing a login. Set the following environment variables for the login:
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
docker login https://<private registry endpoint>
...
eksctl anywhere import-images -f cluster-spec.yaml
Docker configurations
It is necessary to add the private registry’s CA Certificate
to the list of CA certificates on the admin machine if your registry uses self-signed certificates.
For Linux
, you can place your certificate here: /etc/docker/certs.d/<private-registry-endpoint>/ca.crt
For Mac
, you can follow this guide to add the certificate to your keychain: https://docs.docker.com/desktop/mac/#add-tls-certificates
Note
You may need to restart Docker after adding the certificates.
Registry configurations
Depending on what registry you decide to use, you will need to create the following projects:
bottlerocket
eks-anywhere
eks-distro
isovalent
cilium-chart
For example, if a registry is available at private-registry.local
, then the following
projects will have to be created:
https://private-registry.local/bottlerocket
https://private-registry.local/eks-anywhere
https://private-registry.local/eks-distro
https://private-registry.local/isovalent
https://private-registry.local/cilium-chart
5.2 - Bare Metal
Preparing a Bare Metal provider for EKS Anywhere
5.2.1 - Requirements for EKS Anywhere on Bare Metal
Bare Metal provider requirements for EKS Anywhere
To run EKS Anywhere on Bare Metal, you need to meet the hardware and networking requirements described below.
Administrative machine
Set up an Administrative machine as described in Install EKS Anywhere
.
Compute server requirements
The minimum number of physical machines needed to run EKS Anywhere in a non-production mode is:
- Control plane physical machines: 1
- Worker physical machines: 1
The recommended number of physical machines for production is at least:
- Control plane physical machines: 3
- Worker physical machines: 2
You will need an additional, temporary machine for each control plane node grouping and worker node grouping later when you go to upgrade a node.
That machine must have the same specs as all the machines in that group.
This comes from the need to use the same template to populate data on the disks for all nodes in a group.
The compute hardware you need for your Bare Metal cluster must meet the following capacity requirements:
- CPU: 2
- Memory: 8GB RAM
- Storage: 25GB
Network requirements
Each machine should include the following features:
- Network Interface Cards: At least one NIC is required. It must be capable of netbooting from PXE.
- IPMI integration (recommended): An IPMI implementation (such a Dell iDRAC, RedFish-compatible, legacy or HP iLO) on the computer’s motherboard or on a separate expansion card. This feature is used to allow remote management of the machine, such as turning the machine on and off.
NOTE: IPMI is not required for an EKS Anywhere cluster. However, without IPMI, upgrades are not supported and you will have to physically turn machines off and on when appropriate.
Here are other network requirements:
-
All EKS Anywhere machines, including the Admin, control plane and worker machines, must be on the same layer 2 network and have network connectivity to the BMC (IPMI, Redfish, and so on). The hardware does not need to be on the same layer 2 as the BMC, but the Admin machine and management cluster does need routes configured so it can communicate with the BMC API.
-
You must be able to run DHCP on the control plane/worker machine network.
NOTE:: If you have another DHCP service running on the network, you need to prevent it from interfering with the EKS Anywhere DHCP service. You can do that by configuring the other DHCP service to explicitly block all MAC addresses and exclude all IP addresses that you plan to use with your EKS Anywhere clusters.
-
The administrative machine and the target workload environment will need network access to:
- public.ecr.aws
- anywhere-assets.eks.amazonaws.com: To download the EKS Anywhere binaries, manifests and OVAs
- distro.eks.amazonaws.com: To download EKS Distro binaries and manifests
- d2glxqk2uabbnd.cloudfront.net: For EKS Anywhere and EKS Distro ECR container images
-
Two IP addresses routable from the cluster, but excluded from DHCP offering. One IP address is to be used as the Control Plane Endpoint IP or kube-vip VIP address. The other is for the Tinkerbell IP address on the target cluster. Below are some suggestions to ensure that these IP addresses are never handed out by your DHCP server. You may need to contact your network engineer to manage these addresses.
- Pick IP addresses reachable from the cluster subnet that are excluded from the DHCP range or
- Create an IP reservation for these addresses on your DHCP server. This is usually accomplished by adding a dummy mapping of this IP address to a non-existent mac address.
NOTE: When you set up your cluster configuration YAML file, the endpoint and Tinkerbell addresses are set in the ControlPlaneConfiguration.endpoint.host
and tinkerbellIP
fields, respectively.
- Ports must be open to the Admin machine and cluster machines as described in Ports and protocols
.
Validated hardware
Through extensive testing in a variety of on premises customer environments during our beta phase, we expect Amazon EKS Anywhere on bare metal to run on most generic hardware that meets the above requirements.
In addition, we have collaborated with our hardware original equipment manufacturer (OEM) partners to provide you a list of validated hardware:
Bare metal servers |
IPMI |
NIC |
OS |
Dell PowerEdge R740 |
iDRAC9 |
Mellanox ConnectX-4 LX 25GbE |
Validated with Ubuntu v20.04.1 |
Dell PowerEdge R640 (PowerFlex) |
iDRAC9 |
Mellanox ConnectX-4 LX 25GbE |
Validated with Ubuntu v20.04.1 |
SuperServer SYS-510P-M |
IPMI2.0/Redfish API |
Intel® Ethernet Controller i350 2x 1GbE |
Validated with Ubuntu v20.04.1 and Bottlerocket v1.8.0 |
Dell PowerEdge R240 |
iDRAC9 |
Broadcom 57414 Dual Port 10/25GbE |
Validated with Ubuntu v20.04 and Bottlerocket v1.8.0 |
HPE ProLiant DL20 |
iLO5 |
HPE 361i 1G |
Validated with Ubuntu v20.04 and Bottlerocket v1.8.0 |
HPE ProLiant DL160 Gen10 |
iLO5 |
HPE Eth 10/25Gb 2P 640SFP28 A |
Validated with Ubuntu v20.04.1 |
Dell PowerEdge R340 |
iDRAC9 |
Broadcom 57416 Dual Port 10GbE |
Validated with Ubuntu v20.04.1 and Bottlerocket v1.8.0 |
HPE ProLiant DL360 |
iLO5 |
HPE Ethernet 1Gb 4-port 331i |
Validated with Ubuntu v20.04.1 |
Lenovo ThinkSystem SR650 V2 |
XClarity Controller Enterprise v7.92 |
- Intel I350 1GbE RJ45 4-port OCP
- Marvell QL41232 10/25GbE SFP28
2-Port PCIe Ethernet Adapter
|
Validated with Ubuntu v20.04.1 |
5.2.2 - Preparing Bare Metal for EKS Anywhere
Set up a Bare Metal cluster to prepare it for EKS Anywhere
After gathering hardware described in Bare Metal Requirements
, you need to prepare the hardware and create a CSV file describing that hardware.
Prepare hardware
To prepare your computer hardware for EKS Anywhere, you need to connect your computer hardware and do some configuration.
Once the hardware is in place, you need to:
- Obtain IP and MAC addresses for your machines' NICs.
- Obtain IP addresses for your machines' IPMI interfaces.
- Obtain the gateway address for your network to reach the Internet.
- Obtain the IP address for your DNS servers.
- Make sure the following settings are in place:
- UEFI is enabled on all target cluster machines
- PXE boot is enabled for the NIC on each machine for which you provided the MAC address. This is the interface on which the operating system will be provisioned.
- PXE is set as the first device in each machine’s boot order
- IPMI over LAN is enabled on the IPMI interfaces
- Go to the IPMI settings for each machine and set the IP address (bmc_ip), username (bmc_username), and password (bmc_password) to use later in the CSV file.
Prepare hardware inventory
Create a CSV file to provide information about all physical machines that you are ready to add to your target Bare Metal cluster.
This file will be used:
- When you generate the hardware file to be included in the cluster creation process described in the Create Bare Metal production cluster
Getting Started guide.
- To provide information that is passed to each machine from the Tinkerbell DHCP server when the machine is initially PXE booted.
The following is an example of an EKS Anywhere Bare Metal hardware CSV file:
hostname,bmc_ip,bmc_username,bmc_password,mac,ip_address,netmask,gateway,nameservers,labels,disk
eksa-cp01,10.10.44.1,root,PrZ8W93i,CC:48:3A:00:00:01,10.10.50.2,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=cp,/dev/sda
eksa-cp02,10.10.44.2,root,Me9xQf93,CC:48:3A:00:00:02,10.10.50.3,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=cp,/dev/sda
eksa-cp03,10.10.44.3,root,Z8x2M6hl,CC:48:3A:00:00:03,10.10.50.4,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=cp,/dev/sda
eksa-wk01,10.10.44.4,root,B398xRTp,CC:48:3A:00:00:04,10.10.50.5,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=worker,/dev/sda
eksa-wk02,10.10.44.5,root,w7EenR94,CC:48:3A:00:00:05,10.10.50.6,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=worker,/dev/sda
The CSV file is a comma-separated list of values in a plain text file, holding information about the physical machines in the datacenter that are intended to be a part of the cluster creation process.
Each line represents a physical machine (not a virtual machine).
The following sections describe each value.
hostname
The hostname assigned to the machine.
bmc_ip
The IP address assigned to the IPMI interface on the machine.
bmc_username
The username assigned to the IPMI interface on the machine.
bmc_password
The password associated with the bmc_username
assigned to the IPMI interface on the machine.
mac
The MAC address of the network interface card (NIC) that provides access to the host computer.
ip_address
The IP address providing access to the host computer.
netmask
The netmask associated with the ip_address
value.
In the example above, a /23 subnet mask is used, allowing you to use up to 510 IP addresses in that range.
gateway
IP address of the interface that provides access (the gateway) to the Internet.
nameservers
The IP address of the server that you want to provide DNS service to the cluster.
labels
The optional labels field can consist of a key/value pair to use in conjunction with the hardwareSelector
field when you set up your Bare Metal configuration
.
The key/value pair is connected with an equal (=
) sign.
For example, a TinkerbellMachineConfig
with a hardwareSelector
containing type: cp
will match entries in the CSV containing type=cp
in its label definition.
disk
The device name of the disk on which the operating system will be installed.
For example, it could be /dev/sda
for the first SCSI disk or /dev/nvme0n1
for the first NVME storage device.
5.2.3 - Netbooting and Tinkerbell for Bare Metal
Overview of Netbooting and Tinkerbell for EKS Anywhere on Bare Metal
EKS Anywhere uses Tinkerbell
to provision machines for a Bare Metal cluster.
Understanding what Tinkerbell is and how it works with EKS Anywhere can help you take advantage of advanced provisioning features or overcome provisioning problems you encounter.
As someone deploying an EKS Anywhere cluster on Bare Metal, you have several opportunities to interact with Tinkerbell:
- Create a hardware CSV file: You are required to create a hardware CSV file
that contains an entry for every physical machine you want to add at cluster creation time.
- Create an EKS Anywhere cluster: By modifying the Bare Metal configuration file
used to create a cluster, you can change some Tinkerbell settings or add actions to define how the operating system on each machine is configured.
- Monitor provisioning: You can follow along with the Tinkerbell Overview
in this page to monitor the progress of your hardware provisioning, as Tinkerbell finds machines and attempts to PXE boot, configure, and restart them.
Using Tinkerbell on EKS Anywhere
The sections below step through how Tinkerbell is integrated with EKS Anywhere to deploy a Bare Metal cluster.
While based on features described in Tinkerbell Documentation
,
EKS Anywhere has modified and added to Tinkerbell components such that the entire Tinkerbell stack is now Kubernetes-friendly and can run on a Kubernetes cluster.
The information that Tinkerbell uses to provision machines for the target EKS Anywhere cluster needs to be gathered in a CSV file with the following format:
hostname,bmc_ip,bmc_username,bmc_password,mac,ip_address,netmask,gateway,nameservers,labels,disk
eksa-cp01,10.10.44.1,root,PrZ8W93i,CC:48:3A:00:00:01,10.10.50.2,255.255.254.0,10.10.50.1,8.8.8.8,type=cp,/dev/sda
...
Each physical, bare metal machine is represented by a comma-separated list of information on a single line.
It includes information needed to identify each machine (the NIC’s MAC address), PXE boot the machine, point to the disk to install on, and then configure and start the installed system.
See Preparing hardware inventory
for details on the content and format of that file.
Modify the cluster specification file
Before you create a cluster using the Bare Metal configuration
file, you can make Tinkerbell-related changes to that file.
In particular, TinkerbellDatacenterConfig fields
, TinkerbellMachineConfig fields
, and Tinkerbell Actions
can be added or modified.
Tinkerbell actions vary based on the operating system you choose for your EKS Anywhere cluster.
Actions are stored internally and not shown in the generated cluster specification file, so you must add those sections yourself to change from the defaults (see Ubuntu TinkerbellTemplateConfig example
and Bottlerocket TinkerbellTemplateConfig example
for details).
In most cases, you don’t need to touch the default actions.
However, you might want to modify an action (for example to change kexec
to a reboot
action if the hardware requires it) or add an action to further configure the installed system.
Examples in Advanced Bare Metal cluster configuration
show a few actions you might want to add.
Once you have made all your modifications, you can go ahead and create the cluster.
The next section describes how Tinkerbell works during cluster creation to provision your Bare Metal machines and prepare them to join the EKS Anywhere cluster.
Overview of Tinkerbell in EKS Anywhere
When you run the command to create an EKS Anywhere Bare Metal cluster, a set of Tinkerbell components start up on the Admin machine.
One of these components runs in a container on Docker, while other components run as either controllers or services in pods on the Kubernetes kind
cluster that is started up on the Admin machine.
Tinkerbell components include boots, hegel, rufio, and tink.
Tinkerbell boots service
The boots service runs in a single container to handle the DHCP service and Netbooting activities.
In particular, boots hands out IP addresses, serves iPXE binaries via HTTP and TFTP, delivers an iPXE script to the provisioned machines, and runs a syslog server.
Boots is different from the other Tinkerbell services because the DHCP service it runs must listen directly to layer 2 traffic.
(The kind cluster running on the Admin machine doesn’t have the ability to have pods listening on layer 2 networks, which is why boots is run directly on Docker instead, with host networking enabled.)
Because boots is running as a container in Docker, you can see the output in the logs for the boots container by running:
From the logs output, you will see iPXE try to netboot each machine.
If the process doesn’t get all the information it wants from the DHCP server, it will time out.
You can see iPXE loading variables, loading a kernel and initramfs (via DHCP), then booting into that kernel and initramfs: in other words, you will see everything that happens with iPXE before it switches over to the kernel and initramfs.
The kernel, initramfs, and all images retrieved later are obtained remotely over HTTP and HTTPS.
Tinkerbell hegel, rufio, and tink components
After boots comes up on Docker, a small Kubernetes kind cluster starts up on the Admin machine.
Other Tinkerbell components run as pods on that kind cluster. Those components include:
- hegel: Manages Tinkerbell’s metadata service.
The hegel service gets its metadata from the hardware specification stored in Kubernetes in the form of custom resources.
The format that it serves is similar to an Ec2 metadata format.
- rufio: Handles talking to BMCs (which manages things like starting and stopping systems with IPMI).
The rufio Kubernetes controller sets things such as power state, persistent boot order, and eventually other services (like NTP, LDAP, and TLS certificates).
BMC authentication is managed with Kubernetes secrets.
- tink: The tink service consists of three components: tink server, tink controller, and tink worker.
The tink controller manages hardware data, templates you want to execute, and the worflows that each target specific hardware you are provisioning.
The tink worker is a small binary that runs inside of HookOS and talks to the tink server.
The worker sends the tink server its MAC address and asks the server for workflows to run.
The tink worker will then go through each action, one-by-one, and try to execute it.
To see those services and controllers running on the kind bootstrap cluster, type:
kubectl get pods -n eksa-system
NAME READY STATUS RESTARTS AGE
hegel-sbchp 1/1 Running 0 3d
rufio-controller-manager-5dcc568c79-9kllz 1/1 Running 0 3d
tink-controller-manager-54dc786db6-tm2c5 1/1 Running 0 3d
tink-server-5c494445bc-986sl 1/1 Running 0 3d
Provisioning hardware with Tinkerbell
After you start up the cluster create process, the following is the general workflow that Tinkerbell performs to begin provisioning the bare metal machines and prepare them to become part of the EKS Anywhere target cluster.
You can set up kubectl on the Admin machine to access the bootstrap cluster and follow along:
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
Power up the nodes
Tinkerbell starts by finding a node from the hardware list (based on MAC address) and contacting it to identify a baseboard management job (BMJ) that runs a set of baseboard management tasks (BMT).
To see that information, type:
NAMESPACE NAME AGE
eksa-system mycluster-md-0-1656099863422-vxvh2-provision 12m
NAMESPACE NAME AGE
eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-0 55s
eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-1 51s
eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-2 47s
The following shows snippets from the bmt output that represent the three tasks: Power Off, enable PXE boot, and Power On.
kubectl describe bmt -n eksa-system eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-0
...
Task:
Power Action: Off
Status:
Completion Time: 2022-06-27T20:32:59Z
Conditions:
Status: True
Type: Completed
kubectl describe bmt -n eksa-system eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-1
...
Task:
One Time Boot Device Action:
Device:
pxe
Efi Boot: true
Status:
Completion Time: 2022-06-27T20:33:04Z
Conditions:
Status: True
Type: Completed
kubectl describe bmt -n eksa-system eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-2
Task:
Power Action: on
Status:
Completion Time: 2022-06-27T20:33:10Z
Conditions:
Status: True
Type: Completed
Rufio converts the baseboard management jobs into task objects, then goes ahead and executes each task. To see rufio logs, type:
kubectl logs -n eksa-system rufio-controller-manager-5dcc568c79-9kllz | less
PXE boots
Next the boots service PXE boots the machine and begins streaming the HookOS (vmlinuz
and initramfs
) to the machine.
HookOS runs in memory and provides the installation environment.
To watch the boots log messages as each node boots, type:
You can search the output for vmlinuz
and initramfs
to watch as the HookOS is downloaded and booted from memory on each machine.
Running workflows
Once the HookOS is up, Tinkerbell begins running the tasks and actions contained in the workflows.
This is coordinated between the tink worker, running in memory within the HookOS on the machine, and the tink server on the kind cluster.
To see the workflows being run, type the following:
kubectl get workflows.tinkerbell.org -n eksa-system
NAME TEMPLATE STATE
mycluster-md-0-1656099863422-vxh2 mycluster-md-0-1656099863422-vxh2 STATE_RUNNING
This shows the worflow for the first machine that is being provisioned.
Add -o yaml
to see details of that workflow template:
kubectl get workflows.tinkerbell.org -n eksa-system -o yaml
...
status:
state: STATE_RUNNING
tasks:
- actions
- environment:
COMPRESSED: "true"
DEST_DISK: /dev/sda
IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-22/ubuntu-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.gz
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-11
name: stream-image
seconds: 35
startedAt: "2022-06-27T20:37:39Z"
status: STATE_SUCCESS
...
You can see that the first action in the workflow is to stream (stream-image
) the operating system to the destination disk (DEST_DISK
) on the machine.
In this example, the Ubuntu operating system that will be copied to disk (/dev/sda
) is being served from the location specificed by IMG_URL.
The action was successful (STATE_SUCCESS) and it took 35 seconds.
Each action and its status is shown in this output for the whole workflow.
To see details of the default actions for each supported operating system, see the Ubuntu TinkerbellTemplateConfig example
and Bottlerocket TinkerbellTemplateConfig example
.
In general, the actions include:
- Streaming the operating system image to disk on each machine.
- Configuring the network interfaces on each machine.
- Setting up the cloud-init or similar service to add users and otherwise configure the system.
- Identifying the data source to add to the system.
- Setting the kernel to pivot to the installed system (using kexec) or having the system reboot to bring up the installed system from disk.
If all goes well, you will see all actions set to STATE_SUCCESS, except for the kexec-image action. That should show as STATE_RUNNING for as long as the machine is running.
You can review the CAPT logs to see provisioning activity.
For example, at the start of a new provisioning event, you would see something like the following:
kubectl logs -n capt-system capt-controller-manager-9f8b95b-frbq | less
..."Created BMCJob to get hardware ready for provisioning"...
You can follow this output to see the machine as it goes through the provisioning process.
After the node is initialized, completes all the Tinkerbell actions, and is booted into the installed operating system (Ubuntu or Bottlerocket), the new system starts cloud-init to do further configuration.
At this point, the system will reach out to the Tinkerbell hegel service to get the hegel metadata.
If something goes wrong, viewing hegel files can help you understand why a stuck system that has booted into Ubuntu has not joined the cluster yet.
To see the hegel files, get the internal IP address for one of the new nodes. Then check for the names of hegel logs and display the contents of one of those logs, searching for the IP address of the node:
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP ...
eksa-da04 Ready control-plane,master 9m5s v1.22.10-eks-7dc61e8 10.80.30.23
kubectl get logs -n eksa-system | grep hegel
hegel-n7ngs
kubectl logs -n eksa-system hegel-n7ngs
..."Retrieved IP peer IP..."userIP":"10.80.30.23...
If the log shows you are getting requests from the node, the problem is not a cloud-init issue.
After the first machine successfully completes the workflow, each other machine repeats the same process until the initial set of machines is all up and running.
Tinkerbell moves to target cluster
Once the initial set of machines is up and the EKS Anywhere cluster is running, all the Tinkerbell services and components (including boots) are moved to the new target cluster and run as pods on that cluster.
Those services are deleted on the kind cluster on the Admin machine.
Reviewing the status
At this point, you can change your kubectl credentials to point at the new target cluster to get information about Tinkerbell services on the new cluster. For example:
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
First check that the Tinkerbell pods are all running by listing pods from the eksa-system namespace:
kubectl get pods -n eksa-system
NAME READY STATUS RESTARTS AGE
boots-5dc66b5d4-klhmj 1/1 Running 0 3d
hegel-sbchp 1/1 Running 0 3d
rufio-controller-manager-5dcc568c79-9kllz 1/1 Running 0 3d
tink-controller-manager-54dc786db6-tm2c5 1/1 Running 0 3d
tink-server-5c494445bc-986sl 1/1 Running 0 3d
Next, check the list of Tinkerbell machines.
If all of the machines were provisioned successfully, you should see true
under the READY column for each one.
kubectl get tinkerbellmachine -A
NAMESPACE NAME CLUSTER STATE READY INSTANCEID MACHINE
eksa-system mycluster-control-plane-template-1656099863422-pqq2q mycluster true tinkerbell://eksa-system/eksa-da04 mycluster-72p72
You can also check the machines themselves.
Watch the PHASE change from Provisioning to Provisioned to Running.
The Running phase indicates that the machine is now running as a node on the new cluster:
kubectl get machines -n eksa-system
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
mycluster-72p72 mycluster eksa-da04 tinkerbell://eksa-system/eksa-da04 Running 7m25s v1.22.10-eks-1-22-8
Once you have confirmed that all your machines are successfully running as nodes on the target cluster, there is not much for Tinkerbell to do.
It stays around to continue running the DHCP service and to be available to add more machines to the cluster.
5.2.4 - Customize HookOS for EKS Anywhere on Bare Metal
Customizing HookOS for EKS Anywhere on Bare Metal
To initally PXE boot bare metal machines used in EKS Anywhere clusters, Tinkerbell acquires a kernel and initial ramdisk that is referred to as the HookOS.
A default HookOS is provided when you create an EKS Anywhere cluster.
However, there may be cases where you want to override the default HookOS, such as to add drivers required to boot your particular type of hardware.
The following procedure describes how to get the Tinkerbell stack’s Hook/Linuxkit OS built locally.
For more information on Tinkerbell’s Hook Installation Environment, see the Tinkerbell Hook repo
.
-
Clone the hook repo or your fork of that repo:
git clone https://github.com/tinkerbell/hook.git
cd hook/
-
Pull down the commit that EKS Anywhere is tracking for Hook:
git checkout -b <new-branch> 029ef8f0711579717bfd14ac5eb63cdc3e658b1d
NOTE: This commit number can be obtained from the EKS-A build tooling repo
.
-
Make changes shown in the following diff
in the Makefile
located in the root of the repo using your favorite editor.
diff —git a/Makefile b/Makefile
index 66b7f48..f9fc283 100644
--- a/Makefile
+++ b/Makefile
@@ -1,4 +1,4 @@
-ORG ?= quay.io/tinkerbell (http://quay.io/tinkerbell)
+ORG ?= localhost:5000/tinkerbell
ARCH := $(shell uname -m)
GIT_VERSION ?= $(shell git log -1 —format="%h")
@@ -53,13 +53,13 @@ dev-bootkitBuild:
cd bootkit; docker buildx build —load -t $(ORG)/hook-bootkit:0.0 .
bootkitBuild:
- cd bootkit; docker buildx build —platform linux/amd64,linux/arm64 —push -t $(ORG)/hook-bootkit:0.0 .
+ cd bootkit; docker buildx build —platform linux/amd64 —push -t $(ORG)/hook-bootkit:0.0 .
dev-tink-dockerBuild:
cd tink-docker; docker buildx build —load -t $(ORG)/hook-docker:0.0 .
tink-dockerBuild:
- cd tink-docker; docker buildx build —platform linux/amd64,linux/arm64 —push -t $(ORG)/hook-docker:0.0 .
+ cd tink-docker; docker buildx build —platform linux/amd64 —push -t $(ORG)/hook-docker:0.0 .
Changes above change the ORG variable to use a local registry (localhost:5000
) and change the docker build
command to only build for the immediately required platform to save time.
-
Modify the hook.yaml
file located in the root of the repo with the following changes:
diff --git a/hook.yaml b/hook.yaml
index 0c5d789..b51b35e 100644
net: host
--- a/hook.yaml
+++ b/hook.yaml
@@ -1,5 +1,5 @@
kernel:
- image: quay.io/tinkerbell/hook-kernel:5.10.85 (http://quay.io/tinkerbell/hook-kernel:5.10.85)
+ image: localhost:5000/tinkerbell/hook-kernel:5.10.85
cmdline: "console=tty0 console=ttyS0 console=ttyAMA0 console=ttysclp0"
init:
- linuxkit/init:v0.8
@@ -42,7 +42,7 @@ services:
binds:
- /var/run:/var/run
- name: docker
- image: quay.io/tinkerbell/hook-docker:0.0 (http://quay.io/tinkerbell/hook-docker:0.0)
+ image: localhost:5000/tinkerbell/hook-docker:0.0
capabilities:
- all
net: host
@@ -64,7 +64,7 @@ services:
- /var/run/docker
- /var/run/worker
- name: bootkit
- image: quay.io/tinkerbell/hook-bootkit:0.0 (http://quay.io/tinkerbell/hook-bootkit:0.0)
+ image: localhost:5000/tinkerbell/hook-bootkit:0.0
capabilities:
- all
The changes above are for using local registry (localhost:5000) for hook-docker, hook-bootkit, and hook-kernel.
NOTE: You may also need to modify the hook.yaml
file if you want to add or change components that are used to build up the image. So far, for example, we have needed to change versions of init
and getty
and inject SSH keys. Take a look at the LinuxKit Examples
site for examples.
-
Make any planned custom modifications to the files under hook
, if you are only making changes to bootkit
or tink-docker
.
-
If you are modifying the kernel, such as to change kernel config parameters to add or modify drivers, follow these steps:
- Change into kernel directory and make a local image for amd64 architecture:
cd kernel; make kconfig_amd64
docker run --rm -ti -v $(pwd):/src:z quay.io/tinkerbell/kconfig
- You can now navigate to the source code and run the UI for configuring the kernel:
cd linux-5-10
make menuconfig
- Once you have changed the necessary kernel configuration parameters, copy the new configuration:
cp .config /src/config-5.10.x-x86_64
Exit out of container into the repo’s kernel directory and run make:
/linux-5.10.85 # exit
user1 % make
-
Install Linuxkit based on instructions from the LinuxKit
page.
-
Ensure that the linuxkit
tool is in your PATH:
export PATH=$PATH:/home/tink/linuxkit/bin
-
Start a local registry:
docker run -d -p 5000:5000 -—name registry registry:2
-
Compile by running the following in the root of the repo:
-
Artifacts will be put under the dist
directory in the repo’s root:
./initramfs-aarch64
./initramfs-x86_64
./vmlinuz-aarch64
./vmlinuz-x86_64
-
To use the kernel (vmlinuz
) and initial ram disk (initramfs
) when you build your cluster, see the description of the hookImagesURLPath
field in your Bare Metal configuration
file.
5.3 - VMware vSphere
Preparing a VMware vSphere provider for EKS Anywhere
5.3.1 - Requirements for EKS Anywhere on VMware vSphere
Preparing a VMware vSphere provider for EKS Anywhere
To run EKS Anywhere, you will need:
Prepare Administrative machine
Set up an Administrative machine as described in Install EKS Anywhere
.
Prepare a VMware vSphere environment
To prepare a VMware vSphere environment to run EKS Anywhere, you need the following:
-
A vSphere 7+ environment running vCenter
-
Capacity to deploy 6-10 VMs
-
DHCP service
running in vSphere environment in the primary VM network for your workload cluster
-
One network in vSphere to use for the cluster. This network must have inbound access into vCenter
-
An OVA
imported into vSphere and converted into a template for the workload VMs
-
User credentials to create VMs and attach networks, etc
-
One IP address routable from cluster but excluded from DHCP offering.
This IP address is to be used as the Control Plane Endpoint IP or kube-vip VIP address
Below are some suggestions to ensure that this IP address is never handed out by your DHCP server.
You may need to contact your network engineer.
- Pick an IP address reachable from cluster subnet which is excluded from DHCP range OR
- Alter DHCP ranges to leave out an IP address(s) at the top and/or the bottom of the range OR
- Create an IP reservation for this IP on your DHCP server. This is usually accomplished by adding
a dummy mapping of this IP address to a non-existent mac address.
Each VM will require:
- 2 vCPUs
- 8GB RAM
- 25GB Disk
The administrative machine and the target workload environment will need network access to:
- public.ecr.aws
- anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
- distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
- d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
- api.github.com (only if GitOps is enabled)
You need to get the following information before creating the cluster:
-
Static IP Addresses:
You will need one IP address for the management cluster control plane endpoint, and a separate one for the controlplane of each workload cluster you add.
Let’s say you are going to have the management cluster and two workload clusters.
For those, you would need three IP addresses, one for each.
All of those addresses will be configured the same way in the configuration file you will generate for each cluster.
A static IP address will be used for each control plane VM in your EKS Anywhere cluster.
Choose IP addresses in your network range that do not conflict with other VMs and make sure they are excluded from your DHCP offering.
An IP address will be the value of the property controlPlaneConfiguration.endpoint.host
in the config file of the management cluster.
A separate IP address must be assigned for each workload cluster.

-
vSphere Datacenter Name: The vSphere datacenter to deploy the EKS Anywhere cluster on.

-
VM Network Name: The VM network to deploy your EKS Anywhere cluster on.

-
vCenter Server Domain Name: The vCenter server fully qualified domain name or IP address. If the server IP is used, the thumbprint must be set or insecure must be set to true.

-
thumbprint (required if insecure=false): The SHA1 thumbprint of the vCenter server certificate which is only required if you have a self-signed certificate for your vSphere endpoint.
There are several ways to obtain your vCenter thumbprint.
If you have govc installed
, you can run the following command in the Administrative machine terminal, and take a note of the output:
govc about.cert -thumbprint -k
-
template: The VM template to use for your EKS Anywhere cluster.
This template was created when you imported the OVA file into vSphere.

-
datastore: The vSphere datastore
to deploy your EKS Anywhere cluster on.

-
folder:
The folder parameter in VSphereMachineConfig allows you to organize the VMs of an EKS Anywhere cluster.
With this, each cluster can be organized as a folder in vSphere.
You will have a separate folder for the management cluster and each cluster you are adding.

-
resourcePool:
The vSphere Resource pools for your VMs in the EKS Anywhere cluster. If there is a resource pool: /<datacenter>/host/<resource-pool-name>/Resources

5.3.2 - Preparing vSphere for EKS Anywhere
Set up a vSphere cluster to prepare it for EKS Anywhere
Create a VM and template folder (Optional):
For each user that needs to create workload clusters, have the vSphere administrator create a VM and template folder.
That folder will host:
- The VMs of the Control plane and Data plane nodes of each cluster.
- A nested folder for the management cluster and another one for each workload cluster.
- Each cluster VM in its own nested folder under this folder.
User permissions should be set up to:
- Only allow the user to see and create EKS Anywhere resources in that folder and its nested folders.
- Prevent the user from having visibility and control over the whole vSphere cluster domain and its sub-child objects (datacenter, resource pools and other folders).
In your EKS Anywhere configuration file you will reference to a path under this folder associated with the cluster you create.
Add a vSphere folder
Follow these steps to create the user’s vSphere folder:
- From vCenter, select the Menus/VM and Template tab.
- Select either a datacenter or another folder as a parent object for the folder that you want to create.
- Right-click the parent object and click New Folder.
- Enter a name for the folder and click OK.
For more details, see the vSphere Create a Folder
documentation.
Set up vSphere roles and user permission
You need to get a vSphere username with the right privileges to let you creatie EKS Anywhere clusters on top of your vSphere cluster.
Then you would need to import the latest release of the EKS Anywhere OVA template to your VSphere cluster to use it to provision your Cluster nodes.
Add a vCenter User
Ask your VSphere administrator to add a vCenter user that will be used for the provisioning of the EKS Anywhere cluster in VMware vSphere.
- Log in with the vSphere Client to the vCenter Server.
- Specify the user name and password for a member of the vCenter Single Sign-On Administrators group.
- Navigate to the vCenter Single Sign-On user configuration UI.
- From the Home menu, select Administration.
- Under Single Sign On, click Users and Groups.
- If vsphere.local is not the currently selected domain, select it from the drop-down menu.
You cannot add users to other domains.
- On the Users tab, click Add.
- Enter a user name and password for the new user.
- The maximum number of characters allowed for the user name is 300.
- You cannot change the user name after you create a user.
The password must meet the password policy requirements for the system.
- Click Add.
For more details, see vSphere Add vCenter Single Sign-On Users
documentation.
Create and define user roles
When you add a user for creating clusters, that user initially has no privileges to perform management operations.
So you have to add this user to groups with the required permissions, or assign a role or roles with the required permission to this user.
Three roles are needed to be able to create the EKS Anywhere cluster:
-
Create a global custom role: For example, you could name this EKS Anywhere Global.
Define it for the user on the vCenter domain level and its children objects.
Create this role with the following privileges:
> Content Library
* Add library item
* Check in a template
* Check out a template
* Create local library
> vSphere Tagging
* Assign or Unassign vSphere Tag
* Assign or Unassign vSphere Tag on Object
* Create vSphere Tag
* Create vSphere Tag Category
* Delete vSphere Tag
* Delete vSphere Tag Category
* Edit vSphere Tag
* Edit vSphere Tag Category
* Modify UsedBy Field For Category
* Modify UsedBy Field For Tag
-
Create a user custom role: The second role is also a custom role that you could call, for example, EKS Anywhere User.
Define this role with the following objects and children objects.
- The pool resource level and its children objects.
This resource pool that our EKS Anywhere VMs will be part of.
- The storage object level and its children objects.
This storage that will be used to store the cluster VMs.
- The network VLAN object level and its children objects.
This network that will host the cluster VMs.
- The VM and Template folder level and its children objects.
Create this role with the following privileges:
> Content Library
* Add library item
* Check in a template
* Check out a template
* Create local library
> Datastore
* Allocate space
* Browse datastore
* Low level file operations
> Folder
* Create folder
> vSphere Tagging
* Assign or Unassign vSphere Tag
* Assign or Unassign vSphere Tag on Object
* Create vSphere Tag
* Create vSphere Tag Category
* Delete vSphere Tag
* Delete vSphere Tag Category
* Edit vSphere Tag
* Edit vSphere Tag Category
* Modify UsedBy Field For Category
* Modify UsedBy Field For Tag
> Network
* Assign network
> Resource
* Assign virtual machine to resource pool
> Scheduled task
* Create tasks
* Modify task
* Remove task
* Run task
> Profile-driven storage
* Profile-driven storage view
> Storage views
* View
> vApp
* Import
> Virtual machine
* Change Configuration
- Add existing disk
- Add new disk
- Add or remove device
- Advanced configuration
- Change CPU count
- Change Memory
- Change Settings
- Configure Raw device
- Extend virtual disk
- Modify device settings
- Remove disk
* Edit Inventory
- Create from existing
- Create new
- Remove
* Interaction
- Power off
- Power on
* Provisioning
- Clone template
- Clone virtual machine
- Create template from virtual machine
- Customize guest
- Deploy template
- Mark as template
- Read customization specifications
* Snapshot management
- Create snapshot
- Remove snapshot
- Revert to snapshot
-
Create a default Administrator role: The third role is the default system role Administrator that you define to the user on the folder level and its children objects (VMs and OVA templates) that was created by the VSphere admistrator for you.
To create a role and define privileges check Create a vCenter Server Custom Role
and Defined Privileges
pages.
Deploy an OVA Template
If the user creating the cluster has permission and network access to create and tag a template, you can skip these steps because EKS Anywhere will automatically download the OVA and create the template if it can. If the user does not have the permissions or network access to create and tag the template, follow this guide. The OVA contains the operating system (Ubuntu or Bottlerocket) for a specific EKS-D Kubernetes release and EKS-A version. The following example uses Ubuntu as the operating system, but a similar workflow would work for Bottlerocket.
Steps to deploy the Ubuntu OVA
- Go to the artifacts
page and download the OVA template with the newest EKS-D Kubernetes release to your computer.
- Log in to the vCenter Server.
- Right-click the folder you created above and select Deploy OVF Template.
The Deploy OVF Template wizard opens.
- On the Select an OVF template page, select the Local file option, specify the location of the OVA template you downloaded to your computer, and click Next.
- On the Select a name and folder page, enter a unique name for the virtual machine or leave the default generated name, if you do not have other templates with the same name within your vCenter Server virtual machine folder.
The default deployment location for the virtual machine is the inventory object where you started the wizard, which is the folder you created above. Click Next.
- On the Select a compute resource page, select the resource pool where to run the deployed VM template, and click Next.
- On the Review details page, verify the OVF or OVA template details and click Next.
- On the Select storage page, select a datastore to store the deployed OVF or OVA template and click Next.
- On the Select networks page, select a source network and map it to a destination network. Click Next.
- On the Ready to complete page, review the page and click Finish.
For details, see Deploy an OVF or OVA Template
To build your own Ubuntu OVA template check the Building your own Ubuntu OVA section in the following link
.
To use the deployed OVA template to create the VMs for the EKS Anywhere cluster, you have to tag it with specific values for the os
and eksdRelease
keys.
The value of the os
key is the operating system of the deployed OVA template, which is ubuntu
in our scenario.
The value of the eksdRelease
holds kubernetes
and the EKS-D release used in the deployed OVA template.
Check the following Customize OVAs
page for more details.
Steps to tag the deployed OVA template:
- Go to the artifacts
page and take notes of the tags and values associated with the OVA template you deployed in the previous step.
- In the vSphere Client, select Menu > Tags & Custom Attributes.
- Select the Tags tab and click Tags.
- Click New.
- In the Create Tag dialog box, copy the
os
tag name associated with your OVA that you took notes of, which in our case is os:ubuntu
and paste it as the name for the first tag required.
- Specify the tag category
os
if it exist or create it if it does not exist.
- Click Create.
- Repeat steps 2-4.
- In the Create Tag dialog box, copy the
os
tag name associated with your OVA that you took notes of, which in our case is eksdRelease:kubernetes-1-21-eks-8
and paste it as the name for the second tag required.
- Specify the tag category
eksdRelease
if it exist or create it if it does not exist.
- Click Create.
- Navigate to the VM and Template tab.
- Select the folder that was created.
- Select deployed template and click Actions.
- From the drop-down menu, select Tags and Custom Attributes > Assign Tag.
- Select the tags we created from the list and confirm the operation.
5.3.3 - Customize OVAs: Ubuntu
Customizing Imported Ubuntu OVAs
There may be a need to make specific configuration changes on the imported ova template before using it to create/update EKS-A clusters.
Set up SSH Access for Imported OVA
SSH user and key need to be configured in order to allow SSH login to the VM template
Clone template to VM
Create an environment variable to hold the name of modified VM/template
export VM=<vm-name>
Clone the imported OVA template to create VM
govc vm.clone -on=false -vm=<full-path-to-imported-template> - folder=<full-path-to-folder-that-will-contain-the-VM> -ds=<datastore> $VM
Create a metadata.yaml file
instance-id: cloud-vm
local-hostname: cloud-vm
network:
version: 2
ethernets:
nics:
match:
name: ens*
dhcp4: yes
Create a userdata.yaml file
#cloud-config
users:
- default
- name: <username>
primary_group: <username>
sudo: ALL=(ALL) NOPASSWD:ALL
groups: sudo, wheel
ssh_import_id: None
lock_passwd: true
ssh_authorized_keys:
- <user's ssh public key>
Export environment variable containing the cloud-init metadata and userdata
export METADATA=$(gzip -c9 <metadata.yaml | { base64 -w0 2>/dev/null || base64; }) \
USERDATA=$(gzip -c9 <userdata.yaml | { base64 -w0 2>/dev/null || base64; })
Assign metadata and userdata to VM’s guestinfo
govc vm.change -vm "${VM}" \
-e guestinfo.metadata="${METADATA}" \
-e guestinfo.metadata.encoding="gzip+base64" \
-e guestinfo.userdata="${USERDATA}" \
-e guestinfo.userdata.encoding="gzip+base64"
Power the VM on
govc vm.power -on “$VM”
Customize the VM
Once the VM is powered on and fetches an IP address, ssh into the VM using your private key corresponding to the public key specified in userdata.yaml
ssh -i <private-key-file> username@<VM-IP>
At this point, you can make the desired configuration changes on the VM. The following sections describe some of the things you may want to do:
Add a Certificate Authority
Copy your CA certificate under /usr/local/share/ca-certificates
and run sudo update-ca-certificates
which will place the certificate under the /etc/ssl/certs
directory.
Add Authentication Credentials for a Private Registry
If /etc/containerd/config.toml
is not present initially, the default configuration can be generated by running the containerd config default > /etc/containerd/config.toml
command. To configure a credential for a specific registry, create/modify the /etc/containerd/config.toml
as follows:
# explicitly use v2 config format
version = 2
# The registry host has to be a domain name or IP. Port number is also
# needed if the default HTTPS or HTTP port is not used.
[plugins."io.containerd.grpc.v1.cri".registry.configs."registry1-host:port".auth]
username = ""
password = ""
auth = ""
identitytoken = ""
# The registry host has to be a domain name or IP. Port number is also
# needed if the default HTTPS or HTTP port is not used.
[plugins."io.containerd.grpc.v1.cri".registry.configs."registry2-host:port".auth]
username = ""
password = ""
auth = ""
identitytoken = ""
Restart containerd service with the sudo systemctl restart containerd
command.
Convert VM to a Template
After you have customized the VM, you need to convert it to a template.
Reset the machine-id and power off the VM
This step is needed because of a known issue in Ubuntu
which results in the clone VMs getting the same DHCP IP
echo -n > /etc/machine-id
rm /var/lib/dbus/machine-id
ln -s /etc/machine-id /var/lib/dbus/machine-id
Power the VM down
govc vm.power -off "$VM"
Take a snapshot of the VM
It is recommended to take a snapshot of the VM as it reduces the provisioning time for the machines and makes cluster creation faster.
If you do snapshot the VM, you will not be able to customize the disk size of your cluster VMs. If you prefer not to take a snapshot, skip this step.
govc snapshot.create -vm "$VM" root
Convert VM to template
govc vm.markastemplate $VM
Tag the template appropriately as described here
Use this customized template to create/upgrade EKS Anywhere clusters
5.3.4 - Import OVAs
Importing EKS Anywhere OVAs to vSphere
If you want to specify an OVA template, you will need to import OVA files into vSphere before you can use it in your EKS Anywhere cluster.
This guide was written using VMware Cloud on AWS,
but the VMware OVA import guide can be found here
.
Note
If you don’t specify a template in the cluster spec file, EKS Anywhere will use the proper default one for the Kubernetes minor version and OS family you specified in the spec file.
If the template doesn’t exist, it will import the appropriate OVA into vSphere and add the necessary tags.
The default OVA for a Kubernetes minor version + OS family will change over time, for example, when a new EKS Distro version is released. In that case, new clusters will use the new OVA (EKS Anywhere will import it automatically).
Warning
Do not power on the imported OVA directly as it can cause some undesired configurations on the OS template and affect cluster creation. If you want to explore or modify the OS, please follow the instructions to
customize the OVA.
EKS Anywhere supports the following operating system families
- Bottlerocket (default)
- Ubuntu
A list of OVAs for this release can be found on the artifacts page
.
Using vCenter Web User Interface
-
Right click on your Datacenter, select Deploy OVF Template

-
Select an OVF template using URL or selecting a local OVF file and click on Next. If you are not able to select an
OVF template using URL, download the file and use Local file option.
Note: If you are using Bottlerocket OVAs, please select local file option.

-
Select a folder where you want to deploy your OVF package (most of our OVF templates are under SDDC-Datacenter
directory) and click on Next. You cannot have an OVF template with the same name in one directory. For workload
VM templates, leave the Kubernetes version in the template name for reference. A workload VM template will
support at least one prior Kubernetes major versions.

-
Select any compute resource to run (from cluster-1, 10.2.34.5, etc..) the deployed VM and click on Next

-
Review the details and click Next.
-
Accept the agreement and click Next.
-
Select the appropriate storage (e.g. “WorkloadDatastore“) and click Next.
-
Select destination network (e.g. “sddc-cgw-network-1”) and click Next.
-
Finish.
-
Snapshot the VM. Right click on the imported VM and select Snapshots -> Take Snapshot…
(It is highly recommended that you snapshot the VM. This will reduce the time it takes to provision
machines and cluster creation will be faster. If you prefer not to take snapshot, skip to step 13)

-
Name your template (e.g. “root”) and click Create.

-
Snapshots for the imported VM should now show up under the Snapshots tab for the VM.

-
Right click on the imported VM and select Template and Convert to Template

Steps to deploy a template using GOVC (CLI)
To deploy a template using govc
, you must first ensure that you have
GOVC installed
. You need to set and export three
environment variables to run govc
GOVC_USERNAME, GOVC_PASSWORD and GOVC_URL.
-
Import the template to a content library in vCenter using URL or selecting a local OVA file
Using URL:
govc library.import -k -pull <library name> <URL for the OVA file>
Using a file from the local machine:
govc library.import <library name> <path to OVA file on local machine>
-
Deploy the template
govc library.deploy -pool <resource pool> -folder <folder location to deploy template> /<library name>/<template name> <name of new VM>
2a. If using Bottlerocket template for newer Kubernetes version than 1.20 and 1.21, resize disk 1 to 22G
govc vm.disk.change -vm <template name> -disk.label "Hard disk 1" -size 22G
2b. If using Bottlerocket template for Kubernetes version 1.20 or 1.21, resize disk 2 to 20G
govc vm.disk.change -vm <template name> -disk.label "Hard disk 2" -size 20G
-
Take a snapshot of the VM (It is highly recommended that you snapshot the VM. This will reduce the time it takes to provision machines
and cluster creation will be faster. If you prefer not to take snapshot, skip this step)
govc snapshot.create -vm ubuntu-2004-kube-v1.22.6 root
-
Mark the new VM as a template
govc vm.markastemplate <name of new VM>
Important Additional Steps to Tag the OVA
Using vCenter UI
Tag to indicate OS family
- Select the template that was newly created in the steps above and navigate to Summary -> Tags.

- Click Assign -> Add Tag to create a new tag and attach it

- Name the tag os:ubuntu or os:bottlerocket

Tag to indicate eksd release
- Select the template that was newly created in the steps above and navigate to Summary -> Tags.

- Click Assign -> Add Tag to create a new tag and attach it

- Name the tag eksdRelease:{eksd release for the selected ova}, for example eksdRelease:kubernetes-1-22-eks-6 for the 1.22 ova. You can find the rest of eksd releases in the previous section
. If it’s the first time you add an
eksdRelease
tag, you would need to create the category first. Click on “Create New Category” and name it eksdRelease
.

Using govc
Tag to indicate OS family
- Create tag category
govc tags.category.create -t VirtualMachine os
- Create tags os:ubuntu and os:bottlerocket
govc tags.create -c os os:bottlerocket
govc tags.create -c os os:ubuntu
- Attach newly created tag to the template
govc tags.attach os:bottlerocket <Template Path>
govc tags.attach os:ubuntu <Template Path>
- Verify tag is attached to the template
govc tags.ls <Template Path>
Tag to indicate eksd release
- Create tag category
govc tags.category.create -t VirtualMachine eksdRelease
- Create the proper eksd release Tag, depending on your template. You can find the eksd releases in the previous section
. For example eksdRelease:kubernetes-1-22-eks-6 for the 1.22 template.
govc tags.create -c eksdRelease eksdRelease:kubernetes-1-22-eks-6
- Attach newly created tag to the template
govc tags.attach eksdRelease:kubernetes-1-22-eks-6 <Template Path>
- Verify tag is attached to the template
govc tags.ls <Template Path>
Note
If the tags above are not applied as shown exactly, eks-a template validations will fail and CLI will abort
After you are done you can use the template for your workload cluster.
5.3.5 - Custom DHCP Configuration
Create a custom DHCP configuration for your vSphere deployment
If your vSphere deployment is not configured with DHCP, you may want to run your own DHCP server.
It may be necessary to turn off DHCP snooping on your switch to get DHCP working across VM servers.
If you are running your administration machine in vSphere, it would most likely be easiest to run the DHCP server on that machine.
This example is for Ubuntu.
Install
Install DHCP server
sudo apt-get install isc-dhcp-server
Update the ip address range, subnet, mask, etc to suite your configuration similar to this:
default-lease-time 600;
max-lease-time 7200;
ddns-update-style none;
authoritative;
subnet 10.8.105.0 netmask 255.255.255.0 {
range 10.8.105.9 10.8.105.41;
option subnet-mask 255.255.255.0;
option routers 10.8.105.1;
option domain-name-servers 147.149.1.69;
}
Add the main NIC device interface to this file, such as eth0 (this example uses ens160).
INTERFACESv4="ens160"
Restart DCHP
service isc-dhcp-server restart
Verify your configuration
This example assumes the ens160
interface:
tcpdump -ni ens160 port 67 -vvvv
tcpdump: listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes
09:13:54.297704 IP (tos 0xc0, ttl 64, id 40258, offset 0, flags [DF], proto UDP (17), length 327)
10.8.105.12.68 > 10.8.105.5.67: [udp sum ok] BOOTP/DHCP, Request from 00:50:56:90:56:cf, length 299, xid 0xf7a5aac5, secs 50310, Flags [none] (0x0000)
Client-IP 10.8.105.12
Client-Ethernet-Address 00:50:56:90:56:cf
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Request
Client-ID Option 61, length 19: hardware-type 255, 2d:1a:a1:33:00:02:00:00:ab:11:f2:c8:ef:ba:aa:5a:2f:33
Parameter-Request Option 55, length 11:
Subnet-Mask, Default-Gateway, Hostname, Domain-Name
Domain-Name-Server, MTU, Static-Route, Classless-Static-Route
Option 119, NTP, Option 120
MSZ Option 57, length 2: 576
Hostname Option 12, length 15: "prod-etcd-m8ctd"
END Option 255, length 0
09:13:54.299762 IP (tos 0x0, ttl 64, id 56218, offset 0, flags [DF], proto UDP (17), length 328)
10.8.105.5.67 > 10.8.105.12.68: [bad udp cksum 0xe766 -> 0x502f!] BOOTP/DHCP, Reply, length 300, xid 0xf7a5aac5, secs 50310, Flags [none] (0x0000)
Client-IP 10.8.105.12
Your-IP 10.8.105.12
Server-IP 10.8.105.5
Client-Ethernet-Address 00:50:56:90:56:cf
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: ACK
Server-ID Option 54, length 4: 10.8.105.5
Lease-Time Option 51, length 4: 600
Subnet-Mask Option 1, length 4: 255.255.255.0
Default-Gateway Option 3, length 4: 10.8.105.1
Domain-Name-Server Option 6, length 4: 147.149.1.69
END Option 255, length 0
PAD Option 0, length 0, occurs 26
5.3.6 -
- public.ecr.aws
- anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
- distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
- d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
- api.github.com (only if GitOps is enabled)
5.4 - Security best practices
Using security best practices with your EKS Anywhere deployments
If you discover a potential security issue in this project, we ask that you notify AWS/Amazon Security via our vulnerability reporting page
.
Please do not create a public GitHub issue for security problems.
This guide provides advice about best practices for EKS Anywhere specific security concerns.
For a more complete treatment of Kubernetes security generally please refer to the official Kubernetes documentation on Securing a Cluster
and the Amazon EKS Best Practices Guide for Security
.
The Shared Responsibility Model and EKS-A
AWS Cloud Services follow the Shared Responsibility Model,
where AWS is responsible for security “of” the cloud, while the customer is responsible for security “in” the cloud.
However, EKS Anywhere is an open-source tool and the distribution of responsibility differs from that of a managed cloud service like EKS.
AWS Responsibilities
AWS is responsible for building and delivering a secure tool.
This tool will provision an initially secure Kubernetes cluster.
AWS is responsible for vetting and securely sourcing the services and tools packaged with EKS Anywhere and the cluster it creates (such as CoreDNS, Cilium, Flux, CAPI, and govc).
The EKS Anywhere build and delivery infrastructure, or supply chain, is secured to the standard of any AWS service and AWS takes responsibility for the secure and reliable delivery of a quality product which provisions a secure and stable Kubernetes cluster.
When the eksctl anywhere
plugin is executed, EKS Anywhere components are automatically downloaded from AWS.
eksctl
will then perform checksum verification on the components to ensure their authenticity.
AWS is responsible for the secure development and testing of the EKS Anywhere controller and associated custom resource definitions.
AWS is responsible for the secure development and testing of the EKS Anywhere CLI,
and ensuring it handles sensitive data and cluster resources securely.
End user responsibilities
The end user is responsible for the entire EKS Anywhere cluster after it has been provisioned.
AWS provides a mechanism to upgrade the cluster in-place, but it is the responsibility of the end user to perform that upgrade using the provided tools.
End users are responsible for operating their clusters in accordance with Kubernetes security best practices,
and for the ongoing security of the cluster after it has been provisioned.
This includes but is not limited to:
- creation or modification of RBAC roles and bindings
- creation or modification of namespaces
- modification of the default container network interface plugin
- configuration of network ingress and load balancing
- use and configuration of container storage interfaces
- the inclusion of add-ons and other services
End users are also responsible for:
-
The hardware and software which make up the infrastructure layer
(such as vSphere, ESXi, physical servers, and physical network infrastructure).
-
The ongoing maintenance of the cluster nodes, including the underlying guest operating systems.
Additionally, while EKS Anywhere provides a streamlined process for upgrading a cluster to a new Kubernetes version, it is the responsibility of the user to perform the upgrade as necessary.
-
Any applications which run “on” the cluster, including their secure operation, least privilege, and use of well-known and vetted container images.
EKS Anywhere Security Best Practices
This section captures EKS Anywhere specific security best practices.
Please read this section carefully and follow any guidance to ensure the ongoing security and reliability of your EKS Anywhere cluster.
Critical Namespaces
EKS Anywhere creates and uses resources in several critical namespaces.
All of the EKS Anywhere managed namespaces should be treated as sensitive and access should be limited to only the most trusted users and processes.
Allowing additional access or modifying the existing RBAC resources could potentially allow a subject to access the namespace and the resources that it contains.
This could lead to the exposure of secrets or the failure of your cluster due to modification of critical resources.
Here are rules you should follow when dealing with critical namespaces:
-
Avoid creating Roles
in these namespaces or providing users access to them with ClusterRoles
.
For more information about creating limited roles for day-to-day administration and development, please see the official introduction to Role Based Access Control (RBAC)
.
-
Do not modify existing Roles
in these namespaces, bind existing roles to additional subjects
, or create new Roles in the namespace.
-
Do not modify existing ClusterRoles
or bind them to additional subjects.
-
Avoid using the cluster-admin role, as it grants permissions over all namespaces.
-
No subjects except for the most trusted administrators should be permitted to perform ANY action in the critical namespaces.
The critical namespaces include:
eksa-system
capv-system
flux-system
capi-system
capi-webhook-system
capi-kubeadm-control-plane-system
capi-kubeadm-bootstrap-system
cert-manager
kube-system
(as with any Kubernetes cluster, this namespace is critical to the functioning of your cluster and should be treated with the highest level of sensitivity.)
Secrets
EKS Anywhere stores sensitive information, like the vSphere credentials and GitHub Personal Access Token, in the cluster as native Kubernetes secrets
.
These secret objects are namespaced, for example in the eksa-system
and flux-system
namespace, and limiting access to the sensitive namespaces will ensure that these secrets will not be exposed.
Additionally, limit access to the underlying node. Access to the node could allow access to the secret content.
EKS Anywhere does not currently support encryption-at-rest for Kubernetes secrets.
EKS Anywhere support for Key Management Services (KMS)
is planned.
The EKS Anywhere kubeconfig
file
eksctl anywhere create cluster
creates an EKS Anywhere-based Kubernetes cluster and outputs a kubeconfig
file with administrative privileges to the $PWD/$CLUSTER_NAME
directory.
By default, this kubeconfig
file uses certificate-based authentication and contains the user certificate data for the administrative user.
The kubeconfig
file grants administrative privileges over your cluster to the bearer and the certificate key should be treated as you would any other private key or administrative password.
The EKS Anywhere-generated kubeconfig file should only be used for interacting with the cluster via eksctl anywhere
commands, such as upgrade
, and for the most privileged administrative tasks.
For more information about creating limited roles for day-to-day administration and development, please see the official introduction to Role Based Access Control (RBAC)
.
GitOps
GitOps enabled EKS Anywhere clusters maintain a copy of their cluster configuration in the user provided Git repository.
This configuration acts as the source of truth for the cluster.
Changes made to this configuration will be reflected in the cluster configuration.
AWS recommends that you gate any changes to this repository with mandatory pull request reviews.
Carefully review pull requests for changes which could impact the availability of the cluster (such as scaling nodes to 0 and deleting the cluster object) or contain secrets.
GitHub Personal Access Token
Treat the GitHub PAT
used with EKS Anywhere as you would any highly privileged secret, as it could potentially be used to make changes to your cluster by modifying the contents of the cluster configuration file through the GitHub.com
API.
- Never commit the PAT to a Git repository
- Never share the PAT via untrusted channels
- Never grant non-administrative subjects access to the
flux-system
namespace where the PAT is stored as a native Kubernetes secret.
Executing EKS Anywhere
Ensure that you execute eksctl anywhere create cluster
on a trusted workstation in order to protect the values of sensitive environment variables and the EKS Anywhere generated kubeconfig file.
SSH Access to Cluster Nodes and ETCD Nodes
EKS Anywhere provides the option to configure an ssh authorized key for access to underlying nodes in a cluster, via vsphereMachineConfig.Users.sshAuthorizedKeys
.
This grants the associated private key the ability to connect to the cluster via ssh
as the user capv
with sudo
permissions.
The associated private key should be treated as extremely sensitive, as sudo
access to the cluster and ETCD nodes can permit access to secret object data and potentially confer arbitrary control over the cluster.
VMWare OVAs
Only download OVAs for cluster nodes from official sources, and do not allow untrusted users or processes to modify the templates used by EKS Anywhere for provisioning nodes.
Keeping Bottlerocket up to date
EKS Anywhere provides the most updated patch of operating systems with every release. It is recommended that your clusters are kept up to date with the latest EKS Anyhwere release to ensure you get the latest security updates.
Bottlerocket is an EKS Anywhere supported operating system that can be kept up to date without requiring a cluster update. The Bottlerocket Update Operator
is a Kubernetes update operator that coordinates Bottlerocket updates on hosts in the cluster. Please follow the instructions here
to install Bottlerocket update operator.
EKS Anywhere Baremetal clusters run directly on physical servers in a datacenter. Make sure that the physical infrastructure, including the network, is secure before running EKS Anywhere clusters.
Please follow industry best practices for securing your network and datacenter, including but not limited to the following
- Only allow trusted devices on the network
- Secure the network using a firewall
- Never source hardware from an untrusted vendor
- Inspect and verify the metal servers you are using for the clusters are the ones you intended to use
- If possible, use a separate L2 network for EKS Anywhere baremetal clusters
- Conduct thorough audits of access, users, logs and other exploitable venues periodically
Benchmark tests for cluster hardening
EKS Anywhere creates clusters with server hardening configurations out of the box, via the use of security flags and opinionated default templates. You can verify the security posture of your EKS Anywhere cluster by using a tool called kube-bench
, that checks whether Kubernetes is deployed securely.
kube-bench
runs checks documented in the CIS Benchmark for Kubernetes
, such as, pod specification file permissions, disabling insecure arguments, and so on.
Refer to the EKS Anywhere CIS Self-Assessment Guide
for more information on how to evaluate the security configurations of your EKS Anywhere cluster.
5.4.1 - CIS Self-Assessment Guide
CIS Benchmark Self-Assessment Guide for EKS Anywhere clusters
The CIS Benchmark self-assessment guide serves to help EKS Anywhere users evaluate the level of security of the hardened cluster configuration against Kubernetes benchmark controls from the Center for Information Security (CIS). This guide will walk through the various controls and provide updated example commands to audit compliance in EKS Anywhere clusters.
You can verify the security posture of your EKS Anywhere cluster by using a tool called kube-bench
. The ideal way to run the benchmark tests on your EKS Anywhere cluster is to apply the Kube-bench Job YAMLs
to the cluster. This runs the kube-bench
tests on a Pod on the cluster, and the logs of the Pod provide the test results.
Kube-bench currently does not support unstacked etcd
topology (which is the default for EKS Anywhere), so the following checks are skipped in the default kube-bench Job YAML. If you created your EKS Anywhere cluster with stacked etcd
configuration, you can apply the stacked etcd
Job YAML
instead.
Check number |
Check description |
1.1.7 |
Ensure that the etcd pod specification file permissions are set to 644 or more restrictive |
1.1.8 |
Ensure that the etcd pod specification file ownership is set to root:root |
1.1.11 |
Ensure that the etcd data directory permissions are set to 700 or more restrictive |
1.1.12 |
Ensure that the etcd data directory ownership is set to etcd:etcd |
The following tests are also skipped, because they are not applicable or enforce settings that might make the cluster unstable.
Check number |
Check description |
Reason for skipping |
Controlplane node configuration |
|
|
1.2.6 |
Ensure that the –kubelet-certificate-authority argument is set as appropriate |
When generating serving certificates, functionality could break in conjunction with hostname overrides which are required for certain cloud providers |
1.2.16 |
Ensure that the admission control plugin PodSecurityPolicy is set |
Enabling Pod Security Policy can cause applications to unexpectedly fail |
1.2.32 |
Ensure that the –encryption-provider-config argument is set as appropriate |
Enabling encryption changes how data can be recovered as data is encrypted |
1.2.33 |
Ensure that encryption providers are appropriately configured |
Enabling encryption changes how data can be recovered as data is encrypted |
Worker node configuration |
|
|
4.2.6 |
Ensure that the –protect-kernel-defaults argument is set to true |
System level configurations are required before provisioning the cluster in order for this argument to be set to true |
4.2.10 |
Ensure that the –tls-cert-file and –tls-private-key-file arguments are set as appropriate |
When generating serving certificates, functionality could break in conjunction with hostname overrides which are required for certain cloud providers |
Note
Running kube-bench on Bottlerocket controlplane nodes currently produces false negatives with respect to pod specification file (manifest) permissions, since the
default configuration does not include the paths in which Bottlerocket places these manifests. This issue is being tracked
here.
5.5 - Packages
List of EKS Anywhere curated packages
Curated package list
5.5.1 - Harbor configuration
Harbor is an open source trusted cloud native registry project that stores, signs, and scans content. Harbor extends the open source Docker Distribution by adding the functionalities usually required by users such as security, identity and management. Having a registry closer to the build and run environment can improve the image transfer efficiency. Harbor supports replication of images between registries, and also offers advanced security features such as user management, access control and activity auditing.
Configuration options for Harbor
5.5.1.1 - v2.5.0
Trivy, Notary and Chartmuseum are not supported at this moment.
Configuring Harbor in EKS Anywhere package spec
The following table lists the configurable parameters of the Harbor package spec and the default values.
Parameter |
Description |
Default |
General |
|
|
externalURL |
The external URL for Harbor core service |
https://127.0.0.1:30003 |
imagePullPolicy |
The image pull policy |
IfNotPresent |
logLevel |
The log level: debug , info , warning , error or fatal |
info |
harborAdminPassword |
The initial password of the Harbor admin account. Change it from the portal after launching Harbor |
Harbor12345 |
secretKey |
The key used for encryption. Must be a string of 16 chars |
"" |
Expose |
|
|
expose.type |
How to expose the service: nodePort or loadBalancer , other values will be ignored and the creation of the service will be skipped. |
nodePort |
expose.tls.enabled |
Enable TLS or not. |
true |
expose.tls.certSource |
The source of the TLS certificate. Set as auto , secret or none and fill the information in the corresponding section: 1) auto: generate the TLS certificate automatically 2) secret: read the TLS certificate from the specified secret. The TLS certificate can be generated manually or by cert manager 3) none: configure no TLS certificate. |
secret |
expose.tls.auto.commonName |
The common name used to generate the certificate. It’s necessary when expose.tls.certSource is set to auto |
|
expose.tls.secret.secretName |
The name of the secret which contains keys named: tls.crt - the certificate; tls.key - the private key |
harbor-tls-secret |
expose.nodePort.name |
The name of the NodePort service |
harbor |
expose.nodePort.ports.http.port |
The service port Harbor listens on when serving HTTP |
80 |
expose.nodePort.ports.http.nodePort |
The node port Harbor listens on when serving HTTP |
30002 |
expose.nodePort.ports.https.port |
The service port Harbor listens on when serving HTTPS |
443 |
expose.nodePort.ports.https.nodePort |
The node port Harbor listens on when serving HTTPS |
30003 |
expose.loadBalancer.name |
The name of the service |
harbor |
expose.loadBalancer.IP |
The IP address of the loadBalancer. It only works when the loadBalancer supports assigning an IP address |
"" |
expose.loadBalancer.ports.httpPort |
The service port Harbor listens on when serving HTTP |
80 |
expose.loadBalancer.ports.httpsPort |
The service port Harbor listens on when serving HTTPS |
30002 |
expose.loadBalancer.annotations |
The annotations attached to the loadBalancer service |
{} |
expose.loadBalancer.sourceRanges |
List of IP address ranges to assign to loadBalancerSourceRanges |
[] |
Internal TLS |
|
|
internalTLS.enabled |
Enable TLS for the components (core, jobservice, portal, and registry) |
true |
Persistence |
|
|
persistence.resourcePolicy |
Setting it to keep to avoid removing PVCs during a helm delete operation. Leaving it empty will delete PVCs after the chart is deleted. Does not affect PVCs created for internal database and redis components. |
keep |
persistence.persistentVolumeClaim.registry.size |
The size of the volume |
5Gi |
persistence.persistentVolumeClaim.registry.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning |
"" |
persistence.persistentVolumeClaim.jobservice.size |
The size of the volume |
1Gi |
persistence.persistentVolumeClaim.jobservice.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning |
"" |
persistence.persistentVolumeClaim.database.size |
The size of the volume. If an external database is used, the setting will be ignored |
1Gi |
persistence.persistentVolumeClaim.database.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external database is used, the setting will be ignored |
"" |
persistence.persistentVolumeClaim.redis.size |
The size of the volume. If an external Redis is used, the setting will be ignored |
1Gi |
persistence.persistentVolumeClaim.redis.storageClass |
Specify the storageClass used to provision the volumem, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external Redis is used, the setting will be ignored |
"" |
5.5.1.2 - v2.5.1
Notary and Chartmuseum are not supported at this moment.
Configuring Harbor in EKS Anywhere package spec
The following table lists the configurable parameters of the Harbor package spec and the default values.
Parameter |
Description |
Default |
General |
|
|
externalURL |
The external URL for Harbor core service |
https://127.0.0.1:30003 |
imagePullPolicy |
The image pull policy |
IfNotPresent |
logLevel |
The log level: debug , info , warning , error or fatal |
info |
harborAdminPassword |
The initial password of the Harbor admin account. Change it from the portal after launching Harbor |
Harbor12345 |
secretKey |
The key used for encryption. Must be a string of 16 chars |
"" |
Expose |
|
|
expose.type |
How to expose the service: nodePort or loadBalancer , other values will be ignored and the creation of the service will be skipped. |
nodePort |
expose.tls.enabled |
Enable TLS or not. |
true |
expose.tls.certSource |
The source of the TLS certificate. Set as auto , secret or none and fill the information in the corresponding section: 1) auto: generate the TLS certificate automatically 2) secret: read the TLS certificate from the specified secret. The TLS certificate can be generated manually or by cert manager 3) none: configure no TLS certificate. |
secret |
expose.tls.auto.commonName |
The common name used to generate the certificate. It’s necessary when expose.tls.certSource is set to auto |
|
expose.tls.secret.secretName |
The name of the secret which contains keys named: tls.crt - the certificate; tls.key - the private key |
harbor-tls-secret |
expose.nodePort.name |
The name of the NodePort service |
harbor |
expose.nodePort.ports.http.port |
The service port Harbor listens on when serving HTTP |
80 |
expose.nodePort.ports.http.nodePort |
The node port Harbor listens on when serving HTTP |
30002 |
expose.nodePort.ports.https.port |
The service port Harbor listens on when serving HTTPS |
443 |
expose.nodePort.ports.https.nodePort |
The node port Harbor listens on when serving HTTPS |
30003 |
expose.loadBalancer.name |
The name of the service |
harbor |
expose.loadBalancer.IP |
The IP address of the loadBalancer. It only works when loadBalancer supports assigning an IP address |
"" |
expose.loadBalancer.ports.httpPort |
The service port Harbor listens on when serving HTTP |
80 |
expose.loadBalancer.ports.httpsPort |
The service port Harbor listens on when serving HTTPS |
30002 |
expose.loadBalancer.annotations |
The annotations attached to the loadBalancer service |
{} |
expose.loadBalancer.sourceRanges |
List of IP address ranges to assign to loadBalancerSourceRanges |
[] |
Internal TLS |
|
|
internalTLS.enabled |
Enable TLS for the components (core, jobservice, portal, and registry) |
true |
Persistence |
|
|
persistence.resourcePolicy |
Setting it to keep to avoid removing PVCs during a helm delete operation. Leaving it empty will delete PVCs after the chart is deleted. Does not affect PVCs created for internal database and redis components. |
keep |
persistence.persistentVolumeClaim.registry.size |
The size of the volume |
5Gi |
persistence.persistentVolumeClaim.registry.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning |
"" |
persistence.persistentVolumeClaim.jobservice.size |
The size of the volume |
1Gi |
persistence.persistentVolumeClaim.jobservice.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning |
"" |
persistence.persistentVolumeClaim.database.size |
The size of the volume. If an external database is used, the setting will be ignored |
1Gi |
persistence.persistentVolumeClaim.database.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external database is used, the setting will be ignored |
"" |
persistence.persistentVolumeClaim.redis.size |
The size of the volume. If an external Redis is used, the setting will be ignored |
1Gi |
persistence.persistentVolumeClaim.redis.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external Redis is used, the setting will be ignored |
"" |
persistence.persistentVolumeClaim.trivy.size |
The size of the volume |
5Gi |
persistence.persistentVolumeClaim.trivy.storageClass |
Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning |
"" |
Trivy |
|
|
trivy.enabled |
The flag to enable Trivy scanner |
true |
trivy.vulnType |
Comma-separated list of vulnerability types. Possible values os and library . |
os,library |
trivy.severity |
Comma-separated list of severities to be checked |
UNKNOWN,LOW,MEDIUM,HIGH,CRITICAL |
trivy.skipUpdate |
The flag to disable Trivy DB
downloads from GitHub |
false |
trivy.offlineScan |
The flag prevents Trivy from sending API requests to identify dependencies. |
false |
5.5.2 - MetalLB Configuration
MetalLB is a load-balancer implementation for on-premises Kubernetes clusters, using standard routing protocols.
5.5.2.1 - v0.12.1
FRRouting
is currently not supported for MetalLB.
Parameter |
Description |
Default |
General |
|
|
address-pools[] |
List of address-pool objects. Address-pools list the IP addresses that MetalLB is allowed to allocate, along with settings for how to advertise those addresses over BGP once assigned. You can have as many address pools as you want. Example:
address-pools: - name: default protocol: bgp addresses: - 10.220.0.90/30 |
None |
peers[] |
List of peer objects. The peers list tells MetalLB what BGP routers to connect to. There is one entry for each router you want to peer with. Example:
peers: - peer-address: 10.220.0.2 peer-asn: 65000 my-asn: 65002 |
|
bgp-communities |
(optional) BGP community aliases. Instead of using hard-to-read BGP community numbers in address pool advertisement configurations, you can define alias names here and use those elsewhere in the configuration. Example:
bgp-communities: no-export: 65535:65281 |
|
address-pool |
|
|
name |
A name for the address pool. Services can request allocation from a specific address pool using this name, by listing this ame under the metallb.universe.tf/address-pool annotation. |
|
protocol |
Protocol can be used to select how the announcement is done. Supported values are bgp and layer2. |
|
addresses |
A list of IP address ranges over which MetalLB has authority. You can list multiple ranges in a single pool, they will all share the same settings. Each range can be either a CIDR prefix, or an explicit start-end range of IPs. Examples: addresses: - 198.51.100.0/24 - 192.168.0.150-192.168.0.200 |
|
avoid-buggy-ips |
(optional) If true, MetalLB will not allocate any address that ends in .0 or .255. Some old, buggy consumer devices mistakenly block traffic to such addresses under the guise of smurf protection. Such devices have become fairly rare, but the option is here if you encounter serving issues. |
false |
auto-assign |
(optional) If false, MetalLB will not automatically allocate any address in this pool. Addresses can still explicitly be requested via loadBalancerIP or the address-pool annotation. |
true |
bgp-advertisements[] |
(optional) A list of bgp-advertisement objects, when protocol=bgp. Each address that gets assigned out of this pool will turn into this many advertisements. For most simple setups, you’ll probably just want one. The default value for this field is a single advertisement with all parameters set to their respective defaults. |
All Default |
peer |
|
|
peer-address |
The target IP address for the BGP session. |
|
peer-asn |
The BGP AS number that MetalLB expects to see advertised by the router. |
|
my-asn |
The BGP AS number that MetalLB should speak as. |
|
peer-port |
(optional) the TCP port to talk to. |
179 |
source-address |
(optional) The source IP address to use when establishing the BGP session. The address must be configured on a local network interface. |
|
hold-time |
(optional) The proposed value of the BGP Hold Time timer. Refer to BGP reference material to understand what setting this implies. |
|
keepalive-time |
(optional) The keepalive interval to be used in the BGP session. |
hold-time / 3 |
router-id |
(optional) The router ID to use when connecting to this peer. |
Node IP |
password |
(optional) Password for TCPMD5 authenticated BGP sessions offered by some peers. |
|
ebgp-multihop |
(optional) Whether eBGP multihop is permitted. Note that it is always on in the native BGP mode. |
|
node-selectors |
(optional) The nodes that should connect to this peer. A node matches if at least one of the node selectors matches. Within one selector, a node matches if all the matchers are satisfied. The semantics of each selector are the same as the label- and set-based selectors in Kubernetes, documented at Labels and Selectors
. By default, all nodes are selected.
node-selectors: # Match by label=value - match-labels: kubernetes.io/hostname: prod-01 # Match by ‘key OP values’ expressions - match-expressions: key: beta.kubernetes.io/arch operator: In values: [amd64, arm] |
|
bgp-advertisement |
|
|
aggregation-length |
(optional) How much you want to aggregate up the IP address before advertising. For example, advertising 1.2.3.4 with aggregation-length=24 would end up advertising 1.2.3.0/24. For the majority of setups, you’ll want to keep this at the default of 32, which advertises the entire IP address unmodified. |
32 |
aggregation-length-v6 |
(optional) How much you want to aggregate up the IPv6 address before advertising. For example, advertising 2001:0db8:85a3:0000:0000:8a2e:0370:7334 with aggregation-length-v6=64 would end up advertising 2001:0db8:85a3:0000:0000:0000:0000:0000/64. For the majority of setups, you’ll want to keep this at the default of 128, which advertises the entire IP address unmodified. |
128 |
localpref |
(optional) The value of the BGP “local preference” attribute for this advertisement. Only used with IBGP peers (i.e. peers where peer-asn is the same as my-asn). |
|
communities[] |
(optional) BGP communities to attach to this advertisement. Communities are given in the standard two-part form asn:community number. You can also use alias names. |
|
5.6 - What's New?
Added
- Added support for EKS Anywhere on bare metal with provider tinkerbell
. EKS Anywhere on bare metal supports complete provisioning cycle, including power on/off and PXE boot for standing up a cluster with the given hardware data.
- Support for node CIDR mask config exposed via the cluster spec. #488
Changed
- Upgraded cilium from 1.9 to 1.10. #1124
- Changes for EKS Anywhere packages v0.10.0
Fixed
- Fix issue using self-signed certificates for registry mirror #1857
Fixed
- Fix issue by avoiding processing Snow images when URI is empty
Added
- Adding support to EKS Anywhere for a generic git provider as the source of truth for GitOps configuration management. #9
- Allow users to configure Cloud Provider and CSI Driver with different credentials. #1730
- Support to install, configure and maintain operational components that are secure and tested by Amazon on EKS Anywhere clusters.#2083
- A new Workshop section has been added to EKS Anywhere documentation.
- Added support for curated packages behind a feature flag #1893
Fixed
- Fix issue specifying proxy configuration for helm template command #2009
Fixed
- Fix issue with upgrading cluster from a previous minor version #1819
Fixed
- Fix issue with downloading artifacts #1753
Added
- SSH keys and Users are now mutable #1208
- OIDC configuration is now mutable #676
- Add support for Cilium’s policy enforcement mode #726
Changed
- Install Cilium networking through Helm instead of static manifest
v0.7.2
- 2022-02-28
Fixed
- Fix issue with downloading artifacts #1327
v0.7.1
- 2022-02-25
Added
- Support for taints in worker node group configurations #189
- Support for taints in control plane configurations #189
- Support for labels in worker node group configuration #486
- Allow removal of worker node groups using the
eksctl anywhere upgrade
command #1054
v0.7.0
- 2022-01-27
Added
- Support for
aws-iam-authenticator
as an authentication option in EKS-A clusters #90
- Support for multiple worker node groups in EKS-A clusters #840
- Support for IAM Role for Service Account (IRSA) #601
- New command
upgrade plan cluster
lists core component changes affected by upgrade cluster
#499
- Support for workload cluster’s control plane and etcd upgrade through GitOps #1007
- Upgrading a Flux managed cluster previously required manual steps. These steps have now been automated.
#759
, #1019
- Cilium CNI will now be upgraded by the
upgrade cluster
command #326
Changed
- EKS-A now uses Cluster API (CAPI) v1.0.1 and v1beta1 manifests, upgrading from v0.3.23 and v1alpha3 manifests.
- Kubernetes components and etcd now use TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 as the
configured TLS cipher suite #657
,
#759
- Automated git repository structure changes during Flux component
upgrade
workflow #577
v0.6.0 - 2021-10-29
Added
- Support to create and manage workload clusters #94
- Support for upgrading eks-anywhere components #93
, Cluster upgrades
- IMPORTANT: Currently upgrading existing flux manged clusters requires performing a few additional steps
. The fix for upgrading the existing clusters will be published in
0.6.1
release
to improve the upgrade experience.
- k8s CIS compliance #193
- Support bundle improvements #92
- Ability to upgrade control plane nodes before worker nodes #100
- Ability to use your own container registry #98
- Make namespace configurable for anywhere resources #177
Fixed
- Fix ova auto-import issue for multi-datacenter environments #437
- OVA import via EKS-A CLI sometimes fails #254
- Add proxy configuration to etcd nodes for bottlerocket #195
Removed
- overrideClusterSpecFile field in cluster config
v0.5.0
Added
5.7 - Frequently Asked Questions
Frequently asked questions about EKS Anywhere
AuthN / AuthZ
How do my applications running on EKS Anywhere authenticate with AWS services using IAM credentials?
You can now leverage the IAM Role for Service Account (IRSA)
feature
by following the IRSA reference
guide for details.
Does EKS Anywhere support OIDC (including Azure AD and AD FS)?
Yes, EKS Anywhere can create clusters that support API server OIDC authentication.
This means you can federate authentication through AD FS locally or through Azure AD, along with other IDPs that support the OIDC standard.
In order to add OIDC support to your EKS Anywhere clusters, you need to configure your cluster by updating the configuration file before creating the cluster.
Please see the OIDC reference
for details.
Does EKS Anywhere support LDAP?
EKS Anywhere does not support LDAP out of the box.
However, you can look into the Dex LDAP Connector
.
Can I use AWS IAM for Kubernetes resource access control on EKS Anywhere?
Yes, you can install the aws-iam-authenticator
on your EKS Anywhere cluster to achieve this.
Miscellaneous
Can I connect my EKS Anywhere cluster to EKS?
Yes, you can install EKS Connector to connect your EKS Anywhere cluster to AWS EKS.
EKS Connector is a software agent that you can install on the EKS Anywhere cluster that enables the cluster to communicate back to AWS.
Once connected, you can immediately see the EKS Anywhere cluster with workload and cluster configuration information on the EKS console, alongside your EKS clusters.
How does the EKS Connector authenticate with AWS?
During start-up, the EKS Connector generates and stores an RSA key-pair as Kubernetes secrets.
It also registers with AWS using the public key and the activation details from the cluster registration configuration file.
The EKS Connector needs AWS credentials to receive commands from AWS and to send the response back.
Whenever it requires AWS credentials, it uses its private key to sign the request and invokes AWS APIs to request the credentials.
How does the EKS Connector authenticate with my Kubernetes cluster?
The EKS Connector acts as a proxy and forwards the EKS console requests to the Kubernetes API server on your cluster.
In the initial release, the connector uses impersonation
with its service account secrets to interact with the API server.
Therefore, you need to associate the connector’s service account with a ClusterRole,
which gives permission to impersonate AWS IAM entities.
How do I enable an AWS user account to view my connected cluster through the EKS console?
For each AWS user or other IAM identity, you should add cluster role binding to the Kubernetes cluster with the appropriate permission for that IAM identity.
Additionally, each of these IAM entities should be associated with the IAM policy
to invoke the EKS Connector on the cluster.
Can I use Amazon Controllers for Kubernetes (ACK) on EKS Anywhere?
Yes, you can leverage AWS services from your EKS Anywhere clusters on-premises through Amazon Controllers for Kubernetes (ACK)
.
Can I deploy EKS Anywhere on other clouds?
EKS Anywhere can be installed on any infrastructure with the required VMware vSphere versions.
See EKS Anywhere vSphere prerequisite
documentation.
How can I manage EKS Anywhere at scale?
You can perform cluster life cycle and configuration management at scale through GitOps-based tools.
EKS Anywhere offers git-driven cluster management through the integrated Flux Controller.
See Manage cluster with GitOps
documentation for details.
Can I run EKS Anywhere on ESXi?
No. EKS Anywhere is dependent on the vSphere cluster API provider CAPV and it uses the vCenter API.
There would need to be a change to the upstream project to support ESXi.
5.8 - Troubleshooting
Troubleshooting reference for your EKS Anywhere Cluster
Read more about troubleshooting
in the tasks section.
5.9 - Support
Support for EKS Anywhere
EKS Anywhere support licenses are available to AWS customers who pay for enterprise support.
If you would like business support for your EKS Anywhere clusters please contact your Technical Account Manager (TAM) for details.
EKS Anywhere is an open source project and it is supported by the community.
If you have a problem, open an issue
and someone will get back to you as soon as possible.
If you discover a potential security issue in this project, we ask that you notify AWS/Amazon Security via our vulnerability reporting page
.
Please do not create a public GitHub issue for security problems.
5.10 - Artifacts
Artifacts associated with this release: OVAs and images.
Artifacts for EKS Anyware Bare Metal clusters are listed below.
If you like, you can download these images and serve them locally to speed up cluster creation.
See descriptions of the osImageURL
and hookImagesURLPath
fields for details.
Kubernetes 1.20:
https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-20/ubuntu-v1.20.15-eks-d-1-20-17-eks-a-11-amd64.gz
Kubernetes 1.21:
https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-21/ubuntu-v1.21.13-eks-d-1-21-15-eks-a-11-amd64.gz
Kubernetes 1.22:
https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-22/ubuntu-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.gz
Kubernetes 1.21:
https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-21/bottlerocket-v1.21.13-eks-d-1-21-15-eks-a-11-amd64.img.gz
Kubernetes 1.22:
https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-22/bottlerocket-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.img.gz
kernel:
https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/hook/029ef8f0711579717bfd14ac5eb63cdc3e658b1d/vmlinuz-x86_64
initial ramdisk:
https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/hook/029ef8f0711579717bfd14ac5eb63cdc3e658b1d/initramfs-x86_64
vSphere OVAs
Bottlerocket OVAs
Bottlerocket vends its VMware variant OVAs using a secure distribution tool called tuftool. Please follow instructions down below to
download Bottlerocket OVA.
- Install Rust and Cargo
curl https://sh.rustup.rs -sSf | sh
- Install tuftool using Cargo
CARGO_NET_GIT_FETCH_WITH_CLI=true cargo install --force tuftool
- Download the root role tuftool will use to download the OVA
curl -O "https://cache.bottlerocket.aws/root.json"
sha512sum -c <<<"e9b1ea5f9b4f95c9b55edada4238bf00b12845aa98bdd2d3edb63ff82a03ada19444546337ec6d6806cbf329027cf49f7fde31f54d551c5e02acbed7efe75785 root.json"
- Export the desired Kubernetes Version. EKS Anywhere currently supports 1.22, 1.21 and 1.20
export KUBEVERSION="1.22"
- Download the OVA
OVA="bottlerocket-vmware-k8s-${KUBEVERSION}-x86_64-v1.8.0.ova"
tuftool download . --target-name "${OVA}" \
--root ./root.json \
--metadata-url "https://updates.bottlerocket.aws/2020-07-07/vmware-k8s-${KUBEVERSION}/x86_64/" \
--targets-url "https://updates.bottlerocket.aws/targets/"
Bottlerocket Tags
OS Family - os:bottlerocket
EKS-D Release
1.22 - eksdRelease:kubernetes-1-22-eks-8
1.21 - eksdRelease:kubernetes-1-21-eks-15
1.20 - eksdRelease:kubernetes-1-20-eks-17
Ubuntu with Kubernetes 1.22 OVA
Ubuntu with Kubernetes 1.21 OVA
Ubuntu with Kubernetes 1.20 OVA
Building your own Ubuntu OVA for vSphere
The EKS Anywhere project OVA building process leverages upstream image-builder repository.
If you want to build an OVA with a custom Ubuntu base image to use for an EKS Anywhere cluster, please follow the instructions below.
Having access to a vSphere environment and docker running locally are prerequisites for building your own images.
Required vSphere Permissions
Virtual machine
Inventory:
Configuration:
- Change configuration
- Add new disk
- Add or remove device
- Change memory
- Change settings
- Set annotation
Interaction:
- Power on
- Power off
- Console interaction
- Configure CD media
- Device connection
Snapshot management:
Provisioning
Resource Pool
- Assign vm to resource pool
Datastore
- Allocate space
- Browse data
- Low level file operations
Network
Steps to build an OVA
- Spin up a builder-base docker container and exec into it. Please use the most recent tag for the image on its repository here
docker exec -it public.ecr.aws/eks-distro-build-tooling/builder-base:latest bash
- Clone the eks-anywhere-build-tooling repo.
git clone https://github.com/aws/eks-anywhere-build-tooling.git
- Navigate to the image-builder directory.
cd eks-anywhere-build-tooling/projects/kubernetes-sigs/image-builder
- Get the vSphere connection details and create a json file named
vsphere.json
with the following template.
{
"cluster": "<vSphere cluster name>",
"datacenter": "<datacenter name on vSphere>",
"datastore": "<datastore to be used on vSphere>",
"folder": "<folder path to use for building ova>",
"network": "<dhcp enabled network name>",
"resource_pool": "<vSphere resource pool to use>",
"vcenter_server": "<vSphere server URL>",
"username": "<vSphere username>",
"password": "<vSphere password>",
"template": "",
"insecure_connection": "false",
"linked_clone": "false",
"convert_to_template": "false",
"create_snapshot": "true"
}
- Export the vSphere connection data file, escaping all the quotes
export VSPHERE_CONNECTION_DATA=\"$(cat vsphere.json | jq -c . | sed 's/"/\\"/g')\"
- Download the most recent release bundle manifest and get the latest URLs for
etcdadm
and crictl
for the intended Kubernetes version.
wget https://anywhere-assets.eks.amazonaws.com/bundle-release.yaml
- Export the CRICTL_URL and ETCADM_HTTP_SOURCE environment variables with the URLs from previous step.
export CRICTL_URL=<crictl url>
export ETCDADM_HTTP_SOURCE=<etcdadm url>
- Create a library on vSphere for image-builder.
govc library.create "CodeBuild"
- Update the Ubuntu configuration file with the new custom ISO URL and its checksum at
image-builder/images/capi/packer/ova/ubuntu-2004.json
- Setup image-builder and run the OVA build for the Kubernetes version.
RELEASE_BRANCH=1-22 make release-ova-ubuntu-2004
Images
The various images for EKS Anywhere can be found in the EKS Anywhere ECR repository
.
The various images for EKS Distro can be found in the EKS Distro ECR repository
.
5.11 - Ports and protocols
Ports used with an EKS Anywhere cluster
EKS Anywhere requires that various ports on control plane and worker nodes be open.
Some Kubernetes-specific ports need open access only from other Kubernetes nodes, while others are exposed externally.
Beyond Kubernetes ports, someone managing an EKS Anywhere cluster must also have external access to ports on the underlying EKS Anywhere provider (such as VMware) and to external tooling (such as Jenkins).
If you are responsible for network firewall rules between nodes on your EKS Anywhere clusters, the following tables describe both Kubernetes and EKS Anywhere-specific ports you should be aware of.
Kubernetes control plane
The following table represents the ports published by the Kubernetes project that must be accessible on any Kubernetes control plane.
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
6443 |
Kubernetes API server |
All |
TCP |
Inbound |
10250 |
Kubelet API |
Self, Control plane |
TCP |
Inbound |
10259 |
kube-scheduler |
Self |
TCP |
Inbound |
10257 |
kube-controller-manager |
Self |
Although etcd ports are included in control plane section, you can also host your own
etcd cluster externally or on custom ports.
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
2379-2380 |
etcd server client API |
kube-apiserver, etcd |
Use the following to access the SSH service on the control plane and etcd nodes:
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
22 |
SSHD server |
SSH clients |
Kubernetes worker nodes
The following table represents the ports published by the Kubernetes project that must be accessible from worker nodes.
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
10250 |
Kubelet API |
Self, Control plane |
TCP |
Inbound |
30000-32767 |
NodePort Services
|
All |
The API server port that is sometimes switched to 443.
Alternatively, the default port is kept as is and API server is put behind a load balancer that listens on 443 and routes the requests to API server on the default port.
Use the following to access the SSH service on the worker nodes:
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
22 |
SSHD server |
SSH clients |
On the Admin machine for a Bare Metal provider, the following ports need to be accessible to all the nodes in the cluster, from the same level 2 network, for initially PXE booting:
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
67 |
boots DHCP |
All nodes, for network boot |
TCP |
Inbound |
69 |
boots TFTP |
All nodes, for network boot |
TCP |
Inbound |
80 |
boots HTTP |
All nodes, for network boot |
TCP |
Inbound |
42113 |
tink-server gRCP |
All nodes, talk to Tinkerbell |
TCP |
Inbound |
50061 |
hegl HTTP |
All nodes, talk to Tinkerbell |
VMware provider
The following table displays ports that need to be accessible from the VMware provider running EKS Anywhere:
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
443 |
vCenter Server |
vCenter API endpoint |
TCP |
Inbound |
6443 |
Kubernetes API server |
Kubernetes API endpoint |
TCP |
Inbound |
2379 |
Manager |
Etcd API endpoint |
TCP |
Inbound |
2380 |
Manager |
Etcd API endpoint |
A variety of control plane management tools are available to use with EKS Anywhere.
One example is Jenkins.
Protocol |
Direction |
Port Range |
Purpose |
Used By |
TCP |
Inbound |
8080 |
Jenkins Server |
HTTP Jenkins endpoint |
TCP |
Inbound |
8443 |
Jenkins Server |
HTTPS Jenkins endpoint |
5.12 - Release Alerts
SNS Alerts for EKS Anywhere release
EKS Anywhere uses Amazon Simple Notification Service (SNS) to notify availability of a new release.
It is recommended that your clusters are kept up to date with the latest EKS Anywhere release.
Please follow the instructions below to subscribe to SNS notification.
- Sign in to your AWS Account
- Select us-east-1 region
- Go to the SNS Console
- In the left navigation pane, choose “Subscriptions”
- On the Subscriptions page, choose “Create subscription”
- On the Create subscription page, in the Details section enter the following information
- Choose Create Subscription
- In few minutes, you will receive an email asking you to confirm the subscription
- Click the confirmation link in the email
5.13 - eksctl anywhere CLI reference
Details on the options and parameters for eksctl anywhere CLI
The eksctl
CLI, with the EKS Anywhere plugin added, lets you create and manage EKS Anywhere clusters.
While a cluster is running, most EKS Anywhere administration can be done using kubectl
or other native Kubernetes tools.
Use this page as a reference to useful eksctl anywhere
command examples for working with EKS Anywhere clusters.
Available eksctl anywhere
commands include:
create cluster
To create an EKS Anywhere cluster
delete cluster
To delete an EKS Anywhere cluster
generate
[clusterconfig
| support-bundle
| support-bundle-config
] To generate cluster and support configs
help
To get help information
upgrade
To upgrade a workload cluster
version
To get the EKS Anywhere version
Options used with multiple commands include:
-h
or --help
To get help for a command or subcommand
-v int
or --verbosity int
To set log level verbosity from 0-9
-f
filenameor
–filename filename` To identify the filename containing the cluster config
--force-cleanup
To force deletion of previously created bootstrap cluster
-w string
or --w-config string
To identify the kubeconfig file when needed to create a support bundle or upgrade a cluster
Other available options and arguments are listed with the command examples that follow.
eksctl anywhere generate
With eksctl anywhere generate
, you can output sets of cluster resources to create a new cluster
or troubleshoot an existing cluster.
Here are some examples.
eksctl anywhere generate clusterconfig
Using eksctl anywhere generate clusterconfig
you can generate a cluster configuration
for a specific provider (-p
or --provider
provider_name). Here are examples:
Generate a configuration file to create an EKS Anywhere cluster for a vsphere
provider:
export CLUSTER_NAME=vsphere01
eksctl anywhere generate clusterconfig ${CLUSTER_NAME} -p vsphere > ${CLUSTER_NAME}.yaml
Generate a configuration file to create an EKS Anywhere cluster for a Docker provider:
export CLUSTER_NAME=docker01
eksctl anywhere generate clusterconfig ${CLUSTER_NAME} -p docker > ${CLUSTER_NAME}.yaml
Once you have generated the yaml configuration file, edit that file to add configuration information before you use the file to create your cluster.
See local
and production
cluster creation procedures for details.
eksctl anywhere generate support-bundle-config
If you would like to customize your support bundle, you can generate a support bundle configuration file (support-bundle-config
),
edit that file to choose the data you want to gather,
then gather the selected data into a support bundle (support-bundle
).
Generate a support bundle config file (then edit that file to select the log data you want to gather):
export CLUSTER_NAME=vsphere01
eksctl anywhere generate support-bundle-config > ${CLUSTER_NAME}_bundle_config.yaml
eksctl anywhere generate support-bundle
Once you have a bundle config file, generate a support bundle from an existing EKS Anywhere cluster.
Additional options available for this command include:
--bundle-config string
To identify the bundle config file to use to generate the support bundle
--since string
To collect pod logs in the latest duration like 5s, 2m, or 3h.
--since-time string
To collect pod logs after a specific datetime(RFC3339) like 2021-06-28T15:04:05Z
Here is an example:
export CLUSTER_NAME=vsphere01
eksctl anywhere generate support-bundle --bundle-config ${CLUSTER_NAME}_bundle_config.yaml \
-w KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig \
--since 2h -f ${CLUSTER_NAME}_bundle.yaml
The example just shown:
- Uses
${CLUSTER_NAME}_bundle.yaml
as the file to hold the results
- Collects pod logs for the past two hours (2h)
- Identifies the bundle config file to use (
${CLUSTER_NAME}_bundle_config.yaml
)
- Identifies the
.kubeconfig
file to use for a workload cluster
To change the command to generate a support bundle that gathers pod logs starting from a specific date (September 8, 2021) and time (1:27 PM):
export CLUSTER_NAME=vsphere01
eksctl anywhere generate support-bundle --bundle-config ${CLUSTER_NAME}_bundle_config.yaml \
-w KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig \
--since-time 2021-09-8T13:27:00Z 2h -f ${CLUSTER_NAME}_bundle.yaml
eksctl anywhere create cluster
Create an EKS Anywhere cluster from a cluster configuration file you generated (and modified) earlier.
This example sets verbosity to most verbose (-v 9
):
export CLUSTER_NAME=vsphere01
eksctl anywhere create cluster -v 9 -f ${CLUSTER_NAME}.yaml
See local
and production
cluster creation procedures for details.
eksctl anywhere upgrade cluster
Upgrade an existing EKS Anywhere cluster.
This example uses maximum verbosity and forces a cleanup of the previously created bootstrap cluster:
export CLUSTER_NAME=vsphere01
eksctl anywhere upgrade cluster -f ${CLUSTER_NAME}.yaml --force-cleanup -v9 \
-w KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
For more information on this and other ways to upgrade a cluster, see Upgrade cluster
.
eksctl anywhere delete cluster
Delete an existing EKS Anywhere cluster.
This example deletes all VMs and the forces the deletion of the previously created bootstrap cluster:
export CLUSTER_NAME=vsphere01
eksctl anywhere delete cluster -f ${CLUSTER_NAME}.yaml \
--force-cleanup \
-w KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
For more information on deleting a cluster, see Delete cluster
.
eksctl anywhere version
View the version of eksctl anywhere
:
eksctl anywhere version
v0.5.0
eksctl anywhere help
Use eksctl anywhere help
or the -h
option to see general options or options specific to a particular set of commands.
View general help information using help
:
eksctl anywhere help
Use eksctl anywhere to build your own self-managing cluster on your hardware with the best of Amazon EKS
Usage:
eksctl anywhere [command]
Available Commands:
create Create resources
delete Delete resources
generate Generate resources
help Help about any command
upgrade Upgrade resources
version Get the eksctl version
Flags:
-h, --help help for eksctl
-v, --verbosity int Set the log level verbosity
Use "eksctl [command] --help" for more information about a command.
...
Display help options for generating a support bundle:
eksctl anywhere generate support-bundle -h
This command is used to create a support bundle to troubleshoot a cluster
Usage:
eksctl anywhere generate support-bundle -f my-cluster.yaml [flags]
Flags:
--bundle-config string Bundle Config file to use when generating support bundle
-f, --filename string Filename that contains EKS-A cluster configuration
-h, --help help for support-bundle
--since string Collect pod logs in the latest duration like 5s, 2m, or 3h.
--since-time string Collect pod logs after a specific datetime(RFC3339) like 2021-06-28T15:04:05Z
-w, --w-config string Kubeconfig file to use when creating support bundle for a workload cluster
Global Flags:
-v, --verbosity int Set the log level verbosity
Display options for creating a cluster:
eksctl anywhere create cluster -h
This command is used to create workload clusters
Usage:
eksctl anywhere create cluster [flags]
Flags:
-f, --filename string Filename that contains EKS-A cluster configuration
--force-cleanup Force deletion of previously created bootstrap cluster
-h, --help help for cluster
Global Flags:
-v, --verbosity int Set the log level verbosity
6 - Community
Guidelines for community contribution
We work hard to provide a high-quality Kubernetes installer for EKS,
and we greatly value feedback and contributions from our community. Please
review the contribution guidelines
before
submitting any issues
or
pull requests
to ensure we have
all the necessary information to respond to your bug report or contribution
effectively. If you have a concern with a security vulnerability, please
review our reporting a vulnerability policy
.
6.1 - Contributing Guidelines
How to best contribute to the project
Thank you for your interest in contributing to our project. Whether it’s a bug report, new feature, correction, or additional
documentation, we greatly value feedback and contributions from our community.
Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
information to effectively respond to your bug report or contribution.
General Guidelines
Pull Requests
Make sure to keep Pull Requests small and functional to make them easier to review, understand, and look up in commit history.
This repository uses “Squash and Commit” to keep our history clean and make it easier to revert changes based on PR.
Adding the appropriate documentation, unit tests and e2e tests as part of a feature is the responsibility of the
feature owner, whether it is done in the same Pull Request or not.
Pull Requests should follow the “subject: message” format, where the subject describes what part of the code is being
modified.
Refer to the template
for more information on what goes into a PR description.
Design Docs
A contributor proposes a design with a PR on the repository to allow for revisions and discussions.
If a design needs to be discussed before formulating a document for it, make use of GitHub Discussions to
involve the community on the discussion.
GitHub Discussions
GitHub Discussions are used for feature requests (that don’t have actionable items/issues), questions, and anything else
the community would like to share.
Categories:
- Q/A - Questions
- Proposals - Feature requests and other suggestions
- Show and tell - Anything that the community would like to share
- General - Everything else (possibly announcements as well)
GitHub Issues
GitHub Issues are used to file bugs, work items, and feature requests with actionable items/issues (Please refer to the
“Reporting Bugs/Feature Requests” section below for more information).
Labels:
- “<area>” - area of project that issue is related to (create, upgrade, flux, test, etc.)
- “priority/p<n>” - priority of task based on following numbers
- p0: need to do right away
- p1: don’t have a set time but need to do
- p2: not currently being tracked (backlog)
- “status/<status>” - status of the issue (notstarted, implementation, etc.)
- “kind/<kind>” - type of issue (bug, feature, enhancement, docs, etc.)
Refer to the template
for more information on
what goes into an issue description.
GitHub Milestones
GitHub Milestones are used to plan work that is currently being tracked.
- next: changes for next release
- next+1: won’t make next release but the following
- techdebt: used to keep track of techdebt items, separate ongoing effort from release action items
- oncall: used to keep track of issues needing active follow-up
- backlog: items that don’t have a home in the others
GitHub Projects (or tasks within a GitHub Issue)
GitHub Projects are used to keep track of bigger features that are made up of a collection of issues.
Certain features can also have a tracking issue that contains a checklist of tasks that
link to other issues.
Reporting Bugs/Feature Requests
We welcome you to use the GitHub issue tracker to report bugs or suggest features that have actionable items/issues
(as opposed to introducing a feature request on GitHub Discussions).
When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn’t already
reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
- A reproducible test case or series of steps
- The version of the code being used
- Any modifications you’ve made relevant to the bug
- Anything unusual about your environment or deployment
Contributing via Pull Requests
Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
- You are working against the latest source on the main branch.
- You check existing open, and recently merged, pull requests to make sure someone else hasn’t addressed the problem already.
- You open an issue to discuss any significant work - we would hate for your time to be wasted.
To send us a pull request, please:
- Fork the repository.
- Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it
will be hard for us to focus on your change.
- Ensure local tests pass.
- Commit to your fork using clear commit messages.
- Send us a pull request, answering any default questions in the pull request interface.
- Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
GitHub provides additional document on forking a repository
and
creating a pull request
.
Finding contributions to work on
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the
default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ‘help wanted’
and ‘good first issue’ issues are a great place to start.
Code of Conduct
This project has adopted the Amazon Open Source Code of Conduct
.
For more information see the Code of Conduct FAQ
or contact
opensource-codeofconduct@amazon.com with any additional questions or comments.
Security issue notifications
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our
vulnerability reporting page
. Please do not create a
public GitHub issue.
Licensing
See the LICENSE
file for our project’s licensing. We will ask you to confirm the licensing of your contribution.
6.2 - Contributing to EKS Anywhere documentation
Guidelines for contributing to EKS Anywhere documentation
EKS Anywhere documentation uses the Hugo
site generator and the Docsy
theme. To get started contributing:
Style issues
-
EKS Anywhere: Always refer to EKS Anywhere as EKS Anywhere and NOT EKS-A or EKS-Anywhere.
-
Line breaks: Put each sentence on its own line and don’t do a line break in the middle of a sentence.
We are using a modified Semantic Line Breaking
in that we are requiring a break at the end of every sentence, but not at commas or other semantic boundaries.
-
Headings: Use sentence case in headings. So do “Cluster specification reference” and not “Cluster Specification Reference”
-
Cross references: To cross reference to another doc in the EKS Anywhere docs set, use relref in the link so that Hugo will test it and fail the build for links not found. Also, use relative paths to point to other content in the docs set. Here is an example of a cross reference (code and results):
See the [troubleshooting section](/docs/tasks/troubleshoot/) page.
See the troubleshooting section
page.
-
Notes, Warnings, etc.: You can use this form for notes:
Note
<put note here, multiple paragraphs are allowed>
Note
<put note here, multiple paragraphs are allowed>
-
Embedding content: If you want to read in content from a separate file, you can use the following format.
Do this if you think the content might be useful in multiple pages:
-
General style issues: Unless otherwise instructed, follow the Kubernetes Documentation Style Guide
for formatting and presentation guidance.
Where to put content
- Images: Put all images into the EKS Anywhere GitHub site’s docs/static/images
directory.
- Yaml examples: Put full yaml file examples into the EKS Anywhere GitHub site’s docs/static/manifests
directory.
In kubectl examples, you can point to those files using:
https://anywhere.eks.amazonaws.com/manifests/whatever.yaml
- Generic instructions for creating a cluster should go into the getting started
section in either:
- Instructions that are specific to an EKS Anywhere provider should go into the appropriate provider section. Currently, vSphere
is the only supported provider.
- Add integrations to cluster
: Add names of suggested third-party tools. Then Link the names of providers to:
- EKS Anywhere docs instructions for configuring that feature, if instructions are available or
- Somewhere on the third-party site, if there are no instructions available on the EKS Anywhere site
- Compare EKS Anywhere and EKS
: Add supported third-party solutions to the Amazon EKS Anywhere column.
Only link to the partner page for now.
- Workshop content should contain organized links to existing documentation pages.
The workshop content should not duplicate existing documentation pages or contain guides that are not part of the main documentation.
Contributing docs for third-party solutions
To contribute documentation describing how to use third-party software products or projects with EKS Anywhere, follow these guidelines.
Docs for third-party software in EKS Anywhere
Documentation PRs for EKS Anywhere that describe third-party software that is included in EKS Anywhere are acceptable, provided they meet the quality standards described in the Tips described below. This includes:
- Software bundled with EKS Anywhere (for example, Cilium docs
)
- Supported platforms on which EKS Anywhere runs (for example, VMware vSphere
)
- Curated software that is packaged by the EKS Anywhere project to run EKS Anywhere. This includes documentation for Harbor local registry, Ingress controller, and Prometheus, Grafana, and Fluentd monitoring and logging.
Docs for third-party software NOT in EKS Anywhere
Documentation for software that is not part of EKS Anywhere software can still be added to EKS Anywhere docs by meeting one of the following criteria:
- Partners: Documentation PRs for software from vendors listed on the EKS Anywhere Partner page
can be considered to add to the EKS Anywhere docs.
Links point to partners from the Compare EKS Anywhere to EKS
page and other content can be added to EKS Anywhere documentation for features from those partners.
Contact the AWS container partner team if you are interested in becoming a partner: aws-container-partners@amazon.com
- Cluster integrations: Separate, less stringent criteria can be met for a third-party vendor to be listed on the Add cluster integrations
page.
Tips for contributing third-party docs
The Kubernetes docs project itself describes a similar approach to docs covering third-party software in the How Docs Handle Third Party and Dual Sourced Content
blog.
In line with these general guidelines, we recommend that even acceptable third-party docs contributions to EKS Anywhere:
- Not be dual-sourced: The project does not allow content that is already published somewhere else.
You can provide links to that content, if it is relevant. Heavily rewriting such content to be EKS Anywhere-specific might be acceptable.
- Not be marketing oriented. The content shouldn’t sell a third-party products or make vague claims of quality.
- Not outside the scope of EKS Anywhere: Just because some projects or products of a partner are appropriate for EKS Anywhere docs, it doesn’t mean that any project or product by that partner can be documented in EKS Anywhere.
- Stick to the facts: So, for example, docs about third-party software could say: “To set up load balancer ABC, do XYZ” or “Make these modifications to improve speed and efficiency.” It should not make blanket statements like: “ABC load balancer is the best one in the industry.”
- EKS features: Features that relate to EKS which runs in AWS or requires an AWS account should link to the official documentation
as much as possible.
6.3 - Code of Conduct
Details on the project code of conduct
This project has adopted the
Amazon Open Source Code of Conduct
.
For more information, see the
Code of Conduct FAQ
or contact
opensource-codeofconduct@amazon.com with any additional questions or comments.
6.4 - Project governance
Roles and responsibilities of the project
This document lays out the guidelines under which the EKS Anywhere project will be governed.
The goal is to make sure that the roles and responsibilities are well-defined and clarify how decisions are made.
Roles
In the context of EKS Anywhere, we consider the following roles:
- Users … everyone using EKS Anywhere, typically willing to provide feedback on EKS Anywhere by proposing features and/or filing issues.
- Contributors … everyone contributing code, documentation, examples, testing infra, and participating in feature proposals as well as design discussions.
- Maintainers … are responsible for engaging with and assisting contributors to iterate on the contributions until it reaches acceptable quality.
Maintainers can decide whether the contributions can be accepted into the project or rejected.
Communication
The primary mechanism for communication will be via the #eks
channel
on the Kubernetes Slack community.
All features and bug fixes will be tracked as issues in GitHub.
All decisions will be documented in GitHub issues.
In the future, we may consider using a public mailing list, which can be better archived.
Release Management
The release process will be governed by AWS and will coincide with the release of EKS.
Roadmap Planning
Maintainers will share roadmap and release versions as milestones in GitHub.
7 - Welcome to EKS Anywhere Workshop!
The intent of this workshop is to educate users about the EKS Anywhere and its different use cases.
As part of this workshop we also covering how to provision and manage EKS Anywhere clusters, run workloads and leverage observability tools like Prometheus and Grafana to monitor the EKS Anywhere cluster.
We recommend this workshop for Cloud Architects, SREs, DevOps engineers, and other IT Professionals.
7.1 - Introduction
The following topics are covered part of this chapter:
- EKS Anywhere service overview
- Benefits & service considerations
- Frequently asked questions (FAQs)
7.1.1 - Overview
What is the purpose of this workshop?
The purpose of this workshop is to provide a more perscriptive walkthrough of building, deploying, and operating an EKS Anywhere cluster. This will use existing content from the documentation, just in a more condensed format for those wishing to get started.
EKS Anywhere Overview
Amazon EKS Anywhere is a new deployment option for Amazon EKS that allows customers to create and operate Kubernetes clusters on customer-managed infrastructure, supported by AWS. Customers can now run Amazon EKS Anywhere on their own on-premises infrastructure using VMware vSphere starting today, with support for other deployment targets in the near future, including support for bare metal coming in 2022.
Amazon EKS Anywhere helps simplify the creation and operation of on-premises Kubernetes clusters with default component configurations while providing tools for automating cluster management. It builds on the strengths of Amazon EKS Distro: the same Kubernetes distribution that powers Amazon EKS on AWS. AWS supports all Amazon EKS Anywhere components including the integrated 3rd-party software, so that customers can reduce their support costs and avoid maintenance of redundant open-source and third-party tools. In addition, Amazon EKS Anywhere gives customers on-premises Kubernetes operational tooling that’s consistent with Amazon EKS. You can leverage the EKS console to view all of your Kubernetes clusters (including EKS Anywhere clusters) running anywhere, through the EKS Connector
(public preview)

7.1.2 - Benefits & Use cases
Here are some key customer benefits of using Amazon EKS Anywhere:
- Simplify on-premises Kubernetes management - Amazon EKS Anywhere helps simplify the creation and operation of on-premises Kubernetes clusters with default component configurations while providing tools for automating cluster management.
- One stop support - AWS supports all Amazon EKS Anywhere components including the integrated 3rd-party software, so that customers can reduce their support costs and avoid maintenance of redundant open-source and third-party tools.
- Consistent and reliable - Amazon EKS Anywhere gives you on-premises Kubernetes operational tooling that’s consistent with Amazon EKS. It builds on the strengths of Amazon EKS Distro and provides open-source software that’s up-to-date and patched, so you can have a Kubernetes environment on-premises that is more reliable than self-managed Kubernetes offerings.
Use-cases supported by EKS Anywhere
EKS Anywhere is suitable for the following use-cases:
- Hybrid cloud consistency - You may have lots of Kubernetes workloads on Amazon EKS but also need to operate Kubernetes clusters on-premises. Amazon EKS Anywhere offers strong operational consistency with Amazon EKS so you can standardize your Kubernetes operations based on a unified toolset.
- Disconnected environment - You may need to secure your applications in disconnected environment or run applications in areas without internet connectivity. Amazon EKS Anywhere allows you to deploy and operate highly-available clusters with the same Kubernetes distribution that powers Amazon EKS on AWS.
- Application modernization - Amazon EKS Anywhere empowers you to modernize your on-premises applications, removing the heavy lifting of keeping up with upstream Kubernetes and security patches, so you can focus on your core business value.
- Data sovereignty - You may want to keep your large data sets on-premises due to legal requirements concerning the location of the data. Amazon EKS Anywhere brings the trusted Amazon EKS Kubernetes distribution and tools to where your data needs to be.
7.1.3 - Customer FAQ
AuthN / AuthZ
How do my applications running on EKS Anywhere authenticate with AWS services using IAM credentials?
You can now leverage the IAM Role for Service Account (IRSA)
feature
by following the IRSA reference
guide for details.
Does EKS Anywhere support OIDC (including Azure AD and AD FS)?
Yes, EKS Anywhere can create clusters that support API server OIDC authentication.
This means you can federate authentication through AD FS locally or through Azure AD, along with other IDPs that support the OIDC standard.
In order to add OIDC support to your EKS Anywhere clusters, you need to configure your cluster by updating the configuration file before creating the cluster.
Please see the OIDC reference
for details.
Does EKS Anywhere support LDAP?
EKS Anywhere does not support LDAP out of the box.
However, you can look into the Dex LDAP Connector
.
Can I use AWS IAM for Kubernetes resource access control on EKS Anywhere?
Yes, you can install the aws-iam-authenticator
on your EKS Anywhere cluster to achieve this.
Miscellaneous
Can I connect my EKS Anywhere cluster to EKS?
Yes, you can install EKS Connector to connect your EKS Anywhere cluster to AWS EKS.
EKS Connector is a software agent that you can install on the EKS Anywhere cluster that enables the cluster to communicate back to AWS.
Once connected, you can immediately see the EKS Anywhere cluster with workload and cluster configuration information on the EKS console, alongside your EKS clusters.
How does the EKS Connector authenticate with AWS?
During start-up, the EKS Connector generates and stores an RSA key-pair as Kubernetes secrets.
It also registers with AWS using the public key and the activation details from the cluster registration configuration file.
The EKS Connector needs AWS credentials to receive commands from AWS and to send the response back.
Whenever it requires AWS credentials, it uses its private key to sign the request and invokes AWS APIs to request the credentials.
How does the EKS Connector authenticate with my Kubernetes cluster?
The EKS Connector acts as a proxy and forwards the EKS console requests to the Kubernetes API server on your cluster.
In the initial release, the connector uses impersonation
with its service account secrets to interact with the API server.
Therefore, you need to associate the connector’s service account with a ClusterRole,
which gives permission to impersonate AWS IAM entities.
How do I enable an AWS user account to view my connected cluster through the EKS console?
For each AWS user or other IAM identity, you should add cluster role binding to the Kubernetes cluster with the appropriate permission for that IAM identity.
Additionally, each of these IAM entities should be associated with the IAM policy
to invoke the EKS Connector on the cluster.
Can I use Amazon Controllers for Kubernetes (ACK) on EKS Anywhere?
Yes, you can leverage AWS services from your EKS Anywhere clusters on-premises through Amazon Controllers for Kubernetes (ACK)
.
Can I deploy EKS Anywhere on other clouds?
EKS Anywhere can be installed on any infrastructure with the required VMware vSphere versions.
See EKS Anywhere vSphere prerequisite
documentation.
How can I manage EKS Anywhere at scale?
You can perform cluster life cycle and configuration management at scale through GitOps-based tools.
EKS Anywhere offers git-driven cluster management through the integrated Flux Controller.
See Manage cluster with GitOps
documentation for details.
Can I run EKS Anywhere on ESXi?
No. EKS Anywhere is dependent on the vSphere cluster API provider CAPV and it uses the vCenter API.
There would need to be a change to the upstream project to support ESXi.
7.2 - Provisioning
This chapter walks through the following:
- Overview of provisioning
- Prerequisites for creating an EKS Anywhere cluster
- Provisioning a new EKS Anywhere cluster
- Verifying the cluster installation
7.2.1 - Overview
EKS Anywhere uses the eksctl
executable to create a Kubernetes cluster in your environment.
Currently it allows you to create and delete clusters in a vSphere environment.
You can run cluster create and delete commands from an Ubuntu or Mac administrative machine.
To create a cluster, you need to create a specification file that includes all of your vSphere details and information about your EKS Anywhere cluster.
Running the eksctl anywhere create cluster
command from your admin machine creates the workload cluster in vSphere.
It does this by first creating a temporary bootstrap cluster to direct the workload cluster creation.
Once the workload cluster is created, the cluster management resources are moved to your workload cluster and the local bootstrap cluster is deleted.
Once your workload cluster is created, a KUBECONFIG file is stored on your admin machine with RBAC admin permissions for the workload cluster.
You’ll be able to use that file with kubectl
to set up and deploy workloads.
For a detailed description, see Cluster creation workflow
.
Here’s a diagram that explains the process visually.
EKS Anywhere Create Cluster

Next steps:
7.2.2 - Admin machine setup
EKS Anywhere will create and manage Kubernetes clusters on multiple providers.
Currently we support creating development clusters locally using Docker and production clusters using Bare Metal or VMware vSphere.
Creating an EKS Anywhere cluster begins with setting up an Administrative machine where you will run Docker and add some binaries.
From there, you create the cluster for your chosen provider.
See Create cluster workflow
for an overview of the cluster creation process.
To create an EKS Anywhere cluster you will need eksctl
and the eksctl-anywhere
plugin.
This will let you create a cluster in multiple providers for local development or production workloads.
Administrative machine prerequisites
Via Homebrew (macOS and Linux)
Warning
EKS Anywhere only works on computers with x86 and amd64 process architecture.
It currently will not work on computers with Apple Silicon or Arm based processors.
You can install eksctl
and eksctl-anywhere
with homebrew
.
This package will also install kubectl
and the aws-iam-authenticator
which will be helpful to test EKS Anywhere clusters.
brew install aws/tap/eks-anywhere
Manually (macOS and Linux)
Install the latest release of eksctl
.
The EKS Anywhere plugin requires eksctl
version 0.66.0 or newer.
curl "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" \
--silent --location \
| tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin/
Install the eksctl-anywhere
plugin.
export EKSA_RELEASE="0.10.0" OS="$(uname -s | tr A-Z a-z)" RELEASE_NUMBER=14
curl "https://anywhere-assets.eks.amazonaws.com/releases/eks-a/${RELEASE_NUMBER}/artifacts/eks-a/v${EKSA_RELEASE}/${OS}/amd64/eksctl-anywhere-v${EKSA_RELEASE}-${OS}-amd64.tar.gz" \
--silent --location \
| tar xz ./eksctl-anywhere
sudo mv ./eksctl-anywhere /usr/local/bin/
Upgrade eksctl-anywhere
If you installed eksctl-anywhere
via homebrew you can upgrade the binary with
brew update
brew upgrade eks-anywhere
If you installed eksctl-anywhere
manually you should follow the installation steps to download the latest release.
You can verify your installed version with
Deploy a cluster
Once you have the tools installed you can deploy a local cluster or production cluster in the next steps.
7.2.3 - Local cluster setup
EKS Anywhere docker provider deployments
EKS Anywhere supports a Docker provider for development and testing use cases only.
This allows you to try EKS Anywhere on your local system before deploying to a supported provider.
To install the EKS Anywhere binaries and see system requirements please follow the installation guide
.
Steps
-
Generate a cluster config
CLUSTER_NAME=dev-cluster
eksctl anywhere generate clusterconfig $CLUSTER_NAME \
--provider docker > $CLUSTER_NAME.yaml
The command above creates a file named eksa-cluster.yaml with the contents below in the path where it is executed.
The configuration specification is divided into two sections:
- Cluster
- DockerDatacenterConfig
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: dev-cluster
spec:
clusterNetwork:
cniConfig:
cilium: {}
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneConfiguration:
count: 1
datacenterRef:
kind: DockerDatacenterConfig
name: dev-cluster
externalEtcdConfiguration:
count: 1
kubernetesVersion: "1.21"
managementCluster:
name: dev-cluster
workerNodeGroupConfigurations:
- count: 1
name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: DockerDatacenterConfig
metadata:
name: dev-cluster
spec: {}
- Apart from the base configuration, you can add additional optional configuration to enable supported features:
-
Create Cluster: Create your cluster either with or without curated packages:
-
Cluster creation without curated packages installation
eksctl anywhere create cluster -f $CLUSTER_NAME.yaml
Example command output
Performing setup and validations
✅ validation succeeded {"validation": "docker Provider setup is valid"}
Creating new bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Provider specific setup
Creating new workload cluster
Installing networking on workload cluster
Installing cluster-api providers on workload cluster
Moving cluster management from bootstrap to workload cluster
Installing EKS-A custom components (CRD and controller) on workload cluster
Creating EKS-A CRDs instances on workload cluster
Installing AddonManager and GitOps Toolkit on workload cluster
GitOps field not specified, bootstrap flux skipped
Deleting bootstrap cluster
🎉 Cluster created!
-
Cluster creation with optional curated packages
Note
<ul>
- It is optional to install curated packages as part of the cluster creation.
eksctl anywhere version
version should be later than v0.9.0
.
- If including curated packages during cluster creation, please set the environment variable:
export CURATED_PACKAGES_SUPPORT=true
- Post-creation installation and detailed package configurations can be found here.
-
Discover curated-packages to install
eksctl anywhere list packages --source registry --kube-version 1.21
Example command output
Package Version(s)
------- ----------
harbor 2.5.0-4324383d8c5383bded5f7378efb98b4d50af827b
-
Generate a curated-packages config
The example shows how to install the harbor
package from the curated package list
.
eksctl anywhere generate package harbor --source registry --kube-version 1.21 > packages.yaml
-
Create a cluster
# Create a cluster with curated packages installation
eksctl anywhere create cluster -f $CLUSTER_NAME.yaml --install-packages packages.yaml
Example command output
Performing setup and validations
✅ validation succeeded {"validation": "docker Provider setup is valid"}
Creating new bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Provider specific setup
Creating new workload cluster
Installing networking on workload cluster
Installing cluster-api providers on workload cluster
Moving cluster management from bootstrap to workload cluster
Installing EKS-A custom components (CRD and controller) on workload cluster
Creating EKS-A CRDs instances on workload cluster
Installing AddonManager and GitOps Toolkit on workload cluster
GitOps field not specified, bootstrap flux skipped
Deleting bootstrap cluster
🎉 Cluster created!
----------------------------------------------------------------------------------------------------------------
The EKS Anywhere package controller and the EKS Anywhere Curated Packages
(referred to as “features”) are provided as “preview features” subject to the AWS Service Terms,
(including Section 2 (Betas and Previews)) of the same. During the EKS Anywhere Curated Packages Public Preview,
the AWS Service Terms are extended to provide customers access to these features free of charge.
These features will be subject to a service charge and fee structure at ”General Availability“ of the features.
----------------------------------------------------------------------------------------------------------------
Installing curated packages controller on workload cluster
package.packages.eks.amazonaws.com/my-harbor created
-
Use the cluster
Once the cluster is created you can use it with the generated KUBECONFIG
file in your local directory
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
kubectl get ns
Example command output
NAME STATUS AGE
capd-system Active 21m
capi-kubeadm-bootstrap-system Active 21m
capi-kubeadm-control-plane-system Active 21m
capi-system Active 21m
capi-webhook-system Active 21m
cert-manager Active 22m
default Active 23m
eksa-system Active 20m
kube-node-lease Active 23m
kube-public Active 23m
kube-system Active 23m
You can now use the cluster like you would any Kubernetes cluster.
Deploy the test application with:
kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
Verify the test application in the deploy test application section
.
Next steps:
-
See the Cluster management
section for more information on common operational tasks like scaling and deleting the cluster.
-
See the Package management
section for more information on post-creation curated packages installation.
To verify that a cluster control plane is up and running, use the kubectl
command to show that the control plane pods are all running.
kubectl get po -A -l control-plane=controller-manager
NAMESPACE NAME READY STATUS RESTARTS AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-57b99f579f-sd85g 2/2 Running 0 47m
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager-79cdf98fb8-ll498 2/2 Running 0 47m
capi-system capi-controller-manager-59f4547955-2ks8t 2/2 Running 0 47m
capi-webhook-system capi-controller-manager-bb4dc9878-2j8mg 2/2 Running 0 47m
capi-webhook-system capi-kubeadm-bootstrap-controller-manager-6b4cb6f656-qfppd 2/2 Running 0 47m
capi-webhook-system capi-kubeadm-control-plane-controller-manager-bf7878ffc-rgsm8 2/2 Running 0 47m
capi-webhook-system capv-controller-manager-5668dbcd5-v5szb 2/2 Running 0 47m
capv-system capv-controller-manager-584886b7bd-f66hs 2/2 Running 0 47m
You may also check the status of the cluster control plane resource directly.
This can be especially useful to verify clusters with multiple control plane nodes after an upgrade.
kubectl get kubeadmcontrolplanes.controlplane.cluster.x-k8s.io
NAME INITIALIZED API SERVER AVAILABLE VERSION REPLICAS READY UPDATED UNAVAILABLE
supportbundletestcluster true true v1.20.7-eks-1-20-6 1 1 1
To verify that the expected number of cluster worker nodes are up and running, use the kubectl
command to show that nodes are Ready
.
This will confirm that the expected number of worker nodes are present.
Worker nodes are named using the cluster name followed by the worker node group name (example: my-cluster-md-0)
kubectl get nodes
NAME STATUS ROLES AGE VERSION
supportbundletestcluster-md-0-55bb5ccd-mrcf9 Ready <none> 4m v1.20.7-eks-1-20-6
supportbundletestcluster-md-0-55bb5ccd-zrh97 Ready <none> 4m v1.20.7-eks-1-20-6
supportbundletestcluster-mdrwf Ready control-plane,master 5m v1.20.7-eks-1-20-6
To test a workload in your cluster you can try deploying the hello-eks-anywhere
.
7.2.4 - Preparing needed for hosting EKS Anywhere on vSphere
Create a VM and template folder (Optional):
For each user that needs to create workload clusters, have the vSphere administrator create a VM and template folder.
That folder will host:
- The VMs of the Control plane and Data plane nodes of each cluster.
- A nested folder for the management cluster and another one for each workload cluster.
- Each cluster VM in its own nested folder under this folder.
User permissions should be set up to:
- Only allow the user to see and create EKS Anywhere resources in that folder and its nested folders.
- Prevent the user from having visibility and control over the whole vSphere cluster domain and its sub-child objects (datacenter, resource pools and other folders).
In your EKS Anywhere configuration file you will reference to a path under this folder associated with the cluster you create.
Add a vSphere folder
Follow these steps to create the user’s vSphere folder:
- From vCenter, select the Menus/VM and Template tab.
- Select either a datacenter or another folder as a parent object for the folder that you want to create.
- Right-click the parent object and click New Folder.
- Enter a name for the folder and click OK.
For more details, see the vSphere Create a Folder
documentation.
Set up vSphere roles and user permission
You need to get a vSphere username with the right privileges to let you creatie EKS Anywhere clusters on top of your vSphere cluster.
Then you would need to import the latest release of the EKS Anywhere OVA template to your VSphere cluster to use it to provision your Cluster nodes.
Add a vCenter User
Ask your VSphere administrator to add a vCenter user that will be used for the provisioning of the EKS Anywhere cluster in VMware vSphere.
- Log in with the vSphere Client to the vCenter Server.
- Specify the user name and password for a member of the vCenter Single Sign-On Administrators group.
- Navigate to the vCenter Single Sign-On user configuration UI.
- From the Home menu, select Administration.
- Under Single Sign On, click Users and Groups.
- If vsphere.local is not the currently selected domain, select it from the drop-down menu.
You cannot add users to other domains.
- On the Users tab, click Add.
- Enter a user name and password for the new user.
- The maximum number of characters allowed for the user name is 300.
- You cannot change the user name after you create a user.
The password must meet the password policy requirements for the system.
- Click Add.
For more details, see vSphere Add vCenter Single Sign-On Users
documentation.
Create and define user roles
When you add a user for creating clusters, that user initially has no privileges to perform management operations.
So you have to add this user to groups with the required permissions, or assign a role or roles with the required permission to this user.
Three roles are needed to be able to create the EKS Anywhere cluster:
-
Create a global custom role: For example, you could name this EKS Anywhere Global.
Define it for the user on the vCenter domain level and its children objects.
Create this role with the following privileges:
> Content Library
* Add library item
* Check in a template
* Check out a template
* Create local library
> vSphere Tagging
* Assign or Unassign vSphere Tag
* Assign or Unassign vSphere Tag on Object
* Create vSphere Tag
* Create vSphere Tag Category
* Delete vSphere Tag
* Delete vSphere Tag Category
* Edit vSphere Tag
* Edit vSphere Tag Category
* Modify UsedBy Field For Category
* Modify UsedBy Field For Tag
-
Create a user custom role: The second role is also a custom role that you could call, for example, EKS Anywhere User.
Define this role with the following objects and children objects.
- The pool resource level and its children objects.
This resource pool that our EKS Anywhere VMs will be part of.
- The storage object level and its children objects.
This storage that will be used to store the cluster VMs.
- The network VLAN object level and its children objects.
This network that will host the cluster VMs.
- The VM and Template folder level and its children objects.
Create this role with the following privileges:
> Content Library
* Add library item
* Check in a template
* Check out a template
* Create local library
> Datastore
* Allocate space
* Browse datastore
* Low level file operations
> Folder
* Create folder
> vSphere Tagging
* Assign or Unassign vSphere Tag
* Assign or Unassign vSphere Tag on Object
* Create vSphere Tag
* Create vSphere Tag Category
* Delete vSphere Tag
* Delete vSphere Tag Category
* Edit vSphere Tag
* Edit vSphere Tag Category
* Modify UsedBy Field For Category
* Modify UsedBy Field For Tag
> Network
* Assign network
> Resource
* Assign virtual machine to resource pool
> Scheduled task
* Create tasks
* Modify task
* Remove task
* Run task
> Profile-driven storage
* Profile-driven storage view
> Storage views
* View
> vApp
* Import
> Virtual machine
* Change Configuration
- Add existing disk
- Add new disk
- Add or remove device
- Advanced configuration
- Change CPU count
- Change Memory
- Change Settings
- Configure Raw device
- Extend virtual disk
- Modify device settings
- Remove disk
* Edit Inventory
- Create from existing
- Create new
- Remove
* Interaction
- Power off
- Power on
* Provisioning
- Clone template
- Clone virtual machine
- Create template from virtual machine
- Customize guest
- Deploy template
- Mark as template
- Read customization specifications
* Snapshot management
- Create snapshot
- Remove snapshot
- Revert to snapshot
-
Create a default Administrator role: The third role is the default system role Administrator that you define to the user on the folder level and its children objects (VMs and OVA templates) that was created by the VSphere admistrator for you.
To create a role and define privileges check Create a vCenter Server Custom Role
and Defined Privileges
pages.
Deploy an OVA Template
If the user creating the cluster has permission and network access to create and tag a template, you can skip these steps because EKS Anywhere will automatically download the OVA and create the template if it can. If the user does not have the permissions or network access to create and tag the template, follow this guide. The OVA contains the operating system (Ubuntu or Bottlerocket) for a specific EKS-D Kubernetes release and EKS-A version. The following example uses Ubuntu as the operating system, but a similar workflow would work for Bottlerocket.
Steps to deploy the Ubuntu OVA
- Go to the artifacts
page and download the OVA template with the newest EKS-D Kubernetes release to your computer.
- Log in to the vCenter Server.
- Right-click the folder you created above and select Deploy OVF Template.
The Deploy OVF Template wizard opens.
- On the Select an OVF template page, select the Local file option, specify the location of the OVA template you downloaded to your computer, and click Next.
- On the Select a name and folder page, enter a unique name for the virtual machine or leave the default generated name, if you do not have other templates with the same name within your vCenter Server virtual machine folder.
The default deployment location for the virtual machine is the inventory object where you started the wizard, which is the folder you created above. Click Next.
- On the Select a compute resource page, select the resource pool where to run the deployed VM template, and click Next.
- On the Review details page, verify the OVF or OVA template details and click Next.
- On the Select storage page, select a datastore to store the deployed OVF or OVA template and click Next.
- On the Select networks page, select a source network and map it to a destination network. Click Next.
- On the Ready to complete page, review the page and click Finish.
For details, see Deploy an OVF or OVA Template
To build your own Ubuntu OVA template check the Building your own Ubuntu OVA section in the following link
.
To use the deployed OVA template to create the VMs for the EKS Anywhere cluster, you have to tag it with specific values for the os
and eksdRelease
keys.
The value of the os
key is the operating system of the deployed OVA template, which is ubuntu
in our scenario.
The value of the eksdRelease
holds kubernetes
and the EKS-D release used in the deployed OVA template.
Check the following Customize OVAs
page for more details.
Steps to tag the deployed OVA template:
- Go to the artifacts
page and take notes of the tags and values associated with the OVA template you deployed in the previous step.
- In the vSphere Client, select Menu > Tags & Custom Attributes.
- Select the Tags tab and click Tags.
- Click New.
- In the Create Tag dialog box, copy the
os
tag name associated with your OVA that you took notes of, which in our case is os:ubuntu
and paste it as the name for the first tag required.
- Specify the tag category
os
if it exist or create it if it does not exist.
- Click Create.
- Repeat steps 2-4.
- In the Create Tag dialog box, copy the
os
tag name associated with your OVA that you took notes of, which in our case is eksdRelease:kubernetes-1-21-eks-8
and paste it as the name for the second tag required.
- Specify the tag category
eksdRelease
if it exist or create it if it does not exist.
- Click Create.
- Navigate to the VM and Template tab.
- Select the folder that was created.
- Select deployed template and click Actions.
- From the drop-down menu, select Tags and Custom Attributes > Assign Tag.
- Select the tags we created from the list and confirm the operation.
To run EKS Anywhere, you will need:
Prepare Administrative machine
Set up an Administrative machine as described in Install EKS Anywhere
.
Prepare a VMware vSphere environment
To prepare a VMware vSphere environment to run EKS Anywhere, you need the following:
-
A vSphere 7+ environment running vCenter
-
Capacity to deploy 6-10 VMs
-
DHCP service
running in vSphere environment in the primary VM network for your workload cluster
-
One network in vSphere to use for the cluster. This network must have inbound access into vCenter
-
An OVA
imported into vSphere and converted into a template for the workload VMs
-
User credentials to create VMs and attach networks, etc
-
One IP address routable from cluster but excluded from DHCP offering.
This IP address is to be used as the Control Plane Endpoint IP or kube-vip VIP address
Below are some suggestions to ensure that this IP address is never handed out by your DHCP server.
You may need to contact your network engineer.
- Pick an IP address reachable from cluster subnet which is excluded from DHCP range OR
- Alter DHCP ranges to leave out an IP address(s) at the top and/or the bottom of the range OR
- Create an IP reservation for this IP on your DHCP server. This is usually accomplished by adding
a dummy mapping of this IP address to a non-existent mac address.
Each VM will require:
- 2 vCPUs
- 8GB RAM
- 25GB Disk
The administrative machine and the target workload environment will need network access to:
- public.ecr.aws
- anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
- distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
- d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
- api.github.com (only if GitOps is enabled)
You need to get the following information before creating the cluster:
-
Static IP Addresses:
You will need one IP address for the management cluster control plane endpoint, and a separate one for the controlplane of each workload cluster you add.
Let’s say you are going to have the management cluster and two workload clusters.
For those, you would need three IP addresses, one for each.
All of those addresses will be configured the same way in the configuration file you will generate for each cluster.
A static IP address will be used for each control plane VM in your EKS Anywhere cluster.
Choose IP addresses in your network range that do not conflict with other VMs and make sure they are excluded from your DHCP offering.
An IP address will be the value of the property controlPlaneConfiguration.endpoint.host
in the config file of the management cluster.
A separate IP address must be assigned for each workload cluster.

-
vSphere Datacenter Name: The vSphere datacenter to deploy the EKS Anywhere cluster on.

-
VM Network Name: The VM network to deploy your EKS Anywhere cluster on.

-
vCenter Server Domain Name: The vCenter server fully qualified domain name or IP address. If the server IP is used, the thumbprint must be set or insecure must be set to true.

-
thumbprint (required if insecure=false): The SHA1 thumbprint of the vCenter server certificate which is only required if you have a self-signed certificate for your vSphere endpoint.
There are several ways to obtain your vCenter thumbprint.
If you have govc installed
, you can run the following command in the Administrative machine terminal, and take a note of the output:
govc about.cert -thumbprint -k
-
template: The VM template to use for your EKS Anywhere cluster.
This template was created when you imported the OVA file into vSphere.

-
datastore: The vSphere datastore
to deploy your EKS Anywhere cluster on.

-
folder:
The folder parameter in VSphereMachineConfig allows you to organize the VMs of an EKS Anywhere cluster.
With this, each cluster can be organized as a folder in vSphere.
You will have a separate folder for the management cluster and each cluster you are adding.

-
resourcePool:
The vSphere Resource pools for your VMs in the EKS Anywhere cluster. If there is a resource pool: /<datacenter>/host/<resource-pool-name>/Resources

7.2.5 - vSphere cluster
EKS Anywhere supports a vSphere provider for production grade EKS Anywhere deployments.
EKS Anywhere allows you to provision and manage Amazon EKS on your own infrastructure.
This document walks you through setting up EKS Anywhere in a way that:
- Deploys an initial cluster on your vSphere environment. That cluster can be used as a self-managed cluster (to run workloads) or a management cluster (to create and manage other clusters)
- Deploys zero or more workload clusters from the management cluster
If your initial cluster is a management cluster, it is intended to stay in place so you can use it later to modify, upgrade, and delete workload clusters.
Using a management cluster makes it faster to provision and delete workload clusters.
Also it lets you keep vSphere credentials for a set of clusters in one place: on the management cluster.
The alternative is to simply use your initial cluster to run workloads.
Important
Creating an EKS Anywhere management cluster is the recommended model.
Separating management features into a separate, persistent management cluster
provides a cleaner model for managing the lifecycle of workload clusters (to create, upgrade, and delete clusters), while workload clusters run user applications.
This approach also reduces provider permissions for workload clusters.
Prerequisite Checklist
EKS Anywhere needs to be run on an administrative machine that has certain machine
requirements
.
An EKS Anywhere deployment will also require the availability of certain
resources from your VMware vSphere deployment
.
Steps
The following steps are divided into two sections:
- Create an initial cluster (used as a management or self-managed cluster)
- Create zero or more workload clusters from the management cluster
Create an initial cluster
Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a self-managed cluster (for running workloads itself).
All steps listed below should be executed on the admin machine with reachability to the vSphere environment where the EKA Anywhere clusters are created.
-
Generate an initial cluster config (named mgmt-cluster
for this example):
export MGMT_CLUSTER_NAME=mgmt-cluster
eksctl anywhere generate clusterconfig $MGMT_CLUSTER_NAME \
--provider vsphere > $MGMT_CLUSTER_NAME.yaml
The command above creates a config file named mgmt-cluster.yaml in the path where it is executed. Refer to vsphere configuration
for information on configuring this cluster config for a vSphere provider.
The configuration specification is divided into three sections:
- Cluster
- VSphereDatacenterConfig
- VSphereMachineConfig
Some key considerations and configuration parameters:
-
Create at least two control plane nodes, three worker nodes, and three etcd nodes for a production cluster, to provide high availability and rolling upgrades.
-
osFamily (operating System on virtual machines) parameter in VSphereMachineConfig by default is set to bottlerocket. Permitted values: ubuntu, bottlerocket.
-
The recommended mode of deploying etcd on EKS Anywhere production clusters is unstacked (etcd members have dedicated machines and are not collocated with control plane components). More information here. The generated config file comes with external etcd enabled already. So leave this part as it is.
-
Apart from the base configuration, you can optionally add additional configuration to enable supported EKS Anywhere functionalities.
As of now, you have to pre-determine which features you want to enable on your cluster before cluster creation. Otherwise, to enable them post-creation will require you to delete and recreate the cluster. However, the next EKS-A release will remove such limitation.
-
To enable managing cluster resources using GitOps, you would need to enable GitOps configurations on the initial/managemet cluster. You can not enable GitOps on workload clusters as long as you have enabled it on the initial/management cluster. And if you want to manage the deployment of Kubernetes resources on a workload cluster, then you would need to bootstrap Flux against your workload cluster manually, to be able deploying Kubernetes resources to this workload cluster using GitOps
-
Modify the initial cluster generated config (mgmt-cluster.yaml
) as follows:
You will notice that the generated config file comes with the following fields with empty values. All you need is to fill them with the values we gathered in the prerequisites
page.
Refer to vsphere configuration
for more information on the configuring that can be used for a vSphere provider.
-
Set Credential Environment Variables
Before you create the initial/management cluster, you will need to set and export these environment variables for your vSphere user name and password. Make sure you use single quotes around the values so that your shell does not interpret the values
# vCenter User Credentials
export GOVC_URL='[vCenter Server Domain Name]' # Example: https://sample.exampledomain.com
export GOVC_USERNAME='[vSphere user name]' # Example: USER1@exampledomain
export GOVC_PASSWORD='[vSphere password]'
export GOVC_INSECURE=true
export EKSA_VSPHERE_USERNAME='[vSphere user name]' # Example: USER1@exampledomain
export EKSA_VSPHERE_PASSWORD='[vSphere password]'
-
Set License Environment Variable
If you are creating a licensed cluster, set and export the license variable (see License cluster
if you are licensing an existing cluster):
export EKSA_LICENSE='my-license-here'
-
Now you are ready to create a cluster with the basic stettings.
Important
If you plan to enable other compnents such as, GitOps, oidc, IAM for Pods, etc, Skip creating the cluster now and go ahead adding the configuration for those components to your generated config file first. Or you would need to receate the cluster again as mentioned above.
After you have finish adding all the configuration needed to your configuration file the mgmt-cluster.yaml
and set your credential environment variables, you are ready to create the cluster. Run the create command with the option -v 9 to get the highest level of verbosity, in case you want to troubleshoot any issue happened during the creation of the cluster. You may need also to output it to a file, so you can look at it later.
eksctl anywhere create cluster -f $MGMT_CLUSTER_NAME.yaml \
-v 9 > $MGMT_CLUSTER_NAME-$(date "+%Y%m%d%H%M").log 2>&1
-
With the completion of the above steps, the management EKS Anywhere cluster is created on the configured vSphere environment under a sub-folder of the EKS Anywhere
folder. You can see the cluster VMs from the vSphere console as below:

-
Once the cluster is created a folder got created on the admin machine with the cluster name which contains the kubeconfig file and the cluster configuration file used to create the cluster, in addition to the generated SSH key pair that you can use to SSH into the VMs of the cluster.
Output
eks-a-id_rsa mgmt-cluster-eks-a-cluster.kubeconfig
eks-a-id_rsa.pub mgmt-cluster-eks-a-cluster.yaml
-
Now you can use your cluster with the generated KUBECONFIG
file:
export KUBECONFIG=${PWD}/${MGMT_CLUSTER_NAME}/${MGMT_CLUSTER_NAME}-eks-a-cluster.kubeconfig
kubectl cluster-info
The cluster endpoint in the output of this command would be the controlPlaneConfiguration.endpoint.host provided in the mgmt-cluster.yaml config file.
-
Check the cluster nodes:
To check that the cluster completed, list the machines to see the control plane, etcd, and worker nodes:
Example command output
NAMESPACE NAME PROVIDERID PHASE VERSION
eksa-system mgmt-b2xyz vsphere:/xxxxx Running v1.21.2-eks-1-21-5
eksa-system mgmt-etcd-r9b42 vsphere:/xxxxx Running
eksa-system mgmt-md-8-6xr-rnr vsphere:/xxxxx Running v1.21.2-eks-1-21-5
...
The etcd machine doesn’t show the Kubernetes version because it doesn’t run the kubelet service.
-
Check the initial/management cluster’s CRD:
To ensure you are looking at the initial/management cluster, list the CRD to see that the name of its management cluster is itself:
kubectl get clusters mgmt -o yaml
Example command output
...
kubernetesVersion: "1.21"
managementCluster:
name: mgmt
workerNodeGroupConfigurations:
...
Note
The initial cluster is now ready to deploy workload clusters.
However, if you just want to use it to run workloads, you can deploy pod workloads directly on the initial cluster without deploying a separate workload cluster and skip the section on running separate workload clusters.
Create separate workload clusters
Follow these steps if you want to use your initial cluster to create and manage separate workload clusters. All steps listed below should be executed on the same admin machine the management cluster created on.
-
Generate a workload cluster config:
export WORKLOAD_CLUSTER_NAME='w01-cluster'
export MGMT_CLUSTER_NAME='mgmt-cluster'
eksctl anywhere generate clusterconfig $WORKLOAD_CLUSTER_NAME \
--provider vsphere > $WORKLOAD_CLUSTER_NAME.yaml
The command above creates a file named w01-cluster.yaml with similar contents to the mgmt.cluster.yaml file that was generated for the management cluster in the previous section. It will be generated in the path where it is executed.
Same key considerations and configuration parameters apply to workload cluster as well, that were mentioned above with the initial cluster.
-
Refer to the initial config described earlier for the required and optional settings.
The main differences are that you must have a new cluster name and cannot use the same vSphere resources.
-
Modify the generated workload cluster config parameters same way you did in the generated configuration file of the management cluster. The only differences are with the following fileds:
-
controlPlaneConfiguration.endpoint.host:
That you will use a different IP address for the Cluster filed controlPlaneConfiguration.endpoint.host
for each workload cluster as with the initial cluster. Notice here that you use a different IP address from this one that was used with the management cluster.
-
managementCluster.name:
By default the value of this field is the same as the cluster name, when you generate the configuration file. But because we want this workload cluster we are adding, to managed by the management cluster, then you need to change that to the management cluster name.
managementCluster:
name: mgmt-cluster # the name of the initial/management cluster
-
VSphereMachineConfig.folder
It’s recommended to have a separate folder path for each cluster you add for organization purposes.
folder: /Example Datacenter/vm/EKS Anywhere/w01-cluster
Other than that all other parameters will be configured the same way.
-
Create a workload cluster
Important
If you plan to enable other compnents such as oidc, IAM for Pods, etc, skip creating the cluster now and go ahead adding the configuration for those components to your generated config file first. Or you would need to receate the cluster again. If GitOps have been enabled on the initial/management cluster, you would not have the option to enable GitOps on the workload cluster, as the goal of using GitOps is to centrally manage all of your clusters.
To create a new workload cluster from your management cluster run this command, identifying:
- The workload cluster yaml file
- The initial cluster’s credentials (this causes the workload cluster to be managed from the management cluster)
eksctl anywhere create cluster \
-f $WORKLOAD_CLUSTER_NAME.yaml \
--kubeconfig $MGMT_CLUSTER_NAME/$MGMT_CLUSTER_NAME-eks-a-cluster.kubeconfig \
-v 9 > $WORKLOAD_CLUSTER_NAME-$(date "+%Y%m%d%H%M").log 2>&1
As noted earlier, adding the --kubeconfig
option tells eksctl
to use the management cluster identified by that kubeconfig file to create a different workload cluster.
-
With the completion of the above steps, the management EKS Anywhere cluster is created on the configured vSphere environment under a sub-folder of the EKS Anywhere
folder. You can see the cluster VMs from the vSphere console as below:

-
Once the cluster is created a folder got created on the admin machine with the cluster name which contains the kubeconfig file and the cluster configuration file used to create the cluster, in addition to the generated SSH key pair that you can use to SSH into the VMs of the cluster.
Output
eks-a-id_rsa w01-cluster-eks-a-cluster.kubeconfig
eks-a-id_rsa.pub w01-cluster-eks-a-cluster.yaml
-
You can list the workload clusters managed by the management cluster.
export KUBECONFIG=${PWD}/${MGMT_CLUSTER_NAME}/${MGMT_CLUSTER_NAME}-eks-a-cluster.kubeconfig
kubectl get clusters
-
Check the workload cluster:
You can now use the workload cluster as you would any Kubernetes cluster.
Change your credentials to point to the kubconfig file of the new workload cluster, then get the cluster info
export KUBECONFIG=${PWD}/${WORKLOAD_CLUSTER_NAME}/${WORKLOAD_CLUSTER_NAME}-eks-a-cluster.kubeconfig
kubectl cluster-info
The cluster endpoint in the output of this command should be the controlPlaneConfiguration.endpoint.host provided in the w01-cluster.yaml config file.
-
To verify that the expected number of cluster worker nodes are up and running, use the kubectl command to show that nodes are Ready.
-
Test deploying an application with:
kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
Verify the test application in the deploy test application section
.
-
Add more workload clusters:
To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as w01-cluster.yaml
), modifying resource names, and running the create cluster command again.
See the Cluster management
section with more information on common operational tasks like scaling and deleting the cluster.
7.3 - Packages
This chapter walks through the following:
7.3.1 - Harbor use cases
Proxy a public Amazon Elastic Container Registry (ECR) repository
This use case is to use Harbor to proxy and cache images from a public ECR repository, which helps limit the amount of requests made to a public ECR repository, avoiding consuming too much bandwidth or being throttled by the registry server.
-
Login
Log in to the Harbor web portal with the default credential as shown below

-
Create a registry proxy
Navigate to Registries
on the left panel, and then click on NEW ENDPOINT
button. Choose Docker Registry
as the Provider, and enter public-ecr
as the Name, and enter https://public.ecr.aws/
as the Endpoint URL. Save it by clicking on OK.

-
Create a proxy project
Navigate to Projects
on the left panel and click on the NEW PROJECT
button. Enter proxy-project
as the Project Name, check Public access level
, and turn on Proxy Cache and choose public-ecr
from the pull-down list. Save the configuration by clicking on OK.

-
Pull images
Note
harbor.eksa.demo:30003
should be replaced with whatever externalURL
is set to in the Harbor package YAML file.
docker pull harbor.eksa.demo:30003/proxy-project/cloudwatch-agent/cloudwatch-agent:latest
Proxy a private Amazon Elastic Container Registry (ECR) repository
This use case is to use Harbor to proxy and cache images from a private ECR repository, which helps limit the amount of requests made to a private ECR repository, avoiding consuming too much bandwidth or being throttled by the registry server.
-
Login
Log in to the Harbor web portal with the default credential as shown below

-
Create a registry proxy
In order for Harbor to proxy a remote private ECR registry, an IAM credential with necessary permissions need to be created. Usually, it follows three steps:
-
Policy
This is where you specify all necessary permissions. Please refer to private repository policies
, IAM permissions for pushing an image
and ECR policy examples
to figure out the minimal set of required permissions.
For simplicity, the build-in policy AdministratorAccess is used here.

-
User group
This is an easy way to manage a pool of users who share the same set of permissions by attaching the policy to the group.

-
User
Create a user and add it to the user group. In addition, please navigate to Security credentials to generate an access key. Access keys consists of two parts: an access key ID and a secret access key. Please save both as they are used in the next step.

Navigate to Registries
on the left panel, and then click on NEW ENDPOINT
button. Choose Aws ECR
as Provider, and enter private-ecr
as Name, https://[ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/
as Endpoint URL, use the access key ID part of the generated access key as Access ID, and use the secret access key part of the generated access key as Access Secret. Save it by click on OK.

-
Create a proxy project
Navigate to Projects
on the left panel and click on NEW PROJECT
button. Enter proxy-private-project
as Project Name, check Public access level
, and turn on Proxy Cache and choose private-ecr
from the pull-down list. Save the configuration by clicking on OK.

-
Pull images
Create a repository in the target private ECR registry

Push an image to the created repository
docker pull alpine
docker tag alpine [ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/alpine:latest
docker push [ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/alpine:latest
Note
harbor.eksa.demo:30003
should be replaced with whatever externalURL
is set to in the Harbor package YAML file.
docker pull harbor.eksa.demo:30003/proxy-private-project/alpine:latest
Repository replication from Harbor to a private Amazon Elastic Container Registry (ECR) repository
This use case is to use Harbor to replicate local images and charts to a private ECR repository in push mode. When a replication rule is set, all resources that match the defined filter patterns are replicated to the destination registry when the triggering condition is met.
-
Login
Log in to the Harbor web portal with the default credential as shown below

-
Create a nonproxy project

-
Create a registry proxy
In order for Harbor to proxy a remote private ECR registry, an IAM credential with necessary permissions need to be created. Usually, it follows three steps:
-
Policy
This is where you specify all necessary permissions. Please refer to private repository policies
, IAM permissions for pushing an image
and ECR policy examples
to figure out the minimal set of required permissions.
For simplicity, the build-in policy AdministratorAccess is used here.

-
User group
This is an easy way to manage a pool of users who share the same set of permissions by attaching the policy to the group.

-
User
Create a user and add it to the user group. In addition, please navigate to Security credentials to generate an access key. Access keys consists of two parts: an access key ID and a secret access key. Please save both as they are used in the next step.

Navigate to Registries
on the left panel, and then click on the NEW ENDPOINT
button. Choose Aws ECR
as the Provider, and enter private-ecr
as the Name, https://[ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/
as the Endpoint URL, use the access key ID part of the generated access key as Access ID, and use the secret access key part of the generated access key as Access Secret. Save it by clicking on OK.

-
Create a replication rule

-
Prepare an image
Note
harbor.eksa.demo:30003
should be replaced with whatever externalURL
is set to in the Harbor package YAML file.
docker pull alpine
docker tag alpine:latest harbor.eksa.demo:30003/nonproxy-project/alpine:latest
-
Authenticate with Harbor with the default credential as shown below
Note
harbor.eksa.demo:30003
should be replaced with whatever externalURL
is set to in the Harbor package YAML file.
docker logout
docker login harbor.eksa.demo:30003
-
Push images
Create a repository in the target private ECR registry

Note
harbor.eksa.demo:30003
should be replaced with whatever externalURL
is set to in the Harbor package YAML file.
docker push harbor.eksa.demo:30003/nonproxy-project/alpine:latest
The image should appear in the target ECR repository shortly.

Set up trivy image scanner in an air-gapped environment
This use case is to manually import vulnerability database to Harbor trivy when Harbor is running in an air-gapped environment. All the following commands are assuming Harbor is running in the default namespace.
-
Configure trivy
TLS example with auto certificate generation
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
name: my-harbor
namespace: eksa-packages
spec:
packageName: harbor
config: |-
secretKey: "use-a-secret-key"
externalURL: https://harbor.eksa.demo:30003
expose:
tls:
certSource: auto
auto:
commonName: "harbor.eksa.demo"
trivy:
skipUpdate: true
offlineScan: true
Non-TLS example
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
name: my-harbor
namespace: eksa-packages
spec:
packageName: harbor
config: |-
secretKey: "use-a-secret-key"
externalURL: http://harbor.eksa.demo:30002
expose:
tls:
enabled: false
trivy:
skipUpdate: true
offlineScan: true
If Harbor is already running without the above trivy configurations, run the following command to update both skipUpdate
and offlineScan
kubectl edit statefulsets/harbor-helm-trivy
-
Download the vulnerability database to your local host
Please follow oras installation instruction
.
oras pull ghcr.io/aquasecurity/trivy-db:2 -a
-
Upload database to trivy pod from your local host
kubectl cp db.tar.gz harbor-helm-trivy-0:/home/scanner/.cache/trivy -c trivy
-
Set up database on Harbor trivy pod
kubectl exec -it harbor-helm-trivy-0 -c trivy bash
cd /home/scanner/.cache/trivy
mkdir db
mv db.tar.gz db
cd db
tar zxvf db.tar.gz