0

I am facing difficulty in this, maybe the answer is simple so if someone knows the answer, please comment here.

I have created an EKS cluster using the following manifest.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata: name: test-cluster region: us-west-2 version: "1.29"

vpc: subnets: public: us-west-2a: { id: subnet-094d01de2dd2148c0 } us-west-2b: { id: subnet-04429e132a1f42826 } us-west-2c: { id: subnet-028a738bdafc344c6 }

nodeGroups:

  • name: ng-spot instanceType: t3.medium labels: { role: builders } desiredCapacity: 2 minSize: 2 maxSize: 4 volumeSize: 30 ssh: allow: true publicKeyName: techies tags: Name: ng-spot maxPodsPerNode: 110

This cluster is for testing purposes, so I am using the t3.medium instance with the maximum pod limit 110.

arun@ArunLAL555:~$ k get nodes
NAME                                          STATUS   ROLES    AGE   VERSION
ip-192-168-37-0.us-west-2.compute.internal    Ready    <none>   26m   v1.29.0-eks-5e0fdde
ip-192-168-86-42.us-west-2.compute.internal   Ready    <none>   26m   v1.29.0-eks-5e0fdde
arun@ArunLAL555:~$ kubectl get nodes -o jsonpath='{.items[*].status.allocatable.pods}{"\n"}'
110 110

This ensures that I can create 110 pods on each node.

arun@ArunLAL555:~$ k create deployment test-deploy --image nginx --replicas 50
deployment.apps/test-deploy created
arun@ArunLAL555:~$ k get po
NAME                           READY   STATUS              RESTARTS   AGE
test-deploy-859f95ffcc-2c5k6   0/1     ContainerCreating   0          19s
test-deploy-859f95ffcc-2p9rh   1/1     Running             0          19s
test-deploy-859f95ffcc-468wm   0/1     ContainerCreating   0          18s
.
.
test-deploy-859f95ffcc-xxm7z   0/1     ContainerCreating   0          18s
test-deploy-859f95ffcc-z88x6   1/1     Running             0          19s

Here, the remaining pods not getting IP

arun@ArunLAL555:~$ k events po test-deploy-859f95ffcc-xxm7z
1s (x5 over 55s)         Warning   FailedCreatePodSandBox    Pod/test-deploy-859f95ffcc-m7t62                   (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "528eaad224c5578435db12a57a8fa7063a03423b28d57c681bab742cc8389a1a": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container

The following are the subnets and their IP availability

arun@ArunLAL555:~$ aws eks describe-cluster --name test-cluster --query "cluster.resourcesVpcConfig.su
bnetIds"
[
    "subnet-094d01de2dd2148c0",
    "subnet-04429e132a1f42826",
    "subnet-028a738bdafc344c6"
]
arun@ArunLAL555:~$ aws ec2 describe-subnets --subnet-ids subnet-094d01de2dd2148c0 subnet-04429e132a1f42826 subnet-028a738bdafc344c6 --query 'Subnets[*].[SubnetId,AvailableIpAddressCount]' --output text

subnet-028a738bdafc344c6 8167 subnet-094d01de2dd2148c0 8185 subnet-04429e132a1f42826 8168

I have updated the VPC CNI

arun@ArunLAL555:~$ kubectl describe daemonset aws-node --namespace kube-system | grep amazon-k8s-cni: | cut -d : -f 3
v1.16.0-eksbuild.1
arun@ArunLAL555:~$ aws eks create-addon --cluster-name test-cluster --addon-name vpc-cni --addon-version v1.17.1-eksbuild.1 \
-service>     --service-account-role-arn arn:aws:iam::111122223333:role/AmazonEKSVPCCNIRole
{
    "addon": {
        "addonName": "vpc-cni",
        "clusterName": "test-cluster",
        "status": "CREATING",
        "addonVersion": "v1.17.1-eksbuild.1",
        "health": {
            "issues": []
        },
        "addonArn": "arn:aws:eks:us-west-2:111122223333:addon/test-cluster/vpc-cni/fec7333d-c1fc-c2fc-1287-c14beaa883f8",
        "createdAt": "2024-03-22T19:35:54.685000+05:30",
        "modifiedAt": "2024-03-22T19:35:54.703000+05:30",
        "serviceAccountRoleArn": "arn:aws:iam::111122223333:role/AmazonEKSVPCCNIRole",
        "tags": {}
    }
}
arun@ArunLAL555:~$ aws eks describe-addon --cluster-name test-cluster --addon-name vpc-cni --query addon.addonVersion --output text
v1.17.1-eksbuild.1

After that, I have terminated the existing instances, since that the nodes art not getting ready.

arun@ArunLAL555:~$ k get nodes
NAME                                           STATUS     ROLES    AGE     VERSION
ip-192-168-40-177.us-west-2.compute.internal   NotReady   <none>   86s     v1.29.0-eks-5e0fdde
ip-192-168-83-11.us-west-2.compute.internal    NotReady   <none>   3m29s   v1.29.0-eks-5e0fdde
arun@ArunLAL555:~$ k describe nodes ip-192-168-40-177.us-west-2.compute.internal

Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


MemoryPressure False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletHasSufficientPID kubelet has sufficient PID available Ready False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

I would like to know why this is happening, if someone know the answer please comment.

  • First, why the Pods didn't get IP even though the pod limit was set to the maximum
  • Second, why the nodes are not ready after updating the VPC CNI plugin

2 Answers2

1
  1. It looks like you have not replaced the sample service account role arn from the document. "Replace 111122223333 with your account ID and AmazonEKSVPCCNIRole with the name of an existing IAM role that you've created." https://docs.aws.amazon.com/eks/latest/userguide/managing-vpc-cni.html#:~:text=Create%20the%20add%2Don%20using%20the%20AWS%20CLI.
Chopper
  • 56
1

It appears your initial configuration for the maxPodsPerNode might be incorrect for the instance type you're using.

That requires a few steps, in short:

1- Calculate the correct max pods for your instance type.

2- Enable prefix delegation to support more IP addresses per ENI.

3- Update your node group configuration with the correct maxPodsPerNode value.

4- Scale your node group to apply changes.

5- Verify the status of nodes and pods to ensure they are functioning correctly.

6- Set environment variables to ensure sufficient IP address availability.

To determine the Maximum Number of Pods for your instance type, calculate the maximum number of Pods for your instance type (t3.medium) and the specific version of the Amazon VPC CNI plugin you're using:

curl -O https://raw.githubusercontent.com/awslabs/amazon-eks-ami/master/templates/al2/runtime/max-pods-calculator.sh
chmod +x max-pods-calculator.sh
./max-pods-calculator.sh --instance-type t3.medium --cni-version 1.17.1-eksbuild.1

To support a larger number of Pods per node, enable prefix delegation. Prefix delegation allows assigning more IP addresses to each ENI. Like this:

kubectl set env daemonset aws-node -n kube-system ENABLE_PREFIX_DELEGATION=true

Then, update your ClusterConfig to reflect the correct maximum Pods per node. You can use eksctl to update the node group configuration, e.g.:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata: name: test-cluster region: us-west-2 version: "1.29"

vpc: subnets: public: us-west-2a: { id: subnet-094d01de2dd2148c0 } us-west-2b: { id: subnet-04429e132a1f42826 } us-west-2c: { id: subnet-028a738bdafc344c6 }

nodeGroups:

  • name: ng-spot instanceType: t3.medium labels: { role: builders } desiredCapacity: 2 minSize: 2 maxSize: 4 volumeSize: 30 ssh: allow: true publicKeyName: techies tags: Name: ng-spot maxPodsPerNode: 29

Then, recreate your node group or update your existing node group:

eksctl scale nodegroup --cluster test-cluster --name ng-spot --nodes 0
eksctl scale nodegroup --cluster test-cluster --name ng-spot --nodes 2

After making these changes, verify that your nodes are ready and the pods are correctly assigned IP addresses:

kubectl get nodes
kubectl describe nodes
kubectl get pods -o wide

Set environment variables for WARM_PREFIX_TARGET and WARM_IP_TARGET to ensure sufficient IP address availability:

kubectl set env ds aws-node -n kube-system WARM_PREFIX_TARGET=1
kubectl set env ds aws-node -n kube-system WARM_IP_TARGET=5

Good luck!

Max Haase
  • 1,123