4 minutes read
no time? jump straight to conclusion
Do you run Kubernetes on AWS?
Do you use EKS?
You may run EKS on Fargate because running EKS pods on the serverless Fargate computer service for containers is generally available since December 3, 2019.
If so, you probably also use the AWS ALB Ingress Controller.
This post is about ALB Ingress Controller CrashLoopBackOffs.
How does a ALB Ingress Controller CrashLoopBackOff look like?
kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS
kube-system alb-ingress-controller-53d5561b73-azh3p 0/1 CrashLoopBackOff
....
kubectl describe pod/alb-ingress-controller-53d5561b73-azh3p -n kube-system
Name: alb-ingress-controller-53d5561b73-azh3p
.....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 96s (x86488 over 1d) kubelet, fargate-ip-00-000-000-000.aws-zone.compute.internal Back-off restarting failed container
kubectl logs pod/alb-ingress-controller-53d5561b73-azh3p -n kube-system
....
F0125 00:00:00.00000 1 main.go:94] failed to introspect vpcID from ec2Metadata due to RequestError: send request failed
caused by: Get http://169.254.169.254/latest/meta-data/mac: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), specify --aws-vpc-id instead if ec2Metadata is unavailable
The key information of the reduced output and logs is:
specify --aws-vpc-id instead if ec2Metadata is unavailable
What is the root cause of the error?
The alb-ingress-controller container needs to know the vpc-id it runs in.
Actually the aws-alb-ingress-controller documentation is rather specific:
!!!tip If ec2metadata is unavailable from the controller pod, edit the following variables:
- `--aws-vpc-id=vpc-xxxxxx`: vpc ID of the cluster. - `--aws-region=us-west-1`: AWS region of the cluster.https://github.com/kubernetes-sigs/aws-alb-ingress-controller/blob/master/docs/guide/controller/setup.md#kubectl
This is not the only environment information the aws alb ingress container requires. alb-ingress-controller.yaml#L25-L67 lists required and optional arguments:
--watch-namespace
--ingress-class=alb
# REQUIRED
--cluster-name=devCluster--aws-vpc-id=vpc-xxxxxx
--aws-region=us-west-1
--aws-api-debug
--aws-max-retries
Both arguments
--aws-vpc-id
--aws-region
shall be discovered from ec2metadata if unspecified.
However when running EKS with Fargate ec2metadata are not available anymore. A task metadata endpoint exits for ECS – I could not find an EKS equivalent. Without ec2metadata service, the alb-ingress-controller can’t fetch the required arguments. As a result the container crashes again and again and the pod ends up in CrashLoopBackOff.
CrashLoopBackOff fix
Provide the vpc-id and the aws-region argument when applying the alb-ingress-controller.yaml to the cluster. Assuming the EKS cluster was created using eksctl and the aws cli & jq is installed, you can store the vpc-id and the aws-region in environment variables:
VPC_ID=$(aws cloudformation describe-stacks \
--stack-name eksctl-$(echo $CLUSTER_NAME')-cluster \
--query "Stacks[0].Outputs[?OutputKey=='VPC'].OutputValue" \
--output text)
AWS_REGION=$(aws cloudformation describe-stacks \
--stack-name eksctl-$(echo $CLUSTER_NAME')-cluster \
--query "Stacks[0].Outputs[?OutputKey=='Endpoint'].OutputValue" \
--output json | jq -r ".[] | sub (\".eks.amazonaws.com\"; \"\") | scan(\"\\\.[^\\\.]*$\") | split(\".\")[1]")
The alb-ingress-controller.yaml manifest would be adapted to:
...
# AWS VPC ID this ingress controller will use to create AWS resources.
# If unspecified, it will be discovered from ec2metadata.
- --aws-vpc-id=${VPC_ID}
# AWS region this ingress controller will operate in.
# If unspecified, it will be discovered from ec2metadata.
# List of regions: http://docs.aws.amazon.com/general/latest/gr/rande.html#vpc_region
- --aws-region=${AWS_REGION}
...
Use envsub or envsubst or similar substitution tools to replace the environment variables before applying the alb-ingress-controller.yaml manifest in order to create the alb ingress controller.
Conclusion
Creating AWS EKS clusters on fargate can be done with just a single flag:
eksctl create cluster ... --fargate
and is indeed a smooth process
There are at least 2 additional arguments (vpc-id and the aws-region) needed to run the AWS Ingress Controller in such a cluster. These arguments can be easily obtained as shown in this post.
Thanks, this is exactly the problem I was having. I wish that AWS had better documentation for EKS Fargate. We’ve hit a lot of weird issues relationg to IAM roles and things of the sort you have posted about here.
I am glad this write up is helpful.
Lothar, this article is the missing manual for the installer.
Without this I might never have gotten the controller to a ready state!
Thanks for taking the time to write this up.
Thanks Iain,
you are right. I assume in this post an already installed and running “ALB INGRESS Controller”.
Please see the AWS LoadBalancer Controller Installation Guide.
I also noticed the maintainers rebranded it to “AWS Load Balancer Controller”.