ALB INGRESS Controller CrashLoopBackOffs in AWS EKS on FARGATE

4 minutes read
no time? jump straight to conclusion

Do you run Kubernetes on AWS?
Do you use EKS?

You may run EKS on Fargate because running EKS pods on the serverless Fargate computer service for containers is generally available since December 3, 2019.

If so, you probably also use the AWS ALB Ingress Controller.

This post is about ALB Ingress Controller CrashLoopBackOffs.

How does a ALB Ingress Controller CrashLoopBackOff look like?

kubectl get po --all-namespaces
NAMESPACE     NAME                                      READY   STATUS
kube-system   alb-ingress-controller-53d5561b73-azh3p   0/1     CrashLoopBackOff
....

kubectl describe pod/alb-ingress-controller-53d5561b73-azh3p -n kube-system
Name:                 alb-ingress-controller-53d5561b73-azh3p
.....
Events:
  Type     Reason   Age                    From                                                           Message
  ----     ------   ----                   ----                                                           -------
  Warning  BackOff  96s (x86488 over 1d)  kubelet, fargate-ip-00-000-000-000.aws-zone.compute.internal  Back-off restarting failed container

kubectl logs pod/alb-ingress-controller-53d5561b73-azh3p -n kube-system
....
F0125 00:00:00.00000       1 main.go:94] failed to introspect vpcID from ec2Metadata due to RequestError: send request failed
caused by: Get http://169.254.169.254/latest/meta-data/mac: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers), specify --aws-vpc-id instead if ec2Metadata is unavailable

The key information of the reduced output and logs is:

specify --aws-vpc-id instead if ec2Metadata is unavailable

What is the root cause of the error?

The alb-ingress-controller container needs to know the vpc-id it runs in.

Actually the aws-alb-ingress-controller documentation is rather specific:

!!!tip If ec2metadata is unavailable from the controller pod, edit the following variables:
 -  `--aws-vpc-id=vpc-xxxxxx`: vpc ID of the cluster.
 -  `--aws-region=us-west-1`: AWS region of the cluster.
https://github.com/kubernetes-sigs/aws-alb-ingress-controller/blob/master/docs/guide/controller/setup.md#kubectl

This is not the only environment information the aws alb ingress container requires. alb-ingress-controller.yaml#L25-L67 lists required and optional arguments:

```
--watch-namespace
```
```
--ingress-class=alb
```
```
# REQUIRED
--cluster-name=devCluster
```
```
--aws-vpc-id=vpc-xxxxxx
```
```
--aws-region=us-west-1
```
```
--aws-api-debug
```
```
--aws-max-retries
```

Both arguments

--aws-vpc-id
--aws-region

shall be discovered from ec2metadata if unspecified.

However when running EKS with Fargate ec2metadata are not available anymore. A task metadata endpoint exits for ECS – I could not find an EKS equivalent. Without ec2metadata service, the alb-ingress-controller can’t fetch the required arguments. As a result the container crashes again and again and the pod ends up in CrashLoopBackOff.

CrashLoopBackOff fix

Provide the vpc-id and the aws-region argument when applying the alb-ingress-controller.yaml to the cluster. Assuming the EKS cluster was created using eksctl and the aws cli & jq is installed, you can store the vpc-id and the aws-region in environment variables:

VPC_ID=$(aws cloudformation describe-stacks \
--stack-name eksctl-$(echo $CLUSTER_NAME')-cluster \
--query "Stacks[0].Outputs[?OutputKey=='VPC'].OutputValue" \
--output text)

AWS_REGION=$(aws cloudformation describe-stacks \
--stack-name eksctl-$(echo $CLUSTER_NAME')-cluster \
--query "Stacks[0].Outputs[?OutputKey=='Endpoint'].OutputValue" \
--output json | jq -r ".[] | sub (\".eks.amazonaws.com\"; \"\") | scan(\"\\\.[^\\\.]*$\") | split(\".\")[1]")

The alb-ingress-controller.yaml manifest would be adapted to:

...
 # AWS VPC ID this ingress controller will use to create AWS resources.
 # If unspecified, it will be discovered from ec2metadata.
 - --aws-vpc-id=${VPC_ID}

 # AWS region this ingress controller will operate in.
 # If unspecified, it will be discovered from ec2metadata.
 # List of regions: http://docs.aws.amazon.com/general/latest/gr/rande.html#vpc_region	             
 - --aws-region=${AWS_REGION}
...

Use envsub or envsubst or similar substitution tools to replace the environment variables before applying the alb-ingress-controller.yaml manifest in order to create the alb ingress controller.

Conclusion

Creating AWS EKS clusters on fargate can be done with just a single flag:

eksctl create cluster ... --fargate

and is indeed a smooth process

Just created my first #eks cluster with #AWS #fargate. Was a smooth process so far.https://t.co/kp1ycwZ3yz
— lothar schulz (@lothar_schulz) December 9, 2019

There are at least 2 additional arguments (vpc-id and the aws-region) needed to run the AWS Ingress Controller in such a cluster. These arguments can be easily obtained as shown in this post.

JM
July 21, 2020 at 8:57 pm

Thanks, this is exactly the problem I was having. I wish that AWS had better documentation for EKS Fargate. We’ve hit a lot of weird issues relationg to IAM roles and things of the sort you have posted about here.

- Lothar Schulz
  August 4, 2020 at 12:23 pm
  
  I am glad this write up is helpful.
  
Iain Samuel McLean Elder
February 11, 2021 at 11:45 am

Lothar, this article is the missing manual for the installer.

Without this I might never have gotten the controller to a ready state!

Thanks for taking the time to write this up.

- Lothar Schulz
  February 11, 2021 at 6:07 pm
  
  Thanks Iain,
  
  you are right. I assume in this post an already installed and running “ALB INGRESS Controller”.
  
  Please see the AWS LoadBalancer Controller Installation Guide.
  
  I also noticed the maintainers rebranded it to “AWS Load Balancer Controller”.

4 Comments

Leave a Reply to Lothar Schulz Cancel reply