2

I am creating my infrastructure with Ansible. I create a VPC and an ECS cluster where I use Fargate to run my docker containers. When the tasks are started I am prompted that the task was not able to pull the Image from the ECR. STOPPED (CannotPullContainerError: Error response from daem)

I know for a fact that the URI to the image and the IAM permissions are correct. Also I don't think the issue is in the AMI or docker image, because when I create the same scenario over the AWS UI, it works fine.

Therefore, I assume my VPC is not configured correctly in a way that it can not connect to the ECR.

This is how I create my VPC and Subnet:

- name: "2) Create VPC"
  ec2_vpc_net:
    name: "{{vpc_name}}"
    state: present
    cidr_block: 10.10.0.0/16
    region: "{{ region_name }}"
  register: vpc_net

- name: "3) Create VPC Subnet"
  ec2_vpc_subnet:
    state: present
    map_public: yes
    vpc_id: "{{vpc_net.vpc.id}}"
    cidr: 10.10.0.0/24
    tags:
      name: "test-subnet"
  register: vpc_subnet

How can I ensure my VPC and subnet is configured correctly so that the docker daemon can connect and pull the image from the ECR repo? Am I missing an important attribute?

Kyu96
  • 145
  • 3
  • 17
  • 1
    If you check on ECR repository you see your image? – Vader Jan 09 '20 at 01:48
  • 1
    Yeah, the image is there and the URI is correct. @WeyderFerreira – Kyu96 Jan 09 '20 at 01:49
  • 1
    Are you sure that you Ansible role/playbook has access on AWS ECR? – Vader Jan 09 '20 at 01:54
  • I am 99% confident that this is the case. When the same scenario is performed over the ECS-Web-UI (which automatically creates VPC's, Subnets, Sec-groups etc) and only the Cluster, TaskDef and Service are created over Ansible and the VPC/Subnets/Secgroups that were created over the Web-UI are referenced, it works. So my educated guess is, a difference in the VPC/Subnet/SecurityGroup Configuration causes issues. Maybe I need an IGW so that a connection to the ECR can be established? For reference here is my entire playbook: https://pastebin.com/tzNpqGHi @WeyderFerreira – Kyu96 Jan 09 '20 at 02:04
  • 2
    is it possible to login to SSH to system, manually login Docker to the ECR and manually pull the image? – Ta Mu Jan 09 '20 at 08:21
  • I don't think I can SSH into a fargate managed instance. @Peter – Kyu96 Jan 09 '20 at 10:09
  • Do I need to add another network component like a internet gateway, a routing table or a vpc endpoint? I also noticed when I perform the process over the UI where everything works the way it should, two subnets are created, not just one. Please let me know if I need to provide additional information to help to solve the problem. @Peter – Kyu96 Jan 09 '20 at 10:21
  • For internal network communication on AWS you don't need LB. – Vader Jan 13 '20 at 14:24
  • @WeyderFerreira What's LB? – Kyu96 Jan 13 '20 at 15:00
  • @Kyu96 Load Balancer, in AWS be can ALB (Application Load B.), NLB (Network Load B.), ELB (Elastic Load B.). – Vader Jan 13 '20 at 23:29
  • 1
    Can you edit your question to include the code that creates the Fargate service and task definition? – ydaetskcoR Jan 14 '20 at 18:00
  • 1
    It would also be helpful if you included the full error message that you're seeing. You should be able to see this by describing the task with the AWS CLI within a shortish period (2 hours?) of it stopping. – ydaetskcoR Jan 14 '20 at 18:02

2 Answers2

4

This error CannotPullContainerError usually occurs when you have no access to the Internet from ECS and thus it doesn't able to pull an image from registry.

Make sure you have networking configured in such a way to have an access to Internet https://stackoverflow.com/questions/48226547/aws-fargate-cannotpullcontainererror-500

Manually you can confirm everything with such steps

  1. Find route table. Check whether it belongs to your VPC.
  2. Find proper NAT (you need NAT Gateway ID) (IMPORTANT: NAT should be launched in PUBLIC subnet)
  3. Create a route for that

    $ aws ec2 create-route --route-table-id "rtb-037148d7b5967a231" --destination-cidr-block "0.0.0.0/0" --nat-gateway-id "nat-0c137ae8a2b409088"

You can also use internet gateway instead of NAT but the main idea is the same - to create connectivity from service to registry. If you have that in place and verified you can easily rewrite that to ansible tasks.

Update To start an ECS task you want to use ecs_task module https://docs.ansible.com/ansible/latest/modules/ecs_task_module.html

To pull image from repository and start it you can use these tasks

- name: docker login
  shell: "$(aws ecr get-login --no-include-email --region {{ default_region }})"
  args:
    executable: "/bin/bash"

- name: pull latest app image
  docker_image:
    name: "{{ ecr_repository }}/myapp:{{ image_tag }}"
    force: yes

- name: run application with docker compose
  docker_service:  # module is called docker_service before Ansible 2.8
    project_name: "myapp"
    project_src: "{{ app_dir }}"  # path to a directory containing a docker-compose.yml
    pull: yes
    recreate: always  # run scheduler from a clean state
    remove_orphans: yes  # remove containers for services not defined in the compose file
    state: present  # specifying present is the same as running docker-compose up
    restarted: yes  # use with state present to restart all containers
  register: output

but in case of Fargate the above is not needed as everything is handled by task definition

Most Wanted
  • 691
  • 8
  • 18
1

ECR needs authentication and authorization before you can pull an image.

In your ansible ecs_taskdefinition, please make sure you have parameter task_role_arn or execution_role_arn pointing to a role (i.e. you need to create a role my-task-role1).

Attach the AWS managed policy AmazonECSTaskExecutionRolePolicy to the role (for Fargate tasks, the policy AmazonEKSFargatePodExecutionRolePolicy might be a better choice).

This should provide you with the necessary permissions to download your image from ECR.


Policy descriptions from AWS Console:

  • AmazonECSTaskExecutionRolePolicy

    Provides access to other AWS service resources that are required to run Amazon ECS tasks

  • AmazonEKSFargatePodExecutionRolePolicy

    Provides access to other AWS service resources that are required to run Amazon EKS pods on AWS Fargate


Side-note: AWS Console creates ecsTaskExecutionRole automatically if you create a Task Definition there.

Description from "Create new Task Definition" in AWS Console:

This role is required by tasks to pull container images and publish container logs to Amazon CloudWatch on your behalf. If you do not have the ecsTaskExecutionRole already, we can create one for you.

Flo
  • 11
  • 2