1

We have an instance within a private subnet that has a managed NAT gateway. On that instance, we are able to access the internet:

$ curl https://www.google.com/
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head>...

However, we are not able to access the cloudwatch endpoint, e.g. the following times out: (EDIT: My mistake, not the cloudwatch endpoint, but rather the site storing the cloudwatch monitoring scripts.)

$ curl https://cloudwatch.s3.amazonaws.com

DNS is not the problem:

$ dig cloudwatch.s3.amazonaws.com
cloudwatch.s3.amazonaws.com. 2303 IN    CNAME   s3-1-w.amazonaws.com.
s3-1-w.amazonaws.com.   1   IN  A   54.231.72.59

Any ideas about what might be happening?

JustinHK
  • 131

3 Answers3

2

I actually Had the same issue and managed to resolve it in the same way that JustinHK did below. I've reached out to AWS to understand why it happened because I couldn't let it go, so this should help with explaining the behavior. Here's the breakdown:

  • The issue is not with the traffic not being able to reach the destination, it's with the traffic not being able to return correctly to the origin.
  • Since the public subnet (where the NAT gateway is sitting) has 2 options to reach the destination - either via the VPCE (VPC Endpoint) or via the IGW (Internet Gateway), it doesn't know which one to pick when the request is doing the trip back. Since it doesn't know which one to pick - it just times out.
  • Routing chooses the path of least resistance, so adding the VPCE in the private subnet made the VPCE route the ideal route. Though it's worth mentioning here that the request isn't going through the public subnet at all, as it now has a VPCE in the private subnet.

Depending on the setup you're running and whether you actually need the IGW for anything else besides reaching out to S3, one might either drop the IGW from the public subnet or drop the NAT gateway linkage between the private and public subnet. Both of those options should clean up the routing tables a bit while not breaking the solution.

1

Adding an S3 endpoint in the private subnet resolved the issue.

It turns out that our problem was specific to accessing S3. Our setup at the time was:

  • NAT gateway running in public subnet
  • S3 endpoint in public subnet (with higher routing priority than the internet gateway)
  • A default rule for traffic in the private subnets to go through the NAT.

It appears that traffic was not getting routed through the NAT to S3 either through the public internet or through the S3 endpoint. I still do not know why.

JustinHK
  • 131
0

First, the obvious: cloudwatch.s3.amazonaws.com is not one of the Cloudwatch endpoints.

The Cloudwatch endpoints are in the form monitoring.[aws-region].amazonaws.com.

For example, in the us-west-2 region, the endpoint is https://monitoring.us-west-2.amazonaws.com.

http://docs.aws.amazon.com/general/latest/gr/rande.html#cw_region

Also, even if your routing, NAT, or networking is otherwise misconfigured, DNS resolution is immune to many misconfigurations, because of the way it is implemented in VPC... so the fact that it works does not tell you whether you have Internet connectivity, in general.