1

We're using GCP and Cloud DNS to manage our domain and I'm trying to solve for these use cases:

  1. Have private records for things like Databases that can only be resolved within the company network (our VPC).
  2. Override public records with private IPs for alternative routing within the company network.
  3. Be able to issue DNS01 challenges and resolve the records within our network and publically. We need this due to how cert-manager works (which we use to issue certificates with letsencrypt).

I've tried solving this with a public and private zone (AKA, split-horizon DNS), however, this solution only solves use cases 1 and 2. And it only solves use case 2 if we ensure the private zone has a copy of all records in the public zone (if there isn't a private counterpart).

Use case 3 isn't met with this solution as our cert-manager server creates the records in the public zone and then cannot resolve them in the public zone. Due to the specifics of our setup, customing cert-manager to resolve both zones via some local configuration isn't ideal. It also would be difficult to have the records created on both zones, so again not the ideal solution.

What I'd like is for the private zone to forward requests to the public one if it doesn't have a record for a specific request. Is there a way of doing this, specifically using GCP Cloud DNS?

The ideal nslookup -> private zone -> public zone

Currently we have nslookup -> private zone -> error (NXDOMAIN) if no record

For example,

# While on my laptop
> nslookup ws1.example.com
...
Name:   ws1.example.com
Address: 34.111.111.111           # Public IP for web server

While on the GCP network

> nslookup db.example.com ... Name: db.example.com Address: 10.10.0.2 # Private IP for a database > nslookup ws1.example.com ... Name: ws1.example.com Address: 10.0.0.10 # Private IP (from private zone) for web server

This works fine for use cases 1 and 2 but when we try to resolve a record that only exists on the public zone...

# While on my laptop
> nslookup ws1.example.com
...
Name:   ws1.example.com
Address: 34.111.111.111           # Public IP for web server
> nslookup ws2.example.com        # We only have this record in the public zone
...
Name:   ws2.example.com
Address: 34.111.111.112           # Public IP for another web server

While in the GCP VPC

> nslookup ws1.example.com ... Name: ws1.example.com Address: 10.0.0.1 # Private IP (override) for web server > nslookup ws2.example.com # We only have this record in the public zone ... ** server can't find ws2.example.com: NXDOMAIN # Fails to resolve. Should look at private then public zone and resolve to 34.111.111.112.

Any suggestions?

As a workaround, for now, we've switched to using HTTP01 challenges for cert-manager but we'd prefer to use DNS01 if possible.

James
  • 113
  • 5

2 Answers2

3

If example.com is both a private and public zone, then you must have resource records for ws2 in both the public and private zones. There is no failover from private to public. Each zone must be authoritative.

The key to understanding your problem: the query is performed at example.com. If the zone returns NXDOMAIN, that is the end of the lookup. DNS does not then move to another server to query for a different answer.

John Hanley
  • 5,164
1

I know this is an old question that has already been answered very well by John Hanley above, but I'll post our solution here in case it helps anybody else.

We encountered a similar issue with split-horizon DNS recently, and solved it by implementing an Eventarc-triggered Cloud Function that replicates all changes in the public zone to the private zone, so we don't have to do this manually.

Disclaimer: our solution may not work for the cert-manager use case specifically, but should suffice for most other use cases.

amtc
  • 11