If you have a limited number of clients, one thing you could look at is requiring them to download the data over a separate VPN which you provide them with one VPN account per customer. Many VPN services provide usage monitoring which you can hook into. Ideally, you would also want to configure the VPN service to either be a partial VPN or otherwise limit the VPN traffic to only talk with traffic going to the CDN in order to avoid your users accidentally leaving the VPN on after downloading their data.
However, if you don't want to use a dedicated VPN company and want to use a CDN from a cloud provider like Azure, Google Cloud, AWS, etc., you would probably be better off routing things through something like AWS Lambda/Azure Functions and tracking data usage that way or even better by doing the monitoring using metrics from the CDN metrics if those are available.
The reason why a separate VPN might be better than proxying the CDN data with a server is because you don't need to keep dedicated servers around for the VPN.
One more thing you could do which would be the most complex, easiest for customers to exploit, but potentially the least expensive of all these options (short of tracking usage via the CDN itself) would be providing your own custom download client/SDK for this data. It would first authenticate with a middleman server and issue a request for some/all of the data. The server would then create some sort of auth token to the CDN with as short of expiry time as possible on behalf of the client and then send that token along with the real URL of the CDN. The client would then query the CDN using that token, potentially renewing the token by talking with the middleman again if necessary.
The weird redirect client option is almost certainly the cheapest for large datasets besides doing everything on the CDN directly since there would be almost zero cloud data egress costs.