5

We’re building a platform of different APIs (every API is written by different team, different timezone). We want to implement unified analytics for all of the APIs to have one data lake as a single source of truth. We are using Apigee as our API proxy for all of the APIs. Now the question is: where would you implement the analytics layer?

Our first thought was to implement it in Apigee and log events for every API call - but we encounter problems with providing full request, response payloads since we’re using streaming configurations for performance reasons what prevents Apigee from having access to the payload.

The other approach we considered was to obligate all services to send their own events - but is it really possible? Every team has their own schedule and it seems that analytics would always get lower priority, and how can we really make sure that every API call is being logged as expected?

How do your companies deal with analytics? Where would you implement such a layer? Would be grateful to hear your ideas.

Tulains Córdova
  • 39,570
  • 13
  • 100
  • 156

2 Answers2

1

Honestly,your best bet is to just log everything to stdout and stderr, then have your OS handle log rotate and all that. Each service can ship their logs using standard tools like systemd / syslog to your log aggregator. I like logstash for that, but Splunk will work.

If you then need to ship to a third party, your internal log aggregator can control what gets shipped and how often to control cost and sensitivity.

This is by far the simplest and hardest to screw up approach.

Paul
  • 3,347
0

Now the question is: where would you implement the analytics layer?

In a separate microservice. The code to send event stream to it should be developed cooperatively by all your teams.

The other approach we considered was to obligate all services to send their own events - but is it really possible?

Why not? If you will be getting to much of the data, just use sampling (log only 1/5, 1/10 or 1/100 of all successfull requests + all failed ones).

Every team has their own schedule and it seems that analytics would always get lower priority

I think this is not a technical question, but a question of your project management quality.

and how can we really make sure that every API call is being logged as expected?

I guess the same way you make sure your services operate at all. By tests at all levels and external monitoring.