1

I'm working on a large Typescript project in NodeJS with Prisma, where we have dozens of domain entities—some with 30+ fields and complex relations.

We're using the repository pattern to abstract data access. The challenge we're facing is how to avoid overfetching data when different use cases require different slices of the same entity.

Problem

For example, consider a "Shipment" entity with 30+ fields and some relations:

In one use case, I only need 5 fields and a few related fields.

In another use case, I need the full entity including its relations.

To handle this, we’ve had to create dozens of specific repository methods for different permutations of these fetch requirements.

This feels unsustainable as the app grows, because we could end up creating hundreds or even thousands of these methods.

What we've considered

Creating new repository methods per use case (leads to method explosion).

Creating a new abstraction for granular field selection/omission and relation handling - I gave this a go and it got very complex very fast. Nested selections and typescript complicate things, and I'd practically be recreating the existing ORM selection abstractions.

Always fetching full entities (leads to overfetching).

Considering dropping the repository abstraction and calling Prisma directly, but this makes refactoring hard because every small schema change could affect 1000+ direct usages.

Question:

How do large codebases (particularly those written in TS) manage this kind of granularity in data fetching?

Is overfetching just accepted?

Is it reasonable to abandon the repository pattern in such scenarios?

Any insights from teams that have scaled this would be really helpful.

biitse
  • 21
  • 3

3 Answers3

2

Is overfetching just accepted?

Yes. If you have chosen your domain model well, then it's usually more efficient to send the whole model and cache, than have multiple versions of the same model and refetch. You will use less bandwidth over multiple use cases due to reuse of the same object.

I'm not saying there's never a case for a 'lite' object, but from your description it sounds like you are returning view models from your repository rather than domain/business models.

If you have a huge complex model, like shipment might be, then consider whether that really is the best model for the concepts. Which relationship objects can you split off and retrieve separately, just holding the IDs in the root object rather than the whole struct?

Passing your domain model around rather than view models will greatly simplify the application and overall solution. Don't optimise unless you need to, and then don't optimise.

Ewan
  • 83,178
1

If you absolutely can't afford overfetching (e.g. because it's a proven performance issue), you just have to embrace repository growth and do you best to keep its methods in order, by sticking to some consistent schema/convention.

  • If complexity of your data abstraction layer really stems from complex business logic (and not accidental complexity), there is not much you can do about it anyway.
  • Performance optimizations rarely comes for free. They are usually more verbose, less elegant, or require higher maintenance than a "beautiful" solution. You just decide if tradeoffs worth it.

It's also worth giving a hard look to your entities and consider splitting the fattest ones.

1

Make your model fetch data lazily.

To achieve this, split the model into semantic parts and inject it with an extended repository, that is able to fetch these parts. Note, that this repository interface should not be accessed outside the model to preserve invariants.

Basilevs
  • 3,896