4

I'm developing an application where it read data from different data sources. And then those data should be pre-processed and then go through some chain of steps (Filters ?) where those data will get processed and augmented. Finally those data should be written to a common data base.

I'm thinking of a Pipe and Filter kind of style to implement this. While I'm learning on this, I came across these invariants of this style [here].

Independent entities
---- Do not share state
---- Have no knowledge of other filters

Transformation
---- Incremental
---- Not dependent on order in the chain

And I'm having trouble understanding these. Why those are considered as invariants. ?

What happen if they share the state.
what happen if they have a knowledge of other filters.
What if they need to depend on other filters (In my case a pre-process is a must) and what is Incremental.

As I know violating the invariants might erode the code with the time. So if I use this Pipe and Filter style in my app, what kind of things will violate these invariants or what are the things that I can do to violate these invariants ?

Can anyone help me on this.

prime
  • 219

1 Answers1

4

Why those are considered as invariants. ?

Because they're the intrinsic things that define this sort of architecture. If you allow shared state or knowledge of filters, you're no longer doing Pipe and Filter, you're doing something else.

The entire point of this architecture is to create independent, composable, parallelizable streams of work. This allows you to optimize the processing via reordering the filters. This allows you great scale, since the entity processing can be farmed out to many machines. It allows easy development since each filter can be implemented (and tested, and deployed) in isolation. And since these rules are uniform for all entities and all filters, it allows you to make high quality tooling for using them.

As soon as you start making exceptions to these rules, you start losing the benefits of having them. If you share state, then your entities cannot be trivially parallelized. If your filters depend on order, you can't optimize them, and you can cause subtle errors due to the implicit dependency.

Telastyn
  • 110,259