I'm attempting to architect a system that will collect estimates on various quantities (10+) on each of many (300K+) objects, and make decisions from the historical record of these estimates. For example, we have a number of human and automated processes that could be attempting to ascertain an object's, say, PowerLevel, and as they report results that may (or may not) supercede each other, we'd like to track the history of these reports, along with metadata (dates, tuning parameters, code versions, etc.) from the processes. Specifically, we might want to do queries like "grouping by entity id, for each distinct attribute, find the most recent estimate" or "find all entities whose attributes had any updates that were run by codebase 573ae4".
As someone who's never actually used an entity-attribute value schema in production, this seems like a perfect use case for something like EAV, with additional metadata columns for the sources. Specifically, I'd envision a table like this:
entity_id | attribute_id | string_value | numeric_value | datetime_value | discovery_time | discovery_source | discovery_tuning_parameters | discovery_code_hash
But I've heard a lot of criticism of these types of schema, such as this answer. I'm hard pressed to find anyone using EAV on non-legacy systems, and that makes me very hesitant that I may be reinventing a very obsolete wheel...
On the other hand, the only other alternative I can think of is full NoSQL (bleh) or some unholy hybrid like:
id | power | power_current_discovery_date | power_current_discovery_source | power_history (an array of hstores?) | foobar | foobar_current_discovery_date | ...
because we'd need metadata on each attribute.
Any thoughts? Is this one of the few times that EAV is a good fit? Thanks for your help!