0
  • 300k+ videos
  • 10+ millions of markers, pointing to timeranges in videos
{
  "markerCategory": "something",
  "markerDesc": "something-more-specific",
  "frameIn": 120001,
  "frameOut": 140002
},
{
  "markerCategory": "something-else",
  "markerDesc": "something-else-more-specific",
  "frameIn": 130001,
  "frameOut": 135002
}

Any Suggestions which database / search-index would perform best,
when searching for something along these lines:

Videos having events of category A AND category B in overlapping timeranges,
sorted by amount of covered time

Videos are currently exported from some proprietary relational database and stored in an Apache SOLR instance for searching.

  • Is there a specific name for those kind of queries ("inverted range queries" or some thing like that...) ?
  • Any Suggestions which technology would perform best, for those types of queries?
    I was thinking maybe elasticsearch?
gherkins
  • 103
  • 2

1 Answers1

1

Any Suggestions which technology would perform best, for those types of queries?

Any mainstream relational database system should be just fine. It appears the schema of your data is well-defined.

Any Suggestions which database / search-index would perform best, when searching for something along these lines:

Videos having events of category A AND category B in overlapping timeranges, sorted by amount of covered time

Sure, an index on (markerCategory, frameIn, frameOut) would likely be what you'd want. In an RDBMS (relational database management system) you could also add a computed / generated column to the table that stored the difference between frameIn and frameOut for the amount of covered time, and add it to the index too.

Unless your definition of covered time is the total span between both overlapping categories after they're joined. In such a case, an Indexed View might be able to be leveraged to persist the results of the join and the total span of covered time between the joined results, then making that indexable as well.

I was thinking maybe elasticsearch?

Elasticsearch is meant for indexing and searching large instances of text, akin to a search engine like Google. It's good for the concept of Full-Text Search. I don't believe your use case makes sense for it.

J.D.
  • 40,776
  • 12
  • 62
  • 141