With version 5.0 MongoDB's specialized Time Series Collections were introduced to deal with such data. As I already stored some sensor meta data (configuration, specification ...) in MongoDB, I decided to make use of these special collections to store sensor readings next to the sensor meta data.
According to the docs I used a single document for each sensor reading like this (pseudo code):
{
"timestamp": timestamp,
"value": value,
"metadata": {
"sensorId": sensor_uid,
"unit": sensor_unit,
"type": sensor_type,
"fromFile": reading_imported_from_file,
},
}
Around 50 different sensors are read at the same time which results in 50 documents with equal timestamp but varying value and metadata.
I am currently working on migrating our time series data storage from MongoDB to InfluxDB as this seems to provide a sleeker API and has some basic data visualization already included. As already described above, in MongoDB I used a single document per sensor which might be considered as bad practice when using InfluxDB:
A measurement per sensor seems unnecessary and can add a significant amount of system overhead depending on the number of sensors you have. I’d suggest storing all sensor data in a single measurement with multiple fields, [...]
Based on this I came up with the following data structure to be passed to InfluxDB (Python dictionary pseudo code for influxdb-client):
{
"time": 1,
"measurement": measurement_name,
"tags": {
"location": location,
"from_file": reading_imported_from_file,
},
"fields": {
"sensor_1": reading_from_sensor_1,
"sensor_2": reading_from_sensor_2,
"sensor_3": reading_from_sensor_3,
},
}
However, I did not figure out how to store the other meta data like sensorId, unit, or type. On the one hand side I could easily solve this by violating the before mentioned suggestion and use a single measurement per sensor. On the other hand side, from a relational perspective these meta information should be tied to the sensorId and be therefore accessible from a sensor configuration/specification database using the sensorId as a key. Unfortunately, these values can change throughout a single measurement or experiment due to changing device configurations on-site which are not reflected in the configuration database.
How could I solve this issue? Am I missing something or do I simply have to deal with this design/performance vs. ease-of-use tradeoff?