What I would do in this case is to have a simple structure with the potential for rapidly querying the data.
If I've understood you correctly, you have 1500 (no. of sensors) x 24 (hours per day) x 100 (retention time) = 360M records. This is not very small, but not exactly large either by today's standards.
What I would recommend is something like (db-fiddle here - DDL and DML also at the end of the post):
CREATE TABLE sensor_reading
(
sensor_id SMALLINT UNSIGNED NOT NULL,
reading_datetime DATETIME NOT NULL,
reading_type VARCHAR (50) NOT NULL, -- pressure, temp, altitude
reading_value INTEGER NOT NULL, -- could be FLOAT? not sure of exact values - make it as big as you think you need.
reading_unit VARCHAR (25) NOT NULL, -- if using S.I. possibly redundant.
reading_date DATE GENERATED ALWAYS AS (DATE(DATE_ADD(reading_datetime, INTERVAL 30 MINUTE))) VIRTUAL,
reading_hour TINYINT UNSIGNED GENERATED ALWAYS AS (EXTRACT(HOUR FROM (DATE_ADD(reading_datetime, INTERVAL 30 MINUTE)))) VIRTUAL,
UNIQUE KEY sensor_hour_ix (sensor_id, reading_date, reading_hour),
INDEX reading_value_ix (reading_hour)
);
This will only work on MySQL 5.7. On earlier versions, you can use TRIGGERs to fill in the reading_date and reading_hour fields - this time it will have to be actual stored data, rather than VIRTUAL (i.e. GENERATED) fields.
A good reason to have reading_hour is to prevent problems due to small granularity variations in your sensors' timings (see DDL - INSERTs). If you're going to load them all at once in the same transaction (per hour) and/or specify an exact time each and every time, then this shouldn't be an issue! You can look at DATETIME vs. TIMESTAMP in MySQL here.
The reason I have made the choice to "break out" the hour and date into separate fields is so that queries are v. easy and groupings can be done without much calculation on the client end!
This functionality is provided free and gratis by the server and you should take advantage of it - the calculation will always be correct - and you'll never type in 36000 instead of 3600 or similar error when extracting the hour and/or date! Even using triggers, with storage being relatively cheap nowadays, this is not a big cost to bear, and none at all for VIRTUAL columns.
You may find other indexes helpful depending on your queries - but you can probably get a fair bit done with the outline above!
@RickJames ' suggestion regarding the PRIMARY KEY is correct - using an AUTO_INCREMENT field here was the wrong way to go. I'm less sure about his partitioning scheme - personally, I would store "old" (i.e. > 100 days) in the ARCHIVE engine and I would copy it off using an event fired every day at, say, 03:00 or other time when there's not much activity on the server (+ remove it from the live table). It has been my experience that you never know when you might require old data and old trends can be useful - this way they are stored but won't take up a huge amount of space.
=================== DDL and DML ==================
CREATE TABLE sensor_reading
(
sensor_id SMALLINT UNSIGNED NOT NULL,
reading_datetime DATETIME NOT NULL,
reading_type VARCHAR (50) NOT NULL, -- pressure, temp, altitude
reading_value INTEGER NOT NULL, -- could be FLOAT? not sure of exact values - make it as big as you think you need.
reading_unit VARCHAR (25) NOT NULL, -- if using S.I. possibly redundant.
reading_date DATE GENERATED ALWAYS AS (DATE(DATE_ADD(reading_datetime, INTERVAL 30 MINUTE))) VIRTUAL,
reading_hour TINYINT UNSIGNED GENERATED ALWAYS AS (EXTRACT(HOUR FROM (DATE_ADD(reading_datetime, INTERVAL 30 MINUTE)))) VIRTUAL,
UNIQUE KEY sensor_hour_ix (sensor_id, reading_date, reading_hour),
INDEX reading_value_ix (reading_hour)
);
-- I have chosen to use +30 mins as the cutoff for the time,
-- you many find another interval to be better suited to your
-- requirements. It is arbitrary, but I'm assuming that a sensor
-- that measures by the hour might be a minute or two off, but
-- not > ~ 30 mins.
INSERT INTO sensor_reading
(sensor_id, reading_datetime, reading_type, reading_value, reading_unit) VALUES
(1, '2017-11-03 15:00:01.5', 'Temp', 12, 'Celsius'), -- should be hour 15 on the 11th.
(1, '2017-11-03 16:00:01.5', 'Temp', 14, 'Celsius'), -- should be hour 16 on the 3rd.
(2, '2017-11-04 15:59:01.5', 'Temp', 13, 'Celsius'), -- should be hour 16 on the 4th.
--
-- Now, datetimes around midnight
--
(1, '2017-11-03 23:59:01.5', 'Temp', 12, 'Celsius'), -- should be hour 0 on the 4th.
(5, '2017-11-03 00:00:01.5', 'Temp', 14, 'Celsius'), -- should be hour 0 on the 3rd.
(7, '2017-11-03 23:59:01.5', 'Temp', 12, 'Celsius'), -- should be hour 0 on the 4th
(2, '2017-11-04 00:59:01.5', 'Temp', 13, 'Celsius'); -- should be hour 1 on the 4th.
============ Results =========================
sensor_id reading_datetime reading_type reading_value reading_unit reading_date reading_hour
1 2017-11-03 15:00:02 Temp 12 Celsius 2017-11-03 15
1 2017-11-03 16:00:02 Temp 14 Celsius 2017-11-03 16
2 2017-11-04 15:59:02 Temp 13 Celsius 2017-11-04 16
1 2017-11-03 23:59:02 Temp 12 Celsius 2017-11-04 0
5 2017-11-03 00:00:02 Temp 14 Celsius 2017-11-03 0
7 2017-11-03 23:59:02 Temp 12 Celsius 2017-11-04 0
2 2017-11-04 00:59:02 Temp 13 Celsius 2017-11-04 1