General remarks:
Where is your PRIMARY KEY?
It looks to me as if this data is immutable? By that I mean a given call is a matter of historical record rather than being subject to multiple updates?
From here, we get (caveat - I don't know for sure that this is what we're dealing with):
A call option is a contract between a buyer and a seller to purchase a
certain stock at a certain price up until a defined expiration date.
The buyer of a call has the right, not the obligation, to exercise the
call and purchase the stocks. On the other hand, the seller of the
call has the obligation and not the right to deliver the stock if
assigned by the buyer.
You have no stock identifier - this puzzles me - surely you should have a field like identifier CHAR(4) - or whatever the length of your system's identifiers is - if it's variable, use VARCHAR(n) - storage = n + 1 byte.
You can store DATE and TIME together as a TIMESTAMP, which can hold values between '1970-01-01 00:00:01' (UTC) and '2038-01-19 03:14:07' (UTC). If you don't require more than 1s precision, this will save space - 4 bytes vs. 6.
Also, re your PK, from here, you don't require a surrogate PRIMARY KEY (normally INTEGER) if your data isn't changing much, which as I surmised above, should be the case for what is, essentially, a logging table.
1st possibility:
So, I would design it like this (all of the code below is available on the fiddle here):
CREATE TABLE calls
(
stock_id CHAR(4) NOT NULL,
quote_ts TIMESTAMP NOT NULL,
expiry_date DATE NOT NULL,
delta DECIMAL(4,3) NOT NULL,
mid DECIMAL(8,4) NOT NULL,
PRIMARY KEY (stock_id, quote_ts, expiry_date, delta, mid)
);
You can have an index on delta as follows:
CREATE INDEX delta_ix ON calls (delta);
Notice that if you do this, then you get an INDEX SCAN when running your query (using EXPLAIN ANALYZE):
EXPLAIN
-> Filter: ((calls.delta >= 4.000) and (calls.delta <= 6.000)) (cost=0.55 rows=3) (actual time=0.0802..0.0847 rows=3 loops=1)
-> Covering index scan on calls using delta_ix (cost=0.55 rows=3) (actual time=0.0758..0.0794 rows=3 loops=1)
2nd possibility:
Now, if expiration is determined by the number of days from the quote_ts, you could store it as SMALLINT (or even UNSIGNED TINYINT if it's never > 255 days) and use the DATE_ADD() function and have expiry_date as a stored field (VIRTUAL) - no space, calculation is done on the fly. See the fiddle - test with your own system.
CREATE TABLE calls_bis
(
stock_id CHAR(4) NOT NULL,
quote_ts TIMESTAMP NOT NULL,
expiry_days SMALLINT NOT NULL,
delta DECIMAL(4,3) NOT NULL,
mid DECIMAL(8,4) NOT NULL,
expiry_date DATE AS (DATE_ADD(DATE(quote_ts), INTERVAL expiry_days DAY)) VIRTUAL,
PRIMARY KEY (stock_id, quote_ts, expiry_days, delta, mid)
);
Then create the same an index on delta, then rerun the query - INDEX SCAN again. Try a few different queries - check the sizes of the tables (+/- INDEXes.
There's a certain amount of overhead attached to each record - records are stored in pages (again, overhead) as are INDEXes - more overhead. So, to get a handle on your real data usage/record, you'd have to load 1M records and perform the queries at the bottom of the fiddle.
A few points to note:
I'm not sure why you're retrieving 400k records from a table - are you aggregating over something?
I don't know how the delta and mid fields are produced/calculated/derived. If they can be GENERATED, there is potential to reduce the size of your table - and hence the time taken to scan it?
It's best to have NOT NULLs on as many fields as possible - the more information you provide to the optimizer, the better the chances that it will do a good job. Maybe this isn't possible for the mid field which can't be entered until the call is half-way through its term? (don't know much about trading).
Your two indexes (CREATE INDEX covering_index ON CALLS (quote_date, quote_time, expiration, delta); and CREATE INDEX covering_index ON CALLS (quote_date) can be replaced by the PRIMARY KEY - the first field of a PK doesn't need an extra INDEX as it's the leading field anyway.
Instead of the BETWEEN operator, better to use the mathematical ones (<, <=, >=, >) - these are unambiguous - is BETWEEN inclusive and exclusive? Most servers will just transform this operator into these other operators anyway - see here for confusion!
With the index delta_ix, it is used in a query with requests records based on values of delta - see the EXPLAIN ANALYZE in the fiddle.
The order of the fields in the PK should depend on your most frequent query or queries. Test for this.
If you'd like to elaborate on how the fields are produced, then we might have more opportunities to reduce record size and speed up your queries? p.s. welcome to dba.se!