I'm testing on a slightly different version than you (2019 developer), but have been told that the storage engine stuff is mostly the same. Setting aside trivial reasons like blocking, I'd say that there are three main reasons an ALTER TABLE statement might run longer than you'd prefer:
- The underlying table is large and there isn't a covering index available.
- The
ALTER TABLE statement requires most or all pages to be modified.
- The
ALTER TABLE statement has a lot of work to do and must run at MAXDOP 1.
Your ALTER TABLE statement hits all three. Some ALTER TABLE operations require all pages or even all rows to be modified. Changing a column definition from NULL to NOT NULL is one of those. I couldn't find a good way to show this so I'll resort to undocumented commands which may not work on your version. First create a small table:
DROP TABLE IF EXISTS dbo.ALTER_ME_FAST;
CREATE TABLE dbo.ALTER_ME_FAST (
COL1 BIGINT NULL,
COL2 BIGINT NULL,
COL3 BIGINT NULL,
COL4 BIGINT NULL,
COL5 BIGINT NULL,
COL6 BIGINT NULL,
COL7 BIGINT NULL,
COL8 BIGINT NULL,
COL9 BIGINT NULL
);
INSERT INTO dbo.ALTER_ME_FAST WITH (TABLOCK)
SELECT TOP (687) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), 0, 0, 0, 0, 0, 0, 0, 0
FROM master..spt_values;
I can see all of the allocated pages with the undocumented dmv sys.dm_db_database_page_allocations:
SELECT allocated_page_file_id, allocated_page_page_id, page_free_space_percent
FROM sys.dm_db_database_page_allocations
(5, OBJECT_ID(N'dbo.ALTER_ME_FAST'), 0, NULL, 'DETAILED')
WHERE page_type = 1;
The page ids stay the same after running this command:
ALTER TABLE dbo.ALTER_ME_FAST ALTER COLUMN COL5 BIGINT NOT NULL;
A before and after diff using DBCC PAGE shows that the old column was dropped and replaced with a new column for all rows:

On my machine, doing the same thing on a table with 25 million rows takes 18 seconds. Here are the session wait stats for me:

In my case, the resources available to the query aren't getting throttled. It simply has a lot of CPU work to do. SQL Server isn't giving us a lot of clues, so to understand where all of that CPU time is going we need to look deeper. One way to do that is through ETW tracing. Using PerfView during the ALTER TABLE operation shows the call stacks that use the most CPU cycles all relate to changing data on the pages of the heap:

This might seem to be unhelpful, but it's not. On my server I don't have the option of reducing throttling to make the ALTER TABLE faster. One way to look at it is if I need to modify all of the rows anyway, why not do it in parallel instead of serial? The following code ends up with the same result but it can run in parallel:
DROP TABLE IF EXISTS dbo.ALTER_ME_NEW;
CREATE TABLE dbo.ALTER_ME_NEW (
COL1 BIGINT NULL,
COL2 BIGINT NULL,
COL3 BIGINT NULL,
COL4 BIGINT NULL,
COL5 BIGINT NOT NULL,
COL6 BIGINT NULL,
COL7 BIGINT NULL,
COL8 BIGINT NULL,
COL9 BIGINT NULL
);
INSERT INTO dbo.ALTER_ME_NEW WITH (TABLOCK)
SELECT *
FROM dbo.ALTER_ME_SLOW
OPTION (MAXDOP 4);
On my machine the new code takes only six seconds. In practice, coming out ahead with such a technique requires a sufficient server resources, a maintenance window, and a table structure that lends itself well to it. You mentioned in your question that you need to change the primary key. If your primary key is your clustered index then you'll need to rewrite the table's data again anyway, so making a copy in a heap might not be that bad. You can just create the primary key and clustered index that you wanted in parallel after the insert has finished.
I can't say for sure if the parallel insert approach is appropriate for your scenario. If you are already getting throttled with just an ALTER TABLE statement then it probably won't be. But it could be a good option to consider in the future if you need similar operations to be as fast as possible.