3

I currently use the following statement, for 10,000 rows it takes about 150 seconds. I tried removing the index on the target table, but this didn't help. Running the loop without the INSERT INTO takes less then 50ms. I need it to update about 300 million rows and I can't really wait 52 days (!) for it to complete.

Bottom line of the following update query is that I need to go over each row, perform calculations on a VARBINARY and extract proper values from it (we need to get rid of the packed VARBINARY fields), and store that in a new table.

FETCH NEXT FROM LocCities INTO @LocCity 
WHILE (@@FETCH_STATUS = 0)
BEGIN
    -- several sets, removed calculations for clarity
    SET @LocationId = Calculation1()
    SET @CityId = Calculation2()

    IF(@LocCity <> 0)
    BEGIN
        -- left out an inner loop here on the VARBINARY based on its length
        INSERT INTO LocationCities (LocationId, CityId)
        VALUES (@LocationId, @CityId)
    END
    FETCH NEXT FROM RespCursor INTO @TuningRow
END

I understand that I can use the WITH keyword with table hints, but I am not sure what to use. I expect the final update query to run in several hours, and hope there's a way to do that. I really can't wait almost two months ;).

Isn't there something similar like BULKINSERT that I can use?

Abel
  • 144
  • 1
  • 9

1 Answers1

5

I really don't think table hints or BULKINSERT are going to help you here - your approach is still to process each varbinary value one at a time, and this will be your downfall regardless - especially when you discard the idea of set-based queries because you "don't think it's possible."

Here's a set-based approach with no awful loops or cursors. This assumes that the pattern is always the same (LocationID is the first byte, and CityID is the next two).

DECLARE @x TABLE(x VARBINARY(32));

INSERT @x VALUES(0x010734),(0x030735040736),(0x030742050743060712);

;WITH n(n) AS 
(
  SELECT TOP (300) (number*3)+1 
  FROM [master].dbo.spt_values -- your own Numbers table is better
  WHERE [type] = N'P' ORDER BY number
)
-- INSERT dbo.LocationCities(LocationId, CityId)
SELECT 
  x.x,      -- comment this out before insert 
  LocationID = CONVERT(INT, SUBSTRING(x.x, n, 1)),
  CityID     = CONVERT(INT, SUBSTRING(x.x, n+1, 2))
FROM @x AS x INNER JOIN n ON LEN(x) > n.n;

Results:

x                        LocationID    CityID
---------------------    ----------    ------
0x010734                 1             1844
0x030735040736           3             1845
0x030735040736           4             1846
0x030742050743060712     3             1858
0x030742050743060712     5             1859
0x030742050743060712     6             1810

Some articles that will help you understand numbers tables and why generating sets in SQL Server is far superior to even the most efficient loop you can derive.

Aaron Bertrand
  • 181,950
  • 28
  • 405
  • 624