Copy large table
Copy table in SQL Server? It’s not big deal, one would say. There are plenty of options You can chose from to copy from one table to another. But what about copy of the very large table from one database to another?
Arithmetic overflow error converting IDENTITY to data type int
Does this error message sound familiar to You? I bet the answer is yes, since you are reading this post. One time, we found that error in the sql agent jobs that we use to copy from main analytics database to the archive. It is being used occasionally to get more insight from historical data.
What is the problem with Arithmetic overflow…?
At the time the above error was caught in the logs, in addition we started to experience storage capacity problems, and unfortunately we were limited to existing storage only. First things first, let’s reclaim some storage. Actions to mitigate it? Backup the database! Since it is just an archive, we were told, it can stay offline for some time… Backup made to the file server with much bigger capacity. Problem solved! hooray!! We had plenty of free space, but the archive was offline.
Another problem was, we did not have the archive anymore. But we needed to store historical data from analytics database somewhere, otherwise sooner or later it would have become not usable anymore due to the amount of data requiring processing. Impact on day to day work? Not really, let’s create the new archive, and…. do not make the same mistake anymore…. this time ID column was of type BIGINT instead of INT as it used to be in old archive. Hooray! The problem of “Arithmetic overflow error converting IDENTITY to data type int” just got solved.
One could say that it was easy-peasy. But there is always next day, then I heard “we need old archive, can you provide it us, please?”
Combine old archive with new archive
New archive was growing at pretty fast paced. It was already 50 GB in size, while the old archive, right before the backup was performed, was over 700 GB. The good news was that the archive consists of only single table. However, it reached the maximum of INT data type, which is 2140437574 of rows to transfer from old archive to new archive. One more good thing about the situation was that both databases were set to run in SIMPLE recovery mode. If they were set to FULL recovery mode the scenario would be more complicated, but not this time. Lastly, no constraints and no indexes other then Primary Key only existed in the table.
How to tackle the copy large table operation?
Firstly, searched the web to see how others perform similar activities. Found couple articles about options to migrate:
Very first thought was to use the Date column in the filter condition in order to retrieve pieces of data. I could not think about any other approach than to run the migration in batches. Attempt to check the oldest (minimum) date in the source table took 1,5 h to produce the output! It’s crazy, I thought. It will take forever to migrate the data that way. Basically, there was no other choice, to use Primary Key on ID column and its index to filter the data out. Initially, we tried to migrate data in small samples of 10 rows. It was going amazingly fast. Increased the batch size to 100 rows, then went up to 100 K, which was looking pretty good, and most importantly still was going pretty fast.
Data migration from large table
During samples migration around 10 M of rows were transferred successfully. Next step, was to migrate up to 100 M of rows. It took “only” 30 minutes to complete data copy using code below triggered from SQL Management Studio. I was really impressed, especially after seeing how long it took to run the query in order to get the results of the oldest (minimum) date in the table.
DECLARE @cnt INT = 10000000;
WHILE @cnt < 100000000
BEGIN
INSERT INTO Archive..Summary WITH (TABLOCK)
(Date, ..., ...)
SELECT Date, ..., ...
FROM Archive_old..Summary WITH (NOLOCK)
WHERE ID BETWEEN @cnt-100000 AND @cnt
DELETE FROM Archive_old..Summary
WHERE ID BETWEEN @cnt-100000 AND @cnt
SET @cnt = @cnt + 100000;
END;
Checkpoint before going further
While running data samples transfer as well as the main data migration we were able to verify the progress by running the queries as below. Both the minimum ID value from old archive and the maximum ID value from new archive were growing. That checkpoint confirmed, all was going in the right direction.
SELECT MIN (ID) FROM Archive_old..Summary
SELECT MAX (ID) from Archive..Summary
Main data migration in progress
Finally, all was looking good, especially the time it took to migrate initial piece of data was promising. The main migration was run from SQL Management Studio using code as below. It took… 16 h to complete the data transfer, approx. 700 GB in size.
DECLARE @cnt INT = 99900000;
WHILE @cnt < 2140337574
BEGIN
INSERT INTO Archive..Summary WITH (TABLOCK)
(Date, ..., ...)
SELECT Date, ..., ...
FROM Archive_old..Summary WITH (NOLOCK)
WHERE ID BETWEEN @cnt-100000 AND @cnt
DELETE FROM Archive_old..Summary
WHERE ID BETWEEN @cnt-100000 AND @cnt
SET @cnt = @cnt + 100000;
END;
Cleanup
Last but not least, to make sure all data was transferred. After main data migration there were still a few records left in the old archive. They were moved as very last step of the whole migration. Finally, old archive database was deleted. Mission copy large table succeeded.