Thursday, August 28, 2008

MS SQL - Selective BULK INSERT without Format Files

Today I found myself with a simple task that had a small twist. Load a series of text files into an MS SQL database. Normally a quick and easy process using BULK INSERT.

Now BULK INSERT works smoothly when both the target table and input file contain the same number of columns. My problem was I needed to load the data into a table containing an extra identity column. Which meant the files contained fewer columns than the target table. Unfortunately, bulk insert gets confused when you try and selectively insert columns. Using format files is one way to handle this. However, I did not want to go that route because the input file delimiters varied. So I wanted to avoid creating a separate format file for every combination of delimiters.

I kept thinking there had to be some other way to do this. While re-reading the documentation, it dawned on me that the documentation states you can bulk insert into a view, not just tables. Now _that_ had possibilities. I realized I could create a view of my table, that contained only the columns I needed. Then bulk insert into the view, instead of the table. I tried it and it worked perfectly.

Since my input files are just gibberish codes, take this example. Let us say you have a table containing six (6) columns and an input file containing two (2) columns. I deliberately used a significantly different number of columns for demonstration purposes.

First create a view containing only the two columns in the input file: FirstName, LastName.


--- TARGET TABLE
CREATE TABLE ContactStaging
(
RecordID int identity(1,1),
FirstName varchar(100) NOT NULL,
MiddleInit char(1) NULL,
LastName varchar(100) NOT NULL,
Phone varchar(20) NULL,
Fax varchar(20) NULL
)


--- VIEW
CREATE VIEW ContactStagingView
AS
SELECT FirstName, LastName
FROM ContactStaging


Then run the bulk insert command on the view.

BULK INSERT ContactStagingView
FROM 'c:\input.txt'
WITH
(
FIELDTERMINATOR = '|'
, ROWTERMINATOR = '*'
, KEEPNULLS
)


--- INPUT.TXT
Kira|McConnnel*
Alan|Glover*
Thomas|Young*
Stuart|Long*

As you can see from the results, only the selected columns in the view were affected.


Now, this was special case. I would normally use a format file but decided the view approach was more suitable in this particular situation. It is also nice to know there is an alternative to format files, when you need one.

3 comments:

Mahesh,  March 11, 2010 at 1:39 PM  

The other option is to have the identity field at the end of the table. You will face this problem only if the identity field is at the beggining of the table. - Mahesh

cfSearching March 18, 2010 at 2:34 PM  

@Mahesh,

I gave that a quick spin (under 2005) and unfortunately did not have much luck with it..

-Leigh

Anonymous,  March 30, 2010 at 6:23 PM  

This is quite awesome. Thank you.

  © Blogger templates The Professional Template by Ourblogtemplates.com 2008

Header image adapted from atomicjeep