When I first heard about this for SQL Server 2008 I was happy to find that it was now possible to merge tables, updating the existing rows and inserting the new ones based on comparison key(s) like I was doing at this right moment on an Oracle datbase I use as source data.
working with BI at a major TV channel we carry a lot of historical data that stays historical. None the less, because of missing timestamps in some tables it was not possible to see which data has changed since the last time we copied the table data. I need to merge from Oracle –> MS SQL 2008 and from MS SQL 2008/2005 –> MS SQL 2008
So here I was, looking at the syntax of the Merge command and I thought it was very verbose and therefore hard to create and maintain, that is why I created a stored procedure to wrap the Merge command with the following features :
- merge between Oracle and MS SQL Server
- It has to work with temp tables as source
- merge between two tables – with automatic field discovery
- merge between two tables where the left hand SQL select is specified as an input parameter
- avoid the use of cursors – I love the while loops !
This is the signature of the main stored procedure :
@toscreen int= 0,
@arg1IsSQLSelect int= 0
@toscreen indicates if it should ouptut the statement or if it should run the statement.
@arg1IsSQLSelect indicates if @TargetTableName is a select statement instead of the name of a table.
Discover the columns of the source table
This is the first step to the merge and since I want to be able to work with temp tables I need to be able to look in tempdb or in sysobjects .
For that purpose I created a temp table : #ColumnList
where I can store the name of the row and an index number (for my loop).
if LEFT(@TargetTableName,1)='#'INSERTINTO #ColumnList(RowNumber, ColumnName)SELECTrow_number()OVER (Orderby tempdb..syscolumns.colid)as'RowNumber', column_name=tempdb..syscolumns.nameFROM tempdb..sysobjectsJOIN tempdb..syscolumns ON tempdb..sysobjects.id = tempdb..syscolumns.id WHERE tempdb..sysobjects.xtype in('U','V')and tempdb..sysobjects.name = @TargetTableName ORDERBYsysobjects.name,syscolumns.colidELSE BEGIN INSERT INTO #ColumnList(RowNumber, ColumnName)SELECTrow_number()OVER (Orderbysyscolumns.colid)as'RowNumber', column_name=syscolumns.nameFROMsysobjectsJOINsyscolumnsONsysobjects.id =syscolumns.id WHEREsysobjects.xtype in('U','V')andsysobjects.name = @TargetTableNameand colstat & 1 != 1 ORDER BY sysobjects.name,syscolumns.colid END
Now I need to build a list of the column(s) to match in my Merge. Again I use a temp table and the smart method row_number() with the OVER keyword:
CREATE TABLE #ColumnToMatchList(RowNumber int, ColumnMatchName varchar(1000)) INSERT INTO #ColumnToMatchList(RowNumber,ColumnMatchName) ( SELECT row_number() OVER (Orderby ListItem)as'RowNumber', ListItem from [fn_SplitList](',',@ColumnToMatchList) )
One row at a time we can start building the string statements for all the columns with the help of a nicely performat While command (not that it matters here but it’s always nice to perform !)
--retrieve the first row SELECT @iCurrentRowId = RowNumber, @ColumnName =ColumnMatchName FROM #ColumnToMatchListWhere RowNumber = @iNextRowId
After the loop is done we have the first portion of the statement : The matching part on the column names
Now we only need to append the rest of the string.
Building the USING part
Telling the server which column to use in the Select statement.
In the samme While loop we build the Column List, the update statement and the insert statement. Because it is the same columns we need to loop through it makes sense to build it all in the same go.
We loop around exactly the same table then in the first step, because all we need is the name of all the columns either to insert or update:
SET @ColumnList = @ColumnList + @ColumnName +',' SET @ValuedColumnList = @ValuedColumnList +'NewTable.'+ @ColumnName +',' IF @ColumnName != @ColumnToMatch SET @UpdateStatement = @UpdateStatement + @ColumnName +' = '+' NewTable.'+ @ColumnName +','
And we end up with this result :
SET @SQL = @SQL + @ColumnList + ' FROM ' + @SourceTableName + ') as NewTable(' + @ColumnList + ') ON ' + @OnStatement + ' WHEN MATCHED -- THEN ' + @UpdateStatement + ' WHEN NOT MATCHED BY TARGET THEN INSERT (' + @ColumnList + ') VALUES (' + @ValuedColumnList + '); 'Where
@ColumnList is the list of the columns on the Tables to merge
@OnStatement : is the statement we build in the first loop
@UpdateStatement is the SQL statement when the row is matched
@ValueColumnList is the SQL statement for the Insert when the rows are new.
An exmaple of the call to the stored procedure could be :
The fourth argument indicates that we want to print it to the screen in stead of executing it ! I found it very practical when you just want to generate the query and not execute it because you want to add some changes first.
One of the changes I oftest make is removing the Identity column from the SELECT, the UPDATE and the INSERT. I could – of course – have implemented it in the procedure so that the Identity type is not used when building the columns list. It must be for another version.
I’ve made 2 other versions of this proc. One for calling with custom Select intstead of selecting alle the columns from the source table. And one for merging from Oracle using OpenQuery. It works on the same princip as the Custom Select.
Message me if you need a copy of the stored proc !
Happy merging !