Insert and Update BULK XML data into SQL

Problem
In my last article, I talked about how you can use an  FTP task in SSIS to download files from an FTP server. But what if the file you have downloaded is an XML file and you need to import this data from the XML file into a SQL Server table? How do you process/parse XML data into SQL Server tables?
Solution
There are different ways to achieve this task of importing data from an XML file into a SQL Server table, but I am going to demonstrate one of easiest ways to accomplish this task.
These are the steps I performed for importing data into SQL Server and then parsing the XML into a relational format.
  • Import XML data from an XML file into SQL Server table using the OPENROWSET function
  • Parse the XML data using the OPENXML function

Importing XML data from XML file using OPENROWSET

I have an XML file downloaded from my FTP location to a local folder and data in this XML file looks like this:
Importing XML data from XML file using OPENROWSET
Now in order to import data from the XML file to a table in SQL Server, I am using the OPENROWSET function as you can see below.
In the script below, I am first creating a table with a column of data type XML and then reading the XML data from the file using the OPENROWSET function by specifying the file location and name of the XML file as you can see below: 
CREATE DATABASE OPENXMLTesting
GO


USE OPENXMLTesting
GO


CREATE TABLE XMLwithOpenXML
(
Id INT IDENTITY PRIMARY KEY,
XMLData XML,
LoadedDateTime DATETIME
)


INSERT INTO XMLwithOpenXML(XMLData, LoadedDateTime)
SELECT CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE() 
FROM OPENROWSET(BULK 'D:\OpenXMLTesting.xml', SINGLE_BLOB) AS x;


SELECT * FROM XMLwithOpenXML
When I query the table in which I have imported the XML data, it looks like this. The XMLData column is an XML data type, it will output a hyperlink as shown below:
As XMLData column is of XML data type, it will give an hyperlink
Clicking on the hyperlink, in the above image, will open another tab within SSMS with the XML data displayed as shown below.
<ROOT>
<Customers>
<Customer CustomerID="C001" CustomerName="Arshad Ali">
<Orders>
<Order OrderID="10248" OrderDate="2012-07-04T00:00:00">
<OrderDetail ProductID="10" Quantity="5" />
<OrderDetail ProductID="11" Quantity="12" />
<OrderDetail ProductID="42" Quantity="10" />
</Order>
</Orders>
<Address> Address line 1, 2, 3</Address>
</Customer>
<Customer CustomerID="C002" CustomerName="Paul Henriot">
<Orders>
<Order OrderID="10245" OrderDate="2011-07-04T00:00:00">
<OrderDetail ProductID="11" Quantity="12" />
<OrderDetail ProductID="42" Quantity="10" />
</Order>
</Orders>
<Address> Address line 5, 6, 7</Address>
</Customer>
<Customer CustomerID="C003" CustomerName="Carlos Gonzlez">
<Orders>
<Order OrderID="10283" OrderDate="2012-08-16T00:00:00">
<OrderDetail ProductID="72" Quantity="3" />
</Order>
</Orders>
<Address> Address line 1, 4, 5</Address>
</Customer>
</Customers>
</ROOT>

Process XML data using OPENXML function

Now as I said before, XML data stored in a column of data type XML can be processed either by using XML functions available in SQL Server or by using the sp_xml_preparedocument stored procedure along with the OPENXMLfunction.
We will first call the sp_xml_preparedocument stored procedure by specifying the XML data which will then output the handle of the XML data that it has prepared and stored in internal cache.
Then we will use the handle returned by the sp_xml_preparedocument stored procedure in the OPENXML function to open the XML data and read it.
Note: the sp_xml_preparedocument stored procedure stores the XML data in SQL Server's internal cache, it is essential to release this stored XML data from internal cache by calling the sp_xml_removedocument stored procedure. We should call the sp_xml_removedocument stored procedure as early possible, so that internal cache can be freed for other usage.
USE OPENXMLTesting
GO


DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)


SELECT @XML = XMLData FROM XMLwithOpenXML


EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML


SELECT CustomerID, CustomerName, Address
FROM OPENXML(@hDoc, 'ROOT/Customers/Customer')
WITH 
(
CustomerID [varchar](50) '@CustomerID',
CustomerName [varchar](100) '@CustomerName',
Address [varchar](100) 'Address'
)


EXEC sp_xml_removedocument @hDoc
GO
From the above XML data, I want to retrieve all the customer information and hence I am navigating to the Customer element and querying CustomerID and CustomerName (please note the use of "@" before the name of the attribute) attributes and Address element in the above SELECT statement using the OPENXML function.
The structure of the resultset can be determined with the "WITH" clause as shown above.
Process XML data using OPENXML function
From the above XML data, I now want to retrieve all the customer information along with OrderID and OrderDate placed by each individual customer and hence I am navigating to the Order element and then querying OrderID and OrderDate attributes.
If we want to navigate back to the parent or grand parent level and get data from there, we need to use "../" to read the parent's data and "../../" to read the grand parent's data and so on.
USE OPENXMLTesting
GO


DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)


SELECT @XML = XMLData FROM XMLwithOpenXML


EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML


SELECT CustomerID, CustomerName, Address, OrderID, OrderDate
FROM OPENXML(@hDoc, 'ROOT/Customers/Customer/Orders/Order')
WITH 
(
CustomerID [varchar](50) '../../@CustomerID',
CustomerName [varchar](100) '../../@CustomerName',
Address [varchar](100) '../../Address',
OrderID [varchar](1000) '@OrderID',
OrderDate datetime '@OrderDate'
)


EXEC sp_xml_removedocument @hDoc
GO
The result of the above query can be seen in the image below. You can see below all the customers and all the orders placed by each customer.
querying CustomerID and CustomerName
Now let's go one level deeper. This time from the above XML data, I want to retrieve all the customer information and their orders along with ProductID and Quantity from each order placed. And hence, as you can see below I am navigating to the OrderDetail and retrieving the ProductID and Quantity attributes' values. At the same time I am using "../" to reach the parent level to get Order information available at the parent level whereas I am using "../../../" to reach to the great grand parent level to grab Customer information as shown below:
USE OPENXMLTesting
GO


DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)


SELECT @XML = XMLData FROM XMLwithOpenXML


EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML


SELECT CustomerID, CustomerName, Address, OrderID, OrderDate, ProductID, Quantity
FROM OPENXML(@hDoc, 'ROOT/Customers/Customer/Orders/Order/OrderDetail')
WITH 
(
CustomerID [varchar](50) '../../../@CustomerID',
CustomerName [varchar](100) '../../../@CustomerName',
Address [varchar](100) '../../../Address',
OrderID [varchar](1000) '../@OrderID',
OrderDate datetime '../@OrderDate',
ProductID [varchar](50) '@ProductID',
Quantity int '@Quantity'
)


EXEC sp_xml_removedocument @hDoc
GO
The result of the above query can be seen in the image below. You can see all the customer information and their orders along with ProductID and Quantity from each order placed.
The result of the above query  
 
 
===============================================================
Update Query:
SELECT * FROM T
UPDATE T
SET XmlCol =(
SELECT * FROM OPENROWSET(
   BULK 'C:\SampleFolder\SampleData3.txt',
           SINGLE_BLOB
) AS x
)
WHERE IntCol = 1;
GO
 
 
 

Specification for SQL Server 2008,R2 and 2012

The following tables specify the maximum sizes and numbers of various objects defined in SQL Server 2012 components, and compared against the maximum sizes and number of various objects defined in SQL Server 2008 and SQL Server 2008 R2 components.
Database Engine Objects
The following table specifies the maximum sizes and number of various objects defined in SQL Server databases or referenced in Transact-SQL statements.
Maximum Sizes / Numbers SQL Server (32-bit)
SQL Server Database Engine ObjectSQL Server 2008SQL Server 2008 R2SQL Server 2012
Batch size65,536 * Network Packet Size65,536 * Network Packet Size65,536 * Network Packet Size
Bytes per short string column8,0008,0008,000
Bytes per GROUP BY ORDER BY8,0608,0608,060
Bytes per index key900900900
Bytes per foreign key900900900
Bytes per primary key900900900
Bytes per row8,0608,0608,060
Bytes in source text of a stored procedure.Lesser of batch size or 250 MBLesser of batch size or 250 MBLesser of batch size or 250 MB
Bytes per VARCHAR(MAX), VARBINARY(MAX),XML, TEXT, or IMAGE column2^31-12^31-12^31-1
Characters per NTEXT or NVARCHAR(MAX) column2^30-12^30-12^30-1
Clustered indexes per table111
Columns in GROUP BY, ORDER BYLimited only by number of bytesLimited only by number of bytesLimited only by number of bytes
Columns or expressions in a GROUP BY WITH CUBE or WITH ROLLUP statement101010
Columns per index key161616
Columns per foreign key161616
Columns per primary key161616
Columns per nonwide table1,0241,0241,024
Columns per wide table30,00030,00030,000
Columns per SELECT statement4,0964,0964,096
Columns per INSERT statement4,0964,0964,096
Connections per clientMaximum value of configured connectionsMaximum value of configured connectionsMaximum value of configured connections
Database size524,272 terabytes524,272 terabytes524,272 terabytes
Databases per instance of SQL Server32,76732,76732,767
Filegroups per database32,76732,76732,767
Files per database32,76732,76732,767
File size (data)16 terabytes16 terabytes16 terabytes
File size (log)2 terabytes2 terabytes2 terabytes
Foreign key table references per table253253253
Identifier length (in characters)128128128
Instances per computer50 instances on a stand-alone server for all SQL Server editions.50 instances on a stand-alone server for all SQL Server editions.50 instances on a stand-alone server for all SQL Server editions.
Length of a string containing SQL statements (batch size)65,536 * Network packet size65,536 * Network packet size65,536 * Network packet size
Locks per connectionMaximum locks per serverMaximum locks per serverMaximum locks per server
Locks per instance of SQL ServerUp to 2,147,483,647Up to 2,147,483,647Up to 2,147,483,647
Nested stored procedure levels323232
Nested subqueries323232
Nested trigger levels323232
Nonclustered indexes per table999999999
Number of distinct expressions in the GROUP BY clause when any of the following are present: CUBE, ROLLUP, GROUPING SETS, WITH CUBE, WITH ROLLUP323232
Number of grouping sets generated by operators in the GROUP BY clause4,0964,0964,096
Parameters per stored procedure2,1002,1002,100
Parameters per user-defined function2,1002,1002,100
REFERENCES per table253253253
Rows per tableLimited by available storageLimited by available storageLimited by available storage
Tables per databaseLimited by number of objects in a databaseLimited by number of objects in a databaseLimited by number of objects in a database
Partitions per partitioned table or index1,0001,00015,000
Statistics on non-indexed columns30,00030,00030,000
Tables per SELECT statementLimited only by available resourcesLimited only by available resourcesLimited only by available resources
Triggers per tableLimited by number of objects in a database.Limited by number of objects in a database.Limited by number of objects in a database.
Columns per UPDATE statement (Wide Tables)4,0964,0964,096
User connections32,76732,76732,767
XML Indexes249249249
SQL Server Utility Objects
The following table specifies the maximum sizes and number of various objects that were tested in the SQL Server Utility.
Maximum Sizes / Numbers SQL Server (32-bit)
SQL Server Utility ObjectSQL Server 2008SQL Server 2008 R2SQL Server 2012
Computers (physical computers or virtual machines) per SQL Server Utility100100100
Instances of SQL Server per computer555
Total number of instances of SQL Server per SQL Server Utility200200200
User databases per instance of SQL Server, including data-tier applications505050
Total number of user databases per SQL Server Utility1,0001,0001,000
File groups per database111
Data files per file group111
Log files per database111
Volumes per computer333
SQL Server Data-Tier Application Objects
The following table specifies the maximum sizes and number of various objects that were tested in the SQL Server data-tier applications (DAC).
Maximum Sizes / Numbers SQL Server (32-bit)
SQL Server DAC ObjectSQL Server 2008SQL Server 2008 R2SQL Server 2012
Databases per DAC111
Objects per DACLimited by the number of objects in a database, or available memory.Limited by the number of objects in a database, or available memory.Limited by the number of objects in a database, or available memory.
SQL Server Replication Objects
The following table specifies the maximum sizes and number of various objects defined in SQL Server Replication.
Maximum Sizes / Numbers SQL Server (32-bit)
SQL Server Replication ObjectSQL Server 2008SQL Server 2008 R2SQL Server 2012
Articles (merge publication)256256256
Articles (snapshot or transactional publication)32,76732,76732,767
Columns in a table (merge publication)246246246
Columns in a table (SQL Server snapshot or transactional publication)1,0001,0001,000
Columns in a table (Oracle snapshot or transactional publication)995995995
Bytes for a column used in a row filter (merge publication)1,0241,0241,024
Bytes for a column used in a row filter (snapshot or transactional publication)8,0008,0008,000