.NET Forum / .NET Framework / New Users / March 2008
Batch of Stored Procedures
|
|
Thread rating:  |
mj2736@yahoo.com - 05 Mar 2008 02:27 GMT In our .Net 2.0 web service we need to insert/update data records into SQL Server 2005 that have been uploaded to us in XML documents. There could be anywhere from one to literally thousands of records in each uploaded transaction, and there are several dozen destination tables.
Since performance is critical for this particular app, we were thinking of combining the stored procedure calls into configurable- sized batches using CommandType.Text and multiple concatenated EXEC statements, so we minimize database roundtrips as opposed to calling each SP individually in its own SqlCommand.
Is this a good approach? One concern is that SQL Server would have to compile each and every batch since they will all be different, and the overhead involved in this could become significant. Is this really something to worry about, or not? Wouldn't a few batch compiles per transaction still be better than potentially thousands of roundtrips? Are there other problems with this approach?
Of course we are planning to stress test to see how it actually responds, but I just wanted to get some opinions first in case someone has done this before and can give advice/recommendations/suggestions before we get too far into it.
Thanks! MJ
Mufaka - 05 Mar 2008 05:31 GMT > In our .Net 2.0 web service we need to insert/update data records into > SQL Server 2005 that have been uploaded to us in XML documents. There [quoted text clipped - 21 lines] > Thanks! > MJ For SQL 2005, I would use SQL's built in XML processing. You can pass the xml text in as a parameter to a stored procedure.
Here's a snippet from what I have done with something similar:
/* declare @XmlDoc nvarchar(4000) set @XmlDoc = '<patient> <answer> <PatientID>13303</PatientID> <QuestionID>200</QuestionID> <AppointmentID>105084</AppointmentID> <AnswerValue>217,218</AnswerValue> <SaveTypeID>1</SaveTypeID> </answer> </patient>'
exec up_PatientAnswer_Save_FromXml @XmlDoc = @XmlDoc */ ALTER procedure [dbo].[up_PatientAnswer_Save_FromXml] ( @XmlDoc ntext ) as set nocount on
declare @Answers table ( AnswerID Int identity, PatientID Int, QuestionID Int, AppoIntmentID Int, SaveTypeID Int, AnswerValue nvarchar(3900) null, AnswerValueDate datetime null, AnswerValueInt Int null )
declare @Pointer Int
-- get a pointer to the parsed doc exec sp_xml_preparedocument @Pointer output, @XmlDoc
-- insert Into the table var so we can dispose of the xml doc -- and work with the data easier (better) insert Into @Answers (PatientID, QuestionID, AppointmentID, SaveTypeID, AnswerValue, AnswerValueDate, AnswerValueInt) select PatientID, QuestionID, AppointmentID, SaveTypeID, AnswerValue, AnswerValueDate, AnswerValueInt from openxml(@Pointer, '/patient/answer', 2) with ( PatientID int, QuestionID int, AppointmentID int, SaveTypeID int, AnswerValue nvarchar(3900), AnswerValueDate datetime, AnswerValueInt int )
-- release the xml doc exec sp_xml_removedocument @Pointer
-- processing from here is done with the data in the table variable
More info can be found here:
http://www.perfectxml.com/articles/xml/openxml.asp
If the XML that you receive is usable as is, you can just pass that along. Or you can create new XML that is in a format that you expect.
You may also be able to us SqlBulkCopy to insert the data into a loading table and have a stored procedure that works off of that table.
In SQL 2008, you will be able to pass a DataTable as a parameter to a stored proc.
Mike - 05 Mar 2008 14:34 GMT Thanks for the suggestion. We've actually tried this - passing the data in as XML - but found it to be way too resource intensive on the database server for our purposes, particularly with large XML documents. We observed occasional 'out of memory' errors despite calling sp_xml_removedocument every time when finished. So we have already ruled this out.
I am interested if anyone has any opinions about the idea of concatenating multiple SP calls into a batch and sending the whole batch to the database in one shot as CommandType.Text.
Thanks, MJ
Mufaka - 05 Mar 2008 16:40 GMT > Thanks for the suggestion. We've actually tried this - passing the > data in as XML - but found it to be way too resource intensive on the [quoted text clipped - 9 lines] > Thanks, > MJ I recall problems like you mention in SQL 2000, but have never seen this issue in 2005 when you dispose of the xml document early.
Alex Kuznetsov - 05 Mar 2008 14:47 GMT On Mar 4, 8:27 pm, mj2...@yahoo.com wrote:
> In our .Net 2.0 web service we need to insert/update data records into > SQL Server 2005 that have been uploaded to us in XML documents. There [quoted text clipped - 21 lines] > Thanks! > MJ SQL Server MVP Erland Sommarskog wrote an excellent article "Arrays and Lists in SQL". We use his approach. We pack up to 100K values in an image, that mimics an array. We can use several images. Performs very fast.
Cowboy (Gregory A. Beamer) - 05 Mar 2008 15:32 GMT I still think XML may be your best option, but preload the XML into a single table and then run processes on the data there. This works best if you can create a table to load. You may need multiple load tables if you have different clients. This will still hammer the server for a bit. It should not be an issue for thousands of records, however. After you fill one table, you need to get its primary key values into the load table, especially if using row guids or IDENTITY.
When you get to millions of records, this can still cause issues with the server. In these cases, you may want to preprocess outside of SQL Server. If you opt for this, you essentially create bcp files (or files for BULK_INSERT) outside of SQL. When you get to the step to create IDENTITY values, you can use DBCC commands to increment a "hole" the size of the records being loaded. The customer's unique key and new IDENTITY driven primary key can be loaded into a hashtable. Load time for a million records is about 2-3 seconds with 1 million lookups takign far less than a second. At least this was the stats we worked with when I did a project that preprocessed offline.
If you can load into a single table from XML and then run commands, it is less time consuming to code.
What about batching? It is a possibility, but realize you are just flattening the performance curve. Rather than a single huge hit, you are taking many smaller hits. This could well be the solution you are looking for, but you may find that this approach fails as you grow in size. Of course, you can always move SQL Server to a big box, the data files to a SAN, etc. to alleviate some, if not all, issues. At least until you grow to millions of files being loaded this way.
Another potential option is getting more transactional with the process that creates the XML so single records are sent. If this is a client, it may not be an option. Even internal, there may be valid reasons not to go record by record.
Also, in most of these systems, you have to check for existing records before loading, as inevetably someone sends you the same info. You may have some system of checks and balances here to avoid this. If not, consider it, as a process failure in the middle of a large batch of records can be extremely hard to cure.
 Signature Gregory A. Beamer MVP, MCP: +I, SE, SD, DBA
*************************************************
| Think outside the box! *************************************************
> In our .Net 2.0 web service we need to insert/update data records into > SQL Server 2005 that have been uploaded to us in XML documents. There [quoted text clipped - 21 lines] > Thanks! > MJ Alex Kuznetsov - 05 Mar 2008 16:21 GMT On Mar 5, 9:32 am, "Cowboy \(Gregory A. Beamer\)" <NoSpamMgbwo...@comcast.netNoSpamM> wrote:
> I still think XML may be your best option, but preload the XML into a single > table and then run processes on the data there. This works best if you can [quoted text clipped - 3 lines] > you need to get its primary key values into the load table, especially if > using row guids or IDENTITY. So far this is contrary to our benchmarks - in all our experience XML is inefficient in terms of network bandwidth, storage, and CPU cycles to parse.
> What about batching? It is a possibility, but realize you are just > flattening the performance curve. Rather than a single huge hit, you are > taking many smaller hits. This could well be the solution you are looking I have a different opinion here too - based on years of benchmarking I think there is a performance price you pay for every round trip to the database. Dealing with batches may be way more efficient - in some cases 100 times more performant. I do not see the flattening you are speaking about, I usually observe a dramatic reduction of overall execution costs.
Cowboy (Gregory A. Beamer) - 12 Mar 2008 14:56 GMT > On Mar 5, 9:32 am, "Cowboy \(Gregory A. Beamer\)" > <NoSpamMgbwo...@comcast.netNoSpamM> wrote: [quoted text clipped - 12 lines] > is inefficient in terms of network bandwidth, storage, and CPU cycles > to parse. When consumed as XML, I would concur. For apps with heavy load, running directly out of the in-memory XML data, even in SQL Server, has a perf penalty associated with it. The larger the XML set, the greater the penalty.
As efficient as Microsoft has made their XML representations, you still end up with recursion underneath the hood. If you load it into a temp table, however, the inefficiency is reduced as the main hit is moving the data from XML structures to the temp table. The question then is whether or not ripping the data from the temp table into your normalized structure creates too heavy a hit on your database. But, this can take a toll on tempdb if you have exteremely large data sets, so you have to determine where you can afford to take the hit.
>> What about batching? It is a possibility, but realize you are just >> flattening the performance curve. Rather than a single huge hit, you are [quoted text clipped - 6 lines] > speaking about, I usually observe a dramatic reduction of overall > execution costs. Flattening may not have been the best word to use here, but hindsight is always 20-20. :-)
Overall, I was speaking of prepping the data outside of the database, which can be a huge performance boost. Just as a bit of anectdotal evidence (anectdotal, as it is a single case, not because the evidence is inconclusive), I helped design and build an "offline" system that worked completely outside of the database until batch load. Working with non-XML data, of course, but files up to 25GB in size. The prior system was load and run sprocs, which could take up to 8 days to completely normalize in the database. The new load system originally got the load time down to about 24 hours, or approximately 12.5% of the original time. Actually time "hammering" the database was reduced to minutes of batch upload.
Before I left the company, we had refactored to where it was about 8 hours. I believe they are down to about 1/3rd of that time now. Most of the inefficiencies lay in either a) concatenation of strings (StringBuilder helped tremendously here) and b) working with strings instead of bytes (unavoidable in some situations, but very doable during the first churns through the data). Further efficiency was garnered by sorting the files with Windows ports of some UNIX sort utilities (to anyone ever going this route, it pays to get trials and compare sort times, as sort does not necessarily equal sort. Depending on how you use the sort, you will find that some are more efficient than others in some respects).
 Signature Gregory A. Beamer MVP, MCP: +I, SE, SD, DBA
*************************************************
| Think outside the box! *************************************************
sloan - 06 Mar 2008 03:49 GMT I vote YES for xml.
You need to look at the *non* OPENXML option though, and test it.
http://blogs.sqlservercentral.com/michael_coles/archive/2008/01/20/stop-using-op enxml-please.aspx
which takes the @xml XML datatype in 2005/2008. I have not personally vouched for the claims made in the above URL..just sending you the info/link.
and check here http://www.sommarskog.se/arrays-in-sql-2005.html#XML
> In our .Net 2.0 web service we need to insert/update data records into > SQL Server 2005 that have been uploaded to us in XML documents. There [quoted text clipped - 21 lines] > Thanks! > MJ
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|