Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / December 2007

Tip: Looking for answers? Try searching our database.

Multithreading WebRequests, a good and stable approach?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Nightcrawler - 07 Dec 2007 17:58 GMT
I have a webservice that gets data from three websites and puts the
result into a datatable and returns that datatable.

Currently the webservice makes a WebRequest (using parameters in a
querystring) to the first website, adds the data into a datatable then
moves on to the second website, merges the datatable together and
finally gets data from the third website and merges that datatable
with the existing one.

This works fine but I recently changed the client interface to pull
the information using AJAX/javascript. A browser like Firefox will
fire an error (script has stopped responding) if the javascript does
not respond in 10 seconds. This puts more preassure on the webservice
to execute and return results within those 10 seconds.

I started looking into multithreading these webRequests and fire the
requests at the same time.

My questions are:

1. Is this a good approach? Are there any risks in multithreading
multiple webRequests like this?
2. Can anyone point me in the right direction as to how to make these
webrequests using multithreading?
3. More importantly, how do I merge the data into one datatable once
all three webRequests are completed?

Any feedback is appreciated.

Thanks
Peter Duniho - 07 Dec 2007 18:42 GMT
> [...]
> My questions are:
>
> 1. Is this a good approach? Are there any risks in multithreading
> multiple webRequests like this?

Multithreading always has risks.  But I don't think there is anything  
unusual in this scenario.  Just the usual concurrency issues.

> 2. Can anyone point me in the right direction as to how to make these
> webrequests using multithreading?

Use the async methods (start with "Begin..." and "End...").  This will  
prevent any thread from being committed to the operation until it actually  
completes.  Or, it should anyway on operating systems that support IOCP.

> 3. More importantly, how do I merge the data into one datatable once
> all three webRequests are completed?

Where's the datatable?  What part of the issue are you having problems  
with?

It sounds as though you're implementing some sort of web components  
application, where there's server-side and client-side parts.  Is the  
question about how to get the data back to the client when everything's  
done?  Or is it simply about how to manage your DataTable object?

If the latter, it should be pretty much the same as however you do it  
synchronously, except you'll have to provide synchronization for the  
DataTable object (using the lock() statement, for example).  If you need  
the data in the DataTable object in some specific order (for example, in  
the order the requests were started), you'll need to impose that order.  
How best to do that depends on how you've defined the order.

If the former, I have no idea.  Sounds like a web applications question,  
and I don't know anything about that.  :)

Pete
Nightcrawler - 07 Dec 2007 19:01 GMT
Pete,

This is the way I have it setup. I only included enough code so you
understand the logic. I stripped out a bunch that is not required. So
the part where I need to optimze is the GetSearchResultsArray(). I
would like to fire the three GetResults at the same time and then be
able to merge the data together into one table (no particular order).

Thanks for your help!

   [WebMethod]
   [System.Web.Script.Services.ScriptMethod(UseHttpGet = true)]
   public result[] GetSearchResultsArray()
   {
       DataTable dt = BuildDataTable();
       // The BuildDataTable() just returns a datatable with specific
columns to this operation (code not included for simplicity)

       dt = GetResults("url1", "parameters1");
       dt.Merge(GetResults("url2", "parameters2"));
       dt.Merge(GetResults("url3", "parameters3"));

       //Takes the datatable and converts it to a List (code not
included for simplicity)
   }

   private DataTable GetResults(string url, string parameters)
   {
       string result = GetSearchResults(url, parameters);

       // Does processing of the result in the response string and
puts it into a prebuilt datatable (code not included for simplicity)
       return DataTable;
   }
   private string GetSearchResults(string url, string parameters)
   {
       string httpRequest = String.Format("{0}?{1}", url,
parameters);

       WebRequest webRequest = WebRequest.Create(httpRequest);
       StreamReader responseReader = new
StreamReader(webRequest.GetResponse().GetResponseStream());

       string responseString =
HttpUtility.UrlDecode(responseReader.ReadToEnd());
       responseReader.Close();

       return responseString;
   }
Peter Duniho - 07 Dec 2007 20:24 GMT
> Pete,
>
[quoted text clipped - 3 lines]
> would like to fire the three GetResults at the same time and then be
> able to merge the data together into one table (no particular order).

Okay, the "no particular order" is helpful.  If the order did matter, that  
could be easily addressed, but it does make the code simpler to not have  
to worry about it.

Let's start with the suggestions and code that Nicholas posted, since his  
basic response is very useful.

Based on that response, I'd offer a couple of observations:

    * First, the difference between his two suggestions -- calling  
EndGetResponse() in sequence for each request, versus setting a waitable  
event -- is not very great, at least not as he demonstrated it.  In either  
case, the code will simply stop before exiting the method that starts all  
three requests, so they have the same effect.

Where setting the event handle might be useful is if you had some code  
_somewhere else_ that would wait on it, in a different thread.  For  
example, let's say you ran the code he posted in the main thread in  
response to something, but had a different thread sitting around waiting  
to process completed data retrievals.  Then that different thread could  
use the waitable event as its signal to do more work.  Of course, in that  
scenario you wouldn't create the waitable event in the code that starts  
the requests.  It'd be stored somewhere more accessible so that the other  
thread could already be waiting on it.

    * Second, his sample provides a very good illustration of the  
synchronization required for the DataTable.  I like to follow Jon's advice  
to not lock using the actual object, but rather to create a separate  
"object" instance for use in locking.  But otherwise, his sample shows  
what I meant when I wrote of the need to address concurrency issues by  
synchronizing access to the DataTable.

    * Finally, I think Nicholas meant to just write "callback" instead of  
"callback1", "callback2", and "callback3" when he calls BeginGetResponse().

Now, how would I adjust his sample to suit the description you've given  
above?

I would get rid of the synchronization at the end of his method, as well  
as the waitable event altogether.  I would also, of course, create a new  
object for locking the DataTable.  Finally, without the waitable event,  
instead I would just call whatever code you have that needs to be called  
when all of the requests have completed.

So, taking Nicholas's code as the starting point, here's what it'd look  
like instead:

public void MyMethod()
{
    // Create the three web requests.
    HttpWebRequest wr1 = ...;
    HttpWebRequest wr2 = ...;
    HttpWebRequest wr3 = ...;

    // This is the number of web requests that still have to complete.
    int requestsToComplete = 3;

    // The data table to return.
    DataTable dt = ...;

    // [an object used to synchronize access to the DataTable -- Pete]
    object objLock = new object();

    // The event which will be called to indicate that processing is done.
    // The async callback which will process the data.  You will need
    // separate code for each if they have different routines to
    // populate the data table.
    AsyncCallback callback =
        delegate(IAsyncResult ar)
        {
            // Get the request from the state.
            // [note that I've changed to a straight case from the "as"
            //  that Nicholas had.  I only use "as" if I've got some code
            //  that will actually deal with a failed cast.  Otherwise,
            //  you just get a delayed exception, and a less-useful one at
            //  that, since the exception is a null reference instead of  
the
            //  more informative invalid cast that actually describes what
            //  went wrong -- Pete]
            HttpWebRequest request = (HttpWebRequest)ar.AsyncState;

            // Call EndGetResponse.
            using (HttWebResponse response = (HttpWebResponse)  
request.EndGetResponse(ar))
            {
                // Add to the data table here.  This is the code specific  
to the request.
                // You have to synchronize access to the table as well.
                lock (objLock)
                {
                    // Process the response here and add the rows you need  
to.

                    // [here is where you'd convert the response to  
DataTable and
                    // then call DataTable.Merge() with the results, for  
example.
                    // Noting, of course, that in this scenario it might  
be easier
                    // to just add the data as it's generated from the  
response to
                    // the original table.  But if that were really true,  
maybe you
                    // would have done it that way in the original code  
too, so I don't
                    // really know.  :) -- Pete]
                }
            }

            // Decrement the count on the requests to complete.  If it is
            // zero, then fire the event.
            if (Interlocked.Decrement(ref requestsToComplete) == 0)
            {
                // [here you'd call whatever method needs executing when  
all of the
                //  data has been retrieved.  If that method includes any  
calls to update
                //  things in the UI, you'll either need to use  
Control.Invoke() here to
                //  call that method, or in that method use  
Control.Invoke() to do the
                //  UI-specific stuff -- Pete]
            }
        };

    // Begin the calls here.
    wr1.BeginGetResponse(callback, wr1);
    wr2.BeginGetResponse(callback, wr2);
    wr3.BeginGetResponse(callback, wr3);
}

Hope that helps.

Pete
Nightcrawler - 12 Dec 2007 00:17 GMT
Pete,

I am trying your code but it doesn't seem to work.

I tried Nicholas code and it worked fine. I then adjusted it to try
yours by removing the manualevent and modifying it to your post but
now it simply returns nothing. Almost as if the requests never
happened. I have a feeling I am missing a line of code that prevents
the method to exit out before the requests are done.

Please let me know.

Thanks
Peter Duniho - 12 Dec 2007 00:49 GMT
> Pete,
>
[quoted text clipped - 5 lines]
> happened. I have a feeling I am missing a line of code that prevents
> the method to exit out before the requests are done.

Why do you want to prevent the method from exiting?

I thought the whole point here was that if your code doesn't return, it  
appears unresponsive to the browser, which then cancels your code.

The code Nicholas posted may speed things a bit by parallelizing the  
requests, but ultimately you're still waiting, and if any one request  
takes too long, all of the requests are basically useless.

Presumably, you've got some other code that would be executed after the  
method returns, taking all of the responses in aggregate and doing  
something useful with them.  In the code I posted, you should execute that  
code where I indicated by my comments, once the counter reaches zero.  You  
may want to just put all that code into a method, and then call that  
method where I've indicated.

I don't think there's any reason to prevent the method from exiting, but  
if that's a requirement of yours for some reason then no, the code I  
posted isn't going to work for you.  I wrote it specifically to return as  
soon as it could, rather than waiting around for the asynchronous i/o to  
complete, since that's generally the point of doing asynchronous i/o (as  
Nicholas points out, even initiating the i/o asynchronously is only going  
to allow a limited number of the requests to actually proceed in parallel,  
depending on the system configuration).

Pete
Nightcrawler - 12 Dec 2007 18:57 GMT
Pete,

The method will be called throught an AJAX user interface so it will
be exposed in a webservice.

So my current code has 3 different callbacks since each of them have
seperate routines specific to each web request. Once the datatable has
been populated through my three callback routines, I do some filtering
of the datatable using a dataview, then finally convert all the data
in the table to a List and return it as an array to the calling
javascript, which will display it to the user.

Are you saying I could return portions using your code. So, if
webrequest 1 is done it will return that to the javascript, then if
request 3 is done, it will return that and then finally request 2 (I
am assuming the finsih in that order).

Thanks for your input.
Peter Duniho - 12 Dec 2007 19:31 GMT
> Pete,
>
[quoted text clipped - 7 lines]
> in the table to a List and return it as an array to the calling
> javascript, which will display it to the user.

The basic theory is the same as the code that Nicholas and I posted.  In  
the case of my proposal, you can still use three different callbacks, as  
long as each includes some logic as I've suggested at the end to detect  
whether all of the requests have completed.

> Are you saying I could return portions using your code. So, if
> webrequest 1 is done it will return that to the javascript, then if
> request 3 is done, it will return that and then finally request 2 (I
> am assuming the finsih in that order).

I have no idea if that would work.  It might, but I have no way of  
knowing.  For one, I don't do much web development, and I don't have any  
idea how .NET interacts with the web client stuff.  For another, I don't  
know enough about your particular implementation and how that would work  
with the web client to know whether returning intermediate results would  
work.

What I do know is that assuming you currently have an implementation that  
returns just the final complete results, and assuming there's some way for  
that implementation to respond to the web client (with or without the  
actual results) for it to not generate some kind of timeout error, then  
there is a simple way (as illustrated in this thread) to asynchronously  
accumulate the responses as well as know when all have completed so that  
you can take some appropriate action.

Beyond that, you'll need someone who knows more about the web client  
aspect of .NET.  I know that I've dealt with web pages that takes FAR  
longer than 20 seconds to return their results, both in terms of pages  
that take that long to load as well as pages that appear to load right  
away but then have some sort of deferred processing that updates something  
in the page later.  But I've never bothered to take a look at how those  
are implemented.  All I know is that it can be done.

Pete
Nightcrawler - 12 Dec 2007 19:05 GMT
Pete,

On another note, how can I include a regular method call to a table
adapter. Say I want to fetch data from 3 webrequests and 1 one request
using a dataadapter and my own database. Could I incorporate that
logic into this as well?

So theoretically, four threads would work at the same time to populate
a datatable (3 webrequests and one dataadapter using sql server) then
returned through a webservice.

The reason I have to optimze these requests is simply because browsers
ike firefox will throw and disclaimer that the javascript stopped
working if the webservice call through javascript takes longer than 10
seconds. I want to avoid that at all costs.

Thanks
Peter Duniho - 12 Dec 2007 19:44 GMT
> On another note, how can I include a regular method call to a table
> adapter. Say I want to fetch data from 3 webrequests and 1 one request
> using a dataadapter and my own database. Could I incorporate that
> logic into this as well?

Yes, but since DataAdapter doesn't have an async API, you'll have to  
handle that yourself.  The most straightforward way would be to use a  
BackgroundWorker.  The general idea is the same though: provide a delegate  
(in this case, used as the handler for the BackgroundWorker.DoWork event)  
that does the request and then at the end does the same "am I done with  
all requests yet?" sort of logic that the other async handlers do.

In that case, BackgroundWorker.RunWorkerAsync() method takes the place of  
the BeginGetResponse() method.  You can either put the "am I done with all  
requests yet?" logic at the end of the DoWork handler, or you can create a  
seperate delegate to handle the BackgroundWorker.RunWorkerCompleted  
event.  In the latter case, the main advantage is that the event is raised  
on the same thread that created the BackgroundWorker, but since you need  
to do with thread synchronization issues anyway (for the other three  
requests), this may not be all that useful in your case.

> So theoretically, four threads would work at the same time to populate
> a datatable (3 webrequests and one dataadapter using sql server) then
[quoted text clipped - 4 lines]
> working if the webservice call through javascript takes longer than 10
> seconds. I want to avoid that at all costs.

Well, as I mentioned in my other reply, I can't really comment very much  
on the exact interaction with the browser.  Whatever the time limit is (10  
seconds, 20 seconds, etc.) it seems to me that any one request _could_  
take longer than that, and so if you are waiting for them all to complete,  
then even if they are all done in parallel you could still wind up hitting  
that limit.

While I don't know how you'd implement this, I think it would be better  
for the code that the browser is waiting on to return immediately, and  
then provide some way to update the page later once the requests have all  
completed (or, if possible, update the page as the intermediate results  
complete as well).

With web browsers being mainly "pull" data models, I don't really know how  
that sort of things would work.  But I know I've seen what _seems_ to be  
like a "push" data presentation in a web browser, so it seems like it  
ought to be doable somehow.

Pete
Nicholas Paldino [.NET/C# MVP] - 07 Dec 2007 19:13 GMT
Thomas,

   This is a perfectly fine idea, but it will require a little work.  The
HttpWebRequest/HttpWebResponse classes absolutely support making calls
asynchronously.

   The simplest way would be to set up your three web requests
(HttpWebRequest) instances and then call BeginGetResponse on each of them in
succession, storing the IAsyncResult implementations.

   Then, right after that, you would call EndGetResponse on the instances,
passing the IAsyncResponse implementations that correspond to the instances
that returned them on BeginGetResponse.

   At this point, you would have your three results and you could insert
them all into the data table to be returned.

   This works because you are basically going to take as long as the
longest request to get all three requests (assuming they are to different
websites, the HTTP specification has a note in it about how many concurrent
connections can be opened to a website at the same time, I believe) and your
successive calls to EndGetResponse will not hang if the call completes
before it is called.

   However, you can improve on this, if you need to squeeze out more
performance.  You could pass callback routines to the BeginGetResponse
methods, in which you would merge the results with your data set.  You could
then, when they are all complete, indicate to the waiting main thread that
you are done (through an EventHandle of some kind).  That would be a little
more complex, since you don't want to create an individual event handle for
each web request (since you are in a web server, I imagine you are going to
be calling this a lot).

   Anonymous methods can help though.  I would do this:

// This can have any inputs and outputs you like, I'm just using it as an
example, but it
// is basically the entry point for your web request.
public DataTable MyMethod()
{
   // Create the three web requests.
   HttpWebRequest wr1 = ...;
   HttpWebRequest wr2 = ...;
   HttpWebRequest wr3 = ...;

   // This is the number of web requests that still have to complete.
   int requestsToComplete = 3;

   // The data table to return.
   DataTable dt = ...;

   // The event which will be called to indicate that processing is done.
   using (ManualResetEvent event = new ManualResetEvent())
   {
       // The async callback which will process the data.  You will need
       // separate code for each if they have different routines to
       // populate the data table.
       AsyncCallback callback =
           delegate(IAsyncResult ar)
           {
               // Get the request from the state.
               HttpWebRequest request = ar.AsyncState as HttpWebRequest;

               // Call EndGetResponse.
               using (HttWebResponse response = (HttpWebResponse)
request.EndGetResponse(ar))
               {
                   // Add to the data table here.  This is the code
specific to the request.
                   // You have to synchronize access to the table as well.
                   lock (dt)
                   {
                       // Process the response here and add the rows you
need to.
                   }
               }

               // Decrement the count on the requests to complete.  If it
is
               // zero, then fire the event.
               if (Interlocked.Decrement(ref requestsToComplete) == 0)
               {
                   // Set the event.
                   event.Set();
               }
           };

       // Begin the calls here.
       wr1.BeginGetResponse(callback1, wr1);
       wr2.BeginGetResponse(callback2, wr2);
       wr3.BeginGetResponse(callback3, wr3);

       // Wait on the event here.
       event.WaitOne();

       // At this point, the data table will be populated, so you can
return it.
       return dt;
   }
}

Signature

         - Nicholas Paldino [.NET/C# MVP]
         - mvp@spam.guard.caspershouse.com

>I have a webservice that gets data from three websites and puts the
> result into a datatable and returns that datatable.
[quoted text clipped - 26 lines]
>
> Thanks
Nightcrawler - 07 Dec 2007 19:51 GMT
Nicholas,

Many thanks for your input.

Yes, you are right, this is a web environment so it will be called
alot. Also, yes, each call will have a different routine as to how to
work with the data so I will have to setup three different
AsyncCallback callbacks.

I will dive into this right away.

Thanks a bunch!
Peter Duniho - 07 Dec 2007 20:30 GMT
> Yes, you are right, this is a web environment so it will be called
> alot. Also, yes, each call will have a different routine as to how to
> work with the data so I will have to setup three different
> AsyncCallback callbacks.

For the record, the previous code you posted illustrating what you're  
doing uses the same method to process all three requests.  This suggests  
that you only need one callback method as well.  If there are specific  
parameters guiding each specific request, those can easily be incorporated  
into the anonymous method (in fact, IMHO it can be easier when using an  
anonymous method than if you had to pass them directly, as long as you  
watch out for variable capturing).

Pete
Nightcrawler - 07 Dec 2007 21:42 GMT
Thank you both for your input. I greatly appreciate it.

I will test it out and see what kind of improvement I will be able to
get in my webservice requests in terms of time.

Thanks

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.