Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / CLR / June 2005

Tip: Looking for answers? Try searching our database.

Threading scenario - best approach ?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Netveloper - 16 Jun 2005 09:30 GMT
Hi,

In one of my classes I have a method, lets call it Fetch, which will collect
data
from various sources and return the combined result. Each of  the sources
can
take between 5-15 seconds to collect so I would like to incrcease the
performance by introducing multi-threading support for the actuall
collecting of
data. So the Fetch method should spawn of the works and block until all of
the
workers has finished (or failed).

I have done some lite reading and would like some feedback on what the best
approch would be to implement this scenario.

SCENARIO #1 - Using Thread

I thought about creating a worker thread which will collect information from
a
source and return the result. This worker thread would have to be able to
take
two parameters (used to determine what data to get) and return an array of
objects.

The Fetch method could create a new worker object for each datasource, pass
the correct (two) parameters for it, inform it about which callback to use
to signal
it's completion and pass back the return data to and then start it in a new
thread.

Once all of the worker threads is running the Fetch method would enter
something
like this (note VB.NET as example, could just as well be C# since I code in
both)

   For Each WorkerThread In Workers
      WorkerThread.Join()
   Next

   Return CombinedResult

Thoughts and/or suggestions? Advantages/Disadvantages?

SCENARIO #2 - ThreadPool

Just like SCENARIO #1 but I would use the ThreadPool instead. How would I
wait for all the threads to finish before returning, i.e blocking the Fetch
method until
all workes has finished (or failed) ?

Thoughts and/or suggestions? Advantages/Disadvantages?

SCENARIO #3 - Async Delegates

I create a delegate which takes my worker process as a parameter. The
delegate is then
called using the async method, BeginInvoke and use an AsyncCallback to
gather and
combin the worker results. I would probably built this using the technique
posted by Mike
Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list

(watch for line-wrapping)
http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-
3&P=2534


to ensure EndInvoke was called, this avoiding a possible memory leak. If I
went down this
road, how would I make the Fetch method block until all of the async
operations had finished
(or failed) without having to resort to a busy-wait ?

All thoughts and suggestions will be apprechiated on this subject.
Thanks!
Stefan Simek - 16 Jun 2005 11:10 GMT
Hi,

I would recommend using your first scenario, as it is simple and
straightforward.
Using the threadpool would introduce additional complications due to the
maximum limit on threadpool thread count, and would not really improve
performance as we're talking about 5-15 second intervals.
The async delegates are essentially only a wrapper around threadpool, so
it's the same as above.

HTH,
Stefan

> Hi,
>
[quoted text clipped - 70 lines]
> All thoughts and suggestions will be apprechiated on this subject.
> Thanks!
Netveloper - 16 Jun 2005 12:00 GMT
Stefan,

Thank you for your thoughts. I'm also leaning towards scenario #1 and have
started writing a small prototyp. How would you suggest I wait for the
workers
to finish before the Fetch method return? I could perhaps do as described
below
by calling Join on all worker threads or perhaps pass an
Auto/ManualResetEvent
to each worker and have them singnal completion and in the Fetch method I'd
call WaitHandler.WaitAll

> Hi,
>
[quoted text clipped - 86 lines]
>> All thoughts and suggestions will be apprechiated on this subject.
>> Thanks!
Stefan Simek - 16 Jun 2005 12:23 GMT
Hi,

I think you can try both, but I guess the Join method will be OK, no
need to introduce another synchronization mechanism. Calling Join() on a
thread that has already finished will return immediately, so the foreach
... Join will do exactly what is expected - finish after all the threads
are done.

But I'm not trying to push you into anything - use the approach you are
most comfortable with.

Stefan

> Stefan,
>
[quoted text clipped - 98 lines]
>>>All thoughts and suggestions will be apprechiated on this subject.
>>>Thanks!
Netveloper - 16 Jun 2005 12:45 GMT
Hi,

No pressure felt :) I've gotten both to work, equally well and I would just
like to
understand the difference in approach. I guess there are
advantages/disadvantages
with using either of the approaches. Don't really like to use code without
understanding
exaclty what it is doing ;)

> Hi,
>
[quoted text clipped - 113 lines]
>>>>All thoughts and suggestions will be apprechiated on this subject.
>>>>Thanks!
Jon Skeet [C# MVP] - 16 Jun 2005 17:51 GMT
> No pressure felt :) I've gotten both to work, equally well and I
> would just like to understand the difference in approach. I guess
> there are advantages/disadvantages with using either of the
> approaches. Don't really like to use code without understanding
> exaclty what it is doing ;)

Personally I'd use Join - no need to create any events you don't need,
and it does exactly what it says on the tin.

If you want to use a custom threadpool for this, by the way, you could
use the one I've written:
http://www.pobox.com/~skeet/csharp/miscutil

You could subscribe to the event which is fired after a thread job has
finished to synchronize the main thread. (Of course, you wouldn't be
able to use Thread.Join in that scenario.)

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Andreas Håkansson - 16 Jun 2005 21:34 GMT
Jon,

Thanks for your feedback. I've been thinking about leveraging a timeout so
that
the collecting of data wont block indefinitely. I saw that both the Join and
WaitAll
methods accepted an optional timeout parameter.

However the functionality provided by them aren't interchangable since using
a
timeout the Join method will make the first thread run a maximum time of x
(the
timeout), the next thread will run 2*x, the next 3*x and so on. With
WaitAll, all
threads will get the same change to execute before the method stops blocking
the execution of the main thread.

The timeout, however, makes me wonder about the left worker threads. They
will continute executing in the background until they are finished. Do I
have to
clean them up myself, if so then how? What about, for example, if of the
worker threads
calls a webservice and for some reason is unable to establish a connection,
leaving it waiting for it's own timeout which could have been increased
beyond
the default time. This would leave the worker threads hanging around for a
long
time even though the main thread timed out and continued executing.. =/

> > No pressure felt :) I've gotten both to work, equally well and I
> > would just like to understand the difference in approach. I guess
[quoted text clipped - 12 lines]
> finished to synchronize the main thread. (Of course, you wouldn't be
> able to use Thread.Join in that scenario.)
Jon Skeet [C# MVP] - 16 Jun 2005 21:46 GMT
> Thanks for your feedback. I've been thinking about leveraging a
> timeout so that the collecting of data wont block indefinitely. I saw
[quoted text clipped - 7 lines]
> change to execute before the method stops blocking the execution of
> the main thread.

True. Note, however, that there is an alternative to using
Auto/ManualResetEvents - you can use Monitor.Wait and Monitor.Notify.
Personally, I prefer these - they feel more idiomatic .NET somehow,
rather than being Win32 shims. (They also perform very slightly better
if I remember rightly, but the difference isn't significant.)

You could make each worker thread decrement a counter (which is set by
the main thread) and when the last worker thread decrements it to 0, it
could notify the monitor.

> The timeout, however, makes me wonder about the left worker threads.
> They will continute executing in the background until they are
[quoted text clipped - 5 lines]
> for a long time even though the main thread timed out and continued
> executing.. =/

See http://www.pobox.com/~skeet/csharp/threads/shutdown.shtml for
general guidance about stopping tasks in a controlled way.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Andreas Håkansson - 17 Jun 2005 00:06 GMT
Jon,

How would the timeout be implemented using a Monitor ?

Andreas Håkansson <andreas@spamproof.selfinflicted.org> wrote:
> Thanks for your feedback. I've been thinking about leveraging a
> timeout so that the collecting of data wont block indefinitely. I saw
[quoted text clipped - 7 lines]
> change to execute before the method stops blocking the execution of
> the main thread.

True. Note, however, that there is an alternative to using
Auto/ManualResetEvents - you can use Monitor.Wait and Monitor.Notify.
Personally, I prefer these - they feel more idiomatic .NET somehow,
rather than being Win32 shims. (They also perform very slightly better
if I remember rightly, but the difference isn't significant.)

You could make each worker thread decrement a counter (which is set by
the main thread) and when the last worker thread decrements it to 0, it
could notify the monitor.

> The timeout, however, makes me wonder about the left worker threads.
> They will continute executing in the background until they are
[quoted text clipped - 5 lines]
> for a long time even though the main thread timed out and continued
> executing.. =/

See http://www.pobox.com/~skeet/csharp/threads/shutdown.shtml for
general guidance about stopping tasks in a controlled way.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jon Skeet [C# MVP] - 17 Jun 2005 00:12 GMT
> How would the timeout be implemented using a Monitor ?

Using the call to Wait which takes a timeout.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

john conwell - 17 Jun 2005 00:11 GMT
if you go with manually creating your own threads, i'm a bigger fan of using
an AutoResetEvent with a WaitAll() call, rather than Join().  It just seems
more elegant for managing a large colleciton of threads

> Stefan,
>
[quoted text clipped - 98 lines]
> >> All thoughts and suggestions will be apprechiated on this subject.
> >> Thanks!
Jon Skeet [C# MVP] - 17 Jun 2005 00:14 GMT
> if you go with manually creating your own threads, i'm a bigger fan of using
> an AutoResetEvent with a WaitAll() call, rather than Join().  It just seems
> more elegant for managing a large colleciton of threads

I suspect I'm biased because of my history here. Coming from a Java
background, I'm very familiar and happy with Monitor.Wait/Pulse etc,
but not so happy with *ResetEvents. I know of other developers who've
come from a Win32 background and feel exactly the opposite.

Either will work perfectly well, of course :)

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

john conwell - 16 Jun 2005 23:39 GMT
First, are the Fetch methods getting data on a different server?  or the
server the app is running on?  does the server that the threads are running
under have multiple procs or just one.  if its just one, then you are more
likely to slow your app down then speed it up.  the same amount of processing
has to get done, but now you are tossing on thread mgt and context switching
into the mix.  only do this if your app is distributed or the server is
multi-proc.

As far as which way to go, I'm going to have to disagree.  I'd go with
solution 3, async delegates.  First async delegates have a simple way to wait
for all threads to finish.  just collect all the returned
IAsyncResult.AsyncWaitHandles into an array and call WaitHandle.WaitAll,
passing in the array.  This will pause the main thread until all delegates
are finished running.  doesnt get much easier.

Also, as far as performance goes, the perf cost of initializing 5 - 10 new
manual threads is much more than utilizing the pre-existing threads already
initialized in the thread pool.  As far as a threadpool max count is
concerned, this shouldnt be an issue either.  If you call
ThreadPool.GetMaxThreads you'll see how many threads can be created in the
pool.  On my system its 100 (not sure if this is different per OS version or
not).  And if your plan on running more than 100 async tasks you should
rethink this also, as this would probably bog down the CPU with all the
processing and context switching.  The threadpool can manage multiple threads
quite well, and by the time you are ready to kick off your last thread, the
first thread might be finished.  in that case the thread pool will just reuse
an existing thread instead of create another.

Remember creating threads is a fairly significant performance hit.

> Hi,
>
[quoted text clipped - 70 lines]
> All thoughts and suggestions will be apprechiated on this subject.
> Thanks!
Netveloper - 17 Jun 2005 08:01 GMT
John,

Thanks for your feedback. Well lets see. The system is a multi cpu setup
with ample
amount of memory and a disc system with good throughput. The data sources
are
not located on the same machine, all are on remove web services and
rdbms.When
it comes to using async delegates I really wouldn't base my descision based
on your
arguments (this is not to say async delegates wouldn't be a good solution).

The reasons being that collecting the return data by collecting the wait
handlers and
doing a WaitAll on them is not different from doing the same when manually
spawning
your own threads (with the help of Auto/ManualResetEvent objects), calling
Join on
each method, or like Jon suggested - using a Monitor.

Also if you concider my breif description of the data collection, it will
take between
5-15 (could take longer) seconds, averaging around 10 seconds. Now with this
time
fame in mind, the cost of spawning a new thread and any context switching
that might
take place every now and then, is faily cheep. If you don't concider the
context, then
sure thread creation and context switching are expensive operations.

The default size of the thread pool is 25, and it's defined in the
processModel node of
machine.config. The pool is self is the mest intressting point for using
either scenario 2
or three. There is no denying that using the pool to recycle threads will
boost performance,
how much is hard to tell since we're speaking in relative terms of the
actuall collecting
of data. If I have a need to create x-threads for each call to Fetch and
there are y-calls
to Fecth each second/minute then I might as well funnel them threw the pool.

But.. the thread pool wouldn't be exclusive to my Fetch method, it would be
shared for
my application (which btw is a web-application) and if there are any async
operations
etc elsewhere then it will eat away on the pool - leaving for the
possibility for the worker
threads of the Fetch method to queue up and wait, resulting in a decrease in
performance.
Increasing the size of the thread pool could solve this.

Sorry if I'm not very cohesive here, but I only got a couple of hours of
sleep last night
and I admit that I'm just ranting what ever thoughts spring into my head
while replying to
your post :-)

> First, are the Fetch methods getting data on a different server?  or the
> server the app is running on?  does the server that the threads are
[quoted text clipped - 121 lines]
>> All thoughts and suggestions will be apprechiated on this subject.
>> Thanks!
john conwell - 17 Jun 2005 21:06 GMT
Oh, its a web app...That really makes a difference.  From my experience you
definitly dont want to use the treadpool then, because you would be stealing
threads from your sites request handler, since it also uses the thread pool
to service new http requests.

I've played around with this a lot and its hard to find a good mix when each
request could kick off multiple threads.  Definitly use a web site load test
tool (such as ACP) to prove if you actually sped things up or slowed them
down.

I had a site that for a speific request needed to get 7 result sets of data
(from a web service).  I tried many combinations of threading.  One thread
per result set, 2 result sets per thread, thread pool, manual threads.  in
the end with the site under moderate load, the fastest method was to do it
synchrounously.  these web service calls were pretty short, so under your
situation you would get better results since each call takes 10 - 15 seconds.

Another thing to consider is to create a custom IHttpHandler to intercept
all calls to this page and kick off the threads in the ProcessRequest()
method.  Then forward the request on the to desired page to be processes.  
Then in that page sync back up with the threads using Join().  This way the
threads can get some extra process time in before they have to sync back up.

> John,
>
[quoted text clipped - 179 lines]
> >> All thoughts and suggestions will be apprechiated on this subject.
> >> Thanks!

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.