Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / CLR / December 2007

Tip: Looking for answers? Try searching our database.

interrupting flow of a function and/or yielding control

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
dB. - 21 Dec 2007 20:41 GMT
I am trying to build a workflow system for database detection that
needs to perform thousands of detections in parallel. Most of the time
the detectors sit waiting on network IO to do something. The actual
detector code is fairly thick, with a third party implementing the
actual detectors.

What I am looking to provide is an engine in which I can create a list
of 1000 detectors, then execute the code in each detector all at the
same time without spawning 1000 threads.

Most of the delay is on the network. If I were writing the detectors
myself, I could do something like this:

BeginDetection()
{
...
send a packet
begin receive a packet (callback on OnReceivePacket)
}

OnReceivePacket(...)
{
... finish detection
}

Rebuilding all detectors this way is cumbersome for our purpose.

The question is: can CLR do something for me in terms of interrupting
the flow of a function, saving the stack and coming back to it? The
detectors could yield control too in the appropriate places
explicitly.

Thx
dB.
Jeroen Mostert - 21 Dec 2007 23:35 GMT
<snip>
> What I am looking to provide is an engine in which I can create a list
> of 1000 detectors, then execute the code in each detector all at the
> same time without spawning 1000 threads.

Well, you obviously can't literally do that -- if you want to execute the
code "all at the same time" then multiple threads are inevitable, otherwise
the code can at best be "not quite at the same time". However, I assume that
you just mean that you'd like to limit the number of active threads by not
dedicating them to waiting on I/O.

> Most of the delay is on the network. If I were writing the detectors
> myself, I could do something like this:
[quoted text clipped - 10 lines]
>  ... finish detection
> }

Indeed, that's the classic implementation of asynchronous requests, which
will use a very efficient mix of thread pooling and completion ports in .NET.

> Rebuilding all detectors this way is cumbersome for our purpose.

I nevertheless strongly suggest that you consider it. A rewrite to move to
asnychronous I/O, while costly, is something you only do once (well, for
every codebase) and it continues to pay off. Although it may be
"cumbersome", for the most part it's not hard.

> The question is: can CLR do something for me in terms of interrupting
> the flow of a function, saving the stack and coming back to it? The
> detectors could yield control too in the appropriate places
> explicitly.

What you're asking for is a coroutine, something which is not natively
implemented by the framework. That said, you can implement this using the
unmanaged hosting interfaces (that means leaving the comfortable world of
.NET and entering the harsh environs of C++ and Win32). The CLR associates
managed "tasks" with OS threads, and the CLR host can control this
assignment. Take a look at the IHostTaskManager interface and the ICLRTask
interface, and especially the SwitchIn() and SwitchOut() methods of the
latter. Using these, I suspect you could build a coroutine implementation
fairly straightforwardly, possibly using Win32 fibers to ease some of the
load (though there are many, many "details" to get right).

Even so, what you want is not quite comparable to a pure coroutine scenario.
Even if your detector can yield explicitly, it cannot do so *during* I/O
(because that code is not under your control), so it could at best yield
*between* I/O requests. But if I/O is what you're mostly doing, this is of
little use. Your threads will still be preoccupied with idling on I/O.
Managing threads explicitly will allow you to cut down on the number of
threads, but if those few threads are mostly busy doing nothing you haven't
gained much in terms of scalability. Even an unmanaged host has no way of
detecting when code is "waiting on I/O" to reliably switch out the task.

You can detect when the task is waiting in general, though. Implementing the
IHostSyncManager interface will give you precise control over the
synchronization primitives used by the managed code, and most synchronous
I/O is implemented by eventually using one of these primitives to wait for
completion. However, leveraging this effectively to turn threads into
coroutines without introducing deadlocks is a daunting task, to say the
least. Multithreaded programming is difficult enough without having to worry
about the implementation of the synchronization primitives themselves.

If the above sounds complicated to you, that's because it is. If you really
want to go this route, then pick up Steven Pratschner's "Customizing the
Microsoft .NET Common Language Runtime" (ISBN 9780735619883). This book is
pretty much not optional if you want to sink your teeth in hosting, because
the documentation, while pretty good, does not give you the big picture, let
alone the many pitfalls. It took me a good deal of two weeks to implement a
pretty simple host that uses AppDomains as lightweight, reliable,
restartable processes, and that doesn't even touch the more difficult
aspects of hosting. Something as dramatic as what you're asking for sounds
like a multi-month project for the uninitiated, and that's assuming you're
willing/able to muck about with unmanaged code in the first place.

If all you're concerned about is the amount of code you'll have to write to
make the detectors use asynchronous I/O, then you're probably still better
off hacking together some sort of code generator/translator that will
convert the synchronous calls to asynchronous ones for you, or possibly
doing even more dramatic rewrites of the code. While not a pretty solution,
it's still much less involved than implementing coroutines at a low level.

Signature

J.

Dave Farquharson - 22 Dec 2007 00:08 GMT
This is a pretty awesome and well thought out reply, and as someone who just
implemented a CLR hosting component myself I also heartily recommend Steven
Pratschner's book if you're thinking about using CLR hosting. It would have
taken me 4 times as long to get working without that book.

-dave

> <snip>
>> What I am looking to provide is an engine in which I can create a list
[quoted text clipped - 90 lines]
> solution, it's still much less involved than implementing coroutines at a
> low level.
Ben Voigt [C++ MVP] - 24 Dec 2007 19:33 GMT
> <snip>
>> What I am looking to provide is an engine in which I can create a list
[quoted text clipped - 40 lines]
> What you're asking for is a coroutine, something which is not natively
> implemented by the framework. That said, you can implement this using the

Actually, the C# yield return statement implements coroutines.

You're correct that it doesn't help with I/O particularly.... unless....
oooh I have an idea.

Make an IEnumerable interface that yield returns some I/O descriptor, which
a main loop will start asynchronously, providing an appropriate callback.
The callback will invoke IEnumerable.GetNext() on the associated detector,
starting the next I/O asynchronously.
Jeroen Mostert - 24 Dec 2007 22:58 GMT
<snip>
>> What you're asking for is a coroutine, something which is not natively
>> implemented by the framework. That said, you can implement this using the
>
> Actually, the C# yield return statement implements coroutines.

Well, sort of. It's not intended as a general coroutine construct, since the
routine ends aren't equals (GetNext() decides which iterator to call, but
the iterator doesn't decide which routine to yield control to). "Iterators
with closures" is more like it. But yeah, with a prearranged control type
for a return value, you could probably implement every coroutine scenario.
If you don't mind some pretty unintuitive code.

> You're correct that it doesn't help with I/O particularly.... unless....
> oooh I have an idea.
[quoted text clipped - 3 lines]
> The callback will invoke IEnumerable.GetNext() on the associated detector,
> starting the next I/O asynchronously.

This sounds like much more work than the OP was gunning for. (Of course, my
hosting suggestion is even *more* work, but hey.)

There are still issues with scalability in this approach, only now you've
shifted the threading issues to the thread pool (assuming this is what we'll
use to kick off the asynchronous requests). The main problem here is that,
whichever way you slice it, the I/O will be done synchronously, so it'll tie
up a thread. If you care about the number of threads involved, you just
can't have 1,000 detectors going simultaneously, because it's going to take
1,000 threads. The only real solution is to rewrite the I/O to be asynchronous.

Using the thread pool does have the benefit of not blowing up the system,
since there's a limit to the number of requests it can have in flight. If
you *must* do lots of synchronous things with as much parallelism as
possible, the thread pool is probably the best way to go.

Actually, the detectors *might* just be doing lots of small I/O requests
that could profitably be broken up with a coroutine pattern, but the OP
hasn't really made it clear whether this is the case. Nor is it clear how
much rewriting of the detector code is acceptable, really.

Signature

J.

Barry Kelly - 25 Dec 2007 00:30 GMT
> > You're correct that it doesn't help with I/O particularly.... unless....
> > oooh I have an idea.
[quoted text clipped - 3 lines]
> > The callback will invoke IEnumerable.GetNext() on the associated detector,
> > starting the next I/O asynchronously.

> There are still issues with scalability in this approach, only now you've
> shifted the threading issues to the thread pool (assuming this is what we'll
[quoted text clipped - 3 lines]
> can't have 1,000 detectors going simultaneously, because it's going to take
> 1,000 threads. The only real solution is to rewrite the I/O to be asynchronous.

If the I/O descriptor that the iterator yields to the main loop is a
delegate which performs the operation asynchronously (and the main loop
calls the delegate directly, rather than on a threadpool), then it could
work properly (assuming recommunication back of the return value /
exceptions etc. that occur on the End* call). It all depends on the I/O
descriptor.

Rather better, to my mind, would be to code in a continuation passing
style, passing a delegate as the AsyncCallback. Transformation of normal
C# code to CPS is actually mechanical. Compilers could and probably
should be able to do this automatically (see my blog for details - I
talked about this over a year ago, and something in Volta is vaguely
similar, so the latest post has a link back to it).

MS seems to have had a penchant for this weird stateful event
subscription style ("On*" event handlers) of async lately, which I don't
understand, because it's harder to use, IMHO - it requires subscribing
and unsubscribing before and after calls to the asynchronous method,
with even more jiggery pokery than async I/O already requires.

-- Barry

Signature

http://barrkel.blogspot.com/

Ben Voigt [C++ MVP] - 26 Dec 2007 20:35 GMT
>> > You're correct that it doesn't help with I/O particularly....
>> > unless....
[quoted text clipped - 15 lines]
>> whichever way you slice it, the I/O will be done synchronously, so it'll
>> tie

I specifically called for the I/O to be started asynchronously.

>> up a thread. If you care about the number of threads involved, you just
>> can't have 1,000 detectors going simultaneously, because it's going to
[quoted text clipped - 8 lines]
> exceptions etc. that occur on the End* call). It all depends on the I/O
> descriptor.

If only it were possible to assign a completion handler using the
IAsyncResult then that would be an ideal candidate for the iterator
implementation to return.  The iterator implementation would call yield
return BeginRead, then EndRead and process the result and exception locally.
The main loop would need to attach a completion routine that called
IEnumerator<IAsyncResult>.GetNext() on the object.  Of course, if the
asyncState parameter is consistently set to the IEnumerator<IAsyncResult>
object then a single completion routine could be reused everywhere.

Something like:

void theCallback(IAsyncResult r) {
((IEnumerator<IAsyncResult>)r.AsyncState).GetNext(); }

...
IAsyncResult iar = s.BeginRead(..., theCallback, this);
yield return iar;
int bytesRead = s.EndRead(iar);
...

a very straightforward conversion from synchronous to coroutine-based I/O.

> Rather better, to my mind, would be to code in a continuation passing
> style, passing a delegate as the AsyncCallback. Transformation of normal
[quoted text clipped - 8 lines]
> and unsubscribing before and after calls to the asynchronous method,
> with even more jiggery pokery than async I/O already requires.

It also forces you to think in terms of event-driven finite state machines
instead of sequential tasks, which makes it a lot easier to centralize error
handling and more likely to correctly handle exceptional states (i.e. the
operation never completed, something happened out of order, etc).

> -- Barry

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.