Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / January 2008

Tip: Looking for answers? Try searching our database.

How do you kill a completly locked up thread?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
TheSilverHammer - 16 Jan 2008 14:20 GMT
Because C# has no native SSH class, I am using SharpSSH.    Sometimes, for
reasons I do not know, a Connect call will totally lock up the thread and
never return.  I am sure it has something to do with weirdness going on with
the server I am talking to.   Anyhow, this locked up state happens once in a
while (maybe once per day) and I can't figure out how to deal with the locked
up thread.

If I issue a Thread.Abort()  the exception never gets thrown in the thread
because it is locked up.   This seems to be the only C# method I know of to
kill a thread.  Is there some other way to kill off a thread?

A way you can simulate this yourself, is create any thread that connects to
a server where the connection takes some time, like 10 to 20 seconds.    When
the thread is doing this connect (it will happen with even a simple TCP/IP
socket connect) issue a Thread.Abort() from another thread (the one that made
the Thread Object) and you will see that the ThreadAbortException will NOT be
thrown until the Connect call returns.

Another way you can do this is after the connect call is finished and you
start to talk to a server, if  you are on a recive data call and the server
stops sending data but never closes the connection, it will block forever.  
You will once again not be able to get Thread.Abort() to kill the locked up
thread.

Is there anyone, especially a MSVP who can answer this?
Peter Bromberg [C# MVP] - 16 Jan 2008 14:35 GMT
Here is an article with an approach that allows to make any method call
"time-outable":
http://www.eggheadcafe.com/tutorials/aspnet/847c94bf-4b8d-4a66-9ae5-5b61f049019f
/basics-make-any-method-c.aspx

-- Peter
Site: http://www.eggheadcafe.com
UnBlog: http://petesbloggerama.blogspot.com
MetaFinder: http://www.blogmetafinder.com 

> Because C# has no native SSH class, I am using SharpSSH.    Sometimes, for
> reasons I do not know, a Connect call will totally lock up the thread and
[quoted text clipped - 21 lines]
>
> Is there anyone, especially a MSVP who can answer this?
TheSilverHammer - 16 Jan 2008 16:57 GMT
Here is a code snippit from an asynch callback that I am sure is one of the
causes of my thread being locked up when the SharpSSH shell object dies.

ReadDataCallback = new AsyncCallback(OnReadData);
                       shell.IO.BeginRead(RecvBuff, 0, RecvBuff.Length,
ReadDataCallback, null);

                       while (true == shell.ShellOpened)
                       {
                           // See if we have data to send
                           lock (SendBuffer)
                           {
                               if (0 != SendBuffer.Length)
                               {
                                   shell.Write(SendBuffer);
                                   SendBuffer = string.Empty;
                               }
                           }

                           Thread.Sleep(50);
                       }

I am not sure how to set the ReadDataCallback up so that it can recover from
a hard lockup from the shell object.   The method on the egghead cafe page
doesn't seem to fit this very well.

> Here is an article with an approach that allows to make any method call
> "time-outable":
[quoted text clipped - 29 lines]
> >
> > Is there anyone, especially a MSVP who can answer this?
Ben Voigt [C++ MVP] - 16 Jan 2008 15:37 GMT
> Because C# has no native SSH class, I am using SharpSSH.    Sometimes, for
> reasons I do not know, a Connect call will totally lock up the thread and
[quoted text clipped - 10 lines]
> to
> kill a thread.  Is there some other way to kill off a thread?

If you need to terminate a thread while it's running native code, especially
inside a kernel call, you have no way of knowing what state it is modifying
and keeping it coherent.  You have to assume the whole process is corrupted.

The only safe way to forcibly end a failed thread like you have is to end
the process containing it.

Do you have access to the socket handle for the connection?  If you shutdown
(non-gracefully by setting SO_DONTLINGER) the socket from a different thread
then that will probably cause the stuck operation to complete immediately.
TheSilverHammer - 16 Jan 2008 16:43 GMT
Wow, these are some fast replies.  Normally I can go several days without
one.  

Anyway, I am using the SharpSSH class which I did not write, however, I do
have the source code and I suppose I could dig through it to find the socket
calls.  

However I am not sure where all the deadlocks are happening in it so it
would be  very hard to catch all the problems.  In some cases I think the
connect succeeds and a lockup may occur in a receive data callback which runs
in the main thread of that instance of my object (opposed to the SSH object).
 

Ill look at the egg-head café solution, but I am not sure how applicable it
can be to all the instances of a lockup.   For example, if an event handler
in your class has been called by another object (IE: SharpSSH asynch callback
for on data received) you can't wrap that in a method call that can time out
can you?

> > Because C# has no native SSH class, I am using SharpSSH.    Sometimes, for
> > reasons I do not know, a Connect call will totally lock up the thread and
[quoted text clipped - 21 lines]
> (non-gracefully by setting SO_DONTLINGER) the socket from a different thread
> then that will probably cause the stuck operation to complete immediately.
TheSilverHammer - 16 Jan 2008 17:25 GMT
> If you need to terminate a thread while it's running native code, especially
> inside a kernel call, you have no way of knowing what state it is modifying
> and keeping it coherent.  You have to assume the whole process is corrupted.
>
> The only safe way to forcibly end a failed thread like you have is to end
> the process containing it.

Is there an unsafe way to kill it?  I know it can be done, such tools like
process explorer can let me select a single thread of my app and kill it.
Jeroen Mostert - 16 Jan 2008 19:26 GMT
> Because C# has no native SSH class, I am using SharpSSH.    Sometimes, for
> reasons I do not know, a Connect call will totally lock up the thread and
[quoted text clipped - 6 lines]
> because it is locked up.   This seems to be the only C# method I know of to
> kill a thread.  Is there some other way to kill off a thread?

Yes, the unmanaged TerminateThread(). However, this doesn't work, in that it
will kill off the thread, but leave approximately zero chance for your
application to continue running successfully. You are guaranteed to corrupt
internal state with this, especially since the CLR gets no chance to cleanly
release resources associated with that thread. Seriously, don't do this.
Your application will probably just deadlock later on the locks the
terminated thread was holding, if it doesn't just crash on corrupted state.

Also, there's no obvious way to find the thread that's blocking. For one
thing, you can kill off the thread corresponding to the Thread object, but
this is not guaranteed to be the thread doing the actual blocking I/O, it
might just be waiting on another thread. As a result, you've just leaked a
thread that's still busy blocking, and worse, the actual I/O is still in
progress, so the socket is unusable. You don't want to repeat this exercise,
as it's a good way to run out of resources fast.

> A way you can simulate this yourself, is create any thread that connects to
> a server where the connection takes some time, like 10 to 20 seconds.    When
> the thread is doing this connect (it will happen with even a simple TCP/IP
> socket connect) issue a Thread.Abort() from another thread (the one that made
> the Thread Object) and you will see that the ThreadAbortException will NOT be
> thrown until the Connect call returns.

Correct. The thread is blocking on I/O, in unmanaged code. You can't end it,
and this is more or less by design. But you shouldn't be too dismayed,
because Thread.Abort() is a bad idea for the same reasons TerminateThread()
is. If a thread needs to end, it should be designed to have exit points
where the application state is known, and it can check a flag or issue a
wait on a user object at those points. Raising an exception in the middle of
anywhere is a good way of corrupting global state.

> Another way you can do this is after the connect call is finished and you
> start to talk to a server, if  you are on a recive data call and the server
> stops sending data but never closes the connection, it will block forever.  
> You will once again not be able to get Thread.Abort() to kill the locked up
> thread.

Same thing.

> Is there anyone, especially a MSVP who can answer this?

I'm not an MSVP but I've seen this so many times in our codebase that it's
not funny anymore. The one way to cancel pending I/O on a socket and unwedge
threads blocking on that is to close the socket from another thread and
handle the resulting exceptions. Nothing else will do, at least nothing that
can be called reliable. Of course, this means tearing down the connection,
but that's still a whole lot better than tearing down your process.

The other alternative, which is less straightforward but suits some designs
better, is to make sure that threads never issue I/O which can take
"forever". Almost every I/O call has a timeout parameter, and for those that
don't there's always asynchronous I/O and
ThreadPool.RegisterWaitForSingleObject(). When the call returns with a
timeout, either poll, decide to wait some more or give up and close the
socket, which you can then do from the same thread that owns the socket,
simplifying error handling.

I understand it's not your code, but trust me: you'll want to rewrite it
anyway, unless you can afford restarting your application every so often.

Signature

J.

TheSilverHammer - 16 Jan 2008 20:43 GMT
> I'm not an MSVP but I've seen this so many times in our codebase that it's
> not funny anymore. The one way to cancel pending I/O on a socket and unwedge
[quoted text clipped - 14 lines]
> I understand it's not your code, but trust me: you'll want to rewrite it
> anyway, unless you can afford restarting your application every so often.

Grrr..  Damm post thing asked me to login again and ate me post...  Anyway...

The SharpSSH code base has a bunch of classes and would take a major effort
to re-write.  It is clear that it is unfinished from looking at it.   I am
not sure my company wants to fund me re-writing this code set.

However, the solution Peter Bromberg gave on his web site looks good except
for what appears to me to be a big hole or leak.  I do not understand how C#
handles this, so maybe it is a non issue.  The following is the code segment
from his web site, I hope he doesn't mind me posting it:

public ArrayList DoWorkNeedsTimeout(ArrayList alin, int secondsToWait)
      {

          ArrayList alOut = new ArrayList();

          //Create an instance of our delegate, pointing to the helper
method:

          DoWorkNeedsTimeoutDelegate deleg = new
DoWorkNeedsTimeoutDelegate(DoWorkWithTimeout);

          // Call BeginInvoke on delegate.
          // Note on last two parameters of Delegate BeginInvoke Method:
          // 1) callback: not used here, we can pass null
          // 2) state: not used, pass an instance of object in the required
parameter location
          // Invoke the delegate passing the parameters and get the
IAsyncResult object in "ar":

          IAsyncResult ar = deleg.BeginInvoke(alin, secondsToWait, null,
new object());

          // if the WaitOne method times out before we get a result, it
will be false:
          if (!ar.AsyncWaitHandle.WaitOne(5000, false))
          {

              // handle timeout logging / notification here - Syslog,
Database, Email - whatever you need
              alOut.Add("TIMED OUT!");
          }

          else  // we didn't time out:
          {
              // get the result of the method call here
              alOut = deleg.EndInvoke(ar);
          }

          return alOut;

      }    

What he is doing is making a delegate to call BeginInvoke with and then
using the IAsyncResult to wait for a time peroid.  If the time peroid
expires, then his thread continues on.  If it doesn't expire, he calls
EndInvoke().   This looks good except for the issue of dealing with a truely
locked-up thread.

BeginInvoke() uses a thread from the thread-pool right?  So what happens if
that thread never returns so you can call End-Invoke?  Is it gone from the
thread-pool forever?  If you repeat this look 1000s of times and even if 1%
of the time you get a locked up thread, won't you run out of threads?

The only way this can work indefinitly, which it may, is if the Garbage
collector will reclaim the thread once the delegate and other related objects
are out of scope.  Is this how it works?
Jeroen Mostert - 16 Jan 2008 22:26 GMT
>> I'm not an MSVP but I've seen this so many times in our codebase that it's
>> not funny anymore. The one way to cancel pending I/O on a socket and unwedge
[quoted text clipped - 20 lines]
> to re-write.  It is clear that it is unfinished from looking at it.   I am
> not sure my company wants to fund me re-writing this code set.

SSH is widely implemented, though, and you will probably want a proven
implementation, given the security concerns. Delegating to a good unmanaged
library (if the interface isn't too horrible to P/Invoke to) may be a better
option. You can also consider using an ActiveX control: there's good support
for this in .NET, and standalone components were all the rage in the VB days
for a reason. Alternatively, use a standalone SSH application and pull its
strings from the managed application, which is an ugly but venerable hack.
Last but certainly not least -- write it in another language where you do
have a mature library at your beck and call.

.NET still suffers from the "everything old is new again" syndrome where
everyone is reinventing the wheel in the new languages, which under
circumstances can be a big waste of time and money. Just because you're now
using C# doesn't mean all your libraries have to be. I see my colleagues
falling into the same trap; one of them tried to "leverage" a Java library
by automatically converting it to C# and then ignoring the warnings. The
results were, as you can imagine, not pretty, and guess who got to fix the
crashes? Meanwhile, the Java applications continued to run just fine with
their "old" library and "legacy" code.

> However, the solution Peter Bromberg gave on his web site looks good except
> for what appears to me to be a big hole or leak.  I do not understand how C#
[quoted text clipped - 38 lines]
>                alOut = deleg.EndInvoke(ar);
>            }

This is wrong. Every call to .BeginInvoke() must have a corresponding call
to .EndInvoke(), to free up any resources that .BeginInvoke() set up. This
is irrespective of whether you've happened to hit a timeout waiting on the
async handle. People violate this rule all over the place, though, because
it seems to work, but even when it actually does work (because
.BeginInvoke() happens not to claim any additional resources) it's a bad
habit to get into. Don't believe me, believe the MSDN:
http://msdn2.microsoft.com/en-us/library/2e08f6yc(VS.80).aspx

Of course, our hands are forced here because .EndInvoke() would block until
the underlying method actually completed, but this just demonstrates why
this can't actually work. You're leaving a delegate call up in the air, but
forgetting about it isn't going to make it go away. (In this case, we can
easily fix things by passing a callback to the .BeginInvoke() that will call
.EndInvoke(), but it's all irrelevant anyway if the delegate never completes.)

>            return alOut;
>
[quoted text clipped - 5 lines]
> EndInvoke().   This looks good except for the issue of dealing with a truely
> locked-up thread.

Yes, exactly. Wrapping everything in another asynchronous invocation does
*nothing* for the blocking problem. What you create here is just a wrapper
that can indeed be abandoned at will, but this doesn't cancel the underlying
blocking method, it just tosses aside the delegate invocation.

> BeginInvoke() uses a thread from the thread-pool right?

Yes.

> So what happens if that thread never returns so you can call End-Invoke?

You can always call .EndInvoke(). It will just block until the delegate
completes.

> Is it gone from the thread-pool forever?

Well, it's still a part of the thread pool, it just never becomes available
for other tasks again. So the number of available TP threads will steadily
decrease.

> If you repeat this look 1000s of times and even if 1% of the time you get
> a locked up thread, won't you run out of threads?

That's exactly what will happen, and it's easy to test. Use the above code
with a delegate that just does "for (;;) Thread.Sleep(10);" and observe.

This approach is only useful if you don't care that you can't abort an
action that goes on longer than your timeout, but you just need to log when
it does. It doesn't give you any magical ability to abort the action. The
action still needs to complete on its own eventually if you don't want to
run out of resources.

> The only way this can work indefinitly, which it may, is if the Garbage
> collector will reclaim the thread once the delegate and other related objects
> are out of scope.  Is this how it works?

No. If it worked that way, you could never have background threads unless
they were referenced by other threads. In a sense, a Thread object is always
"referenced" by the underlying thread. They're not collected until the
underlying thread exits, and if the underlying thread never exits, well,
that's too bad.

Signature

J.

TheSilverHammer - 17 Jan 2008 18:33 GMT
So the basic lesson here is that a locked up thread is unrecoverable.   The
only thing you can do about it is abandon the thread and move on.  If you
have an application which is supposed to run persistently for days or weeks
at a time, it will have to be restarted to reclaim the resources.

In my case, unless I do major repairs on the SharpSSH class, I will have the
occasional unrecoverable threads.

This kind of stinks.  I wonder if there was a way that MS could write a
thread that could be terminated safely.    If you can do that with a process,
why can't you do it with a thread?   Is there a way to create a process as a
thread that can be killed?
Peter Duniho - 17 Jan 2008 19:09 GMT
> [...]
> In my case, unless I do major repairs on the SharpSSH class, I will have  
> the
> occasional unrecoverable threads.

Yup.  One of the risks of using third-party code is that if the code sucks  
(whether because it's poorly designed or just a work in progress), there's  
not much you can do about it.  At least in this case, it sounds like you  
_could_ try to fix the library (I don't know anything about the library,  
so I'm just taking that from your comments).

> This kind of stinks.  I wonder if there was a way that MS could write a
> thread that could be terminated safely.    If you can do that with a  
> process,
> why can't you do it with a thread?

You can't really do it with a process either.

This isn't something that Microsoft can really solve.  The lack of safety  
has to do with what the code executing in the thread or process is doing,  
and in particular the inability for someone outside the code to know for  
sure what that is.  It is possible to write code that, if interrupted  
unexpectedly, leaves things in an indeterminate state.

If you are the one writing the code executing on a thread, there are some  
situations in which you could know that aborting the code is safe.  But if  
you're the one writing the code, there's no need to do so.  You can just  
design the code correctly, so that it's abortable in a well-defined way  
instead.

If you're not the one writing the code, then you don't know whether the  
nature of the code is such that it's safe to abort at some arbitrary point  
of execution.  Thus, it's not safe to do.  But there's not really any  
practical way for Microsoft to change that.  It's not about how the OS  
manages the thread, it's about the fact that code executing in a thread  
could be doing _anything_.

Pete
Jeroen Mostert - 17 Jan 2008 19:36 GMT
> So the basic lesson here is that a locked up thread is unrecoverable.   The
> only thing you can do about it is abandon the thread and move on.

Well, I'd phrase it differently: threads must never lock up *because*
there's no acceptable way to deal with them. If you've got a thread that
could block forever, you've got a bug, simple as that. You have to get an
answer if you ask "so what guarantees that this wait here will be satisfied
eventually?" and if the answer is "the kindness of strangers", you lose.

> If you have an application which is supposed to run persistently for days
> or weeks at a time, it will have to be restarted to reclaim the
> resources.

And that's assuming the application will clean up everything when it stops.
The OS will guarantee that most resources are released, but that's not the
same thing as exiting cleanly (an open file will be closed, for example, but
what's *in* the file when it is?)

> This kind of stinks.  I wonder if there was a way that MS could write a
> thread that could be terminated safely.    If you can do that with a process,
> why can't you do it with a thread?   Is there a way to create a process as a
> thread that can be killed?

You can't terminate a process safely either! The keyword here is "safely".
The best thing that happens when you kill off a process is that the OS will
reclaim the resources it associated with that process -- forcibly. For
memory, this doesn't matter; for a socket, this means a connection reset;
for a file, it's probably data loss. This is nothing to get enthusiastic
about, even if it's a good step up from crashing the computer.

Terminating a thread means your application state is hosed. There's nothing
the OS can do to make this "safe", since it knows diddly about your
application's internal state. It can't even track OS resources for every
thread to release them, because there's no notion of ownership beyond the
process. Threads share the process state, including any resources, so just
releasing anything a terminated thread allocated would be wrong.

Signature

J.

TheSilverHammer - 17 Jan 2008 22:58 GMT
If they can do it with processes, why can't they do it with threads?

I am sure they can't guarantee that everything will be fine if my code
doesn't anticipate a resources disappearing, but if I do, I should be able to
do it safely.

For example:

I have a MyThread and then I have the thread procedure which opens a bunch
of files, sockets, and all that.   If MyThread is killed, the OS can recover
all that stuff.  If MyApplication is the one calling the ThreadKill, then
windows should say, "OK, well you made it, so if you want to kill it, you
must know what you are doing."

If in my thread I do something like:

MyList = new List<string>;

And then when I kill the thread, windows says the List was created in the
thread and therefor will be nuked, it is my problem.  I could write my app in
such a way that I know where stuff was allocated so that I could expect
MyList to go away.  The CLR could go as far as making any references to
MyList null or just throwing an exception of I try and use it (besides
assigning it a new value).

All a thread has to be is a bag of 'stuff' and if it goes bad, toss it all
out, and as long as there are 'rules' which I can expect to follow, I could
deal with it.  They only need one simple rule:  If it was opened, allocated,
created in a thread, when the thread is killed (not exits) then it would be
Closed, freed, destroyed, etc...

Having said all that, I understand the sentiment about writing good code and
how none of this is necessary.  Unfortunately, that is a 'if the world were
perfect...' point of view in an imperfect world.

In this particular case, I need SSH, which for some reason Microsoft doesn't
seem to see fit as being a core protocol for C# (or .NET in general).   I
suggested this on the community sites, and got a 'resolved' and 'won't fix'
with no reasons supplied.  The only valid reason I can think of is because
SSH support is in the works, however after much googling I can't find any
hint about official MS SSH support.  With their big security push, and SSH
being a cornerstone in network security management, this makes absolutely no
sense.  Maybe they are waiting until the security crowd starts beating them
with a stick and hail it as yet another reason to use Linux.  How long would
it take a few of MS well trained developers to put out a great SSH suite for
.NET?  Ignoring the bureaucracy, it should only take a few actual weeks of
development time.

This leaves me with a choice of writing my own implementation or using some
other library.   My employer is not going to want me to spend several weeks
to write my own or fix this SharpSSH library.  Personally, I wouldn't mind,
but really, I have a lot to do.

Considering we are living in an imperfect world, we should try to be
accommodating.  Yes, the right thing is to NOT screw something up, but it
WILL happen.   The proper thing isn't to stand around and talk about how it
should have been done right, and if it was all your problems would go away.

Microsoft's job on this kind of issue is to make life as a programmer as
easy as possible.  I will grant you that compared to OS X and Linux stuff,
Microsoft is a rock-star, but in a more absolute sense there is a lot they
could do much, much better.

For example, the current issue, Locked up threads.   Granted a good program
will never have this problem, but a realistic response outlook would be that
we have to deal with 'bad' things.    A better approach would be for MS to
figure out a way to create a thread and provide some kind of emergency
recovery system.  You could make it a special kind of thread used to run
unsafe stuff and the architecture will save you from what is in the thread if
worst comes to worst.  It would be like a container for uranium.   You have
to use it, and you hope nothing goes wrong, but if it does, it is contained.

Another way (not to drag this rant any longer) to look at this is to look
back in the days where there was no memory protection for applications.  One
rogue application could bring the entire system down.   To take today's
outlook on threads and apply it to that, it would be the same thing as simply
saying, "Clearly the solution to rogue applications is to not run rogue
applications."  Ignoring the fact that AwsomeApp.exe is the ONLY app that
does what you need.

No, I do not expect anyone here to be able to do anything about this.   I do
not know, and would doubt, that any MS big-wigs (ones with enough power to
actually do something) read this kind of stuff and would care enough to do
anything about it.

Having said all that, the squeaky wheel gets the kick, so griping about
issues like this might instill even more griping until "The powers that be"
at MS can't stand it anymore and decide to do something.

Anyway, to all who have helped me, thanks.   I would like suggest to Peter
Bromberg that he put a warning for the solution he purposed, or in fact
remove it.   he solution leaves bound up threads and resources, and if an
application repeats that more then 50 times, it will cease working until it
is restarted.  It is OK for a program that isn't going to iterate over that
more then a few times, but it is a death trap for anything that does.
Peter Duniho - 17 Jan 2008 23:42 GMT
> If they can do it with processes, why can't they do it with threads?

Can do what with processes?  We've already explained that you can't safely  
terminate a process any more than you can safely abort a thread.

> I am sure they can't guarantee that everything will be fine if my code
> doesn't anticipate a resources disappearing, but if I do, I should be  
> able to
> do it safely.

It's not an issue of resources "disappearing".  It's an issue of them  
being left in an inconsistent state.

There is no way for the _operating system_ to ensure that things are left  
in an inconsistent state.  Implementors of various data structures can do  
things to make sure they are always in a consistent state (e.g. see  
"journaled" or "transaction-based"), but that's up to the implementor.  
The OS has no way to do this (though it might provide APIs to help an  
implementor do it).

> For example:
>
[quoted text clipped - 3 lines]
> recover
> all that stuff.

No, it can't.  All data within a process is owned by the _process_, unless  
it's been specifically marked as thread data (*).  The OS has no way to  
know whether killing a thread allows that data to be cleaned up or not.

(*) (I'm not sure .NET supports this or not, but is supported in the  
unmanaged Windows API...I'm seeing a Thread.AllocateDataSlot() method, and  
I suspect this addresses the same issue in managed code.  In any case,  
note that it only addresses specific thread-local storage, not the OS  
objects that might be referenced by that storage, as those are still  
per-process and cannot be released with the thread terminates).

But even if it did have a way to know what data could be cleaned up,  
_that's not the problem_.  Cleaning things up is the least of the  
worries.  It's the fact that software _does_ stuff, and if it's  
interrupted in the middle of _doing_ that stuff, whatever data the  
software is operating on could be in an inconsistent state.

Most of your rant seems to be about this question of cleaning up, but  
that's not the main problem.  That's not what makes killing threads or  
processes unsafe, and coming up with a paradigm in which you can ensure  
things are cleaned up would _not_ make killing threads or processes a safe  
operation.

As far as your specific problem goes, there's no point in complaining that  
SSH isn't supported in .NET (assuming it's not...I know .NET does have a  
lot of crypto stuff in it, and it's possible that you could easily write  
an SSH implementation just by combining that with the usual network i/o  
stuff).  .NET can't possibly implement _everything_, even as with each  
iteration it does support more and more.

If a specific library isn't doing what you need or want, you can either  
find a different library or write it yourself.  Programmers all over the  
world make these kinds of decisions every day, and it's just not a big  
deal.  Note that you are not limited to using a managed code library.  
With p/invoke you should be able to use pretty much whatever library you  
find useful.

I will point out that your assertion that Microsoft could publish an SSH  
library "in a few weeks time" is absurd.  No reputable software publishing  
company does _anything_ "in a few weeks time".  It would take _way_ more  
than a few weeks just to properly _test_ such a library, never mind  
implement it correctly.  Granted I have very little specific knowledge of  
SSH, but I would guess that it would take at least three staff members  
(programmer, tester, and a program manager to manage the specification for  
the feature) something like 6-12 months, for a potential cost of up to  
three man-years.

Even if it _were_ just a few weeks worth of work, it boggles my mind that  
you would on the one hand say that Microsoft should do this work, and on  
the other hand write "My employer is not going to want me to spend several  
weeks to write my own".  Don't you think Microsoft already has their own  
things they are trying to get done?  Surely if this is an important enough  
feature for your need to justify them implementing it, it's important  
enough to justify _you_ doing whatever work is needed on your own to get  
it into your product.

Maybe it will get into .NET eventually, maybe it won't.  But making  
fanciful claims about how easy it would be to implement doesn't help your  
case any.  If it's really that easy, write it yourself.

And please keep in mind that designing and implementing an operating  
system is a lot harder than you seem to think it is.  I think it's safe to  
say that if dealing with hung threads were really as easy as you claim it  
is, Windows and every other OS would already do it.  But there's not a  
single OS I can think of off the top of my head that can allow a thread or  
process to be safely terminated without the risk of causing data integrity  
problems.

Pete
Ben Voigt [C++ MVP] - 18 Jan 2008 13:10 GMT
>> If they can do it with processes, why can't they do it with threads?
>
> Can do what with processes?  We've already explained that you can't safely
> terminate a process any more than you can safely abort a thread.

Sure you can.  Ok, maybe not an arbitrary process, but it's fairly easy
(depending on what resources are required by your requirements) to design a
process that can be terminated at any point in time.  It's even easier to
manage exiting your own process, even with hung threads.  Theoretically you
can also create a thread that can be safely terminated, but... not with
.NET.  .NET holds internal state and accesses it willy-nilly from any
threads in a way that's threadsafe but not abort safe.  However, .NET
doesn't implement any external state on its own, only what you ask it to, so
you can manage your external resources in such a way that it's ok for the
process to be interrupted (for example, instead of writing data files that
could be left inconsistent, store your data in an ACID database using
transactions).

>> I am sure they can't guarantee that everything will be fine if my code
>> doesn't anticipate a resources disappearing, but if I do, I should be
[quoted text clipped - 10 lines]
> The OS has no way to do this (though it might provide APIs to help an
> implementor do it).

Yup, and the problem is that the .NET implementation uses hidden
process-local resources without doing any of this, so no matter what code
you tag on top, calling TerminateThread is gonna crash the process.

>> For example:
>>
[quoted text clipped - 26 lines]
> things are cleaned up would _not_ make killing threads or processes a safe
> operation.

One reasonable approach, as long as this SharpSSH library doesn't use any
external resources except sockets, would be to put that component in a
separate process, communicate back and forth with Remoting, and provide at
least one call that causes said process to free any shared resources and
then call ExitProcess (.NET Application.Exit?) to free the hung thread(s).
Willy Denoyette [MVP] - 18 Jan 2008 14:05 GMT
>>> If they can do it with processes, why can't they do it with threads?
>>
[quoted text clipped - 7 lines]
> you can also create a thread that can be safely terminated, but... not
> with .NET.

Terminating a thread using TerminateThread is safe as long as you know
exactly what the thread is doing at the moment the OS kills the thread, this
is exactly what's impossible to know when calling into arbitrary code.
Whenever your thread runs arbitrary code (third party or not) you can't
safely terminate the thread, because you don't have an idea what the thread
is doing, this has nothing to do with .NET, this is about Windows.
Run a simple native code program and terminate a thread (using
TerminateThread Win32 API) while he's allocating memory from the heap, all
successive heap alloc's or heap releases from other threads will now block
forever.
Or terminate a thread while he's executing in a critical section, this CS
will never get released (well, actually when the process terminates),
another thread that tries to enter the CS will deadlock....

Willy.
Ben Voigt [C++ MVP] - 18 Jan 2008 16:07 GMT
>>>> If they can do it with processes, why can't they do it with threads?
>>>
[quoted text clipped - 19 lines]
> successive heap alloc's or heap releases from other threads will now block
> forever.

Only if it's using a shared heap...

> Or terminate a thread while he's executing in a critical section, this CS
> will never get released (well, actually when the process terminates),
> another thread that tries to enter the CS will deadlock....

But you can use a kernel mutex instead, then it'll be marked as abandoned
and you can recover.

My point was that .NET in particular does a bunch of stuff that is not abort
safe.  This is far from saying that .NET is the only library that isn't
abort safe, but there is nothing inherently unsafe about Win32 itself.

> Willy.
Willy Denoyette [MVP] - 18 Jan 2008 16:46 GMT
>>>>> If they can do it with processes, why can't they do it with threads?
>>>>
[quoted text clipped - 21 lines]
>
> Only if it's using a shared heap...

I'l talking about real world applications (.NET or not), calling into
arbitray code, how would a caller know what allocator is getting used? I'm
talking about Windows applications calling into library code, that allocates
from the process heap, CRT heap or from the COM heap, that is, allocates
from the heap manager (ntdll). You can't safely kill threads that are
executing in these libraries.

>> Or terminate a thread while he's executing in a critical section, this CS
>> will never get released (well, actually when the process terminates),
>> another thread that tries to enter the CS will deadlock....
>
> But you can use a kernel mutex instead, then it'll be marked as abandoned
> and you can recover.

Again, I'm calling into arbitrary code, say I'm calling into Winsock library
like the OP is doing.... and this library is using CS all the way down.

> My point was that .NET in particular does a bunch of stuff that is not
> abort safe.  This is far from saying that .NET is the only library that
> isn't abort safe, but there is nothing inherently unsafe about Win32
> itself.

What do you call Win32?
A thread that executes arbitrary (native code libraries, whatever) code
cannot safely be killed (using TerminateThread ) , that's why the CLR
refuses to kill a thread (using TerminateThread ) that currently runs in
"unmanaged" code, the CLR waits for the thread to return into managed to
checks whether a thread abort has been issued, gracefully aborting the
thread when it's the case (not using TerminateThread !).
Again, it's unsafe to call Win32's TerminateThread, unless you know it's
not.

Willy.
Ben Voigt [C++ MVP] - 18 Jan 2008 19:10 GMT
>>>>>> If they can do it with processes, why can't they do it with threads?
>>>>>
[quoted text clipped - 30 lines]
> allocates from the heap manager (ntdll). You can't safely kill threads
> that are executing in these libraries.

I wasn't talking about arbitrary code.  I was challenging your statement
"this has nothing to do with .NET, this is about Windows".  It is not
possible to write abort safe code in .NET.  Of course not all code written
without .NET is abort safe, but the act of using .NET prevents being abort
safe, whereas the act of using Windows does not.  So this isn't exclusive to
.NET, but it isn't applicable to Windows in general like it is to .NET.

BTW you can use the heap manager (ntdll) with private heaps.  You could not
safely free such a heap after a thread was aborted while allocating from it,
but it would not block other threads either.
Willy Denoyette [MVP] - 18 Jan 2008 20:52 GMT
>>>>>>> If they can do it with processes, why can't they do it with threads?
>>>>>>
[quoted text clipped - 41 lines]
> not safely free such a heap after a thread was aborted while allocating
> from it, but it would not block other threads either.

I was talking about arbitrary code, and here I mean - calling arbitrary code
from whatever environment you see fit( .NET or other) on Windows, that's why
I keep saying that it has nothing to do with .NET.

Whenever you call *TerminateThread* to kill a thread that actually runs code
you didn't *completely* implement yourself, you are in danger. That's why
MSDN says that "TerminateThread" is dangerous API, you should never call it
unless you know exactly what the target thread is actually doing, which is
exactly the point of this whole thread, you *never* know what the thread is
doing when running arbitrary code. Some say that this service (invoked by
TerminateThread) should never have been exposed to user code)

Note also, that "private" based on the heap manager (ntdll) have the same
issue, the heap manager protects it's internal structures with critical
sections when you are allocating/de-allocating, you don't control the heap
manager don't you?.
Killing a thread when the heap manager runs in a critical section, will most
likely corrupt the heap and deadlock whenever another thread tries  to
allocate/de-allocate from the same heap.

Simply things like statics and global variables, TLS, FLS etc... are
allocated on the heap, creating a thread (from kernel32) in windows calls
the heap manager (ntdll), several hundred times, to allocate from the
process heap, dynamic module loading/unloading allocate from the process
heap, kernel32 and ntdll are the first modules loaded by the OS loader when
you create a process, no single Win32 process can live without them.
You aren't going to rewrite all these Win32 DLL's and runtime libraries, so
that they use your own heap manager, don't you?

You could build your own private heap manager on top of the VM Manager (like
the CLR's memory allocator) , but just like the "Heap Manager " you'll have
to protect your internal structures with a CS, if you want to handle
allocations from multiple threads. So, you are back at square zero, nor will
it solve the other possible issues related to TerminateThread.

Willy.
Jeroen Mostert - 18 Jan 2008 00:51 GMT
> If they can do it with processes, why can't they do it with threads?

It's more a case of "THREADS DON'T WORK THAT WAY!" rather than "can't be done".

A thread's supposed to be lightweight; a simple means of achieving
multiprocessing. If you follow the reliability angle through and add
resource tracking and whatnot you end up with a thread that's basically just
as fat as a process. A thread's not supposed to be isolated from anything;
that's not their purpose.

What you're looking for actually has less to do with threads and more with
isolating components (which may or may not be using separate threads) from
each other's failures. But here "failure" has to be defined so generally as
to make any form of isolation lower than process level well nigh useless.

> If in my thread I do something like:
>
[quoted text clipped - 6 lines]
> MyList null or just throwing an exception of I try and use it (besides
> assigning it a new value).

But what's the point?

If you are in a position to terminate the thread properly, you're also in a
position to know what resources should be thrown away. So why don't you do
that, instead of demanding that the CLR save your bacon at a considerable
(and in 99% of the cases, unnecessary) overhead?

Now, if you're using someone else's component, you don't know what resources
they're squirreling away, so you could say that's an argument in favor of
CLR tracking. But hang on a moment -- how do you know what threads the
misbehaving component is using, and how do you select the one that's
blocking in a way you don't want it to for termination? If you can dig deep
enough to figure that out, can't you also figure out what resources it's
abusing and dispose of them?

Indefinitely blocking threads are such a huge pain in the a.s because
recognizing when a thread is never going to do something meaningful again is
in theory equivalent to the halting problem and in practice not actually
that much easier. It's like asking the OS for an infinite loop detector. It
could try, but it'd run into unsolvable cases pretty soon.

> Having said all that, I understand the sentiment about writing good code and
> how none of this is necessary.  Unfortunately, that is a 'if the world were
> perfect...' point of view in an imperfect world.

If the world were perfect, the operating system and the runtime would join
hands to ensure that nothing you ever did could cause state corruption, and
every error condition was recoverable. But since that's a theoretical
impossibility, they have to settle somewhere before that. Threads were never
meant to be an aid in this. They're actually more like aggravating factors.

The process is the one edge where they can reasonably isolate the rest of
the system from most of the impact of failure. And even that fails when
processes are cooperating to get something done. Try killing off "csrss.exe"
sometime. If you succeed, it's rebooting time, baby. Your other processes
will be just as doomed.

> In this particular case, I need SSH, which for some reason Microsoft doesn't
> seem to see fit as being a core protocol for C# (or .NET in general).

Hey, they have to give third-party developers *some* chance at a living,
don't they? :-)

> I suggested this on the community sites, and got a 'resolved' and 'won't
> fix' with no reasons supplied.  The only valid reason I can think of is
> because SSH support is in the works, however after much googling I can't
> find any hint about official MS SSH support.  With their big security
> push, and SSH being a cornerstone in network security management, this
> makes absolutely no sense.

Windows has no native (read: Microsoft-supplied) SSH services. That's the
most obvious reason I can think of. .NET heavily focuses on making all of
Windows available through the managed API, but it doesn't go out of its way
to support stuff that isn't ubiquitous on Windows already. And SSH isn't
ubiquitous on Windows -- RDP over VPN is much more common. I say this
without offering judgement on how things are or should be.

> Maybe they are waiting until the security crowd starts beating them with
> a stick and hail it as yet another reason to use Linux. How long would it
> take a few of MS well trained developers to put out a great SSH suite for
>  .NET? Ignoring the bureaucracy, it should only take a few actual weeks
> of development time.

It's not a case of "MS has so much resources, they could do this". Because
every developer and his janitor has a feature they clamor for this way ("why
isn't this just in the base classes so I don't have to think about it
anymore?") It's a big win for the developers, but it has to be a win for
Microsoft too. If there's not enough business incentive for Microsoft to
develop, distribute and support it then they won't do it. Simple as that.

It's weird how in the Unix world everyone cheers when a third-party
developer brings out Yet Another implementation of a well-known protocol,
but how in the Windows world the developers are looking over at Microsoft
expectantly to build everything they need and give it to them. It's true
that Microsoft plays a big role in encouraging this attitude, but still.

> This leaves me with a choice of writing my own implementation or using some
> other library.   My employer is not going to want me to spend several weeks
> to write my own or fix this SharpSSH library.  Personally, I wouldn't mind,
> but really, I have a lot to do.

I just googled ".NET SSH". You don't want to know how many hits I got (and
some of them relevant, even!) What made SharpSSH the monopolist? What about
my suggestion of using an ActiveX control? Is it just a case of not wanting
or being able to spend any money? You get what you pay for...

If you're waiting for MS to turn into a charity and do the things your
company doesn't have the time or money for, then don't forget to pick up a
lottery ticket every day, because you're sure to win in the meantime. Say hi
to your competitors for me.

> Considering we are living in an imperfect world, we should try to be
> accommodating.  Yes, the right thing is to NOT screw something up, but it
> WILL happen.   The proper thing isn't to stand around and talk about how it
> should have been done right, and if it was all your problems would go away.

You're absolutely right. The proper thing is not to stand around and talk
about it but to *do* things right. There has to be a point, somewhere, where
you have to stop talking about general stopgaps and have to get down to
where the actual problem is, because stopgaps only go so far. The OS can't
fix problems with hung threads for you. It already allows you to kill them
off Completely Dead through TerminateThread() if you really think you know
what you're doing. (You probably don't, which is why it's so dangerous.)
That is not fixing the problems, though. And releasing all resources we
somehow deem "belonging" to that thread still isn't fixing the problems.

Tacking on a tracking system for releasing resources is just not a
cost-effective tradeoff. For most applications, the problem will *not* be in
releasing the resources, it's in the fact that whatever they're doing is
going completely wrong. Some applications might just be able to continue
without any problem if the particular action the thread was working on fails
spectacularly, but most will not. They're more likely to grind to a halt. If
you're killing off a thread, you'll probably be killing off your process soon.

> Microsoft's job on this kind of issue is to make life as a programmer as
> easy as possible.  I will grant you that compared to OS X and Linux stuff,
> Microsoft is a rock-star, but in a more absolute sense there is a lot they
> could do much, much better.

I really have to disagree, at least on this particular issue. You're asking
for the impossible. They can give you the Big Red Emergency Button, and it's
already present in the form of .Abort, and if that doesn't work
TerminateThread(). But you want that button to magically keep your
application in serviceable condition as it's killing off an integral part of
it, and that can't be done.

> For example, the current issue, Locked up threads.   Granted a good program
> will never have this problem, but a realistic response outlook would be that
> we have to deal with 'bad' things.    A better approach would be for MS to
> figure out a way to create a thread and provide some kind of emergency
> recovery system.

TerminateThread() *will* get rid of the thread. But the only one who can
"recover" is you. And if the component that failed you is a black box to
you, you're just as sunk as the OS would be.

> You could make it a special kind of thread used to run unsafe stuff and
> the architecture will save you from what is in the thread if worst comes
> to worst.  It would be like a container for uranium.   You have to use
> it, and you hope nothing goes wrong, but if it does, it is contained.

Uranium is easy. That's just radiation. Threads can do *anything*. And most
of the time they're *cooperating* with other threads to get things done.
Good luck automagically containing things.

> Another way (not to drag this rant any longer) to look at this is to look
> back in the days where there was no memory protection for applications.
[quoted text clipped - 3 lines]
> rogue applications." Ignoring the fact that AwsomeApp.exe is the ONLY app
> that does what you need.

See above for the whole "the buck stops somewhere" point. If you want this
protection (and it's indeed a good thing the OS has this), then by all
means, isolate the failing component in a process. The OS can guarantee that
it will at least keep your main process safe from wrongdoings as far as
internal state goes (the failing app might still have corrupted your drive
or something annoying like that, but you stand a good chance).

But that's the thing: that's what *processes* are for. Processes only
started working that way when the OS said they did: before that, processes
could exchange memory directly, as ugly and error-prone as that was. Then
the OS said: "No, stop that -- processes are isolated, and if you want to
cooperate, do it explicitly". But threads are not for isolation and they
never were, they're for integration! They're "lightweight processes", where
"lightweight" means "fast because I do the least amount of work possible to
manage them, they're all yours".

Your argument simply doesn't hold water for threads: it's impossible for
thread X to be "the only thread that does what you need". The thread is just
a way to achieve parallel execution! It's not some sort of isolation box for
computations that aren't under your control. What you want is to isolate
*components*, not threads. Unfortunately, most components can't meaningfully
be isolated, since they have to be able to do anything.

Signature

J.

TheSilverHammer - 18 Jan 2008 14:46 GMT
Maybe you are all right about making a safe thread that can be killed the way
processes can be to recover resources isn't possible.   If you are right, I
have no idea why beyond it is 'logistically' impossible, not actually
impossible.

BTW you can't use ThreadKill() to kill a C# thread (be it from the thread
pool or Thread class) because there is no way to match the Thread ID with the
OS Thread ID.  The documentation also says that a thread created with the
Thread Class might be used for multiple things behind the scenes.

So I have  been putting as much Duct Tape on SharpSSH as I can and hoping to
catch the lockups, which is very hard since I can't reproduce them easily.    
As far as googling SSH for .NET, I am sure you did find quite a few
solutions.   Expensive, commercial solutions.  Maybe large companies do not
have an issue paying for such things, but the smaller ones I work at are very
cheap.   Do you know how long it took me to get them to upgrade just TWO
machines from VS 6.0 to VS 2005?  It was like a 2 year long campaign of
pestering.   Eventually, with Vista on the horizon, I had invent an
unresolvable problem that forced the issue.   So yeah, there are other C# SSH
solutions.   Really the point wasn't so much about that, but locked threads.

The simple answer with regard to recovering a locked thread is:  You can't.  
Not, "You can't safely".    No, you simply can't.  End of Story.  Game Over.
Thank you for playing.

Clearly the big issue is / was figuring out why a thread was locking.   Even
that was very difficult because the lockup would only occur sometime at night
when no one was around, and in the morning when my App was seized up, even
Dev Studio could not 'break' the App so I could see what was going on with
the threads.   If I did try and 'break' it, Dev Studio would lock up until I
used Task Manager to kill my app, and then Dev studio would say it could not
interrupt the Application.

Whomever suggest I use another thread to close the Shell object, I would
like to thank.   That works although it causes a lot of exceptions and
crashes.   At least I have a working point and the thread is no longer seized
up.
Ben Voigt [C++ MVP] - 18 Jan 2008 16:11 GMT
> Maybe you are all right about making a safe thread that can be killed the
> way
[quoted text clipped - 30 lines]
> Over.
> Thank you for playing.

Ah, well, you asked a slightly different question.

How do you kill a locked thread?  You can't safely.
How do you recover a locked thread in .NET?  You can't, period.

> Clearly the big issue is / was figuring out why a thread was locking.
> Even
[quoted text clipped - 13 lines]
> seized
> up.

You're welcome.  Win32 APIs are designed not to force you into a totally
unrecoverable state.

I suspect if you had used "native-only debugging" you might have had less
problems attaching with the debugger.

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.