.NET Forum / Languages / C# / March 2008
Truncated data using .NET 3.5 sockets VS 2008
|
|
Thread rating:  |
Kenny D - 24 Mar 2008 16:55 GMT I am having trouble sending streams of data from one socket to another using the following code.
When the transmitter is running on the same machine as the receiver, the data transmits perfectly. When the transmitter runs on another machine, the data gets truncated at semi random places.
The data being transmitted is a unicode XML string. There is nothing special about it other than being unicode.
The message stream is terminated with "</mos>\r\n". I read the stream 1 byte at a time in the code below. I have tried various other "block read" methods with the same results.
This used to work when I was using .Net 2.0 under VS 2005. Since upgrading to VS 2008, this no longer works correctly.
Is there anything I am doing wrong based on the code below or is there something I have overlooked vis-a-vis VS 2008? Is there something wrong with my network set up that would cause this - a service pack release or something? I have tested this with the Windows Firewall on and off - makes no difference.
Any help appreciated.
// The code that writes to the socket... public void Write(NetworkStream OutputStream, string Msg) { Msg += "\r\n"; while (Msg.Contains("\r\n\r\n")) Msg = Msg.Replace("\r\n\r\n", "\r\n");
byte[] OutputBuffer = Encoding.BigEndianUnicode.GetBytes(Msg); OutputStream.Write(OutputBuffer, 0, OutputBuffer.GetLength(0)); OutputStream.Flush(); }
// The code that reads the socket... public string Read(NetworkStream InputStream, string EOM) { StreamReader Reader = new StreamReader(InputStream, Encoding.BigEndianUnicode); StringBuilder MsgText = new StringBuilder();
Int32 InChar = -1;
while (true) {
// If the next character == -1, we have been cut off and disconnected or there was an error // in the message being sent to the reader. Either way, stop reading. if ((InChar = Reader.Peek()) == -1) {
if (!MsgText.ToString().EndsWith(EOM)) { string ErrorMessage = string.Format("Error: Incomplete Message\r\n\r\n{0}\r\n[PREMATURE END OF MESSAGE]", MsgText.ToString());
MOSNetTraceSource.TraceData(TraceEventType.Error, ErrorMessage); LogWriter.WriteError(ErrorMessage); }
return (null); }
InChar = Reader.Read();
MsgText.Append((char)InChar);
if (MsgText.Length > 10 && (char)InChar == '\n') { if (MsgText.ToString().EndsWith(EOM)) { break; } }
return(MsgText.ToString()); }
Jon Skeet [C# MVP] - 24 Mar 2008 17:08 GMT > I am having trouble sending streams of data from one socket to another > using the following code. [quoted text clipped - 18 lines] > release or something? I have tested this with the Windows Firewall on > and off - makes no difference. I'd say the first thing to find out is what's actually going on the wire. Try using WireShark (Google for the site) to sniff the data, preferably on both sides.
It's not a hugely efficient way of reading, but I don't think it should actually be causing any issues.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
Kenny D - 24 Mar 2008 17:31 GMT Thanks, Jon.
I will check out the utility you mentioned.
I know that one character at a time is inefficient, it's just happened to be the way I did it when I decided to seek help here. Not shown in the code is the blcok-read method which results in the same errors.
I am pulling what's left of my hair out over this. All this stuff worked fine a couple of weeks ago.
Peter Duniho - 24 Mar 2008 18:05 GMT > I am having trouble sending streams of data from one socket to another > using the following code. > > When the transmitter is running on the same machine as the receiver, > the data transmits perfectly. When the transmitter runs on another > machine, the data gets truncated at semi random places. You are using StreamReader.Peek() to decide whether you've reached the end of the stream. Except that, as I read the docs for StreamReader.Peek(), that's not what it's going to tell you. It will return -1 any time that there is no character _currently_ available. It doesn't tell you anything about whether a character _may_ become available in the future.
I recommend changing your code to read bytes, properly checking for the end of the stream by watching for a zero-byte read from the stream. When that happens, you will know that the connection has been closed, indicating the end of the stream, at which point you can decode the bytes you've received all at once.
I admit, I don't use StreamReader.Peek() and thus don't have first-hand experience with respect to what it really does. But the documentation certainly implies to me that you're using it wrong. And if that implication is correct, it would absolutely explain the behavior you're seeing.
Pete
Jon Skeet [C# MVP] - 24 Mar 2008 18:20 GMT <snip>
> I admit, I don't use StreamReader.Peek() and thus don't have first-hand > experience with respect to what it really does. Ditto.
> But the documentation > certainly implies to me that you're using it wrong. And if that > implication is correct, it would absolutely explain the behavior you're > seeing. The documentation seems pretty unclear to me. I'd read it as a blocking call, but on a reread I think you're probably right. The docs should definitely be improved...
I'd keep using ReadLine until that returned null, or Read/BeginRead until they returned 0.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
Peter Duniho - 24 Mar 2008 18:29 GMT > [...] > The documentation seems pretty unclear to me. I'd read it as a blocking > call, but on a reread I think you're probably right. The docs should > definitely be improved... I've never seen a method/function named "peek" that was blocking. :)
> I'd keep using ReadLine until that returned null, or Read/BeginRead > until they returned 0. Right...in my reply I recommended the latter, but actually it seems to me that StreamReader should work correctly doing the former, and it's probably more in keeping with the original code to do it that way. So, I'll switch my recommendation to agree with yours. :)
Pete
Kenny D - 24 Mar 2008 19:55 GMT Read(), as I have used it in the original example is a blocking call. Peek is a blocking call as well.
I am no network expert and this code worked like a charm before I upgraded to VS2008 and .NET 3.5. I am not emphatically stating that they are the culprits but the coincidence is questionable at least.
The problem with using the... BytesRead = InputStream.Read(Buffer, 0, ReadBufferSize); ... method is that the results are the same - truncated data.
Additionally, I used WireShark (WS) and it indicates a bad CRC on some of the packets. The interesting thing is that it looks like the data is getting to the target machine ok. I can look at the data in WireShark and see all the proper tags in the XML, including the message end tag. I don't see any strange data embedded in the XML in WS.
Any other ideas? All help appreciated.
Thanks, so far.
Peter Duniho - 24 Mar 2008 20:26 GMT > Read(), as I have used it in the original example is a blocking call. > Peek is a blocking call as well. Why do you say that? Given that the docs for StreamReader.Peek() specifically say "The returned value is -1 if no more characters are currently available", under what condition would Peek() block?
I'm not saying you must be wrong, but I don't see anything in the docs that suggest it's a blocking call, nor is it clear that it _could_ be a blocking call. It would also be contrary to the usual semantics of "peek" methods for it to be a blocking call.
> I am no network expert and this code worked like a charm before I > upgraded to VS2008 and .NET 3.5. I am not emphatically stating that > they are the culprits but the coincidence is questionable at least. It is not uncommon at all for buggy code to work under certain circumstances, only to have the bugs exposed later on when someone external changes.
For example, with TCP there is the common misunderstanding among beginners that they can send what they perceive to be a message at one end, and have that show up as a complete message at the other end. In fact, with TCP the only guarantee is that the bytes will be received in the same order as they are sent, and things that are "messages" may be grouped differently when received, requiring the receiver to figure out where the "messages" actually start and end.
The beginner will write code that makes this invalid assumption, and it will work just fine on a LAN, and sometimes even over the Internet. But occasionally, for any of a variety of reasons, the network interface will do the perfectly reasonable and allowable act of breaking the data up in some other way than the messages that were originally sent. At that point, the buggy code will fail.
It is _extremely_ unlikely that any of the classes you're using are broken. They are widely used, and the behavior you're seeing would be disastrous if it was actually being caused by .NET. All sorts of applications would be breaking left and right, causing a wide variety of discussion in the .NET community that would easily be found with a search on Google or the MSDN web site.
> The problem with using the... > BytesRead = InputStream.Read(Buffer, 0, ReadBufferSize); > ... method is that the results are the same - truncated data. How are you using that? Showing just that one statement doesn't prove that you're using the method correctly. If you are still getting the same problem using that statement, then all that says to me is that it's likely to not be used correctly.
> Additionally, I used WireShark (WS) and it indicates a bad CRC on some > of the packets. The interesting thing is that it looks like the data > is getting to the target machine ok. I can look at the data in > WireShark and see all the proper tags in the XML, including the > message end tag. I don't see any strange data embedded in the XML in > WS. Well, TCP does do error detection and correction. In fact, this is one of those situations that I was talking about that could cause the re-grouping of data that has been sent. And if your code has a bug in it that's sensitive to that regrouping, it would fail when an error occurred, even though TCP is perfectly capable of detecting and correcting the error.
This is, in fact, consistent with what you've seen using Wireshark: it shows you that errors are happening, but it also shows you that the data ultimate received is fine. That's what it looks like when errors are detected and corrected.
> Any other ideas? All help appreciated. IMHO, the original suggestions are still your best bet.
What happens if you replace your receiving code with this? :
public string Read(NetworkStream InputStream, string EOM) { StreamReader Reader = new StreamReader(InputStream, Encoding.BigEndianUnicode); StringBuilder MsgText = new StringBuilder(); string strLine; bool fTerminated = false;
while ((strLine = Reader.ReadLine()) != null) { MsgText.Append(strLine); if (strLine.EndsWith(EOM)) { fTerminated = true; break; } }
if (MsgText.Length > 0 && !fTerminated) { string ErrorMessage = string.Format("Error: Incomplete Message\r\n\r\n{0}\r\n[PREMATURE END OF MESSAGE]", MsgText.ToString());
MOSNetTraceSource.TraceData(TraceEventType.Error, ErrorMessage); LogWriter.WriteError(ErrorMessage);
return null; }
return(MsgText.ToString()); }
This version will only return null if the input was unexpectedly terminated. An empty string will be returned if the input was terminated in a graceful way (basically, the previous message was correctly terminated, and so when it tries to read a new message, it finds the stream is already at the end).
Also, keep in mind that the EndsWith() comparison is culture-sensitive. For networking, you usually don't want each end to be dependent on culture, and so you probably really want to use the overload for EndsWidth() that allows you to specify a specific or invariant culture.
Pete
Jon Skeet [C# MVP] - 24 Mar 2008 20:28 GMT > Read(), as I have used it in the original example is a blocking call. > Peek is a blocking call as well. Do you have concrete evidence of this? A look in Reflector suggests that it's not really blocking - that it will return -1 if the last call to fill the internal buffer didn't read as many bytes as it asked for.
This sort of question occurs every so often - I think I'm going to write a stream class which one can "program" to behave in particular ways (e.g. waiting a while before returning data, returning a certain amount of data at a time etc). That probably won't happen tonight though.
> I am no network expert and this code worked like a charm before I > upgraded to VS2008 and .NET 3.5. I am not emphatically stating that > they are the culprits but the coincidence is questionable at least. It's quite possible that undocumented behaviour has changed - that you were relying on something that you shouldn't have been, but that has changed. Of course, it could be a real bug - or a change somewhere else.
Sorry, that isn't terribly helpful really, is it?
> The problem with using the... > BytesRead = InputStream.Read(Buffer, 0, ReadBufferSize); > ... method is that the results are the same - truncated data. Could you post that code as well, so we can check the logic?
> Additionally, I used WireShark (WS) and it indicates a bad CRC on some > of the packets. The interesting thing is that it looks like the data [quoted text clipped - 4 lines] > > Any other ideas? All help appreciated. It would be helpful to write a pair of really simple console apps to demonstrate it - one which just listened for a connection and then spat out the data, and one which connected and then read it.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
Kenny D - 24 Mar 2008 20:45 GMT Blocking Peek(): I can place a break point after the call to peek. The break point never gets hit until data is in the pipe waiting to be read.
I will code up the console applications. They won't be ready for a couple of hours as I have to step out. This will help out a lot because I have not had anyone else try this on their machines.
Please keep an eye on this thread. I really need to get to the bottom of this.
Thanks,
K
Code for the other method...
Int32 BytesRead = 0; do { byte[] Buffer = new byte[ReadBufferSize]; BytesRead = InputStream.Read(Buffer, 0, ReadBufferSize);
if (BytesRead > 0) { MsgText.Append(Encoding.BigEndianUnicode.GetString(Buffer)); }
string Text = MsgText.ToString(); Int32 Pos = Text.IndexOf(EOM);
if (Pos > 0) { Pos += EOM.Length; MsgText.Length = Pos;
break; }
} while (InputStream.DataAvailable);
Kenny D - 24 Mar 2008 20:48 GMT Also, the other code snippet was thrown together hastily just to test for the truncation. It has not been thoroughly tested. Neither is it very elegant.
Jon Skeet [C# MVP] - 24 Mar 2008 21:06 GMT > Blocking Peek(): I can place a break point after the call to peek. > The break point never gets hit until data is in the pipe waiting to be > read. It may wait for the *first* time, but possibly not afterwards. I'll try to prove it either way though.
> I will code up the console applications. They won't be ready for a > couple of hours as I have to step out. This will help out a lot > because I have not had anyone else try this on their machines. Happy to help :)
> Please keep an eye on this thread. I really need to get to the bottom > of this. <snip>
> } while (InputStream.DataAvailable); That one's a problem to start with - it will only keep going while there is data available *now*, rather than until the stream is closed at the server end.
You should really just read until the call to Read returns 0.
I assume the server *is* closing its end of the stream, by the way?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
Peter Duniho - 24 Mar 2008 21:07 GMT > Blocking Peek(): I can place a break point after the call to peek. > The break point never gets hit until data is in the pipe waiting to be > read. The next statement after the call to Peek() only executes when Peek() returns something other than -1. Are you sure that it's not just that Peek() keeps returning -1?
And I reiterate: given that the docs say that Peek() will return -1 if no data is available, under what condition would Peek() block? Conversely, feel free to describe a condition under which Peek() would return -1.
> I will code up the console applications. They won't be ready for a > couple of hours as I have to step out. This will help out a lot > because I have not had anyone else try this on their machines. > > Please keep an eye on this thread. I really need to get to the bottom > of this. For better or worse, your question is likely to get exactly the same attention anyone else's does. If you have an urgent situation, you may want to consider paid developer support from Microsoft. You'll find that many of us are reasonably responsive, but there's not likely to be any point in expressing your urgency. And at worst, it implies that we are are your beck and call, which is surely not the implication you intended.
> Thanks, You're welcome. :)
> K > > Code for the other method... I see at least four problems in that code, at least three of which I believe are significant.
Specifically:
> Int32 BytesRead = 0; > do [quoted text clipped - 6 lines] > MsgText.Append(Encoding.BigEndianUnicode.GetString(Buffer)); > } At the very least, you have a problem any time that a Unicode character gets split in two during network transmission. You also aren't taking into account the actual count of BytesRead when you convert the buffer to a string, but since the buffer is zero-filled and you allocate it anew each time (inefficient, but workable) I think this isn't as likely to cause a problem.
> string Text = MsgText.ToString(); > Int32 Pos = Text.IndexOf(EOM); [quoted text clipped - 6 lines] > break; > } What happens when you receive more than one message in a single read? You're just discarding whatever text was read after the current message's EOM.
> } while (InputStream.DataAvailable); Like Peek(), the DataAvailable property doesn't tell you anything about the future condition of the stream. So if you stop here, then as soon as the stream gets temporarily interrupted you incorrectly will detect that as the end of the message. Of course, if the actual EOM text hasn't arrived yet, your code breaks.
Pete
Peter Duniho - 25 Mar 2008 04:19 GMT > Blocking Peek(): I can place a break point after the call to peek. > The break point never gets hit until data is in the pipe waiting to be > read. [...] It turns out, Peek() does block, sort of. I wrote a little Stream implementation (attached below) that's just a wrapper for an actual Stream, but which introduces network-like behavior: you can configure it so that it returns something less than what was asked for, and so that the completion of the operation is delayed by some amount (with these optionally being randomized).
I used this implementation to empirically look at what StreamReader does. In fact, I was able to see that the first time you call Peek(), it actually reads what it can. If it can put anything into its internal buffer, Peek() will return that data. It will continue to do so until it exhausts its internal buffer.
Here's the very important part: once it's exhausted its internal buffer, Peek() DOES NOT attempt to read again. It will return -1, even if a subsequent attempt to read would succeed. In fact by adjusting my test code slightly I showed that you can, after receiving -1 from Peek(), call Read() and get valid data. You can also, after receiving -1 from Peek(), call Read() and block until more data is available. And of course finally you can, after receiving -1 from Peek(), call Read() and receive -1 as the return value, indicating the end of the stream.
In other words, Peek() doesn't really tell you very much. For network i/o, it is practically never a good idea to try to look ahead at the data. You should receive the data you're ready to deal with, and not try to look at any data beyond that. StreamReader exposes Peek() for some reason, and while I'm of the opinion that it's of limited utility even with non-network streams, it's obvious to me that it inherits much of the same problematic issues that network streams have with looking ahead generally.
If you are going to use Peek(), you must take into account its exact behavior, and based on my experiment it's clear from looking at the code you posted that you're not doing so.
Personally, I would avoid using Peek() altogether. It's a poor technique even when it works properly, and in this case the only way to get it to work right is to add enough code that the call to Peek() becomes irrelevant anyway. With a network stream, if Peek() returns -1, the only safe thing to do at that point is call Read() and find out what happens. Given that, you might as well just call Read() in the first place and forget about Peek().
I will recommend once again that you try using the code I posted earlier. I suspect that it will fix the problem you're dealing with, and as an added benefit it's a lot simpler than what you've posted so far (in your original message as well as later).
Pete
p.s. Another interesting thing I noticed with my test code is that the StreamReader tries to read from the stream any time you read from the StreamReader, even if it's already reached the end of the stream and buffered all of the remaining data. I still haven't figured out if this is a bug or a feature. Seems like the only time it'd matter is when the stream lies to the reader about being at the end.
Anyway, here's my test code (with Program class to execute it) (my apologies if there's a minor compilation error; my actual project is two different files, and I combined them while composing this post, so the exact posted code hasn't actually been through a compiler):
using System; using System.Collections.Generic; using System.Text; using System.IO; using System.Threading;
namespace TestPseudoNetStream { class Program { static void Main(string[] args) { string strData = "Now is the time for all good men to come to the aid of their country.\r\n" + "The quick brown fox jumped over the lazy dog's back.\r\n" + "We are now at the end of our data."; MemoryStream streamMemory = new MemoryStream(Encoding.Unicode.GetBytes(strData)); PseudoNetStream streamNet = new PseudoNetStream(streamMemory, 0.2f, new TimeSpan(), new TimeSpan(0, 0, 3)); StreamReader reader = new StreamReader(streamNet,Encoding.Unicode); string strLine;
Console.WriteLine("Using ReadLine():"); while ((strLine = reader.ReadLine()) != null) { Console.WriteLine(strLine); }
streamNet.Position = 0; reader.DiscardBufferedData();
StringBuilder sb = new StringBuilder();
Console.WriteLine("Using Peek():"); while (true) { int ch = reader.Peek();
if (ch == -1) { // Uncomment the next line to see how reading after // Peek() fails causes more data to be read // ch = reader.Read();
if (ch == -1) { Console.WriteLine("Peek() returned -1"); break; }
sb.Append((char)ch); }
sb.Append((char)reader.Read()); }
Console.WriteLine(sb.ToString());
Console.WriteLine("done!"); Console.ReadLine(); } }
class PseudoNetStream : Stream { private Stream _stream; private float _fractionIo; private bool _fSleep; private TimeSpan _tsDelayMin; private TimeSpan _tsLengthMax; private Random _rnd = new Random();
public PseudoNetStream(Stream streamBase, float fractionIo, TimeSpan tsDelayMin, TimeSpan tsDelayMax) { if (streamBase == null) { throw new ArgumentNullException("streamBase must be non-null"); }
if (fractionIo < 0 || fractionIo > 1.0f) { throw new ArgumentOutOfRangeException("fractionIo must be between 0 and 1, inclusive"); }
_stream = streamBase; _fractionIo = fractionIo; _tsDelayMin = tsDelayMin; _tsLengthMax = tsDelayMax - tsDelayMin;
_fSleep = _tsDelayMin.Ticks > 0 || _tsLengthMax.Ticks > 0;
try { Console.WriteLine("input stream has " + streamBase.Length.ToString() + " bytes."); } catch (NotSupportedException) { } }
public override bool CanRead { get { return _stream.CanRead; } }
public override bool CanSeek { get { return _stream.CanSeek; } }
public override bool CanWrite { get { return _stream.CanWrite; } }
public override void Flush() { _stream.Flush(); }
public override int Read(byte[] buffer, int offset, int count) { int cbRead;
if (_fractionIo > 0) { cbRead = Math.Max(1, (int)(count * _fractionIo)); } else { cbRead = Math.Max(1, (int)(_rnd.NextDouble() * count)); }
if (_fSleep) { TimeSpan tsDelay = _tsDelayMin + new TimeSpan((long)(_rnd.NextDouble() * _tsLengthMax.Ticks));
Console.WriteLine("--> Delaying read of " + cbRead.ToString() + " bytes (originally " + count.ToString() + " bytes), by " + tsDelay.ToString() + " <--");
Thread.Sleep(tsDelay); }
cbRead = _stream.Read(buffer, offset, cbRead);
Console.WriteLine("--> actual read length: " + cbRead.ToString() + " bytes <--");
return cbRead; }
public override void Write(byte[] buffer, int offset, int count) { _stream.Write(buffer, offset, count); }
public override long Position { get { return _stream.Position; } set { _stream.Position = value; } }
public override long Seek(long offset, SeekOrigin origin) { return _stream.Seek(offset, origin); }
public override long Length { get { return _stream.Length; } }
public override void SetLength(long value) { _stream.SetLength(value); } } }
Kenny D - 25 Mar 2008 16:54 GMT Pete,
Thanks for the reply and, no, it is not my intention to have everyone on the internet at my beck and call. You guys have been very helpful and I appreciate it.
As to the questions:
Peek will return -1 if the transmitting application terminates the connection for whatever reason. So will Read(); If the connection gets broken, I want the code on the receiver to clean itself up, dispose of the socket and associated resources and wait for the transmitter to open another connection. This all worked at one point.
The messaging protocol we are using calls for an acknowledgement to be sent by the receiver before the transmitter is supposed to send another message. So, in effect, the business rules call for only one message at a time to be "in the pipe." There is a timeout on the transmitting side that waits a configurable amount of time to receive the ACK message. If the ACK is not received in that time, it will disconnect, open a new connection and try again.
It is my desire to hold the connection open for the length of the communication between transmitter and receiver. If I close the NetworkStream on the transmitter after sending the message, doesn't that close the socket and terminate the connection? If so, this is not the behavior I desire.
Before I finish coding the console applications, I am going to give Pete's suggestions a try. I have also removed the call to Peek() and replaced it with Read() and a couple of minor logic changes to make sure that I get the behavior I am looking for. Once I can determine the cause of the truncation, I will fix up the code for efficiency.
Thanks again. I will keep you posted.
Everyone's hep is VERY MUCH APPRECIATED.
Cheers,
K
Kenny D - 25 Mar 2008 17:53 GMT Gentlemen,
THANK YOU, THANK YOU, THANK YOU!!!
Did I say THANK YOU?
It was the call to Peek() that was causing the problem! I replaced the original call to Peek() with a call to Read() and a minor change to the logic to just append the character read to the string.
This was a real PITA as I don't think it's clear from the documentation that Peek() behaves as you all described it. Add to this that I have used this code in .NET 2.0 and it worked perfectly on every system I tested on.
Both my original code and Pete's code work fine now. Pete's code is, of course, more elegent than my one character at a time method, so that is what I will stick with. Thanks, Pete!
The only changes I had to make to Pete's code to get it to work correctly was to place the "\r\n" back at the end of each string read in. This is an interesting problem because while the communication protocol does not require CRLF pairs be after closing XML tags, the system I must interface to includes them, and recommends that our system support it. It does make for XML that is easier to read and debug.
My goal is to preserve whatever formatting was present in the original message. Our system will send nicely formatted, human readable XML in case a human has to look at it.
Jon, I have been to your web site and retrieved several articles from there for my teammates to read.
Pete thanks for taking the time to rewrite the code!
Thanks, guys! This is much appreciated! It's great to know that there are resources such as you folks on the internet!
Cheers,
K
Peter Duniho - 25 Mar 2008 18:36 GMT > [...] > Both my original code and Pete's code work fine now. Pete's code is, > of course, more elegent than my one character at a time method, so > that is what I will stick with. Thanks, Pete! Glad we could help. And yes, as I noted in my other reply, I agree that the docs could be more clear, and that I think the actual behavior of Peek() isn't very intuitive anyway (making it that much more important that it be documented better).
That said, as I think I mentioned, the whole "look ahead" technique is generally a bad idea anyway. It's just not a reliable way to deal with network i/o, because network i/o is all about changing futures. It doesn't buy you much, if anything at all, to know what things are like "now", because they can easily change between "now" and "now plus a tiny fraction of a moment". And it can lead to genuine problems.
It's unfortunate that Peek() didn't behavior in a more consistent way, either never reading or always reading. Had it been more consistent, you would have run into this issue earlier and it probably would have been more obvious. :) But at least now you know: don't try to peek. Santa doesn't like it, and neither does network i/o. :)
> The only changes I had to make to Pete's code to get it to work > correctly was to place the "\r\n" back at the end of each string read [quoted text clipped - 3 lines] > message. Our system will send nicely formatted, human readable XML in > case a human has to look at it. I think that the goal of preserving formatting is a good one. However, given that that's a goal, I think you might be better off using StreamReader(char[], int, int). It doesn't sound like you really need line-by-line access to the data; the main benefit of the StreamReader for your code is the text encoding. So just read the characters in chunks until you reach the end of the stream.
That way, since you never wind up stripping out the line-breaks in the first place, you don't have to replace them later. It also ensures that the line breaks are the original ones, rather than your explicitly stated "\r\n" breaks (it's theoretically possible for "\n" to be used alone, for example).
Pete
Kenny D - 25 Mar 2008 19:29 GMT >I admit, I'm not really clear on why you desire that behavior. If all >you're going to do is create another NetworkStream to start reading from >the Socket again, why not just reuse the same NetworkStream rather than >closing it? Actually, I am not creating a new NetworkStream to read from the socket again. The messaging works roughly like this:
====================================
start:
Transmitter opens a connection to the receiver.
while(true) { Transmitter sends an XML message to the receiver and waits for an ACK message for "nn" seconds. if (ACK received before timeout) continue else // connection broken break; }
goto start;
====================================
In the case of successful transmissions and successful ACK messages, neither the NS or the socket on either side is closed. The connection remains a two-way communication over one socket. Only in the case of a failure (for whatever reason) is the connection dropped and opened again.
>I think that the goal of preserving formatting is a good one. However, >given that that's a goal, I think you might be better off using >StreamReader(char[], int, int). It doesn't sound like you really need >line-by-line access to the data; the main benefit of the StreamReader for >your code is the text encoding. So just read the characters in chunks >until you reach the end of the stream. Great point. I do not need line by line access and I am aware of the "\n" possibility. In fact, when first ingesting data from the present system, that was the first thing I checked for when looking at the raw data coming in.
In order to reach the end of the stream on the receiver, must I close the NetworkStream on the transmitting side (without closing the socket)? I am a little hazy on that.
Thanks again,
K
Peter Duniho - 25 Mar 2008 19:55 GMT > [...] > In the case of successful transmissions and successful ACK messages, > neither the NS or the socket on either side is closed. The connection > remains a two-way communication over one socket. Only in the case of > a failure (for whatever reason) is the connection dropped and opened > again. Well, if you're actually closing the connection, you're going to need a new NetworkStream.
I admit, I'm a bit confused as to what the question actually is. Your previous message seemed to say that you want to be able to close the NetworkStream without closing the Socket (and thus the connection). Is that actually what you want to be able to do? If so, can you explain more precisely under what circumstances you'd want to do that and why? And is that related to this next quoted question?
> [...] > In order to reach the end of the stream on the receiver, must I close > the NetworkStream on the transmitting side (without closing the > socket)? I am a little hazy on that. The "end of the stream" occurs when the underlying network connection (i.e. the Socket) is closed. Closing the NetworkStream without closing the Socket itself will have no effect on the network connection. Only when the Socket itself is shutdown and closed will the receiver be able to detect an "end of stream" condition.
Pete
Kenny D - 25 Mar 2008 21:44 GMT Ok, I think we got mixed up in the nature of brief written communication.
The only time I want to reestablish the connection is if the network fails for some reason. At that time, the transmitter will get an error either when attempting to transmit a message or when listening on the same socket for an acknowledgement. That's it. At no time will either side purposefully disconnect. The NetworkStream and Socket would continue to be used until some error caused us to reconnect.
The listener, upon a network error would dispose of the socket and all associated resources and wait for another connection attempt.
In the above case, reading to the end of the stream then becomes problematic as (according to your last post) there would be no end of stream per se. I would only read until the end-of-message tag ("</mos> \r\n").
Make sense now?
If there is something inherently wrong with this, please feel free to enlighten me. The help is much appreciated.
Thanks again!
Cheers,
K
Peter Duniho - 25 Mar 2008 21:54 GMT > Ok, I think we got mixed up in the nature of brief written > communication. [quoted text clipped - 3 lines] > error either when attempting to transmit a message or when listening > on the same socket for an acknowledgement. Important caveat: the transmitter may not in fact get an error when waiting for an acknowledgement.
With TCP, the general rule is: an error is only detected when an attempt to send data fails. Inasmuch as receiving data occasionally involves sending something at a lower level in the network stack that needs acknowledgement, there's a remote chance of detecting failure when simply receiving. But typically, you'll only notice an error when trying to send.
For most applications, this is not a problem. They implement a timeout, or they don't care. I gather that you've implemented a timeout, and that when the timeout occurs, you'll try to resend, at which point you will detect an error in the connection. So I believe, based on my interpretation of what you've written, that this caveat is not a problem in your case.
But it's important to understand, to avoid problems in other scenarios where it might apply.
> That's it. At no time > will either side purposefully disconnect. The NetworkStream and [quoted text clipped - 8 lines] > stream per se. I would only read until the end-of-message tag ("</mos> > \r\n"). I agree, it's possible that your receiver may never detect the broken connection and thus receive the "end of stream". It all depends on how and exactly when the connection is broken. If this is important, you will probably need to include some logic that detects a reconnection attempt and cleans up the old one when that happens.
Pete
Peter Duniho - 25 Mar 2008 18:12 GMT > As to the questions: > > Peek will return -1 if the transmitting application terminates the > connection for whatever reason. Except that's not actually true. Peek() will not return -1 if the StreamReader still has data in its own buffer. Furthermore (and more importantly for your situation) it _will_ return -1 if you have exhausted its own buffer but not tried to read any more data from the reader (e.g. called Read() _after_ receiving -1 from Peek()).
In other words, using the code you posted, the only time that the StreamReader will actually attempt to read data from the underlying stream is the first time you call Peek(). If it's able to get the entire message in that one read, everything's fine. But if it isn't able to (and that's a perfectly reasonable thing to happen when using TCP), then you will incorrectly detect the message as truncated, because you don't do anything that would cause the StreamReader to try to read again from the stream.
I think it's reasonably valid to argue that this is actually defective behavior from Peek(). I think it's inconsistent that it _does_ read from the stream if the StreamReader hasn't read anything yet, but that it does _not_ read from the stream once the StreamReader is out of data in its internal buffer. I think when the StreamReader's internal buffer has no more unread data, Peek() ought to never read, or it should always read. For it to do one some times and the other other times is not a good design.
But you're stuck with that behavior, and it's not even technically inconsistent with what the docs say (given that they are relatively vague on the precise behavior you should expect). You need to code to what Peek() actually does.
> [...] > It is my desire to hold the connection open for the length of the > communication between transmitter and receiver. If I close the > NetworkStream on the transmitter after sending the message, doesn't > that close the socket and terminate the connection? If so, this is > not the behavior I desire. You can disable the default behavior by creating the NetworkStream with the constructor overload that takes a boolean indicating whether the NetworkStream takes ownership of the Socket instance. If you pass "false" for that parameter, then the NetworkStream will not close the Socket when you call Close() on the NetworkStream.
I admit, I'm not really clear on why you desire that behavior. If all you're going to do is create another NetworkStream to start reading from the Socket again, why not just reuse the same NetworkStream rather than closing it?
But if you really need to be able to close the NetworkStream without closing the Socket, it can be done.
> Before I finish coding the console applications, I am going to give > Pete's suggestions a try. I have also removed the call to Peek() and > replaced it with Read() and a couple of minor logic changes to make > sure that I get the behavior I am looking for. Once I can determine > the cause of the truncation, I will fix up the code for efficiency. I would expect that replacing Peek() with Read() should fix the issue. As Jon indicated much earlier, it's not necessarily the most efficient way to read the data, but at least you're not reading from the Socket one byte at a time. I'd guess the performance overhead reading the data from the StreamReader, which is buffering the network i/o for you, would not be too terrible.
Pete
Jeroen Mostert - 24 Mar 2008 21:53 GMT >> Read(), as I have used it in the original example is a blocking call. >> Peek is a blocking call as well. [quoted text clipped - 8 lines] > amount of data at a time etc). That probably won't happen tonight > though. There's not as much use for that as you'd expect. The network stack in particular will make sure at some point that the conditions your stream is trying to guarantee can't be met, so you'll either need a way to communicate how many bytes were actually read (and we already have that) or it's exception-throwing time (and that's really only useful if you've got a strict-length protocol).
It is true that NetworkStream's behavior tends to be rather unintuitive, especially when you're used to other streams. It's why I usually don't bother with the abstraction it tries to provide and use Socket directly (with my own higher-level abstraction on top of it where relevant).
I think most of the problems could be solved if NetworkStream's behavior was tightened up and documented better. At present it feels like you have to know both how sockets work and how NetworkStream abstracts from them, and that really doesn't make it much more convenient. NetworkStream wraps a Socket for if you absolutely, positively must pass a Stream instance to someone.
 Signature J.
Jon Skeet [C# MVP] - 24 Mar 2008 22:46 GMT > > This sort of question occurs every so often - I think I'm going to > > write a stream class which one can "program" to behave in particular [quoted text clipped - 6 lines] > trying to guarantee can't be met, so you'll either need a way to communicate > how many bytes were actually read (and we already have that) We already have that in Read - it's a case of creating a MemoryStream-like type which lets you state how the stream should respond in various cases.
I should stress that this class wouldn't be useful as a *production* class - it would be a class to help *test* code. Basically a stub, or at least somewhere between a stub and a mock.
> or it's exception-throwing time (and that's really only useful if > you've got a strict-length protocol). Well, it's handy to be able to check whether a program behaves correctly in the face of IO errors.
> It is true that NetworkStream's behavior tends to be rather > unintuitive, especially when you're used to other streams. It's why I [quoted text clipped - 8 lines] > convenient. NetworkStream wraps a Socket for if you absolutely, > positively must pass a Stream instance to someone. I don't know - I'm quite happy with its behaviour for most of the time. I always assume that *any* stream may return less bytes than have been requested, even if there's more data on its way.
What other problems do you have with NetworkStream?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk
Jeroen Mostert - 24 Mar 2008 23:39 GMT >> It is true that NetworkStream's behavior tends to be rather >> unintuitive, especially when you're used to other streams. It's why I [quoted text clipped - 12 lines] > I always assume that *any* stream may return less bytes than have been > requested, even if there's more data on its way. And this is indeed the proper way to use any stream, but unfortunately there's plenty of code out there assuming that all the world's basically a file and that pulling in some more data will always work.
> What other problems do you have with NetworkStream? OK, let me rephrase it slightly -- NetworkStream isn't a problem if you know it's a NetworkStream. When passing it to clients who use a Stream, it can be quite a problem not to know when NetworkStream is going to block (possibly indefinitely, until you close the socket, at which point you'll get an IOException and anything goes).
Wrapping a NetworkStream in other streams, buffers or readers (which some clients will do) is almost sure to cause even more unexpected blocking, because a lot of those wrappers don't even have proper asynchronous operations. The blocking wouldn't be such a problem if it didn't have the potential to be *indefinite* -- if you're expecting this, it's fine, but a lot of clients are not expecting this and it's hard to make them work well with this scenario.
Then there's timeouts. NetworkStream doesn't document that setting the read or write timeouts and getting an exception on timeout means that the stream is now useless and should be closed (as should the underlying socket, regardless of whether the stream owns the socket). This is documented in setsockopt(), but you'd have to know that that's being used under the covers. True, you ideally only get bitten by this once, but it's a pretty leaky abstraction.
Now, most of this isn't actually NetworkStream's fault, but it does mean you have to take care when passing a NetworkStream around. For streams with a clear beginning and end (like HTTP traffic) it works out quite well; for other streams, not so much.
 Signature J.
Lasse Vågsæther Karlsen - 25 Mar 2008 00:05 GMT >>> It is true that NetworkStream's behavior tends to be rather >>> unintuitive, especially when you're used to other streams. It's why I [quoted text clipped - 32 lines] > this, it's fine, but a lot of clients are not expecting this and it's > hard to make them work well with this scenario. I think this is what Joel Spolsky would call a leaky abstraction.
You can abstract away all the details and try to hide the fact that you're talking to something that may, or may not, get more data, at some uncertain point in the future, but at some point the code that talks to this abstracted class needs to know what can happen.
If everything's a stream, there has to be some common low-level ground that they can all share, and if reading from the stream always blocks, then it always blocks, otherwise it always completes as soon as possible.
If what happens depends on the underlying class, it's not really all that abstract and the calling code would definitely have to been written to handle this case.
> Then there's timeouts. NetworkStream doesn't document that setting the > read or write timeouts and getting an exception on timeout means that [quoted text clipped - 3 lines] > used under the covers. True, you ideally only get bitten by this once, > but it's a pretty leaky abstraction. And here you say that yourself. Teaches me to read the whole post I guess.
Well, now that I put so much work and energy into writing it, I'll just go ahead and post it anyway :)
But just to reiterate my point, I don't think it's possible to treat all streams as streams, as the documentation for how a "Stream" should behave is lose enough that most, if not anything, goes, and thus the code using the stream must probably be aware of what kind of stream it is, or rather, limit itself to handling streams it know how to handle.
Or... you could build a layer that behaves like a file stream, warts and all, around the networked stream, since most code that deals with streams assumes there is a file involved.
 Signature Lasse Vågsæther Karlsen mailto:lasse@vkarlsen.no http://presentationmode.blogspot.com/ PGP KeyID: 0xBCDEA2E3
Jeroen Mostert - 25 Mar 2008 00:37 GMT > Or... you could build a layer that behaves like a file stream, warts and > all, around the networked stream, since most code that deals with > streams assumes there is a file involved. But that's just the point -- you can't do that. A NetworkStream just *isn't* a file and all the wrapping in the world isn't going to make it one. You can't magic away things like indefinite blocking and partial reads. Clients just have to handle this properly, or else you have to store *all* the data in an actual file before processing it (or less conservatively a MemoryStream). Anything inbetween won't solve the problem.
Now, if all clients treated a Stream as if it *could* be a NetworkStream, things would have a better chance of working... since file streams can certainly be seen as extraordinarily predictable, finite network streams. Unfortunately a lot of them aren't as conservative as they should/could be.
 Signature J.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|