Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / July 2007

Tip: Looking for answers? Try searching our database.

Reading TCP data stream and finding an End of line

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Tom - 20 Jul 2007 19:24 GMT
This may be more theory than code,

I am currently using a 3rd party TCP client socket tool to read in
data from a connection.  The tool supports and End of Line(EOL)
character.  The problem is that the data that is coming in does not
have an EOL character or set of characters.  The first 4 bytes of the
data contain a set of defined characters "ABCD".  The next 4 bytes
give the size of the whole packet, "0012". After that, there is a
random length of data.

Of course, the TCP tools DataIn function fires for partial messages,
which I have to assemble in a byte[] buffer, while waiting for the
rest of the message.  Well I only have the length of the message to go
on.  If I use the starting characters "ABCD" as the EOL, the last
message in will not get processed until another.  This is to be
expected.  I could count the number of characters until the message
length, copy out the message, but would then have to shift the entire
buffer to the left in case part of another message was in the previous
transmission.  Or is there a natural break so that one event can not
have two separate partial TCP packets?  I know that this is probably a
3rd party question.

What is a good way to deal with something like this?

Thanks,
Tom
Peter Duniho - 20 Jul 2007 20:07 GMT
> [...]
> Of course, the TCP tools DataIn function fires for partial messages,
> which I have to assemble in a byte[] buffer, while waiting for the
> rest of the message.  Well I only have the length of the message to go
> on.

Then you need to use that.

> If I use the starting characters "ABCD" as the EOL, the last
> message in will not get processed until another.  This is to be
> expected.  I could count the number of characters until the message
> length, copy out the message, but would then have to shift the entire
> buffer to the left in case part of another message was in the previous
> transmission.

Yes, shifting the data is a waste of time, and there are better ways to  
deal with it.  For example, use multiple buffers, put into a queue.  Keep  
track of the next byte offset to be read from the current buffer, updating  
that each time you copy out any data.  Discard a buffer once you've read  
all of the bytes from it.

> Or is there a natural break so that one event can not
> have two separate partial TCP packets?

There's no such thing as a "TCP packet", and there is no "natural break"  
in the stream of TCP data.  With TCP, the data may be grouped in any  
arbitrary grouping.  The only guarantee is that the bytes will be received  
in the same order in which they are sent, assuming they are received at  
all.

> What is a good way to deal with something like this?

The first step is to stop thinking of data you receive over TCP as  
"packets".  Even if the application protocol defines "packets" or  
"messages" or whatever you want to call it, TCP doesn't know anything  
about that.  IMHO, if you can conceptually (that is, in your own mind)  
separate the TCP communications _completely_ from your application  
protocol, this leads to better solutions.

Every time you write or think something like "partial TCP packets", you  
are leading your own mental concept in the wrong direction, whether you  
realize it or not.  IMHO, this makes it harder to discover the correct  
solution.

IMHO, the second step is probably to stop using the third party "TCP  
client socket tool".  .NET provides a very nice Socket class, as well as a  
TcpClient class that encapsulates some of the more basic things you'd want  
to do with a TCP connection.  Part of your problem here is that because  
the third party tool offers a feature (supporting the idea of a "end of  
line" marker) you are getting bogged down thinking that you somehow need  
to use that feature.

As you've noted, your data does not have an "end of line" marker.  You  
can't use any concept of "end of line" to handle your data.  Even if you  
did have an "end of line" marker, it's not hard to handle this explicitly  
rather than using some third party library to do it for you.

Especially since the third party tool does not appear to be taking over  
any of the work to actually build up application-level messages anyway,  
it's hard for me to see what value it offers.  It doesn't seem like it  
could be offering much.

As far as the specific solution goes...

There are a variety of ways to address this.  The simplest is to keep  
track of how many bytes you are capable of processing, and only attempt to  
receive that many at once.  So, in your example, you would start out  
receive 8 bytes, which would be the signature (4 bytes) and the message  
length (4 bytes).  Then, once you know how long the message is, receive  
only the number of bytes that will compose that message.  You control the  
number of bytes you receive via the length of the buffer you pass to the  
receive method, of course.  This is VERY inefficient, but if your  
communications do not involve a lot of data, that should not be a problem.

A better way is to disconnect the network i/o from the data handling  
altogether.  This is where not thinking about the TCP stream as having  
"packets" is useful.  You need a layer between your application and your  
network i/o that handles "packetizing" the TCP stream.  There are a  
variety of ways to do this, and the "best" way depends on your own  
application to some degree, but the basic idea is to design the network  
i/o to be efficient relative to how network i/o works, and to design the  
application data handling to be efficient relative to how the application  
works.  The layer in between maps the efficient network i/o to the  
efficient application i/o.

IMHO, the goal here would be to design something that conceptually makes  
sense to you, without worrying too much about the efficiency.  Obviously,  
you should worry a little bit...otherwise, it'd be better to keep it  
simple and to the inefficient thing I mentioned above.  But other than  
that, don't waste time worrying too much about efficiency when the first  
goal should be to get it to work.

Pete
Tom - 23 Jul 2007 15:41 GMT
Thank you Pete,

That was very helpful and informative.  I am going to try and separate
the TCP from the app, by trying to use a queue.  I will just push the
data from the TCP connection onto a queue.  The queue handle the
parsing of the beginning and end, and will then handle the
distribution to the client app.  I will see if I can get that working.

Thanks,
Tom

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.