Hi folks,
I am sending a string over a socket from a C# app to a native C++ app.
I embed the unicode symbol for the pound sign in the string being
sent. When I read data from the socket, I find that a 0xC2 has been
added into the input stream right before the pound symbol.
C# fragment :
TcpClient skt = null;
StreamWriter wrt = null;
try {
skt = local.AcceptTcpClient();
skt.NoDelay = true; //no buffering!
NetworkStream ns = skt.GetStream();
string result = "STATUS|\u00A37.89 (GBP)|
sess"
wrt = new StreamWriter(ns);
//results are EOL delimited
wrt.WriteLine(result);
wrt.Flush();
} catch (Exception e) {
Console.WriteLine(e.ToString());
} finally {
if (rdr != null) rdr.Dispose();
if (skt != null) skt.Close();
}
C++ fragment :
char inbuf[1024];
iResult = recv(sock, inbuf, 1024, 0);
if (iResult != SOCKET_ERROR && iResult > 0) {
Log::Debug(__WFILE__, __LINE__, L"recvBuf has %d bytes\n", iResult);
if ( iResult > 1024 ) {
Log::Debug(__WFILE__, __LINE__, L"Received way too many bytes in
response - ignoring\n");
*netStatus = 6;
} else {
---> inbuf[iResult] = '\0';
::MultiByteToWideChar(CP_ACP, 0, inbuf, iResult+1, outStr, 1024);
Log::Debug(__WFILE__, __LINE__, L"Login response : %s\n", inbuf);
Log::Debug(__WFILE__, __LINE__, L"Login response : %s\n", outStr);
*outBytes = wcslen(outStr);S
*netStatus = 0;
Log::Debug(__WFILE__, __LINE__, L"<--SendServiceMessage received :
%s\n", outStr);
}
}
The socket communicatiopn works fine, but the problem is on the C++
end, if I check inbuf right after the socket read (i.e. at the --->),
it has an extra character in it.
Sent from C# : STATUS|£7.89 (GBP)|sess
Received by C++ : STATUS|£7.89 (GBP)|sess
Does anyone have any suggestions or theroies regarding the extra
character? I am stumped.
Peter Duniho - 19 May 2008 19:17 GMT
> [...]
> Does anyone have any suggestions or theroies regarding the extra
> character? I am stumped.
It's very difficult to say without a complete code sample.
However, the first thing I'd try is changing "CP_ACP" to "CP_UTF8" in your
call to MultiByteToWideChar().
Also, you should really reconsider disabling Nagle (TcpClient.NoDelay).
It's unlikely that data coalescing is causing any significant performance
issue, and disabling it can actually cause worse performance in general,
depending on what your network i/o looks like.
Pete
Barry Kelly - 19 May 2008 19:25 GMT
> I am sending a string over a socket from a C# app to a native C++ app.
Sockets deal with binary data. Strings are text data. To convert from
text to binary you need an encoder. If you don't supply an encoder, a
default will be chosen.
> I embed the unicode symbol for the pound sign in the string being
The "pound sign" is ambiguous; to a lot of Americans, it means #.
I believe from your source you mean '£', unicode 0x00A3.
> sent. When I read data from the socket, I find that a 0xC2 has been
> added into the input stream right before the pound symbol.
This is because of the encoding. The default encoding you are using is
doing this. When you use a StreamWriter constructor overload that
doesn't specify an encoding, you end up with Encoding.Default, which
ought to be the system default ANSI codepage.
What you need to do is explain what you expect to see on the wire. What
encoding is the C++ side expecting? A specific ANSI codepage? UTF-8?
UTF-16? ASCII (which doesn't include 0x00A3)? Both sides need to agree
on the encoding.
> C# fragment :
> wrt = new StreamWriter(ns);
Here, you are not specifying the encoding. To get the desired result,
you need to use the right encoding, and use the StreamWriter(Stream,
Encoding) overload, and be sure the receiving side is expecting the same
encoding.
> C++ fragment :
> ---> inbuf[iResult] = '\0';
> ::MultiByteToWideChar(CP_ACP, 0, inbuf, iResult+1, outStr, 1024);
Is the default ANSI code page on the C++ system the same as on the
sending system? You shouldn't rely on this default for an over the wire
protocol, IMO.
> Log::Debug(__WFILE__, __LINE__, L"Login response : %s\n", inbuf);
> Log::Debug(__WFILE__, __LINE__, L"Login response : %s\n", outStr);
[quoted text clipped - 9 lines]
> end, if I check inbuf right after the socket read (i.e. at the --->),
> it has an extra character in it.
Yes, that's before you've decoded (Multibyte ANSI->Unicode) the binary
data.
-- Barry

Signature
http://barrkel.blogspot.com/