Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / Managed C++ / October 2005

Tip: Looking for answers? Try searching our database.

How do I copy the content of a string to a char array?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Kueishiong Tu - 16 Oct 2005 16:23 GMT
How do I copy the content of a string in one encoding (in my case big5) to a
char array (unmanaged) of the same encoding?

I try the following

   String line[] = S"123水泥";
   char buffer[200];

   for(int i=0; i<line->get_length(); i++)
   {
       buffer[i] = (char) line->Chars[i];
   }

It works fine for the first 3 Ascii characters, but gets messed up for the
next 2 Chinese characters. What is wrong here?
Jochen Kalmbach [MVP] - 16 Oct 2005 16:46 GMT
Hi Kueishiong!
> How do I copy the content of a string in one encoding (in my case big5) to a
> char array (unmanaged) of the same encoding?

>     String line[] = S"123水泥";

.NET strings have no special encoding!!! They are always stored in UTF-16.

>     char buffer[200];

You need to convert the UTF-16 string to the "big5" string!

>     for(int i=0; i<line->get_length(); i++)
>     {
[quoted text clipped - 3 lines]
> It works fine for the first 3 Ascii characters, but gets messed up for the
> next 2 Chinese characters. What is wrong here?

You can use the following:

  System::Text::Encoding *e = System::Text::Encoding::GetEncoding("big5");
  System::Byte big5 __gc[] = e->GetBytes(S"123水泥");
  char *szBig5 = new char[big5->get_Count()+1];
  System::Byte __pin *b = &big5[0];
  strncpy(szBig5, (char*) b, big5->get_Count());
  szBig5[big5->get_Count()] = 0;

Signature

Greetings
  Jochen

   My blog about Win32 and .NET
   http://blog.kalmbachnet.de/

Carl Daniel [VC++ MVP] - 16 Oct 2005 17:25 GMT
> Hi Kueishiong!
>> How do I copy the content of a string in one encoding (in my case
[quoted text clipped - 4 lines]
> .NET strings have no special encoding!!! They are always stored in
> UTF-16.

Actually, I believe it's UCS2.  It's not UTF16 since there's no multi-word
characters in the .NET representation and code points above 0xffff are
simply not representable.

>>     char buffer[200];
>
> You need to convert the UTF-16 string to the "big5" string!

Or store it in wchar_t buffer[200] instead of char to preserve the UCS2
format.

-cd
Kueishiong Tu - 16 Oct 2005 18:03 GMT
Thank you very much for replying. I change buffer to wchar_t and the coping
works fine.
However the ultimate object I need is a char array becuase the code
following it
requires that. How do I convert a wchar_t array to a char array? From my
experience I know a char array can store both a one-byte ASCII character and
two-byte Chinese character.

Kueishiong Tu
Jochen Kalmbach [MVP] - 16 Oct 2005 18:20 GMT
Hi Carl!

> Actually, I believe it's UCS2.  It's not UTF16 since there's no multi-word

In fact there is no multi-word, but there are high/loh-surrogates...
And this _is_ UTF-16 (everything in windows is using UTF-16).

See: http://www.unicode.org/notes/tn12/
<quote>
Most major software with good Unicode support uses UTF-16 (or 16-bit
Unicode strings). Note that much of the software listed below runs on
Unix/Linux systems as well as Windows and others.

- Everything Microsoft — Windows (including Pocket PC) and application
</quote>

> characters in the .NET representation and code points above 0xffff are
> simply not representable.

This would be very bad, then .NET would not support unicode!!!
(and by the way: .NET *is* fully unicode enabled).

At least with .NET 2.0, they added some classes to query all the
necessary infos...

See: StringInfo Class
http://msdn2.microsoft.com/en-us/library/c4hkht93(en-us,VS.80).aspx

See: StringInfo.ParseCombiningCharacters
http://msdn2.microsoft.com/en-us/library/2wayc3ak(en-us,vs.80).aspx

Signature

Greetings
  Jochen

   My blog about Win32 and .NET
   http://blog.kalmbachnet.de/

Carl Daniel [VC++ MVP] - 16 Oct 2005 18:55 GMT
> Hi Carl!
>
[quoted text clipped - 3 lines]
> In fact there is no multi-word, but there are high/loh-surrogates...
> And this _is_ UTF-16 (everything in windows is using UTF-16).

Consider myself educated :)  I didn't realize that support for code points
above 0xffff was in fact included in .NET.  I'm sure I've missed something,
but I don't recall any very useful character sets in the code points at
10000 and above (e.g. Klingon, Elvish), but I'm happy to see that they're
representable.

-cd
Jochen Kalmbach [MVP] - 16 Oct 2005 19:02 GMT
Hi Carl!

> Consider myself educated :)  I didn't realize that support for code points
> above 0xffff was in fact included in .NET.  I'm sure I've missed something,
> but I don't recall any very useful character sets in the code points at
> 10000 and above (e.g. Klingon, Elvish), but I'm happy to see that they're
> representable.

Some might be usefull (but you are right: most of them will never be used):

10000..1007F; Linear B Syllabary
10080..100FF; Linear B Ideograms
10100..1013F; Aegean Numbers
10140..1018F; Ancient Greek Numbers
10300..1032F; Old Italic
10330..1034F; Gothic
10380..1039F; Ugaritic
103A0..103DF; Old Persian
10400..1044F; Deseret
10450..1047F; Shavian
10480..104AF; Osmanya
10800..1083F; Cypriot Syllabary
10A00..10A5F; Kharoshthi
1D000..1D0FF; Byzantine Musical Symbols
1D100..1D1FF; *Musical Symbols*
1D200..1D24F; Ancient Greek Musical Notation
1D300..1D35F; Tai Xuan Jing Symbols
1D400..1D7FF; *Mathematical Alphanumeric Symbols*
20000..2A6DF; *CJK Unified Ideographs Extension B*
2F800..2FA1F; *CJK Compatibility Ideographs Supplement*
E0000..E007F; Tags
E0100..E01EF; Variation Selectors Supplement
F0000..FFFFF; Supplementary Private Use Area-A
100000..10FFFF; Supplementary Private Use Area-B

Signature

Greetings
  Jochen

   My blog about Win32 and .NET
   http://blog.kalmbachnet.de/

Kueishiong Tu - 16 Oct 2005 17:37 GMT
Thank you very much for replying.

">    System::Text::Encoding *e = System::Text::Encoding::GetEncoding("big5");
>    System::Byte big5 __gc[] = e->GetBytes(S"123水泥");

However the source is something I read from a text file which is in a String.

   FileStream* fs = new FileStream(path, FileMode::Open);
   StreamReader* sr = new StreamReader(fs, Encoding::GetEncoding("big5"));
   String *line = sr->ReadLine();

   As your suggestion, I have to convert a String to a Byte array.
   How do I do that?

>    char *szBig5 = new char[big5->get_Count()+1];
>    System::Byte __pin *b = &big5[0];
>    strncpy(szBig5, (char*) b, big5->get_Count());
>    szBig5[big5->get_Count()] = 0;

Kueishiong Tu
Jochen Kalmbach [MVP] - 16 Oct 2005 18:22 GMT
Hi Kueishiong!

> However the source is something I read from a text file which is in a String.
>
>     FileStream* fs = new FileStream(path, FileMode::Open);
>     StreamReader* sr = new StreamReader(fs, Encoding::GetEncoding("big5"));
>     String *line = sr->ReadLine();

This does _not_ matter!!!
If you have a "string" then it _is_ unicode. The encoding was only used
while reading  the file (and translating the big5-encoding to unicode).

>     As your suggestion, I have to convert a String to a Byte array.
>     How do I do that?
>>    char *szBig5 = new char[big5->get_Count()+1];

My example works very well. What is your problem?

Signature

Greetings
  Jochen

   My blog about Win32 and .NET
   http://blog.kalmbachnet.de/

Kueishiong Tu - 17 Oct 2005 15:27 GMT
In your example

> System::Text::Encoding *e = System::Text::Encoding::GetEncoding("big5");
> System::Byte big5 __gc[] = e->GetBytes(S"123水泥");
> char *szBig5 = new char[big5->get_Count()+1];
> System::Byte __pin *b = &big5[0];
> strncpy(szBig5, (char*) b, big5->get_Count());
> szBig5[big5->get_Count()] = 0;

you copy the content of Byte array pointed at by b to a char array szBig5.
However what I need is to copy the content of a String to a char array.
(said String *b = S"123水泥" to szBig5)

> Hi Kueishiong!
>
[quoted text clipped - 13 lines]
>
> My example works very well. What is your problem?
Jochen Kalmbach [MVP] - 17 Oct 2005 17:19 GMT
Hi Kueishiong!

>>System::Text::Encoding *e = System::Text::Encoding::GetEncoding("big5");
>>System::Byte big5 __gc[] = e->GetBytes(S"123水泥");
[quoted text clipped - 6 lines]
> However what I need is to copy the content of a String to a char array.
> (said String *b = S"123水泥" to szBig5)

Maybe we are talking about different things...

I though you wanted a char-array in big5-encoding? Isn´t this what you
wanted???

And excactly this does my example...
It converts a "string" into an char-array which is encoded in "big5".

Signature

Greetings
  Jochen

   My blog about Win32 and .NET
   http://blog.kalmbachnet.de/

Kueishiong Tu - 17 Oct 2005 18:01 GMT
Hi Jochen!

What I want is to copy the content of a String

    (
    as the source is  read from a text file using the following StreamReader
    sr->ReadLine() call and stored in the String class *line
     FileStream* fs = new FileStream(path, FileMode::Open);
     StreamReader* sr = new StreamReader(fs, Encoding::GetEncoding("big5"));
     String *line = sr->ReadLine();
     )

to a char array (said buffer declared as char buffer[200]), i.e.

    move the contents in *line to buffer[].

> Hi Kueishiong!
>
[quoted text clipped - 16 lines]
> And excactly this does my example...
> It converts a "string" into an char-array which is encoded in "big5".
Jochen Kalmbach [MVP] - 17 Oct 2005 18:15 GMT
Hi Kueishiong!

> What I want is to copy the content of a String
>
[quoted text clipped - 7 lines]
>
> to a char array (said buffer declared as char buffer[200]), i.e.

What is "char" ? 8-bit?

>      move the contents in *line to buffer[].

There is no difference between buffer[] and *buffer

System::String *line = S"123水泥";

System::Text::Encoding *e = System::Text::Encoding::GetEncoding("big5");
System::Byte big5 __gc[] = e->GetBytes(line);
char *buffer[ = new char[big5->get_Count()+1];
System::Byte __pin *b = &big5[0];
strncpy(buffer[, (char*) b, big5->get_Count());
buffer[big5->get_Count()] = 0;

// now the buffer contains the char-array encoded in "big5"
// after you have used the buffer, you need to destroy it...

delete [] buffer;

(and this was exactly my 1st reply...)

Signature

Greetings
  Jochen

   My blog about Win32 and .NET
   http://blog.kalmbachnet.de/

Kueishiong Tu - 17 Oct 2005 18:58 GMT
Dear Jochen:

> System::Byte big5 __gc[] = e->GetBytes(line);

It is the above line that converts from a String to a Byte array that I want.
I put that in, and the whole program works fine. Thank you very much for help.

Kueishiong Tu
Norman Diamond - 17 Oct 2005 01:33 GMT
> Thank you very much for replying.
>
[quoted text clipped - 11 lines]
>    As your suggestion, I have to convert a String to a Byte array.
>    How do I do that?

If the file contains a Byte array (ANSI string) and you need to pass the
same byte array to another routine, then don't read a String (Unicode
string).  Read a byte array in the first place.
Kueishiong Tu - 17 Oct 2005 15:36 GMT
Thank you very much for replying.

> If the file contains a Byte array (ANSI string) and you need to pass the
> same byte array to another routine, then don't read a String (Unicode
> string).  Read a byte array in the first place.

How do I read the content of a text file in as a Byte array instread of a
String which a StreamReader *sr->ReadLine() return?

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.