Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / June 2007

Tip: Looking for answers? Try searching our database.

Changing garbled text ("¾î´À ¸ÚÁø³¯") to readable text ("어느 멋진날")

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
JC - 27 Jun 2007 07:02 GMT
I have been trying to implement a method to convert garbled text such
as "¾î´À ¸ÚÁø³¯" to it's proper form as "어느 멋진날" (which you'll see if
you have the Korean language pack).

Someone recommended using Encoding._properEncoding (ie. UTF8 or
Unicode).  As far as I can tell, the issue doesn't seem to simply be
an encoding problem but it seems like a character set issue also.  If
I place the raw text "¾î´À ¸ÚÁø³¯" into the body of an HTML file and
open it, and change the character encoding to Korean or set "charset =
EUC-KR" I get the proper form of the text.  However, the Unicode code-
points do not match up between the garbled text and proper text, yet
somehow through a web-browser I am able to convert the garbled text. I
have scoured the internet for help but to no avail.

This is the code I used to see the byte values, and attempt to solve
the issue.

   public static void Main()
   {

       string test1 = "¾î´À ¸ÚÁø³¯";
       string test2 = "어느 멋진날";

       Encoding unicode = Encoding.Unicode;

       Byte[] unicodeBytes = unicode.GetBytes(test1);
       //string correct = Encoding.UTF8.GetString(unicodeBytes);

       StringBuilder sb = new StringBuilder();
       foreach (byte b in unicodeBytes)
       {
           sb.Append(b).Append(" ");
       }

       Console.WriteLine(sb);
       //Console.WriteLine(correct);
   }

The return byte values for the garbled text is:
190 : 0 : 238 : 0 : 180 : 0 : 192 : 0 : 32 : 0 : 184 : 0 : 218 : 0 :
193 : 0 : 248 : 0 : 179 : 0 : 175 : 0

The return byte values for the correct text is:
180 : 197 : 144 : 178 : 32 : 0 : 75 : 186 : 196 : 201 : 160 : 176

I realize that in the console window, if the text were to actually
come out correct it would be displayed as "?? ???" but I couldn't get
that either.  Thank you in advance for any help.

JC
Jon Skeet [C# MVP] - 27 Jun 2007 07:30 GMT
> I have been trying to implement a method to convert garbled text such
> as "???? ??????" to it's proper form as "?? ???" (which you'll see if
[quoted text clipped - 9 lines]
> somehow through a web-browser I am able to convert the garbled text. I
> have scoured the internet for help but to no avail.

Your code is currently converting from one *string* to another. You
should be converting from your original file's *binary* data into a
string using an encoding. It sounds like you still haven't got the
right encoding.

See http://www.yoda.arachsys.com/csharp/strings.html for some code
which will let you see the Unicode points for a string easily.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

JC - 27 Jun 2007 08:14 GMT
> Your code is currently converting from one *string* to another. You
> should be converting from your original file's *binary* data into a
> string using an encoding. It sounds like you still haven't got the
> right encoding.

I understand I need the binary data now, but then my question is, how
do I get the binary data of a filename?  I have been using FileInfo to
access information about the files, is there another class I should be
using to retrieve the binary data.  Ha I'm sure it's a simple solution
but I can't seem to wrap my head around it  .And for the most part
I'll be dealing with ID3 tags and the data returned from the ID3 tag
methods are all strings.

Thanks for the link, I've been to that site a couple times and skimmed
through it/bookmarked it and tried to make as much sense of it as
possible but overall it was very enlightening.

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.