Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / General / July 2004

Tip: Looking for answers? Try searching our database.

Convert DOS Cyrillic text to Unicode

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Nikolay Petrov - 27 Jul 2004 07:44 GMT
How can I convert DOS cyrillic text to Unicode
Jon Skeet [C# MVP] - 27 Jul 2004 08:16 GMT
> How can I convert DOS cyrillic text to Unicode

See http://www.pobox.com/~skeet/csharp/unicode.html

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nikolay Petrov - 27 Jul 2004 09:52 GMT
I have read this and other info in Unicode topic
My question is how can I do it in VB. I need the code.

> > How can I convert DOS cyrillic text to Unicode
>
> See http://www.pobox.com/~skeet/csharp/unicode.html
Jon Skeet [C# MVP] - 27 Jul 2004 10:15 GMT
> I have read this and other info in Unicode topic
> My question is how can I do it in VB. I need the code.

I provide some C# code to read a file in one encoding and write it in
another. It's very simple code - it should be easy to understand and
rewrite in VB.NET. The important thing is really just the creation of
the StreamReader with the right encoding.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nikolay Petrov - 27 Jul 2004 10:25 GMT
My problem is that I don't read file.
The DOS Cyrillic text is pasted in a textbox, and should apear in another.
That's all.
I don't have anyting in Binary.

> > I have read this and other info in Unicode topic
> > My question is how can I do it in VB. I need the code.
[quoted text clipped - 3 lines]
> rewrite in VB.NET. The important thing is really just the creation of
> the StreamReader with the right encoding.
Jon Skeet [C# MVP] - 27 Jul 2004 10:48 GMT
> My problem is that I don't read file.
> The DOS Cyrillic text is pasted in a textbox, and should apear in another.
> That's all.
> I don't have anyting in Binary.

If it's in a text box, you should have it as Unicode text already. All
strings are in Unicode in .NET.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Cor Ligthert - 27 Jul 2004 10:33 GMT
Hi Jon,

I pointed Nikolay in the language.VB newsgroup on you and Jay B, who has
answered a message in language.VB however as well not complete enough for
Nikolay. Jay B will probably not be active on this newsgroup before 13:00
GMT.

I am curious as well, what is the right encoding you think about for this
Cyrillic problem?

Nikolas wrote in the language VB group that he past it from a notepad
so I guess UTF16?

:-)

Cor

...
> > I have read this and other info in Unicode topic
> > My question is how can I do it in VB. I need the code.
[quoted text clipped - 5 lines]
>
> --
Jon Skeet [C# MVP] - 27 Jul 2004 10:49 GMT
> I pointed Nikolay in the language.VB newsgroup on you and Jay B, who has
> answered a message in language.VB however as well not complete enough for
[quoted text clipped - 3 lines]
> I am curious as well, what is the right encoding you think about for this
> Cyrillic problem?

Not sure - but it sounds like it won't actually be a problem, as if
he's got the data in notepad to start with, there's no encoding change
required - cut and paste should sort everything out.

> Nikolas wrote in the language VB group that he past it from a notepad
> so I guess UTF16?

No way - DOS precedes UTF16 by a long time!

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nikolay Petrov - 27 Jul 2004 11:09 GMT
The user pasts text from text files, which contain DOS Cyrillic characters.
When they are pasted in text box or even in the Notepad windows they look
like garbage.
I am not sure, can I post a file here as attachment, so you can see it?

> > I have read this and other info in Unicode topic
> > My question is how can I do it in VB. I need the code.
[quoted text clipped - 3 lines]
> rewrite in VB.NET. The important thing is really just the creation of
> the StreamReader with the right encoding.
Jon Skeet [C# MVP] - 27 Jul 2004 11:26 GMT
> The user pasts text from text files, which contain DOS Cyrillic characters.

What does he have the text open in? It sounds like the existing app is
probably not putting it into the clipboard in Unicode :(

> When they are pasted in text box or even in the Notepad windows they look
> like garbage.

Ah - I thought you meant he had it working in notepad to start with.

> I am not sure, can I post a file here as attachment, so you can see it?

It's probably best if you email it to me.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Cor Ligthert - 27 Jul 2004 11:31 GMT
Hi John,

>It's probably best if you email it to me.

I am also interested in this question, so why not mail to the newsgroup?

Cor
Jon Skeet [C# MVP] - 27 Jul 2004 11:46 GMT
> >It's probably best if you email it to me.
>
> I am also interested in this question, so why not mail to the
> newsgroup?

It's more that depending on the way of attaching the file, it might get
converted during the attachment process - that's less likely to happen
in a mail message.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Cor Ligthert - 27 Jul 2004 11:56 GMT
> It's more that depending on the way of attaching the file, it might get
> converted during the attachment process - that's less likely to happen
> in a mail message.

So I wait the results and than you can maybe send it to me when all is
clear?

Cor
Jon Skeet [C# MVP] - 27 Jul 2004 12:09 GMT
> > It's more that depending on the way of attaching the file, it might get
> > converted during the attachment process - that's less likely to happen
> > in a mail message.

> So I wait the results and than you can maybe send it to me when all is
> clear?

Yup, sure. I suspect there's nothing particularly interesting about the
file though - it's just I should be able to work out what encoding it's
in, so that if the OP *does* want to read it directly (rather than with
c'n'p) he should be able to.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nikolay Petrov - 27 Jul 2004 12:16 GMT
Ok guys, I have mailed it to both of you

I'll also but some of this DOS text here, case anyone else is interested

???<?'? ??  6 ?. 2004??".

> > The user pasts text from text files, which contain DOS Cyrillic characters.
>
[quoted text clipped - 9 lines]
>
> It's probably best if you email it to me.
Nikolay Petrov - 27 Jul 2004 14:24 GMT
New problem ;-(
Text is encoded partialy.
All calital letters are fine, and some of the lower, but not all.
What may  coused this?

> How can I convert DOS cyrillic text to Unicode
Jon Skeet [C# MVP] - 27 Jul 2004 14:44 GMT
> New problem ;-(
> Text is encoded partialy.

At what stage?

> All calital letters are fine, and some of the lower, but not all.
> What may  coused this?

No idea - are you saying the original files are corrupt, basically?

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Paul Gorodyansky - 31 Jul 2004 00:38 GMT
Hi,

> New problem ;-(
> Text is encoded partialy.
> All calital letters are fine, and some of the lower, but not all.
> What may  coused this?
>
> > How can I convert DOS cyrillic text to Unicode

You did not answer Jon's question, but it was critical -
in what _program_ your user opens a text file with DOS Cyrillic?

I am working with Cyrillic encodings since 1995 :) so I dealt
with most of them, including CP-866.

The easiest way in your scenario would be:

Open that DOS Cyrillic .txt file in MS Word 2000 or newer,
choosing "Cyrillic (DOS)" encoding in the process:
http://ourworld.compuserve.com/homepages/PaulGor/cp_e.htm#open

Now your user should see normal Russian text - in Unicode already
converted by Word and can paste it itno your text box.

Otherwise, if you try to open a file that contains text in
DOS Cyrillic encoding in some regular MS Windows text editor,
you *will* see just gibberish - editor expects one of _Windows_
encodings, not a DOS one.

There are many more ways to get it done, say converter programs that
make "Cyrillic(Windows), 1251" text from your DOS Cyrillic text,
I18n-aware editors that - as Word - offer you to specify explicitely
what is the encoding of your file - such as
http://www.esperanto.mv.ru/UniRed/ENG/
etc., etc.

Signature

Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet":
  http://RusWin.net
Russian On-screen Keyboard: http://Kbd.RusWin.net


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.