Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / Managed C++ / March 2008

Tip: Looking for answers? Try searching our database.

Can fopen tell me the coding of a file?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
PLS - 28 Mar 2008 23:59 GMT
When I use fopen for reading with the CCS options, the library will look
at the byte order mark to determine the file type and read accordingly.

I have a more compilcated case. I want to open a file for appending I
want the appended data to always be in UTF-8. If the existing file is in
another encoding I will have to copy and convert it before appending. In
the interest of speed I would prefer not to open the file separately
just to determine what the BOM is.

Is it possible to open for appending with CCS=UTF-8 and then determine
what the existing coding is? Then if the existing is wrong I can close
the file and convert it. This way I'm only doing the extra effort if the
file actually needs conversion.

I think the existing encoding is stored in the FILE structure. Is this
documented anywhere?

 Thanks,
   ++PLS
Jeroen Mostert - 29 Mar 2008 01:32 GMT
> When I use fopen for reading with the CCS options, the library will look
> at the byte order mark to determine the file type and read accordingly.
[quoted text clipped - 4 lines]
> the interest of speed I would prefer not to open the file separately
> just to determine what the BOM is.

How is opening a file and reading the first few bytes going to slow anything
down? You're going to be appending a whole lot more.

That said, not opening a file twice has other benefits which are more
important than any putative speed gain (such as not being caught by surprise
if the file changes during calls).

Note that the behavior you're trying to avoid (reopening) is exactly what
the CRT will do anyway if you use mode "a" -- it will first open the file
for reading to determine the BOM and then reopen it for writing.

> Is it possible to open for appending with CCS=UTF-8 and then determine
> what the existing coding is?

Sure. If you open the file with "a+", you can both read and append. Rewind
the file to the beginning and read the BOM.

> Then if the existing is wrong I can close the file and convert it. This
> way I'm only doing the extra effort if the file actually needs
> conversion.
>
> I think the existing encoding is stored in the FILE structure.

It's not, at least not directly. The file structures are internal to the CRT.

> Is this documented anywhere?

No, and to the best of my knowledge there's no documented way of getting the
encoding used to open the file (or the actual encoding). The CRT support
isolates you from encoding issues. If you want to handle encoding issues
explicitly, you're going to have to handle them explicitly.

Signature

J.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.