Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Visual Studio.NET / General / January 2005

Tip: Looking for answers? Try searching our database.

VS.NET web.config file encoding

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
rdcpro - 28 Jan 2005 22:09 GMT
I've just noticed that the web.config file that Visual Studio .NET 2003
creates in an ASP.NET application is saved with an XML declaration stating
UTF-8 encoding, but appears to be in fact saved using UTF-16 encoding.  Is
there some reason for this?  Will it be fixed/changed?

Regards,
Mike Sharp
Gary Chang[MSFT] - 29 Jan 2005 09:33 GMT
Hi Mike,

>I've just noticed that the web.config file that Visual Studio .NET 2003
>creates in an ASP.NET application is saved with an XML declaration stating
>UTF-8 encoding, but appears to be in fact saved using UTF-16 encoding.  Is
>there some reason for this?  Will it be fixed/changed?

It would be dependent on your machine's system locale(code page) setting,
so what's the locale of your Windows system used?

Thanks!

Best regards,

Gary Chang
Microsoft Community Support
--------------------
Get Secure! ??C www.microsoft.com/security
Register to Access MSDN Managed Newsgroups!
http://support.microsoft.com/default.aspx?scid=/servicedesks/msdn/nospam.asp
&SD=msdn

This posting is provided "AS IS" with no warranties, and confers no rights.
rdcpro - 29 Jan 2005 17:55 GMT
Hi Gary, thanks for your help.

The locale is Seattle...standard XP install from MSDN Univ. media of the
english version (lcid 1033).  Although this still seems strange--I can save
files with either UTF-16 or UTF-8 encoding; I can understand how VS.NET would
save it in UTF-16 if the web.config template was being generated from a
string (a BStr is essentially UTF-16), but I would expect the declaration to
show UTF-16.   If VS.NET was using my locale, wouldn't it be saved in windows
1252?

If VS.NET is going to save it using the local code page...say a double byte
characterset like Big5...would it modify the xml declaration?  The only
reason I would care is that yesterday I happened to be working with XML
Serialization, and noticed that a serialized class, though serialized to
UTF-16 was actually saved as UTF-8 (I did not explicitly specify an encoding
at first).  This caused a parser warning in another application when I opened
the resulting file that basically said "the file is UTF-8 but your
declaration is UTF-16".

I'm working toward deserializing custom configuration sections in my
web.config as part of a framework I'm developing for a project, and wanted to
be sure I understood how VS.NET saves web.config files by default, and if it
can be modified.  

Encoding problems can come back to haunt you! ;^)

Regards,
Mike Sharp

> Hi Mike,
>
[quoted text clipped - 19 lines]
>
> This posting is provided "AS IS" with no warranties, and confers no rights.
Steven Cheng[MSFT] - 31 Jan 2005 06:19 GMT
Hi Mike,

Thanks for your response. As for the further questions you mentioned,
here are some of my understandings:
The encoding declaration such as
<?xml version="1.0" encoding="utf-16" ?>

is just part of the XML file content which is possiblely not the actual
encoding (charset) of the xml file. And this declaration maybe used by some
certain XML file processing tools but most text editor such as notepad will
detect the file's encoding type via the BOM( byte order mark which is the
first two or three bytes in the begin of a certain text file)  or if no BOM
specified, the tool will generally read a certain lengh of bytes from the
begining of the text file and guess the file's encoding (notepad just does
like this).  In addition, as the windows 1252 you mentioned, I don't think
VS.NET will save your web.config into this encoding because the VS.NET will
at least use a MBCS charset to save the file so  that it can contains mbcs
chars and win 1252(latin 1) is a SBCS which only contains the single byte
range chars.  
And the generally all the files in our asp.net projects will be saved into
the encoding specified via the OS's
"System Locale"( for XP ,2000 or later) , the System Locale (in fact dosn't
actually means locale)  is just a charset which is used to handle MBCS for
non-unicode applications. For some fareast system, the System Local is
likely a MBCS such as GB2312 for smiplified Chinese  so that  most
application can use this charset to handle chinese chars via GB2312 rather
than Unicode. However, since your system is English one, no need to deal
with those complex chars such as in Chinese or Japanese, maybe the system
will automatically use the UTF-16, default encoding of unicode.

In  addition, the problem that XML Serialization automatically use UTF-8 is
because the .net's StreamWriter(StreamReader) will automatically use
"UTF-8" encoding if we don't explicitly specify one when constructing a
StreamWriter or StreamReader.  We can explicitly contruct a StreamWriter
with a specified encoding so that the output text content will be saved in
this charset(encoding).

Thanks & Regards,

Regards,

Steven Cheng
Microsoft Online Support

Signature

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.