Hi Gary, thanks for your help.
The locale is Seattle...standard XP install from MSDN Univ. media of the
english version (lcid 1033). Although this still seems strange--I can save
files with either UTF-16 or UTF-8 encoding; I can understand how VS.NET would
save it in UTF-16 if the web.config template was being generated from a
string (a BStr is essentially UTF-16), but I would expect the declaration to
show UTF-16. If VS.NET was using my locale, wouldn't it be saved in windows
1252?
If VS.NET is going to save it using the local code page...say a double byte
characterset like Big5...would it modify the xml declaration? The only
reason I would care is that yesterday I happened to be working with XML
Serialization, and noticed that a serialized class, though serialized to
UTF-16 was actually saved as UTF-8 (I did not explicitly specify an encoding
at first). This caused a parser warning in another application when I opened
the resulting file that basically said "the file is UTF-8 but your
declaration is UTF-16".
I'm working toward deserializing custom configuration sections in my
web.config as part of a framework I'm developing for a project, and wanted to
be sure I understood how VS.NET saves web.config files by default, and if it
can be modified.
Encoding problems can come back to haunt you! ;^)
Regards,
Mike Sharp
> Hi Mike,
>
[quoted text clipped - 19 lines]
>
> This posting is provided "AS IS" with no warranties, and confers no rights.
Steven Cheng[MSFT] - 31 Jan 2005 06:19 GMT
Hi Mike,
Thanks for your response. As for the further questions you mentioned,
here are some of my understandings:
The encoding declaration such as
<?xml version="1.0" encoding="utf-16" ?>
is just part of the XML file content which is possiblely not the actual
encoding (charset) of the xml file. And this declaration maybe used by some
certain XML file processing tools but most text editor such as notepad will
detect the file's encoding type via the BOM( byte order mark which is the
first two or three bytes in the begin of a certain text file) or if no BOM
specified, the tool will generally read a certain lengh of bytes from the
begining of the text file and guess the file's encoding (notepad just does
like this). In addition, as the windows 1252 you mentioned, I don't think
VS.NET will save your web.config into this encoding because the VS.NET will
at least use a MBCS charset to save the file so that it can contains mbcs
chars and win 1252(latin 1) is a SBCS which only contains the single byte
range chars.
And the generally all the files in our asp.net projects will be saved into
the encoding specified via the OS's
"System Locale"( for XP ,2000 or later) , the System Locale (in fact dosn't
actually means locale) is just a charset which is used to handle MBCS for
non-unicode applications. For some fareast system, the System Local is
likely a MBCS such as GB2312 for smiplified Chinese so that most
application can use this charset to handle chinese chars via GB2312 rather
than Unicode. However, since your system is English one, no need to deal
with those complex chars such as in Chinese or Japanese, maybe the system
will automatically use the UTF-16, default encoding of unicode.
In addition, the problem that XML Serialization automatically use UTF-8 is
because the .net's StreamWriter(StreamReader) will automatically use
"UTF-8" encoding if we don't explicitly specify one when constructing a
StreamWriter or StreamReader. We can explicitly contruct a StreamWriter
with a specified encoding so that the output text content will be saved in
this charset(encoding).
Thanks & Regards,
Regards,
Steven Cheng
Microsoft Online Support

Signature
Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)