* lamxing@gmail.com wrote in microsoft.public.dotnet.xml:
> I've spent a long time to try to get the xmldocument.load method
>to handle UTF-8 characters, but no luck. Every time it loads a
>document contains european characters (such as the one below, output
>from google map API), it always said invalid character at position
>229, which I believe is the "ß" character.
Then it is most likely that your document is not UTF-8 encoded. You will
have to check which bytes are actually at that position, e.g. using a
hex editor (e.g., use File.OpenFile ... /e:Binary in Visual Studio). If
the ß is encoded as two bytes C3 9F then that's either not the offending
character, or you have other encoding problems (for example, you might
have told the XML processor the document is US-ASCII encoded).
Note that loading XML documents in Internet Explorer and copying and
pasting the results does not help in any way to debug this kind of
problem, compressing the document and loading it up to some web server
is a more sensible approach.

Signature
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
lamxing@gmail.com - 31 Jan 2007 07:47 GMT
Thanks for your reply, Björn. Since this file is coming from a
dynamic URL online, I just used the XmlDocument.Load(URL) method to
load the xml file. In this case, how do I tell the XML processor what
encoding the file would be before I load the document? I've saved the
sample XML file (dynamicaly generated from google map) from IE's File-
>Save As... , and uploaded the file to http://www.usctimes.com/gmap/
geo.xml . It seems to open fine in the browser, does that means
anything?
> Then it is most likely that your document is not UTF-8 encoded. You will
> have to check which bytes are actually at that position, e.g. using a
[quoted text clipped - 11 lines]
> Weinh. Str. 22 · Telefon: +49(0)621/4309674 ·http://www.bjoernsworld.de
> 68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 ·http://www.websitedev.de/
Martin Honnen - 31 Jan 2007 13:09 GMT
> Since this file is coming from a
> dynamic URL online, I just used the XmlDocument.Load(URL) method to
> load the xml file. In this case, how do I tell the XML processor what
> encoding the file would be before I load the document?
You don't have to tell the encoding, pass in the URL to the Load method
and the XML parser will check the XML declaration for the declared
encoding or will check for byte order mark and will then based on that
information decode the bytes served to characters. If that is not
possible you get an error.
> I've saved the
> sample XML file (dynamicaly generated from google map) from IE's File-
>> Save As... , and uploaded the file to http://www.usctimes.com/gmap/
> geo.xml . It seems to open fine in the browser, does that means
> anything?
It also loads fine with .NET and the Load method of
System.Xml.XmlDocument so that file is properly encoded. And .NET parses
it just fine (tested with .NET 1.x and 2.0).

Signature
Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
lamxing@gmail.com - 31 Jan 2007 16:46 GMT
Hi Martin,
Thanks for the test result. It seems that if I load the file I
saved earlier using XmlDocument.Load(), it worked fine. But when I
tried to load the dynamic generated file directly from google map's
server, it will cause that "invalid character in the given encoding,
line 1, position 228" error. Does that mean google map uses the wrong
encoding for that XML file? I don't think I can post the complete
google map link here as the URL contains the google map API key. But
the URL goes something like this:
http://maps.google.com/maps/geo?q=germaniastr%20134,%20berlin%20berlin&output=xm
l&key=GOOGLEKEY
Any thoughts?
Chris
> lamx...@gmail.com wrote:
>
[quoted text clipped - 25 lines]
> Martin Honnen --- MVP XML
> http://JavaScript.FAQTs.com/
Martin Honnen - 31 Jan 2007 16:58 GMT
> It seems that if I load the file I
> saved earlier using XmlDocument.Load(), it worked fine. But when I
> tried to load the dynamic generated file directly from google map's
> server, it will cause that "invalid character in the given encoding,
> line 1, position 228" error. Does that mean google map uses the wrong
> encoding for that XML file?
It means that the XML is not properly encoded.

Signature
Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
lamxing@gmail.com - 31 Jan 2007 22:16 GMT
> lamx...@gmail.com wrote:
> > It seems that if I load the file I
[quoted text clipped - 10 lines]
> Martin Honnen --- MVP XML
> http://JavaScript.FAQTs.com/
Martin,
Do you have any suggestion on how can I load this dynamic file, or how
to make the xml document properly encoded?
Thanks!
Bjoern Hoehrmann - 01 Feb 2007 00:15 GMT
* lamxing@gmail.com wrote in microsoft.public.dotnet.xml:
>Do you have any suggestion on how can I load this dynamic file, or how
>to make the xml document properly encoded?
If the XML document is really not properly encoded, you should contact
Google to have their service fixed. Until then all you can do is try to
fix the XML document before parsing. For example, you could remove all
non-ASCII octets or you could transcode the document from Windows-1252
to UTF-8 using System.Text.Encoding.

Signature
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
lamxing@gmail.com - 01 Feb 2007 18:25 GMT
> * lamx...@gmail.com wrote in microsoft.public.dotnet.xml:
>
[quoted text clipped - 10 lines]
> Weinh. Str. 22 · Telefon: +49(0)621/4309674 ·http://www.bjoernsworld.de
> 68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 ·http://www.websitedev.de/
Hi Björn, Can you provide an example of how to save an online xml
document and transcode it to UTF-8 with System.Text.Encoding? Thanks!
Helena Kotas [MSFT] - 06 Feb 2007 02:10 GMT
First you have to find out which encoding does the dynamic document use.
XmlDocument/XmlTextReader by default uses UTF-8 unless there is a BOM mark or
encoding attribute in the XML declaration that says something else. Once you
find out the encoding, create a StreamReader over the input stream and
specify the document's encoding in its constructor. Then create an XmlReader
over this StreamReader and use XmlDocument.Load to load the document.
If you are sure that the document's encoding is indeed UTF-8 and there is an
invalid character in it, you can create an instance of UTF8Encoding that will
ignore invalid characters (see the UTF8Encoding constuctor).
-Helena
> > * lamx...@gmail.com wrote in microsoft.public.dotnet.xml:
> >
[quoted text clipped - 13 lines]
> Hi Björn, Can you provide an example of how to save an online xml
> document and transcode it to UTF-8 with System.Text.Encoding? Thanks!
Tim Heap - 22 Mar 2007 13:33 GMT
Help !
I have the same problem and need to remove funny characters from my
source xml file. Please can someone supply an example..
Tim Heap
Software & Database Manager
POSTAR Ltd
www.postar.co.uk
tim@postar.co.uk