Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / XML / April 2006

Tip: Looking for answers? Try searching our database.

"Illegal characters in path" with XmlReader of .Net 1.1

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Eckhard Schwabe - 28 Apr 2006 17:53 GMT
I only found one post on Google where someone mentions the same problem
with a DataSet:

XmlDataReader in .Net 1.1 can not read XML files from a path which
contains "%10" or "%3f".

code to reproduce:

string filename = "%10.xml"; //XML file with this name is existing
XmlReader reader = new XmlTextReader(filename);
reader.Read();

this will throw an
System.ArgumentException: "Illegal characters in path."
   at System.IO.Path.CheckInvalidPathChars(String path)
   at System.IO.Path.GetFileName(String path)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess
access, FileShare share)
   at System.Xml.XmlDownloadManager.GetStream(Uri uri, ICredentials
credentials)
   at System.Xml.XmlUrlResolver.GetEntity(Uri absoluteUri, String role,
Type ofObjectToReturn)
   at System.Xml.XmlTextReader.CreateScanner()
   at System.Xml.XmlTextReader.Init()
   at System.Xml.XmlTextReader.Read()

All classes which use the XmlReader internally have the same problem
(DataSet, XmlDocument, ...), but XmlWriter can write to such a file.

This bug is fixed in .Net 2.0, and in 1.1 the workaround is simply to
first load the file into a streamreader:

IO.StreamReader streamReader = new StreamReader(filename);
XmlReader reader = new XmlTextReader(streamReader);               
...

What really interests me is:
- Why can't I find any info about this behaviour on MSDN? This would
have saved us from shipping a product which falls over "%" in the filename.
As our product is for chemists, the probability of choosing this
character in a file or directory name is not so low as one might expect
from "normal" usage: "EtOH60%10ml_method" beeing one example.
- What is so special about "%10" or "%3f" ? Are there other
"problematic" charater combinations?

as a sidenote:
If I try to add a file to a Solution in Visual Studio 2005 it tells me:

>Item and file names cannot:
>- contain any of the following characters: / ? : & \ * " < > | # %
Since when are "#" or "%" invalid characters in the windows file system?

Eckhard
Bjoern Hoehrmann - 29 Apr 2006 14:12 GMT
* Eckhard Schwabe wrote in microsoft.public.dotnet.xml:
>I only found one post on Google where someone mentions the same problem
>with a DataSet:
>
>XmlDataReader in .Net 1.1 can not read XML files from a path which
>contains "%10" or "%3f".

To clarify, does the path include the string "%3f" or does it include
the string "?"? In case of the former you have to URL-escpae the path
before passing it here, System.Uri has methods for that. "%253f" would
be the right string in this case.

>What really interests me is:
>- Why can't I find any info about this behaviour on MSDN? This would
>have saved us from shipping a product which falls over "%" in the filename.
>As our product is for chemists, the probability of choosing this
>character in a file or directory name is not so low as one might expect
>from "normal" usage: "EtOH60%10ml_method" beeing one example.

The documentation is rather clear that the arguments are URLs, not file
names. There is some overlap, but in cases like this the difference is
important.

>- What is so special about "%10" or "%3f" ? Are there other
>"problematic" charater combinations?

These map to U+0010 and U+003F, a control character and the question
mark. RFC 3986 has the details.
Signature

Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Eckhard Schwabe - 30 Apr 2006 14:10 GMT
>> XmlDataReader in .Net 1.1 can not read XML files from a path which
>> contains "%10" or "%3f".
>
> To clarify, does the path include the string "%3f" or does it include
> the string "?"?

It includes "%3f".
"?" is an illegal character for a path, so this wouldn't be a bug.

>In case of the former you have to URL-escpae the path
> before passing it here, System.Uri has methods for that. "%253f" would
> be the right string in this case.

I don't understand what you mean.

> The documentation is rather clear that the arguments are URLs, not file
> names. There is some overlap, but in cases like this the difference is
> important.

And why has DotNet 2.0 no problems with the same file name?
From the Documentation for "DataSet.ReadXml" (which had the same
problem in DotNet 1.1):

>DataSet.ReadXml (String)  Reads XML schema and data into the DataSet
>using the specified file.

So it clearly states: "file", not "URL".

>> - What is so special about "%10" or "%3f" ? Are there other
>> "problematic" charater combinations?
>
> These map to U+0010 and U+003F, a control character and the question
> mark. RFC 3986 has the details.

So DotNet 1.1 erroneously translated "%3f" to a question mark, which IS
an invalid character before trying to open the file. The bug was perhaps
in "Urlresolver class"?

Regards,

Eckhard
Bjoern Hoehrmann - 30 Apr 2006 18:38 GMT
* Eckhard Schwabe wrote in microsoft.public.dotnet.xml:
>>In case of the former you have to URL-escpae the path
>> before passing it here, System.Uri has methods for that. "%253f" would
>> be the right string in this case.
>
>I don't understand what you mean.

URLs use %xx as escape sequence for special characters. The '%'
character is a special character that has to be escaped if it is
not part of such an escape sequence. %25 is the escape sequence
for the '%' character, %3f is the escape sequence for the '?'
character.

>And why has DotNet 2.0 no problems with the same file name?

The argument is not a file name. .NET 2.0 presumably has code to
work around authoring errors like this.

>So it clearly states: "file", not "URL".

It would be good if you could file a documentation bug report on
this then, e.g. using the "Send comments" link on the MSDN page.
Signature

Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.