hi,
posting this a third time in the hope that someone may explain why it happens, or that MS will acknowledge a bug and fix it in 2.0...
i have a regular expression and very occassionally i'm getting an index out
of bounds exception from one of the inner framework methods, when i use
Match(). i don't know what the input is because its in a production
environment and all debugging is turned off.
my code is as follows:
----------------------------
Regex rex = new Regex(@"http\:\/\/([a-zA-z0-9\-]*\.?)*?(\:[0-9]*)??\/",
RegexOptions.IgnoreCase);
Match match = rex.Match(absoluteUrl); // exception happens here
if(match.Success)
return "/" + absoluteUrl.Replace(match.ToString(), ""); // strip out the
absolute part of the entire url, returning the relative url.
-------------------------------
> stack trace:
> -------------------------------
> IndexOutOfRangeException at
> System.Text.RegularExpressions.RegexInterpreter.Go() at
> System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text,
> Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean
quick)
> at
> System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen,
> String input, Int32 beginning, Int32 length, Int32 startat) at
> System.Text.RegularExpressions.Regex.Match(String input)
> -------------------------------
>
> thanks for any help
> tim mackey.
\\ email: tim at mackey dot ie //
\\ blog: http://tim.mackey.ie //
67d0ebfec70e8db3
Justin Rogers - 12 Jun 2004 09:08 GMT
Sorry Tim, but I'm afraid we just can't help if we can't reproduce the scenario
that you are running yourself into. Any number of things could be going
wrong in this scenario and the Interpreter code is sufficiently complex that
I'm not sure anyone could poke a guess at why the exception is occuring.
Eat the cost of turning your debugging on for a few days and give us the
string that is tossing the exception.

Signature
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers
hi,
posting this a third time in the hope that someone may explain why it happens, or that MS will acknowledge a bug and fix it in 2.0...
i have a regular expression and very occassionally i'm getting an index out
of bounds exception from one of the inner framework methods, when i use
Match(). i don't know what the input is because its in a production
environment and all debugging is turned off.
my code is as follows:
----------------------------
Regex rex = new Regex(@"http\:\/\/([a-zA-z0-9\-]*\.?)*?(\:[0-9]*)??\/",
RegexOptions.IgnoreCase);
Match match = rex.Match(absoluteUrl); // exception happens here
if(match.Success)
return "/" + absoluteUrl.Replace(match.ToString(), ""); // strip out the
absolute part of the entire url, returning the relative url.
-------------------------------
>
> stack trace:
> -------------------------------
> IndexOutOfRangeException at
> System.Text.RegularExpressions.RegexInterpreter.Go() at
> System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text,
> Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean
quick)
> at
> System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen,
> String input, Int32 beginning, Int32 length, Int32 startat) at
> System.Text.RegularExpressions.Regex.Match(String input)
> -------------------------------
>
> thanks for any help
> tim mackey.
\\ email: tim at mackey dot ie //
\\ blog: http://tim.mackey.ie //
67d0ebfec70e8db3
Jay B. Harlow [MVP - Outlook] - 12 Jun 2004 15:59 GMT
Tim,
Have you considered calling Microsoft directly with the problem? If there is
an actual bug you will not be charged with the support call.
Have you considered asking in a different newsgroup?
microsoft.public.dotnet.framework or microsoft.public.dotnet.general have a
larger following someone in one of those may have come across your
problem...
As Justin suggested, have you considered putting the RegEx.Match in a try
catch & writing out the URL that is causing problems? So as to identify the
URL that is causing an issue...
I would consider creating a custom exception class so as to log the URL &
other context info that is causing the exception...
try
{
Match match = rex.Match(absoluteUrl); // exception happens here
}
catch (Exception ex)
{
// throw new exception with the input, pattern & innerException
throw new MyMatchException(absoluteUrl, pattern, ex);
}
if(match.Success)
return "/" + absoluteUrl.Replace(match.ToString(), ""); // strip out the
Note for production environments I find it invaluable to add global
exception handlers to my application, where the global exception handler
logs Exception.ToString to the EventLog. With a custom exception class, the
log would contain the URL & other context info that caused the problem. The
Exception Management Block is useful for this logging & provides options as
to how & where things are logged...
http://msdn.microsoft.com/webservices/building/frameworkandstudio/default.aspx?p
ull=/library/en-us/dnbda/html/emab-rm.asp
Depending on the type of application you are creating, .NET has three
different global exception handlers.
For ASP.NET look at:
System.Web.HttpApplication.Error event
Normally placed in your Global.asax file.
For console applications look at:
System.AppDomain.UnhandledException event
Use AddHandler in your Sub Main.
For Windows Forms look at:
System.Windows.Forms.Application.ThreadException event
Use AddHandler in your Sub Main.
It can be beneficial to combine the above global handlers in your app, as
well as wrap your Sub Main in a try catch itself.
There is an article in the June 2004 MSDN Magazine that shows how to
implement the global exception handling in .NET that explains why & when you
use multiple of the above handlers...
http://msdn.microsoft.com/msdnmag/issues/04/06/NET/default.aspx
For example: In my Windows Forms apps I would have a handler attached to the
Application.ThreadException event, plus a Try/Catch in my Main. The
Try/Catch in Main only catches exceptions if the constructor of the MainForm
raises an exception, the Application.ThreadException handler will catch all
uncaught exceptions from any form/control event handlers.
Hope this helps
Jay
Hope this helps
Jay
hi,
posting this a third time in the hope that someone may explain why it
happens, or that MS will acknowledge a bug and fix it in 2.0...
i have a regular expression and very occassionally i'm getting an index out
of bounds exception from one of the inner framework methods, when i use
Match(). i don't know what the input is because its in a production
environment and all debugging is turned off.
my code is as follows:
----------------------------
Regex rex = new Regex(@"http\:\/\/([a-zA-z0-9\-]*\.?)*?(\:[0-9]*)??\/",
RegexOptions.IgnoreCase);
Match match = rex.Match(absoluteUrl); // exception happens here
if(match.Success)
return "/" + absoluteUrl.Replace(match.ToString(), ""); // strip out the
absolute part of the entire url, returning the relative url.
-------------------------------
> stack trace:
> -------------------------------
> IndexOutOfRangeException at
> System.Text.RegularExpressions.RegexInterpreter.Go() at
> System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text,
> Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean
quick)
> at
> System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen,
[quoted text clipped - 4 lines]
> thanks for any help
> tim mackey.
\\ email: tim at mackey dot ie //
\\ blog: http://tim.mackey.ie //
67d0ebfec70e8db3
Pandurang Nayak - 15 Jun 2004 11:34 GMT
have you considered the possibility that certain URLs might be of the form:
http://www.somesite.com/something.aspx?site=http://somesitereference.com - in this case, your regex is probably matching the second string and then when you run the Replace command your getting a out of range coz it exceeds the string length.
You could add a condition to check if the index recieved from the regex match is out of bounds (less than zero or greater than the string length, or if the string is empty). In any of these cases, you could log the URL for inspection later like other people have suggested already.
Regards
Pandurang

Signature
blog: pandurang.thinkingMS.com
> hi,
> posting this a third time in the hope that someone may explain why it happens, or that MS will acknowledge a bug and fix it in 2.0...
[quoted text clipped - 33 lines]
> \\ blog: http://tim.mackey.ie //
> 67d0ebfec70e8db3
Tim Mackey - 19 Jun 2004 19:44 GMT
hi Pandurang,
unfortunately my code never got as far as the replace command because the
exception happens at the line above. i have added in an error logging
try/catch to inform me of the url that caused the problem. next time it
happens, i'll post the problem causing url here.
thanks
tim
Lasse V?gs?ther Karlsen - 15 Jun 2004 21:25 GMT
> hi,
> posting this a third time in the hope that someone may explain why it
> happens, or that MS will acknowledge a bug and fix it in 2.0...
In order to verify that it's a bug, we would probably need a copy of the
url string that produces the exception as well.
You need to turn on some kind of logging so that you get hold of the value.

Signature
Lasse V?gs?ther Karlsen
http://www.vkarlsen.no/
PGP KeyID: 0x0270466B
David Gutierrez[MSFT] - 24 Jun 2004 17:00 GMT
Tim, I did some experimenting with your code snippet and found that an
absolute url without the trailing / will cause this exception. For
example: "http://www.msn.com". I'll enter a bug to track this and it
should get fixed in the next version. Thanks for letting us know about
this!
David
Tim Mackey - 01 Jul 2004 17:41 GMT
hi David,
thanks for acknowledging that, glad i got brought to light before 2.0 is
officially released.
it could be caused by my dodgy regular expression as i'm new to this topic
:)
thanks
tim
\\ email: tim at mackey dot ie //
\\ blog: http://tim.mackey.ie //
67d0ebfec70e8db3
Tim Mackey - 21 Jul 2004 11:49 GMT
hi david,
further to my last email, i have found another example of a string
that causes the index out of range exception.
for the regex: http://([a-zA-z0-9\-]*\.?)*?(:[0-9]*)??/
the string: http://ks%20med%20test/
causes the exception. it's the % character i believe.
hope you can include this in your testing.
thanks
tim
Niki Estner - 28 Jul 2004 23:58 GMT
Hi Tim,
I could reduce this to: "(a?)*?b", mathing "a".
Seems like the regex engine doesn't like a lazy matching quantifier that
contains a subexpression that can match nothing.
I'd suggest avoiding the lazy quantifier as a workaround (i.e.
"http://([a-zA-z0-9\-]*\.?)*(:[0-9]*)??/"), or adding some required part to
the capture (e.g. "http://([a-zA-z0-9\-]+\.?)*?(:[0-9]*)??/".)
Niki
> hi david,
> further to my last email, i have found another example of a string
[quoted text clipped - 7 lines]
> thanks
> tim