>>However under VC++ 2005 Express February 2005 CTP we get for the
>
[quoted text clipped - 11 lines]
> Yes : I said that the regex matches an empty string : So here you match
> the empty string at the beginning of "bcdefghij".
I am not sure I understood this. There is no empty string in there.
> What you failed to see is that the IsMatch method try to find a match
> inside the given string, it doesn't check that the full string is
> matched.
If I wanted the entire string to be matched, shouldn't I use
Regex::IsMatch(s, "^a*$")?
> Use the Regex.Match method to get the Match object : you'll
> see that it matches an empty string (length=0) at index 0 from the
[quoted text clipped - 3 lines]
> fact it find a 0 length match at each position of the input string, so
> you get 10 matches!
So in essence it matches everything and is equivalent to
Regex::IsMatch(s, ".*")?
BTW why does Regex::IsMatch(s, "*") crash?
Unhandled Exception: System.ArgumentException: parsing "*" - Quantifier
{x,y} fo
llowing nothing.
at System.Text.RegularExpressions.RegexParser.ScanRegex()
at System.Text.RegularExpressions.RegexParser.Parse(String re,
RegexOptions o
p)
at System.Text.RegularExpressions.Regex..ctor(String pattern,
RegexOptions op
tions, Boolean useCache)
at System.Text.RegularExpressions.Regex.IsMatch(String input, String
pattern)
at main() in c:\documents and settings\administrator\my
documents\visual stud
io\projects\test\test\test.cpp:line 14
Press any key to continue . . .
> Yes, but "." match anything, including whitespaces.
Thanks, I did not know that.
Tom Widmer - 11 Mar 2005 16:10 GMT
>>> However under VC++ 2005 Express February 2005 CTP we get for the
>>
[quoted text clipped - 13 lines]
>
> I am not sure I understood this. There is no empty string in there.
The empty string is a substring of every string, and there are n
different substring calls that will produce the empty string for an n
character string.
>> What you failed to see is that the IsMatch method try to find a match
>> inside the given string, it doesn't check that the full string is
>> matched.
Yes, IsMatch sees if any substring of the string matches the regex.
> If I wanted the entire string to be matched, shouldn't I use
> Regex::IsMatch(s, "^a*$")?
Yes.
>> Use the Regex.Match method to get the Match object : you'll
>> see that it matches an empty string (length=0) at index 0 from the
[quoted text clipped - 6 lines]
> So in essence it matches everything and is equivalent to
> Regex::IsMatch(s, ".*")?
"a*"? For IsMatch, yes they are equivalent, but as RegExes, they are
not. If you have the string:
"abaabb"
then ".*" will match:
"" 6x
a 3x
ab 2x
aba 1x
abaa 2x
etc.
whereas
"a*" will match:
"" 6x
"a" 3x
"aa" 1x
etc.
Matching isn't just a yes/no (unless you use IsMatch) - the regex
matches against some substring of the string.
> BTW why does Regex::IsMatch(s, "*") crash?
"*" is not a valid Regex. 0 to many of what? Similarly "+" and "{0,4}"
are not valid.
(apologies for any misinformation, regexp is not a major area of
expertise for me)
Tom
Arnaud Debaene - 11 Mar 2005 21:39 GMT
>>> However under VC++ 2005 Express February 2005 CTP we get for the
>>
[quoted text clipped - 13 lines]
>
> I am not sure I understood this. There is no empty string in there.
Yes there are many! : there is an empty string at index 0, another at index
1, another at index 2, etc... This si true for whatever string...
> If I wanted the entire string to be matched, shouldn't I use
> Regex::IsMatch(s, "^a*$")?
Yes, but this is a rather useless regex (as is "a*) : a regex that matches
the empty string doesn't make much sense, unles you filter the matches
afterwards : say, keep only matches more than x characters long. But in that
case, you'd better write a regex that does this filtering directly.
> So in essence it matches everything and is equivalent to
> Regex::IsMatch(s, ".*")?
As Tom explained, it is a bit more complex. In order to experiment, I
suggest you display all the Matches from both regexes on a given input
string.
> BTW why does Regex::IsMatch(s, "*") crash?
>
> Unhandled Exception: System.ArgumentException: parsing "*" -
> Quantifier {x,y} fo
> llowing nothing.
The error description seems quite clear, no? "*" is a quantifier : it
specifies "0 to n instances of the token before it" : There is nothing
before the quantifier in your regex, so it is an invalid regex.
Arnaud
MVP - VC
ismailp - 13 Mar 2005 17:32 GMT
yes, "*" is an illegal regular expression, it represents nothing. these
quantifiers, as Tom and Arnaud told, should follow something. * matches
0-n of preceding item (string or character group, or character,
whatever). bare * itself is meaningless, literally, "illegal". ? also
illegal, +, and {}. these are illegal, if they do not follow anything.
"*a" is also wrong.