I don't have the original question you asked, and I'm not sure you specified
what the rules should be. Neither do I have the original Regular Expression
I posted for you. The one you posted is modified. So, I can't tell you what
rules I assumed for those which were not provided, nor can I tell you
whether the change you made to the regular expression has anything to do
with it.
Therefore, I went back into my personal library, and found a Regular
Expression I once created for another project, which identifies all
attribute names and values (in 2 groups) in a block of HTML text. The
original was this, to capture *all* attribute names and values:
(?i)\s+(?:(\w+)=(?:["']?([^"'>=]*)["']?)(?=\s|/?>)|\s*(?=\s|/?>))
The first group is defined by the sequence: (\w+) (any sequence of one or
more alpha-numeric characters).
I replaced that with the following:
(?i)\s+(?:(onclick)=(?:["']?([^"'>=]*)["']?)(?=\s|/?>)|\s*(?=\s|/?>))
This will only capture attributes with a name of "onclick"
(case-insensitive)
Upon testing it with your script sample below, it correctly identified only
ONE of the attributes, the first one. The reason it didn't identify the
second one you said that it should is that the second one is not correct
syntactically. In HTML, the '=' character in an attribute may not be
preceded or followed by any spaces.

Signature
HTH,
Kevin Spencer
Microsoft MVP
Logostician
http://unclechutney.blogspot.com
Parabola is a mate of plane.
>> (?i)(?<=<[\w]+[^<\>=]+)(onclick)=(?:["']?([^"'>=]*)["']?)
>>
[quoted text clipped - 13 lines]
> Thanks,
> Shawn
Shawn B. - 29 Nov 2006 19:14 GMT
Kevin, thanks for your reply. Actually, I'm trying to look for cross site
scripting vulnerabilities on input fields. While the '=' preceded or
superceded by a space isn't valid html, the browser (IE) will still render
it and treat it the same, and it is a perfectly valid detection evasion
technique. The expression you provided actually still allows a few false
positives to go through on our system but I did find an express that works
flawlessly:
(<[^>]*?(ONMOUSEOVER)\s*=.*?>)
This expression catches every one of our known vulnerabilities and does not
catch any of our known false positives. However, I'll take a closer look at
your expression and figure out if we can adapt it to other parts of our
scanning engine.
Thanks,
Shawn
>I don't have the original question you asked, and I'm not sure you
>specified what the rules should be. Neither do I have the original Regular
[quoted text clipped - 44 lines]
>> Thanks,
>> Shawn
Kevin Spencer - 29 Nov 2006 22:08 GMT
My pleasure, Shawn. As always, figuring out the business rules is the
hardest part!

Signature
HTH,
Kevin Spencer
Microsoft MVP
Logostician
http://unclechutney.blogspot.com
Parabola is a mate of plane.
> Kevin, thanks for your reply. Actually, I'm trying to look for cross site
> scripting vulnerabilities on input fields. While the '=' preceded or
[quoted text clipped - 62 lines]
>>> Thanks,
>>> Shawn