Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / New Users / June 2007

Tip: Looking for answers? Try searching our database.

Regular Expression problem

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Zoodor - 20 Jun 2007 13:52 GMT
The crux of my problem is that I want a regular expression that will match a
sequence of numbers that have been "ANDed" or "ORed" together in text:
e.g.
1
1 OR 2
1 OR 2 OR 3...etc.
1 AND 2
1 AND 2 AND 3...etc.
(But not 1 OR 2 AND 3, which mixes ANDs and ORs)

I tried the regular expression:
(?:(?:[0-9]+(?: AND [0-9]+)*)|(?:[0-9]+(?: OR [0-9]+)*))

Given the string "1 OR 2" my expression will result in two matches (from a
call to Regex.Matches()), one matching "1" and another matching "2". I want
to match the whole string "1 OR 2".

Interestingly, given the string "1 AND 2", it does have the desired
behaviour (i.e. it matches the whole string as one match).

Am I doing something silly or can I not do what I want?

Any help appreciated

Mark
Kevin Spencer - 21 Jun 2007 14:07 GMT
Hi Zoodor,

This one was a bit tricky. Here's the solution:

(?m)^\d+(?(?=.)(?:\s+(AND|OR)\s*))(?:\d+(?(?=.)(?:\s+\1\s*)))*$

It breaks down into 2 sections:

(?m)^\d+(?(?=.)(?:\s+(AND|OR)\s*))

First, '^' and '$' match at beginning and end of both strings and line
breaks. Second, it must begin at the beginning of a string, or a line. Match
any sequence of digits. If it is followed by any characters other than line
breaks, it must be followed by at least one space, plus one of the sequences
"AND" or "OR," followed by zero or more spaces. The result of the match
(AND|OR) is stored in Capturing Group 1.

The second section may be matched 0 or more times:

(?:\d+(?(?=.)(?:\s+\1\s*)))*$

Match any digit. If followed by anything other than a line break, it must be
followed by at least one space, plus the sequence captured in Capturing
Group 1, followed by zero or more spaces.

The result is that whichever of the "AND" or "OR" is captured is the
required match for any subsequent matching character sequences. It is only
optional for the first, as the Capturing Group is used in the second, which
is optional and may be repeated any number of times. However, the last time
it is repeated must be at the end of a line. This ensures that any line
having AND and OR in it is discarded altogether.

I tested this against the following:

1                                            * success
1 OR 2                                  * success
1 OR 2 OR 3                        * success
2 AND 5 AND 6 AND 7      * success
1 AND 5 OR 6                      * fail

Signature

HTH,

Kevin Spencer
Microsoft MVP

Printing Components, Email Components,
FTP Client Classes, Enhanced Data Controls, much more.
DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net

> The crux of my problem is that I want a regular expression that will match
> a
[quoted text clipped - 23 lines]
>
> Mark
Zoodor - 22 Jun 2007 16:36 GMT
Thanks Kevin, your reply was very helpful.

Mark

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.