Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / October 2006

Tip: Looking for answers? Try searching our database.

RegEx to find a word not enclosed in paranthesis

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
vmoreau@gmail.com - 31 Oct 2006 16:40 GMT
I have a text and I need to find a Word that are not enclosed in
paranthesis. Can it be done with a regex? Is someone could help me?
I am not familar with regex...

Example looking for WORD:
(there is a WORD in ( my string WORD )) and * WORD * to (find WORD)
and * WORD *

Should give me the to word between star (star ar not part of string)

thanks a lot
Kevin Spencer - 31 Oct 2006 19:31 GMT
I don't believe this can be done using Regular Expressions, at least not
practically. I'll tell you why:

In order to identify the WORD you're looking for, the only rule that can be
applied is that it is preceded by the exact same number of left and right
parentheses. That means that the number of left parentheses before the WORD
and the number of right parentheses before the WORD must be the same,
whether 0 or more, but the exact same number of each.

In addition, the left and right parentheses have to be in order, that is, if
there are 2 left parentheses, they must be followed (at some point) by 2
right parentheses. In other words, you can't have 1 left parenthesis
followed by 2 right parentheses followed by one left parenthesis. And you
can't start with right parentheses. You must always have a number higher
than 0 of left parenthesis, followed by some sequence of 0 or more
characters that is NOT "WORD" followed by the exact same number of right
parentheses.

Since Regular Expressions does not have the capacity to count, this can't be
done using Regular Expressions. However, as I was able to determine the rule
for identifying WORD, I also have some idea of how it might be done using
string and character manipulation.

Since you're looking for the incidences of a string within a string, you
don't need to actually match the string, but only to know what the indices
of the incidences of the string within the origin string are. That is, once
you know the indices of the incidences, and you know what the search string
is, you can find them all within the string any time you need to.

You would need 2 variables, one to keep a count of left parenteses, and one
to keep a count of right parentheses. When you hit a left parenthesis,
increment the left parenthesis variable. If the 2 variables are not of equal
value, you don't do anything. If they are, you begin to check the characters
following for the search string ("WORD"). Here's an example. I've tested
this using all possible combinations, with one exception. It assumes that
left and right parentheses will always be in left-right order. That is, if
there is a stray parenthesis, or if the parentheses are somehow reversed in
the string, it may not work as advertised, and you may need to revise it:

/// <summary">
/// Finds the indices of all incidences of <paramref name="searchString"/>
/// found in <paramref name="origin"/> that are not
/// enclosed within parentheses.
/// </summary>
/// <param name="origin">String to Search.</param>
/// <param name="searchString">String to Find.</param>
/// <returns>An array of the indices of all incidences of <paramref
name="searchString"/>
/// found in <paramref name="origin"/> that are not enclosed within
parentheses,
/// or an empty integer array if not found.</returns>
public static int[] IndicesWithoutParentheses(string origin, string
searchString)
{
char c;
int i, count = 0;
int leftCount = 0, rightCount = 0;
int originIndex, searchIndex;

int originLength = origin.Length;
int searchLength = searchString.Length;

int[] indices = new int[originLength]; // holds indices found
int[] result; // return value
for (i = 0; i < indices.Length; i++)
 indices[i] = -1; // No index

// Iterate through the origin string
for (originIndex = 0; originIndex < originLength; originIndex++)
{
 c = origin[originIndex]; // Current char
 if (c == '(') leftCount++; // Count left parentheses
 else if (c == ')') rightCount++; // Count right parentheses
 else if (leftCount == rightCount)
 {
  i = originIndex;
  // Find the first letter of searchString prior to any left parenthesis
  while (i < origin.Length && origin[i] != searchString[0] &&
   origin[i] != '(') i++;
  // if we've reached the end of the origin string, we're done.
  if (i == origin.Length) break;
  // Otherwise, we set originIndex to i, and begin searching for
searchString
  originIndex = i + 1;
  if (origin[i] == '(')
  {
   leftCount++;
   originIndex--;
   continue;
  }
  // Begin looking for searchString
  for (searchIndex = 1; searchIndex < searchLength; i++)
   if (searchString[searchIndex++] != origin[originIndex++]) break;
  // if the loop did not break, we have found one
  if (searchIndex == searchLength) indices[count++] = originIndex -
searchIndex;
  originIndex--; // need to back up one because outer loop increments.
 }
}
i = Array.IndexOf<int>(indices, -1);
if (i <= 0) result = new int[0];
else
{
 result = new int[i];
 Array.Copy(indices, result, i);
}
return result;
}

Signature

HTH,

Kevin Spencer
Microsoft MVP
Short Order Coder
http://unclechutney.blogspot.com

The devil is in the yada yada yada

>I have a text and I need to find a Word that are not enclosed in
> paranthesis. Can it be done with a regex? Is someone could help me?
[quoted text clipped - 7 lines]
>
> thanks a lot

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.