Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / New Users / April 2007

Tip: Looking for answers? Try searching our database.

strings vs regular expressions

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
AVL - 11 Apr 2007 09:26 GMT
hi,
I need to comapare or check for substrings in a given string.
which would give better performance - string related comapare functions or
regualr expressions....
Michael Nemtsev - 11 Apr 2007 09:39 GMT
Hello AVL,

AIFAIK String.Contains will be the fastest, because it's only call IndexOf
when the regexp makes really waste processing

---
WBR,  Michael  Nemtsev [.NET/C# MVP].  
My blog: http://spaces.live.com/laflour
Team blog: http://devkids.blogspot.com/

"The greatest danger for most of us is not that our aim is too high and we
miss it, but that it is too low and we reach it" (c) Michelangelo

A> hi,
A> I need to comapare or check for substrings in a given string.
A> which would give better performance - string related comapare
A> functions or
A> regualr expressions...
Henning Krause [MVP - Exchange] - 11 Apr 2007 09:48 GMT
Hello,

but string.IndexOf has very bad implemention. If you want a fast string
search, look for a .NET implementation of the Boyer-Moore algorithm - this
is also used in regular expressions internals.

Depending on the length of the text being searched and the frequency, you
might want to consider a precompiled regex.

Anyway, you should perform some performance testing yourself. It really
depends on the circumstances.

Best regards,
Henning Krause

> Hello AVL,
>
[quoted text clipped - 14 lines]
> A> functions or
> A> regualr expressions....
Michael Nemtsev - 11 Apr 2007 09:54 GMT
Hello Henning Krause [MVP - Exchange],

H> Anyway, you should perform some performance testing yourself. It
H> really depends on the circumstances.

That's the point, coz we dont know what the OP is looking for

>> A> hi,
>> A> I need to comapare or check for substrings in a given string.
>> A> which would give better performance - string related comapare
>> A> functions or
>> A> regualr expressions....
Jon Skeet [C# MVP] - 11 Apr 2007 11:29 GMT
On Apr 11, 9:48 am, "Henning Krause [MVP - Exchange]"
<newsgroups_rem...@this.infinitec.de> wrote:
> but string.IndexOf has very bad implemention. If you want a fast string
> search, look for a .NET implementation of the Boyer-Moore algorithm - this
> is also used in regular expressions internals.

I wouldn't say that IndexOf has a "very bad" implementation. In *some*
cases it won't be as fast as doing the "pre-work" involved for Boyer-
Moore, but I suspect in the vast majority of cases used in the real
world, it's far quicker to use the "brute force" method, given that
you're only looking for the string once (as far as String.IndexOf is
concerned - you may be calling it multiple times, of course).

I suppose String.IndexOf could apply some heuristics and guess whether
it's worth building the tables (or whatever) for Boyer-Moore, but as I
say, in the vast majority of real cases it won't make any odds.

> Depending on the length of the text being searched and the frequency, you
> might want to consider a precompiled regex.
>
> Anyway, you should perform some performance testing yourself. It really
> depends on the circumstances.

Agreed. If you know you're going to have to search for the same string
lots of times in a performance-critical environment, it may be worth
using regular expressions. I would use Contains until I'd actually
proved it was a bottleneck though :)

Jon
Henning Krause [MVP - Exchange] - 11 Apr 2007 14:48 GMT
Hi,

> Agreed. If you know you're going to have to search for the same string
> lots of times in a performance-critical environment, it may be worth
> using regular expressions. I would use Contains until I'd actually
> proved it was a bottleneck though :)

"The First Rule of Program Optimization: Don't do it. The Second Rule of
Program Optimization (for experts only!): Don't do it yet." - Michael A.
Jackson
Kevin Spencer - 11 Apr 2007 13:01 GMT
Hi AVL,

Just to clear things up regarding regular expressions versus string
functions. Use regular expressions when looking for a *pattern* of
characters in a string, which may be different characters in the same
pattern, and string functions for looking for substrings. What I mean by
"patterns" is, for example, a hyperlink in an HTML document.

A hyperlink is a string that must follow certain rules. It must begin with
the character sequence "<a" followed by one or more white space characters,
followed by 0 or more attribute name=value pairs, followed by the ">"
character. This is followed by a string of text that is followed by the
"</a>" character sequence. Note that only several of the characters are
specified, and you don't know what the rest of them will be. So, how do you
look for a string that satisfies these rules? Example:

(?m)(?i)(?<=<a)(?:(?:\s+href=(?<href>[^>]+))|(?:\s+[^=>]+=[^>]+))*>(?<innerHtml>[^<]*)(?=</a>)

The above is a regular expression that identifies substrings that satisfy
those rules. In addition, it captures 2 groups, one for the link text, one
for the innerHtml of the anchor.

You could not use a string function to find this pattern. Generally, string
functions are faster than regular expressions, but when looking for patterns
(groups of characters that satisfy rules), regular expressions are the
fastest method.

Signature

HTH,

Kevin Spencer
Microsoft MVP

Printing Components, Email Components,
FTP Client Classes, Enhanced Data Controls, much more.
DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net

> hi,
> I need to comapare or check for substrings in a given string.
> which would give better performance - string related comapare functions or
> regualr expressions....

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.