Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / November 2006

Tip: Looking for answers? Try searching our database.

Serious Bug System.Collections Sort

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
william.hooper@gmail.com - 09 Nov 2006 12:46 GMT
There is a longer article about this subject here:
http://www.codeproject.com/useritems/SortedList_Bug.asp
See the main article and the reply thread started by Robert Rohde.

Alternatively look at this code:

ArrayList a=new ArrayList();

string s1 = "-0.67:-0.33:0.33";
string s2 = "0.67:-0.33:0.33";
string s3 = "-0.67:0.33:-0.33";

a.Add(s1);
a.Add(s2);
a.Add(s3);

a.Sort();
for (int i=0; i<3; i++) Console.WriteLine( a[i] );

Console.WriteLine();

a.Clear();
a.Add(s1);
a.Add(s3);
a.Add(s2);

a.Sort();
for (int i=0; i<3; i++) Console.WriteLine( a[i] );

This code produces the following six lines of output:

-0.67:0.33:-0.33
0.67:-0.33:0.33
-0.67:-0.33:0.33

-0.67:-0.33:0.33
-0.67:0.33:-0.33
0.67:-0.33:0.33

Note that the .Sort produces different outputs depending on the order
the strings are added.

It looks like the Sort algorithm is ignoring the "-" mark.

This is a very serious Bug impacting the System.Collections Array,
SortedList etc.
Marc Gravell - 09 Nov 2006 13:11 GMT
This appears to relate to the culture-specific comparer, which is presumably
ignoring symbols (one of the CompareOptions flags); perhaps switch to
ordinal comparison, which resolves this. You may be able to use
StringComparer.Ordinal; can't remember if that exists in 1.1, but if not
something like this should do (and use it in the Sort() calls).

public class OrdinalStringComparer : IComparer
   {
       public readonly static OrdinalStringComparer Singleton = new
OrdinalStringComparer();
       private OrdinalStringComparer() { }
       public int Compare(object x, object y)
       {
           return string.CompareOrdinal((string) x, (string) y);
       }
   }

Marc
william.hooper@gmail.com - 09 Nov 2006 13:39 GMT
I don't agree....

string s2 = "0.67:-0.33:0.33";
string s3 = "-0.67:0.33:-0.33";

Console.WriteLine( String.Compare(s2,s3));
Console.WriteLine( String.Compare(s2,s3));

returns -1 and 1 showing that the Sting.Compare function is working as
expected.

I don't think this is a culture issue and writing an ICompared for
evxery System.Collections string comparison is a nighmare.
Marc Gravell - 09 Nov 2006 14:19 GMT
Well... it would appear that the problem is a dodgy comparer; try this
(using your previous values for s1, s2, s3):

           Console.WriteLine(Comparer.Default.Compare(s1, s2));
           Console.WriteLine(Comparer.Default.Compare(s2, s3));
           Console.WriteLine(Comparer.Default.Compare(s3, s1));

Yields 1, 1, 1 meaning there is a comparer loop. Oops! Anybody [MS?] want to
comment on whether this is intentional or a fubar? It would appear to
violate the "transitive" rule of comparers, which should be enforced for
non-zero results (0 being transitive as long as it agrees in each
direction).

Actually, it's quite lucky that this returns at all! Perhaps there is a
panic "oops, shouldn't possibly have taken more than n^2 iterations...".

Anyway, the point of my post was that this can be avoided using an ordinal
comparer. And you can re-use the same one each time... no need to write
anything more.

Marc
william.hooper@gmail.com - 09 Nov 2006 14:45 GMT
Yes, Marc, sorry you are right.

The code

string s1 = "-0.67:-0.33:0.33";
string s2 = "0.67:-0.33:0.33";
string s3 = "-0.67:0.33:-0.33";

Console.WriteLine( String.Compare(s1,s2));
Console.WriteLine( String.Compare(s2,s3));
Console.WriteLine( String.Compare(s3,s1));

Console.WriteLine();

Console.WriteLine( String.CompareOrdinal(s1, s2));
Console.WriteLine( String.CompareOrdinal(s2, s3));
Console.WriteLine( String.CompareOrdinal(s3, s1));

returns

1
1
1

-3
3
3

Ie String.Compare can not handle the "-" marks but
String.CompareOrdinal can.

Is there not some regional setting you can put once in the code so
String.Compare and ArrayList.Sort all work that way from then on? Would
much prefer this...

----

Note it annoys me that the documentation writes:

"The .NET Framework supports word, string, and ordinal sort
rules....For example, the hyphen ("-") might have a very small weight
assigned to it so that "coop" and "co-op" appear next to each other in
a sorted list...."

But in fact the hypen is being completly ignored rather than given a
low weight.

I think this behaviour by default is very undesirable.
Marc Gravell - 09 Nov 2006 15:01 GMT
Well, it isn't being ignored. If it was being ignored I would expect the
result to be 0,0,0.

> I think this behaviour by default is very undesirable.
I think it is buggy, but it isn't in System.Collections - it is in
String.CompareTo. I can't think of a valid occasion in a well-ordered system
when a < b < c < a or a > b > c > a. It just makes no logical sense unless
you are Maurits Escher.

Now, a = b = c = a = 0 I could live with (i.e. hyhpens completely ignored).

Marc
w.hooper@hotmail.com - 09 Nov 2006 15:10 GMT
If you play around with examples like this:

s1 = "0.67:0.33:-0.33";
s2 = "0.67:-0.33:0.33";
s3 = "-0.67:0.33:0.33";

Console.WriteLine( String.Compare(s1,s2));
Console.WriteLine( String.Compare(s2,s3));
Console.WriteLine( String.Compare(s3,s1));

It all looks OK. But when you put two hyphens into the lines the
String.Compare gets confused and gives silly results.

OK the String.CompareOrdinal function fixes the problem but this is not
behaviour by design surely?

We need a comment from our lord and master MS....
Marc Gravell - 09 Nov 2006 15:33 GMT
Logged:
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=
236900


Marc
w.hooper@hotmail.com - 09 Nov 2006 15:43 GMT
Thanks very much that's great. I will keep an eye on it.
Marc Gravell - 13 Nov 2006 13:25 GMT
Still no MS viewpoint?

For ref, I think this is actually quite important, as it could (as
illustrated on the now-deleted CodeProject link) cause a whole range of
sort-critical operations to fail... SortedList etc, or any custom
collections that assume that .Sort() might actually work, and then
trust the results.

Any thoughts?

Marc
william.hooper@gmail.com - 13 Nov 2006 17:30 GMT
Yes absolutely, it cost me lots of time in the SortedList failing. I
think it's to do with the hyphen algorithm which is designed to sort
words with hyphens in a smart way. Obviously with two hyphens in the
word it fails. Anyone who puts two hyphens in strings is taking a huge
risk - yet using hyphens in strings is common enough. A really
dangerous bug I would say, but I guess Microsoft haven't seen that
yet. You could mention on the article that it causes problems with
SortedList and you think it's a very serious problem. I think we have
discovered a real corker and it deserves to be given a lot of attention.
Chris Dunaway - 09 Nov 2006 14:34 GMT
> I don't agree....
>
[quoted text clipped - 9 lines]
> I don't think this is a culture issue and writing an ICompared for
> evxery System.Collections string comparison is a nighmare.

>From the docs, note how it mentions that the hyphen might be given a
low weight so that similar words will sort together:

"The .NET Framework supports word, string, and ordinal sort rules. A
word sort performs a culture-sensitive comparison of strings in which
certain nonalphanumeric Unicode characters might have special weights
assigned to them. For example, the hyphen ("-") might have a very small
weight assigned to it so that "coop" and "co-op" appear next to each
other in a sorted list. A string sort is similar to a word sort, except
that there are no special cases and all nonalphanumeric symbols come
before all alphanumeric Unicode characters. An ordinal sort compares
strings based on the numeric value of each Char in the string. For more
information about word, string, and ordinal sort rules, see the
System.Globalization.CompareOptions topic.

Comparison and search procedures are case-sensitive by default and use
the culture associated with the current thread unless specified
otherwise. By definition, any string, including the empty string (""),
compares greater than a null reference, and two null references compare
equal to each other.

If your application makes security decisions based on the result of a
comparison or case change operation, then the operation should use the
invariant culture to ensure the result is not affected by the value of
the current culture. For more information, see the
CultureInfo.InvariantCulture topic."

I don't know if this is what is causing the behavior you reported, but
at least it's worth looking into.

Chris
Chris Dunaway - 09 Nov 2006 15:03 GMT
And FWIW, I didn't see this issue officially reported on the feedback
site.  You may wish to post it there.
w.hooper@hotmail.com - 09 Nov 2006 15:40 GMT
> And FWIW, I didn't see this issue officially reported on the feedback
> site.  You may wish to post it there.

Chris, what is the feedback site? I would liek to post it there. It's
really annoying and you never know, they might take an interest.
Marc Gravell - 09 Nov 2006 15:44 GMT
See my previous post; since you didn't seem familiar with "connect" I posted
it as a bug. Feel free to go in and click "validate"... and vote!
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=
236900


And note: I wasn't trying to steal your thunder... just to get it logged
with the least fuss...

Marc

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.