> It was my understanding that when comparing strings using
> "OrdinalIgnoreCase" as the method to compare the strings, the .Net compared
> the strings by first capitalizing all of the characters on the string and
> then making an ordinal comparison (Unicode code point comparison).
The process of capitalization itself is culture-sensitive, which is
what's tripping you up. Your call to ToUpper is returning plain "I" in
both cases, because it's using the thread's current culture - if you
specify CultureInfo.InvariantCulture as the culture to use when upper
casing, you'll get the same results for both comparisons.
In this case, I believe that from a culture-neutral point of view,
they're different letters rather than just differently capitalised
letters. It's all a bit tricksy though, to be honest.
Hope this at least explains a bit of what's going on...

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Rene - 06 Jul 2007 02:06 GMT
Sure enough, I added
System.Threading.Thread.CurrentThread.CurrentCulture =
System.Globalization.CultureInfo.InvariantCulture;
before doing any comparing and viola, I got the same answers this time (both
show as not being equal).
Looks like I have to do some more reading on "string.Compare". I didn't
think that learning about string/Unicode/culture/etc will take me as long as
it has taken me, the more I research the more new stuff I keep bumping on...
dam it!
Thanks.
Rene - 06 Jul 2007 17:11 GMT
OK, I did some more digging around, according to the following site:
http://www.fileformat.info/info/unicode/char/0131/index.htm
The *Unicode* uppercase equivalent for 'LATIN SMALL LETTER DOTLESS I'
(U+0131) is 'LATIN CAPITAL LETTER I' (U+0049).
Having said that, I was under the impression that the OrdinalIgnoreCase flag
would use the *Unicode conversion tables* (no culture involved) to convert
the characters on the string to uppercase, this means that uppercase
conversion should always be the same no matter what culture is being used.
If above is true, the result for "compare2" should be zero because:
The "smallLetterDotlessI" variable capitalized using the Unicode tables
should return (U+0049).
The "capitalLetterI" variable is already a capital character so after
capitalizing using the Unicode tables should return (U+0049).
So you may think that the line of code below should return zero:
int compare2 = string.Compare(smallLetterDotlessI, capitalLetterI,
StringComparison.OrdinalIgnoreCase);
But it does not. So what's going on? What logic is the .Net using when
comparing with the OrdinalIgnoreCase flag? Is it not uppercasing all
characters using the Unicode conversion tables?
Thanks.
Rene - 06 Jul 2007 17:25 GMT
Well, I think I found the answer here:
http://blogs.msdn.com/michkap/archive/2005/03/10/391564.aspx
Basically the page says:
"Windows and the .NET Framework mainly support simple, reversible casing --
which is to say single code point casing that have ToUpper() and ToLower()
as inverse operations that can "undo" each other."
So in my example, the 'LATIN SMALL LETTER DOTLESS I' (U+0131) will need to
uppercase to 'LATIN CAPITAL LETTER I' (U+0049), but then 'LATIN CAPITAL
LETTER I' (U+0049) should in return lowercase to 'LATIN SMALL LETTER DOTLESS
I' (U+0131) but that is not the case because it will lowercase to 'LATIN
SMALL LETTER I' (U+0069) Since this conversion is not reversible
OrdinalIgnoreCase is not really uppercasing the character and that is why
"compare2" will not return zero.
At least that's what I think is going on.