I have a situation with a c# application attempting to store and look up
some unicode value in an NVARCHAR. Apparently the system is having issues
dealing with some Unicode values on inserts or lookups.
My tables:
/* table containing terms */
CREATE TABLE [dbo].[TermTable] (
[TermID] [int] IDENTITY (1, 1) NOT NULL ,
[Term] [nvarchar] (256) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
) ON [PRIMARY]
/* table containing many-to-many relation from terms to something else, with
a Unique index on FID and TERMID */
CREATE TABLE [dbo].[TermRelTable] (
[FID] [int] NOT NULL , /* foreign key to related item */
[TermID] [int] NOT NULL
) ON [PRIMARY]
GO
Code uses SQLCommands and parameters, with the statements prepared before
execution.
Pseudocode
int foreignkeyid;
string [] Terms;
foreach term in Terms
Lookup term in TermTable, get TERMID
if not found, insert row in TermTable with Term, and get the
SCOPE_IDENTITY into TERMID
insert row in TermRelTable with foreignkeyid and TERMID.
For some sets of data values (this is just one of the failing sets) the
operaiton fails.
Assume Terms[0]='xpl'
Assume Terms[1]='xpl?' where the last character is 0x0224
When the system attempts to see if the second term is in the TermTable, it
finds the FIRST term - and the subsequent insert into the TermRelTable fails
with an integrity constraint.
the SQLProfiler Trace of the relevant operations :
-----------------------------
exec sp_execute 2, @Term = N'xpl' // look up XPL
exec sp_executesql N'INSERT INTO TermTable(Term) VALUES (@Term);Select
SCOPE_IDENTITY()', N'@Term nvarchar(256)', @Term = N'xpl' // didn't find
it - insert it into TermTable
exec sp_execute 2, @FID = 192239, @TermID = 1550945 // insert row into
relationship table
exec sp_execute 2, @Term = N'xpl?' // look up term in TermTable, expecting
to NOT find it
exec sp_execute 2, @FID = 192239, @TermID = 1550945 // it found the OTHER
term - and attempts to insert an entry in the TEMRELTABLE - generating an
exception.
--------------------------------------
So - what do I look at to get this to work? I do not control the source of
the terms - they can come from anywhere and so can contain any unicode data
value (this one happens to be a latin capital letter z with hook)
Note : The c# code does not have actual char constants - the characters are
read from a text file (using a StreamReader). I included the two
assumptions to show the data values in particular that the application was
having an issue with.
I know that the characters ARE read from the file as Unicode - as I can
watch them as they're assigned to be the parameter which is sent to the dB
command and shown in the SQLProfile log.
sp_execute 2, @Term = N'xpl'
sp_execute 2, @Term = N'xpl?'
where the ? character exactly corresponds with the position of the value
0x0224 in the string passed to the SQLParameter.
(already posted in microsoft.public.sqlserver.programming - but it was
suggested that I cross-post to here)
Michael (michka) Kaplan [MS] - 05 Jul 2005 21:12 GMT
What happens when you use a real Windows collation rather than a limited
range, SQL compatibility one?
Did you mean LATIN CAPITAL LETTER Z WITH HOOK or some other letter when you
hadd 0x0224 below?

Signature
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap
This posting is provided "AS IS" with
no warranties, and confers no rights.
>I have a situation with a c# application attempting to store and look up
> some unicode value in an NVARCHAR. Apparently the system is having issues
[quoted text clipped - 83 lines]
> (already posted in microsoft.public.sqlserver.programming - but it was
> suggested that I cross-post to here)
bryan - 06 Jul 2005 13:04 GMT
When I use a real collation, such as Latin1_General_BIN, it works just fine.
And yes, I did mean LATIN CAPITAL LETTER Z WITH HOOK but 0x0224played better
in the original newsgroup where I had posed the question.
> What happens when you use a real Windows collation rather than a limited
> range, SQL compatibility one?
>
> Did you mean LATIN CAPITAL LETTER Z WITH HOOK or some other letter when
> you hadd 0x0224 below?
Michael (michka) Kaplan [MS] - 08 Jul 2005 09:37 GMT
Stay away from the compat collations -- they really kind of suck for almost
anything Unicode-based.

Signature
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap
This posting is provided "AS IS" with
no warranties, and confers no rights.
> When I use a real collation, such as Latin1_General_BIN, it works just
> fine.
[quoted text clipped - 7 lines]
>> Did you mean LATIN CAPITAL LETTER Z WITH HOOK or some other letter when
>> you hadd 0x0224 below?