Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / .NET Framework / Internationalization / July 2005

Tip: Looking for answers? Try searching our database.

SQLServer & NVARCHAR not handling unicode

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
bryan - 28 Jun 2005 20:19 GMT
I have a situation with a c# application attempting to store and look up
some unicode value in an NVARCHAR.  Apparently the system is having issues
dealing with some Unicode values on inserts or lookups.

My tables:

/* table containing terms */
CREATE TABLE [dbo].[TermTable] (
[TermID] [int] IDENTITY (1, 1) NOT NULL ,
[Term] [nvarchar] (256) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
) ON [PRIMARY]

/* table containing many-to-many relation from terms to something else, with
a Unique index on FID and TERMID  */
CREATE TABLE [dbo].[TermRelTable] (
[FID] [int] NOT NULL , /* foreign key to related item */
[TermID] [int] NOT NULL
) ON [PRIMARY]
GO

Code uses SQLCommands and parameters, with the statements prepared before
execution.

Pseudocode
int foreignkeyid;
string [] Terms;
foreach term in Terms
   Lookup term in TermTable, get TERMID
   if not found, insert row in TermTable with Term, and get the
SCOPE_IDENTITY into TERMID
   insert row in TermRelTable with foreignkeyid and TERMID.

For some sets of data values (this is just one of the failing sets) the
operaiton fails.

Assume Terms[0]='xpl'
Assume Terms[1]='xpl?' where the last character is 0x0224

When the system attempts to see if the second term is in the TermTable, it
finds the FIRST term - and the subsequent insert into the TermRelTable fails
with an integrity constraint.

the SQLProfiler Trace of the relevant operations :
-----------------------------
exec sp_execute 2, @Term = N'xpl' // look up XPL

exec sp_executesql N'INSERT INTO TermTable(Term) VALUES (@Term);Select
SCOPE_IDENTITY()', N'@Term nvarchar(256)', @Term = N'xpl'  // didn't find
it - insert it into TermTable

exec sp_execute 2, @FID = 192239, @TermID = 1550945 // insert row into
relationship table

exec sp_execute 2, @Term = N'xpl?'  // look up term in TermTable, expecting
to NOT find it

exec sp_execute 2, @FID = 192239, @TermID = 1550945    // it found the OTHER
term - and attempts to insert an entry in the TEMRELTABLE - generating an
exception.
--------------------------------------

So - what do I look at to get this to work?  I do not control the source of
the terms - they can come from anywhere and so can contain any unicode data
value (this one happens to be a latin capital letter z with hook)

Note : The c# code does not have actual char constants - the characters are
read from a text file (using a StreamReader).  I included the two
assumptions to show the data values in particular that the application was
having an issue with.

I know that the characters ARE read from the file as Unicode - as I can
watch them as they're assigned to be the parameter which is sent to the dB
command and shown in the SQLProfile log.

sp_execute 2, @Term = N'xpl'
sp_execute 2, @Term = N'xpl?'

where the ? character exactly corresponds with the position of the value
0x0224 in the string passed to the SQLParameter.

(already posted in microsoft.public.sqlserver.programming - but it was
suggested that I cross-post to here)
Michael (michka) Kaplan [MS] - 05 Jul 2005 21:12 GMT
What happens when you use a real Windows collation rather than a limited
range, SQL compatibility one?

Did you mean LATIN CAPITAL LETTER Z WITH HOOK or some other letter when you
hadd 0x0224 below?

Signature

MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.

>I have a situation with a c# application attempting to store and look up
> some unicode value in an NVARCHAR.  Apparently the system is having issues
[quoted text clipped - 83 lines]
> (already posted in microsoft.public.sqlserver.programming - but it was
> suggested that I cross-post to here)
bryan - 06 Jul 2005 13:04 GMT
When I use a real collation, such as Latin1_General_BIN, it works just fine.

And yes, I did mean LATIN CAPITAL LETTER Z WITH HOOK but 0x0224played better
in the original newsgroup where I had posed the question.

> What happens when you use a real Windows collation rather than a limited
> range, SQL compatibility one?
>
> Did you mean LATIN CAPITAL LETTER Z WITH HOOK or some other letter when
> you hadd 0x0224 below?
Michael (michka) Kaplan [MS] - 08 Jul 2005 09:37 GMT
Stay away from the compat collations -- they really kind of suck for almost
anything Unicode-based.

Signature

MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.

> When I use a real collation, such as Latin1_General_BIN, it works just
> fine.
[quoted text clipped - 7 lines]
>> Did you mean LATIN CAPITAL LETTER Z WITH HOOK or some other letter when
>> you hadd 0x0224 below?

Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.