I'm wrestling with the best way to create some C# code to find all of
the unique email id strings I receive within an array. Basically I
want to efficiently eliminate all of the duplicates. Typically there
will be between 1 and a few hundred unique email ids, and worse case
it might be 20,000 to 50,000 unique email ids.
My first pass was to use a C# hashtable. I use the hashtable to tell
me whether or not I've already seen the email id string in the
array. I don't care about the ordering of the email ids, all I care
about is finding all of the unique email ids.
Does anyone have any suggestion for a better solution?
Are there are gotchas for using a C# hashtable for this solution?
Jon Skeet [C# MVP] - 22 Feb 2008 15:47 GMT
> I'm wrestling with the best way to create some C# code to find all of
> the unique email id strings I receive within an array. Basically I
[quoted text clipped - 10 lines]
>
> Are there are gotchas for using a C# hashtable for this solution?
It would help if you'd say which version of .NET you're using.
In .NET 3.5 I'd just call Distinct() on the array.
In .NET 2.0 I'd use a Dictionary<string,string> using the same value as
the key.
In .NET 1.1 I'd use Hashtable.

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk
Arne Vajhøj - 22 Feb 2008 22:07 GMT
> I'm wrestling with the best way to create some C# code to find all of
> the unique email id strings I receive within an array. Basically I
[quoted text clipped - 8 lines]
>
> Does anyone have any suggestion for a better solution?
If on 3.5 then HashSet was a possibility.
Arne
Jon Skeet [C# MVP] - 22 Feb 2008 22:17 GMT
> > I'm wrestling with the best way to create some C# code to find all of
> > the unique email id strings I receive within an array. Basically I
[quoted text clipped - 10 lines]
>
> If on 3.5 then HashSet was a possibility.
You *could* explicitly use a HashSet - but why go to the work of doing
it yourself when Enumerable.Distinct() does it all for you? :)

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk
Arne Vajhøj - 23 Feb 2008 03:35 GMT
>>> I'm wrestling with the best way to create some C# code to find all of
>>> the unique email id strings I receive within an array. Basically I
[quoted text clipped - 12 lines]
> You *could* explicitly use a HashSet - but why go to the work of doing
> it yourself when Enumerable.Distinct() does it all for you? :)
When you have a hammer problems tend to look like nails.
:-)
Is .Discrete using a HashSet internally ?
Arne