.NET Forum / .NET Framework / New Users / January 2006
Object identity
|
|
Thread rating:  |
Stephan Keil - 22 Jan 2006 18:23 GMT Hi all,
I am a novice with .NET and I am wondering if there is something like an "identity value" of an object. I mean something like the object's address in C++ or C, i.e. a fixed unique value per object, which can be used e.g. to put objects in an associative container (a hash value is not an alternative as it is not fixed during the object's life time). I know that the garbage collector moves objects around in the memory, so the gc pointer cannot be used (at least as long as it is not "pinned"). Is there anything like a fixed pointer in .NET?
Thx & regards,
Stephan
Vadym Stetsyak - 22 Jan 2006 18:29 GMT object.GetHashCode() - "serves as a hash function for a particular type, suitable for use in hashing algorithms and data structures like a hash table"
 Signature Vadym Stetsyak aka Vadmyst http://vadmyst.blogspot.com
> Hi all, > [quoted text clipped - 10 lines] > > Stephan Stephan Keil - 22 Jan 2006 20:22 GMT > object.GetHashCode() - Now, after re-reading the documentation of GetHashCode(), I am totally confused :-o I will open another thread for this.
Let me describe in an example, which kind of problems I would like to attack: Suppose you have a container with - say - 1 million object references. Now you get another object reference and the job is to efficiently find out whether this particular object is already contained (by identity) in the container (side note: the objects could have changed their state, after they've been inserted into the container; for my understanding this makes the GetHashCode() useless for this problem).
You could iterate the whole container and check for identity equality with each container element. This results in 1 million comparisons!
If there is something like a fixed object identity value with a total order (like the pointers in C++), you could sort the container by that value and make a binary search. The "contained check" would then be possible with approx. 20 comparisons!
Thus, a fixed object identity could be of great value, but I've got to admit, that I don't know how the .NET framework could make such an identity available without overhead.
- Stephan
Ted Williams - 22 Jan 2006 20:52 GMT If these are your own classes, use a Guid to uniquely identify them and initialize the Guid at the time the object is constructed. If you override the GetHashCode() method to generate a hash code based on the value of the Guid, all the standard .Net container classes will function as expected.
-Ted
>> object.GetHashCode() - > [quoted text clipped - 23 lines] > > - Stephan Stephan Keil - 22 Jan 2006 21:27 GMT > If these are your own classes, use a Guid to uniquely identify them What if not? Sorry for being pertinacious, I just want to know if .NET has something to offer to generally solve these issues (e.g. how does the .NET serialization mechanism prevent cycles?).
- Stephan
=?iso-8859-1?Q?Patrik=20L=f6wendahl=20[C#=20MVP]?= - 22 Jan 2006 21:35 GMT Hello Stephan,
it uses the reference to identify the objects. Like Mathias stated in an earlier post.
-- Patrik Löwendahl [C# MVP] http://www.lowendahl.net http://www.cornerstone.se
>> If these are your own classes, use a Guid to uniquely identify them >> [quoted text clipped - 3 lines] > > - Stephan Lasse Vågsæther Karlsen - 23 Jan 2006 09:30 GMT I think you're going to have a problem however you want to do this.
Let me just talk my way through this and I'll try to explain what I mean.
First, you need to add all those objects to the container in such a way that they can be quickly retrieved. Let's say you have something of an "identity" value for each of those objects.
This identity value would either have to be: - based on the values in the object - a unique value not related to the contents of the object
Let's go with the first option and let's use the hash code of the object as returned by GetHashCode() as this value. If, after adding an object to a hash table, you modify the object in such a way that the hash code returned by GetHashCode() for that object is now different from the one used when the object was added, then yes, you have a problem.
In this context, you cannot modify the values of the keys in such a way that the hash code changes. Put differently, the keys should be immutable, if not enforcable then at least you should treat them as such and not use them.
The bonus of this is that if you later on construct a wholy new object with the same values internally as an object already present in the hash table, then those two objects should return the same hash code and thus you could easily detect that the object values are already in another object in your container.
This is, as I see it, what you want.
Now, let's go with the other option. Make each object have a unique value unrelated to the contents of the object.
This would of course make it possible for you to change the contents of the object without messing up the hash table, as the hash code you would use in the hash table is still the same as the original one.
However, when you later on construct a new object with the same values as an existing object in the hash table, this new object gets a new, unique value that will not be found in the hash table. Or, if it is found, chances are it's not the object you're interested in, ie. another object just happen to have that particular hash code.
Another option would be to base the hash code on the values from the object and then cache it so that subsequent calls to GetHashCode() would return the same value even if the contents of the object has changed.
This is also problematic regarding a hash table as it will use an equality test once it finds the hashed values in the table to determine which one in particular you want, and if the contents have changed...
Basically, it all comes down to one thing, the keys in the hash table should never change. If they do then you need to take the key+value out of the hash table and re-add it with the new key. Anything else won't work.
 Signature Lasse Vågsæther Karlsen http://usinglvkblog.blogspot.com/ mailto:lasse@vkarlsen.no PGP KeyID: 0x2A42A1C2
Lasse Vågsæther Karlsen - 23 Jan 2006 09:33 GMT Some typos. I still need to regulate my coffee intake it seems :P
<snip>
> In this context, you cannot modify the values of the keys in such a way > that the hash code changes. Put differently, the keys should be > immutable, if not enforcable then at least you should treat them as such ... if not enforcable by the compiler/runtime then at least ...
> and not use them. .. and not change them.
<snip>
 Signature Lasse Vågsæther Karlsen http://usinglvkblog.blogspot.com/ mailto:lasse@vkarlsen.no PGP KeyID: 0x2A42A1C2
guy - 23 Jan 2006 03:16 GMT Hi Vadym, Do not use Hash - most of the time it is ok, but if for example you have a long (64 bit) multiple values will map to the same Hash(32 bit) and it cannt be guaranteed that the same wont happen for other objects
hth
guy
> object.GetHashCode() - > "serves as a hash function for a particular type, suitable for use in [quoted text clipped - 15 lines] > > > > Stephan Paul Gielens - 22 Jan 2006 19:02 GMT Hi Stephan,
No, since you have no control where on the heap an object is created, you can't get its address. We use (not necessarily for database mapping) the IdentityField (http://www.martinfowler.com/eaaCatalog/identityField.html) pattern.
Best regards, Paul Gielens
Visit my blog @ http://weblogs.asp.net/pgielens/ ###
Mattias Sjögren - 22 Jan 2006 19:04 GMT Stephan,
>i.e. a fixed unique value per object, which can be used e.g. >to put objects in an associative container Can't you store the object reference itself?
>I know that the garbage collector moves objects around in the memory, so >the gc pointer cannot be used (at least as long as it is not "pinned"). When that happens all references are updated.
Mattias
 Signature Mattias Sjögren [C# MVP] mattias @ mvps.org http://www.msjogren.net/dotnet/ | http://www.dotnetinterop.com Please reply only to the newsgroup.
Stephan Keil - 22 Jan 2006 19:42 GMT Wow, that's a lot of answers in a very short time. Thanks to all.
>>i.e. a fixed unique value per object, which can be used e.g. >>to put objects in an associative container > > Can't you store the object reference itself? It's a general question, I don't have a particular problem to solve. But in C++ e.g. I sometimes use a STL set with object pointers to track, which objects I have already seen (a STL set is an associative container which is typically implemented by a sorted and balanced binary tree for efficient look up and manipulation). E.g. (sorry for posting C++ code):
class Y { ... };
class X { std::set<Y*> m_processed; // keep track of already processed Ys public: // ... void ProcessY(Y* y) { // process special Y object, if not already done if (m_processed.find(y) != m_processed.end()) { return; // already processed } // ... process y ... m_processed.insert(y); // remember y } };
How, e.g. does serialization work in .NET? During serialization, the already serialized objects must be remembered by identity in some data structure to prevent cycling (or they must be marked, which is only possible with runtime support). How is _efficient_ identity lookup possible here?
- Stephan
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|