.NET Forum / .NET Framework / Performance / March 2005
Collection sorted on last accessed object
|
|
Thread rating:  |
Vani Murarka - 25 Mar 2005 18:37 GMT Hi Everyone,
Does .NET offer any collection class which will give me objects last *accessed* such that I may build a least-recently-used cache that kills off objects that haven't been used for awhile?
Or is there any other way to implement this kind of a cache / collection where one can do this kind of cleanup based on least-recently-used objects?
Java has a collection class called LinkedHashSet which enables one to do this. Is there something similar in .NET - or some other way of doing this?
Any pointers will be really appreciated.
Thanks a ton
Regards
Vani
at - 25 Mar 2005 19:55 GMT Just move it to the front on access and kill those at the back. Any collection class that allows the move and the kill will do for this, like ArrayList.
> Hi Everyone, > [quoted text clipped - 17 lines] > > Vani Helge Jensen - 26 Mar 2005 10:24 GMT > Just move it to the front on access and kill those at the back. Any > collection class that allows the move and the kill will do for this, like > ArrayList. You may run into performance-problems, ArrayList really doesn't perform that well on anything but appending.
You could look into using a splay-tree data-structure, it fits your requirements pretty nice and is really easy to implement.
 Signature Helge Jensen mailto:helge.jensen@slog.dk sip:helge.jensen@slog.dk -=> Sebastian cover-music: http://ungdomshus.nu <=-
at - 26 Mar 2005 12:43 GMT That interests and surprises me, I have not measured the ArrayList's performance for moving elements around but could you provide some links to confirm your statement? I do not contradict your statement but I would like some confirmation.
Then, an ArrayList comes standard meaning less code to write and as long as its performance is ok why not stick with it? One can always change to another container as the need arises and have the process itself up and running first and then see what the performance is.
>> Just move it to the front on access and kill those at the back. Any >> collection class that allows the move and the kill will do for this, like [quoted text clipped - 5 lines] > You could look into using a splay-tree data-structure, it fits your > requirements pretty nice and is really easy to implement. Helge Jensen - 26 Mar 2005 15:19 GMT > That interests and surprises me, I have not measured the ArrayList's > performance for moving elements around but could you provide some links to > confirm your statement? I do not contradict your statement but I would like > some confirmation. Output of attached source (measured execution time since start):
00:00:00 mutation from end... 00:00:00.0100144 mutation from end... done 00:00:00.0100144 mutation from start... 00:00:12.3777984 mutation from start... done
It's not really suprising, since lists implemented as arrays has to copy the tail of the list when inserting/removing.
> Then, an ArrayList comes standard meaning less code to write and as long as > its performance is ok why not stick with it? One can always change to > another container as the need arises and have the process itself up and > running first and then see what the performance is. Yes, the initial implementation can easily be done using ArrayList, and if the profiler shows a performance problem there you can re-implement.... but, I have implemented caching in the past, and array's really aren't a good datastructure for it.
BTW: How are you going to search the cache? if it gets mederately large you should probably have a seperate indexing on it, by hashing for example.
If my memory seves me right wrt. LinkedHashSet(JAVA), there is no way to rearrange the ordering, moving most-used elements to the front, so it really isn't very good for caching either.
 Signature Helge Jensen mailto:helge.jensen@slog.dk sip:helge.jensen@slog.dk -=> Sebastian cover-music: http://ungdomshus.nu <=-
at - 26 Mar 2005 15:39 GMT I thought ArrayList was backed by a doubly linked list, I guess I was wrong. If implemented using fixed size arrays you are completely right.
Whatever data structure, as long as it is has the operations that a doubly linked list has (implemented as such or as some tree flavour) the one most in front is the most recently accessed one, the next one the one accessed before that and so on. From the other end it works the same way hence te requirement for doybly linked list semantics.
I am not considering random access here, just access starting from head and starting from tail and step from there.
Well, thanks anyway, for pointing out the ArrayList inefficiency.
>> That interests and surprises me, I have not measured the ArrayList's >> performance for moving elements around but could you provide some links [quoted text clipped - 31 lines] > rearrange the ordering, moving most-used elements to the front, so it > really isn't very good for caching either. --------------------------------------------------------------------------------
> using System; > using System.Collections; [quoted text clipped - 23 lines] > } > } at - 26 Mar 2005 17:26 GMT But, I tried the following
I measured
ArrayList al = new ArrayList(); for(int i = 0; i < 100000; i++) { al.Add(new TestItem(i)); }
TestItem ti; int j = 99999; Console.WriteLine("{0} starting turn around", DateTime.UtcNow); for(int i = 0; i < 100000; i++) { ti = (TestItem)al[j]; al.RemoveAt(j); al.Insert(0, ti); j--; } Console.WriteLine("{0} turn around finished", DateTime.UtcNow);
and
public class TestItem { private int m;
public TestItem(int i) { m = i; } }
With the following result:
3/26/2005 4:18:20 PM starting turn around 3/26/2005 4:18:59 PM turn around finished
That is about 0.0004 seconds per move, I'd say that is better than fast enough, at least it sufficiently fast so if moving an element to the front is all that is needed I would initially just use an ArrayList.
Regards,
At
>I thought ArrayList was backed by a doubly linked list, I guess I was >wrong. If implemented using fixed size arrays you are completely right. [quoted text clipped - 76 lines] >> } >> } Helge Jensen - 26 Mar 2005 23:44 GMT >>I thought ArrayList was backed by a doubly linked list, I guess I was >>wrong. If implemented using fixed size arrays you are completely right. It's backed by an Array :)
Based on experimental evidence, the Array is reallocated when full. I'm guessing it uses reallocation by multiplying the current size (O(n) amortized for n inserts).
>>I am not considering random access here, just access starting from head >>and starting from tail and step from there. Didn't you mention a cache? what do you do lookup based on?
>>Well, thanks anyway, for pointing out the ArrayList inefficiency. Well, it may or may not be a problem... atleast you know why it could be slow.
 Signature Helge Jensen mailto:helge.jensen@slog.dk sip:helge.jensen@slog.dk -=> Sebastian cover-music: http://ungdomshus.nu <=-
Helge Jensen - 26 Mar 2005 23:50 GMT > That is about 0.0004 seconds per move, I'd say that is better than fast > enough, at least it sufficiently fast so if moving an element to the front > is all that is needed I would initially just use an ArrayList. Good for you.
The expected expense of a randomly remove/insert would be O((n/2(^2)).
If the cache is smaller than 10k this may be an acceptible delay for you, especially if the cached calculation is very expensive or lookups are infrequent.
Of course you can always change to a "better" implementation later.
 Signature Helge Jensen mailto:helge.jensen@slog.dk sip:helge.jensen@slog.dk -=> Sebastian cover-music: http://ungdomshus.nu <=-
Jeff J. - 27 Mar 2005 18:58 GMT If the important element is expiration of older items, rather than sorting by access, you can use the Cache object from System.Web.Caching. It supports timed expirations, both fixed and sliding, as well as callbacks fired on removed items.
I don't know what the associated overhead is, but I hope that helps.
Good luck.
~ Jeff
> Hi Everyone, > [quoted text clipped - 17 lines] > > Vani Vani Murarka - 28 Mar 2005 13:08 GMT Hi Everyone,
Thanks for all the responses.
Firstly, the Java class I had intended to mention is the LinkedHashMap, because I do need to keep key-value pairs and refer to them for my operations apart from the clean-up task of removing items which have not been accessed for long.
Regarding using System.Web.Caching - This is a small server component (say S1) that I am making, which is to be used by several server business-logic components (A, B, C etc.) exposed via a web service, which in turn will be called by the main web application ASPX pages.
A, B, C components will all be calling S1 which remains available commonly available (S1 will be a Singleton - non-instantiable Class with all methods as static). The web application is not supposed to be aware of S1, nor is S1 to be aware of the web pages of the web application.
In such a scenario, I guess System.Web.Caching cannot be used - right?
What I am doing at present is - Keeping the information in a private static DataTable of S1. First column is the primary key column by which I normally have to get information. Second column is LastAccessed - which is updated with current server time everytime that item is read, or inserted or updated.
For the normal operations of S1, the DataTable will be used by its first column which is being defined as the primary key.
For the clean up task to remove items from the DataTable, I am thinking of keeping another thread running, which will, time to time, look for items (filter via dataview) where (current time - lastaccessed) = greater than a set timeout value and delete those items.
Is there any better way to do this?
ArrayList will not be appropriate I think because moving items around might be an overhead and because I need to access items by that key value.
Regards
Vani
> If the important element is expiration of older items, rather than > sorting by access, you can use the Cache object from [quoted text clipped - 28 lines] > > > > Vani Helge Jensen - 28 Mar 2005 15:05 GMT > A, B, C components will all be calling S1 which remains available > commonly available (S1 will be a Singleton - non-instantiable Class > with all methods as static). The web application is not supposed to be That's not the singleton pattern from the GOF book.
The singleton from GOF would be:
public class S1 { public static S1 Intance = new S1(); protected S1() {}; public T f(...); }
Only one static, the "instance" operation.
> For the clean up task to remove items from the DataTable, I am > thinking of keeping another thread running, which will, time to time, > look for items (filter via dataview) where (current time - > lastaccessed) = greater than a set timeout value and delete those > items. Usually caching means keeping N instances around, not just removing "too-old" ones.
> Is there any better way to do this? Dunno... why don't you keep cache in memory?
> ArrayList will not be appropriate I think because moving items around > might be an overhead and because I need to access items by that key > value. You would can use two data-structures, an IDictionary for the key-based lookup and a another where the objects are sorted after their last-accessed property (this is also what a database would do for multiple-indexed data).
You could also do something like in the attached file, using a bit more memory (3 refs pr. node) to get a runtime-efficient impl, but if you can live with having the cache-data in a databse-like thingy you probably don't need that.
 Signature Helge Jensen mailto:helge.jensen@slog.dk sip:helge.jensen@slog.dk -=> Sebastian cover-music: http://ungdomshus.nu <=-
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|