.NET Forum / Languages / C# / September 2007
REQ: High Performance Access to a Static Object's List<string>
|
|
Thread rating:  |
Mark S. - 12 Sep 2007 05:12 GMT Hello,
The app in question is lives on a Windows 2003 server with .NET 2.0 running IIS 6. The page of the app in question processes 2000 get requests a second during peak loads.
The app uses a Static Object. In this object is a generic List<String>. For every page request this list is looped over only reading not writing each value.
Question is, would it be a better practices to create an copy of the List before looping over it? List<string> myList = MyStaticObject.myList; If this is just a reference? If so, then it would seem to be less efficient? If that's true, would it be better to deep clone the list and then loop?
Testing on the dev machine under stress loading hasn't shown any light on the best practices way to go to reach optimal efficiency.
Thank you in advance for your thoughts.
M
Marc Gravell - 12 Sep 2007 05:27 GMT > List<string> myList = MyStaticObject.myList; > If this is just a reference? If so, then it would seem to be less efficient? > If that's true, would it be better to deep clone the list and then loop? How would that make it any less efficient? As long as the contents aren't changing occasionally (so that locking is an issue) then a deep- clone can only add time. Technically, I belive a string[] is marginally quicker than a List<string>, so perhaps swap for that? However - better is to look at a profiler and see where your time is really going. On a string[], I believe (IIRC) indexer access is again /marginally/ quicker - but your own tests may help.
Marc
Mark S. - 12 Sep 2007 05:38 GMT > How would that make it any less efficient? If it's a reference then why create a new list? I'd simply loop over the static object's list directly.
Maximum concurrency is what I'm striving for. If only N number of requests can read the static object's list then I was wondering if making a copy of the list would more quickly release the read to be used by another request. Otherwise that read hold stays in effect during the loop. Which then begs question, does the creating the copy consume more the CPU thereby negating any gains.
Peter Duniho - 12 Sep 2007 06:40 GMT >> How would that make it any less efficient? > > If it's a reference then why create a new list? I'd simply loop over the > static object's list directly. Well, one reason to copy to a local variable would be so that the static object's reference could be updated by one thread without affecting the instance being used by another.
This would assume that the list instances themselves are immutable, as I suggested in my other reply.
If changes to the list are always in place then, yes...you'll need to synchronize access to the list so that, at a minimum, writing to the list only occurs when no other thread is also trying to read from it.
Easier would be to synchronize all access to the list, but then even readers of the list would wind up serialized, which could hinder performance. It's a classic trade-off...easy and optimal are not always the same. :)
> Maximum concurrency is what I'm striving for. If only N number of requests > can read the static object's list then I was wondering if making a copy of > the list would more quickly release the read to be used by another request. To some extent, that depends on what you're doing during the read. If each element of the list requires extensive time to process during the iteration then it's possible copying the list could speed things up. However, you may still have synchronization issues between individual list elements if you do a shallow copy, or you may not find there's any significant performance benefit if you do a deep copy.
> Otherwise that read hold stays in effect during the loop. Which then begs > question, does the creating the copy consume more the CPU thereby negating > any gains. Only by measuring it will you know for sure. It all depends on how expensive a deep copy is, whether you even need a deep copy, and how much work the actual use of the list involves.
Personally, I'd go for the immutable list design, but then that's probably already apparent since I mentioned it twice already. :) I think that, at the very least, copying the list each time it changes (which would be required if the list instance itself is immutable) is likely to be much more efficient than copying it each time you actually use it.
Pete
Marc Gravell - 12 Sep 2007 10:01 GMT > If it's a reference then why create a new list? When did I suggest that?
> I'd simply loop over the static object's list directly. Wasn't that your original question?
Peter Duniho - 12 Sep 2007 06:14 GMT > [...] > The app uses a Static Object. In this object is a generic List<String>. For [quoted text clipped - 3 lines] > Question is, would it be a better practices to create an copy of the List > before looping over it? Not if you create a new copy each time you need to loop over it. That would obviously just make things slower, because you'd iterate the list twice instead of just once.
However, if you make the list essentially immutable, only replacing it if and when it needs to be changed by some other part of your architecture, then you should be able to have as many threads reading simultaneously from the list as you like without any problems and without having to create any copies of the list.
Make the reference itself volatile, initialize a new version when necessary, and just assign the reference to the new version when it's been completely initialized.
> List<string> myList = MyStaticObject.myList; If this is just a reference? If > so, then it would seem to be less efficient? Less efficient than what? How? Yes...all you are doing is copying the reference to the list from one variable or property to the other variable. It seems to me that should be very efficient. Vastly more efficient than cloning, deep or otherwise, the entire list.
> If that's true, would it be > better to deep clone the list and then loop? In what way would cloning be better?
> Testing on the dev machine under stress loading hasn't shown any light on > the best practices way to go to reach optimal efficiency. If you're only processing 2000 requests per second, you may not be putting enough load on the system to note significant differences between designs. Maybe try increasing the load until you reach the maximum throughput for a given design. Then you can modify the design and see if things get better or worse, by testing the system again to maximum load and seeing whether the new maximum load is different from the previous one, and by how much.
Pete
Jon Skeet [C# MVP] - 12 Sep 2007 07:49 GMT > The app in question is lives on a Windows 2003 server with .NET 2.0 running > IIS 6. The page of the app in question processes 2000 get requests a second [quoted text clipped - 3 lines] > every page request this list is looped over only reading not writing each > value. How long is the list? I wouldn't expect iterating over a list 2000 times a second would be significant in terms of performance unless it's really pretty long.
For instance, here's code to iterate over a list of 10000 entries, 20000 times. On my laptop (which is far from "server class") it takes about 3 seconds. That's actually longer than I expected, but then 10000 is quite a long list. How long is your actual list?
using System; using System.Collections.Generic; using System.Diagnostics;
public class Test { static List<string> list = new List<string>(); const int Iterations = 5000; static void Main() { for (int i=0; i < 10000; i++) { list.Add(i.ToString()); } int total=0; Stopwatch sw = Stopwatch.StartNew(); for (int i=0; i < Iterations; i++) { foreach (string x in list) { total += x.Length; } } Console.WriteLine(sw.ElapsedMilliseconds); Console.WriteLine(total); } }
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Willy Denoyette [MVP] - 12 Sep 2007 11:06 GMT >> The app in question is lives on a Windows 2003 server with .NET 2.0 >> running [quoted text clipped - 46 lines] > } > } Jon, Seems quite slow, here is what I have:
csc /platform:x86 /o+ .... result: 692 194450000
It's hard to believe my system is 5X faster than yours, unless your process runs at a lower speed as a result of power management policy. Note that I run this on X64 W2K3, so I need to specify X86 in order to load the 32 bit CLR and JIT32.
Willy.
Jon Skeet [C# MVP] - 12 Sep 2007 11:12 GMT On Sep 12, 11:06 am, "Willy Denoyette [MVP]" <willy.denoye...@telenet.be> wrote:
> Seems quite slow, here is what I have: > [quoted text clipped - 7 lines] > Note that I run this on X64 W2K3, so I need to specify X86 in order to load > the 32 bit CLR and JIT32. Nope, there's a really good reason for the discrepancy. I posted the *numbers* for doing 20000 iterations, but the *code* for doing 5000 iterations. So your machine is just a little bit faster than mine, which is very plausible.
Oops :(
Jon
Willy Denoyette [MVP] - 12 Sep 2007 11:48 GMT > On Sep 12, 11:06 am, "Willy Denoyette [MVP]" > <willy.denoye...@telenet.be> wrote: [quoted text clipped - 20 lines] > > Jon Yep, really good reason indeed.....
2617 777800000
Willy.
Mark S. - 12 Sep 2007 11:25 GMT > string[] vs List<string> Thanks for the tip. I'll look into it.
> List length I should have detailed that, it's a short list, most the time under 20, but could grow to 1000.
> How long will each loop take to process The entire page should never toke more than 100 mil seconds to complete worse case, the process in the loop take 70% of the processing time of the entire page.
> is the list immutable? The list is never altered by code in the page. There is a single thread timer background process that builds a tempory list from the db, once built, it locks the entire static object and replaces the old object with a new one object which contains the new list. (oldStaticObj = tempNewObj) This happens every 2 seconds.
Thank you all for your continuing helpful feedback.
Jon Skeet [C# MVP] - 12 Sep 2007 11:35 GMT > > List length > > I should have detailed that, it's a short list, most the time under 20, but > could grow to 1000. In that case it sounds like you're worrying prematurely.
On my work machine, the test program takes only about 20 milliseconds to iterate through a list of 1000 items 2000 times. So for 2000 hits per second, that's 2% of your performance - but only using one CPU. In real life with multiple threads, you'll get cache misses etc - but the kind of chnage you're talking about isn't going to alter that.
Do you have any reason for concentrating on this particular part of performance? Do you have numbers to suggest you even have a performance issue?
Jon
Göran Andersson - 12 Sep 2007 11:38 GMT >> string[] vs List<string> > Thanks for the tip. I'll look into it. > >> List length > I should have detailed that, it's a short list, most the time under 20, but > could grow to 1000. That's still short. :)
>> How long will each loop take to process > The entire page should never toke more than 100 mil seconds to complete > worse case, the process in the loop take 70% of the processing time of the > entire page. Doing the actual looping is a small fraction of that time, so if you have any performance problems, you have to do something time consuming inside the loop.
>> is the list immutable? > The list is never altered by code in the page. There is a single thread > timer background process that builds a tempory list from the db, once built, > it locks the entire static object and replaces the old object with a new one > object which contains the new list. (oldStaticObj = tempNewObj) This happens > every 2 seconds. In that case you should just copy the reference to the list when you want to read it. That way the page will read from the old list if the list is replaced while the page is looping, and the lock is only while you copy the reference, not while you loop the list.
 Signature Göran Andersson _____ http://www.guffa.com
Mark S. - 12 Sep 2007 15:46 GMT > Do you have numbers to suggest you even have a performance issue? No issue, just looking for the best practices for this simple yet vital part of the app.
> In that case you should just copy the reference to the list when you want > to read it. That way the page will read from the old list if the list is > replaced while the page is looping, and the lock is only while you copy > the reference, not while you loop the list. I understand your point in theory, and it makes sense, but here's the code implimentation of this understanding, do I have it wrong?
myApp.aspx List <String> myNonStaticInstanceOfTheMasterList = MyStaticObject.ShortList; for (int i=0; i<myNonStaticInstanceOfTheMasterList; i++) { //do stuff }
> Sam questions We're using and the other techniques you mentioned in other parts of the app, but for this part of the app the loop each request is necessary.
Jon Skeet [C# MVP] - 12 Sep 2007 16:00 GMT > > Do you have numbers to suggest you even have a performance issue? > > No issue, just looking for the best practices for this simple yet vital part > of the app. First best practice is to keep an eye on the performance throughout development and have appropriate targets, but not worry too much about little things like this until they've been proved to be a problem.
> > In that case you should just copy the reference to the list when you want > > to read it. That way the page will read from the old list if the list is [quoted text clipped - 10 lines] > > } I assume that's meant to have a ".Count" in there?
Unless you really need the index, it would be easier to do:
foreach (string x in MyStaticObject.ShortList)
Jon
Peter Duniho - 12 Sep 2007 18:42 GMT > [...] > I understand your point in theory, and it makes sense, but here's the code [quoted text clipped - 5 lines] > //do stuff > } Other than what Jon wrote, no...seems fine.
Note that if you copy the reference and mark it as "volatile", then you should not need to do any actual locking when updating it. Just create the entire list first in a local variable, and then assign that reference to the static class's reference variable when the list has been completely initialized.
Pete
Samuel R. Neff - 12 Sep 2007 14:02 GMT What exactly are you doing when you loop over the list? Can you cache the results of the loop and just use that or do you really need to loop through the items in every request, 2000 times a second? Or is there another collection that would be more efficient.. are you looking for an item? can you use a dictionary? Would sorting the list allow you to loop differently and skip items?
HTH,
Sam
------------------------------------------------------------ We're hiring! B-Line Medical is seeking .NET Developers for exciting positions in medical product development in MD/DC. Work with a variety of technologies in a relaxed team environment. See ads on Dice.com.
>Hello, > [quoted text clipped - 18 lines] > >M
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|