.NET Forum / .NET Framework / Performance / November 2003
What would be a faster alternative to boxing/unboxing?
|
|
Thread rating:  |
Mountain Bikn' Guy - 20 Nov 2003 17:34 GMT I have a situation where an app writes data of various types (primitives and objects) into a single dimensional array of objects. (This array eventually becomes a row in a data table, but that's another story.) The data is written once and then read many times. Each primitive read requires unboxing. The data reads are critical to overall app performance. In the hopes of improving performance, we have tried to find a way to avoid the unboxing and casting. So far, nothing has worked well.
One solution we tried was to create our own simple object to wrap the primitives. In this object we used a public field that allowed reading of the primitive data value without a property or unboxing. However, overall performance went down. We are still investigating, but our current speculations are that: 1. the overhead of creating the object was significantly greater than the overhead of boxing and this offset any gains we might have gotten on the reads (a guess). 2. wrapping each primitive forced us to introduce a new virtual method call in place of the boxing/casting. Maybe the overhead of this call is what caused the performance to go down. (Another guess.)
BTW, performance went down 15%.
Does anyone have any ideas on how to implement the following:
A 1-dimension array (with normal indexing) containing heterogenous strongly typed data that allows reading/writing without casting and without boxing/unboxing.
All thoughts/suggestions are appreciated.
Paul Robson - 20 Nov 2003 17:47 GMT > Does anyone have any ideas on how to implement the following: > [quoted text clipped - 3 lines] > > All thoughts/suggestions are appreciated. Have to excuse me as I'm new to C# :)
My understanding is you can't have an array of objects with value types without boxing. As the instant you do obArray[x] = 32 it boxes it.
Can you use a second array in parallel with your first array ; casting data if you need to ; this might be quicker. Wasteful of storage :)
Other thing is to hack into the system and write inline code. Don't know if you can do this in C# even if it is a good idea.
Sorry if these ideas are dumb :)
100 - 20 Nov 2003 18:19 GMT Hi Paul,
> Have to excuse me as I'm new to C# :) > > My understanding is you can't have an array of objects with value types > without boxing. As the instant you do obArray[x] = 32 it boxes it. Not quite true. If you have
struct MyStruct { ..... }
MyStruct [] arr = new MyStruct[XXX]; arr[0] = new MyStruct() ; //that won't cause any boxing. MyStruct ms = arr[0]; //won't cause any unboxing because nothing has been boxed
Anyway, it might not cause boxing, but the value will be copied which will introduce some overhead compared to the case if we use reference type instead of value type.
B\rgds 100
Mountain Bikn' Guy - 20 Nov 2003 18:34 GMT This suggestion is almost exactly what I referred to in my original posting. Don't know why, but it was slower than boxing/unboxing. Copying isn't an issue because I'm dealing with value types (primitives) which will be copied in any case -- and this is the behavior I desire.
> Hi Paul, > > Have to excuse me as I'm new to C# :) [quoted text clipped - 21 lines] > B\rgds > 100 Paul Robson - 21 Nov 2003 09:24 GMT > Hi Paul, > [quoted text clipped - 19 lines] > introduce some overhead compared to the case if we use reference type > instead of value type. Hm. So structs are this sort of wierd interim type between values and objects ? I find the whole thing not confusing but maybe inconsistent.
If you can assign a struct to an object reference what happens if you have
sometype somefunc(object[] a) { a[1] = new MyStruct() }
My understanding is that MyStruct[] is allocated off the stack not the heap. So when the stack frame is removed, what does a[1] point to ?
It doesn't seem very "safe" - unless it does a struct copy (as in struct1=struct2) and allocates that on the heap ?
Sorry if these are dumb questions :) I only started C# a couple of days ago.
100 - 21 Nov 2003 15:12 GMT Hi Paul,
No, the question is not dumb at all.
> Hm. So structs are this sort of wierd interim type between values and > objects ? I find the whole thing not confusing but maybe inconsistent. No, the struct are value types. If you look at the docs value types are: primitives (int32, int64, char, etc), enums and structures. So take a look at MSDN. Int32 for example is decalred as a *struct*. They are called primitives because IL has special instructions to work with them directly. That is why you won't see any overloads of the arithmetic operators defined for them. Decimal, though, is not a primitive so it has overloaded the arithmetic operators and this is the reason why the performance is worse when you work with them compared with double for example.
> If you can assign a struct to an object reference what happens if you have > [quoted text clipped - 10 lines] > > Sorry if these are dumb questions :) I only started C# a couple of days ago. Value types may be in two states: boxed and unboxed, how you know. When they are in boxed state they are always in the managed heap. When they are in unboxed state they might be in the heap or in the stack. In c# we can have unboxed value type in the heap only when they are part of the reference object (which is allocated always in the heap). I said in C# because for CLR's point of view the operation of unboxing doesn't do the copy of the value type from the heap to the stack. You might have boxed value type in the heap and then doing *unbox* it gets a pointer to the data part of the object (pointer to the unboxed object) and pushed it in the evaluation stack. It doesn't copy the data from the heap. However, because in most of the cases the opertation of unboxing is followed by operatation of copying the data C# designers desided not to provide unboxing in the heap. In C# unboxing is always followed by copy. You can do it in VC++ or ILAsm, though.
So arrays are reference types. As reference types they reside in the heap. But if they are of type *value type* the items are in unboxed state in the heap as if they were data members of a class. So, if you have
int[] intArr = new int[10]; intArr[0] = 10; the value type 10 won't be boxed in the array it will be copied in the heap in unboxed state.
What happens here
> sometype somefunc(object[] a) > { > a[1] = new MyStruct() > } new MyStruct will create indeed the strucutre in the stack. But then the assignment will copy this value form the stack to the heap. The original one which is in the stack will go away when the stack frame gets removed but its verbatim copy in the heap will stay as part of the array.
HTH B\rgds 100
Paul Robson - 21 Nov 2003 16:02 GMT > What happens here > [quoted text clipped - 11 lines] > B\rgds > 100 Thanks ! It's much clearer now :)
100 - 20 Nov 2003 19:13 GMT Hi, What do you mean by primitives? Do you mean real primitives like (int, float, char, etc) or you are talking about value types in general (which could be primitives as well as your own defined structures)?
> 1. the overhead of creating the object was significantly greater than the > overhead of boxing and this offset any gains we might have gotten on the > reads (a guess). Wrapping primitives as long as I can see is a process of:
1. Creating an reference type calling its constructor, which I assume accept the primitives as a parameter. This constructor then saves the primitive in internal data member of the same type as the primitive (to avoid boxing) Now let see how many times we copy our primitive: once calling the constuctor we are making a copy in the stack, twise when we copy it in the internal variable.
I asked if your promitives are real primitives because if they are they are not so big. Anyway, even if you use reference types you have to copy the reference (address - 4 bytes) which is not a big difference. In addition the JITer will optimize the calls using the CPU registers. If you have structures, though, it might be too big to be optimized and they will be indeed copied. You will end up with two copy operation + all work done to create the wrapping object + calling the constructor. If you have used boxing you would have - creating object in the heap + one copy (no calling constructors).
2. Then when you read data again you will have at leas one copy operation of the primitive. As far as I understand you use virtual method to retrieve the data so the method is too general to be used for all primitives you may have. So type conversion(which means boxing + unboxing) is necessary. Again only one unboxing is better.
It is not surprising (at least to me) that the wrapping classes are slower than boxing/unboxing
" 2. wrapping each primitive forced us to introduce a new virtual method call
> in place of the boxing/casting. Maybe the overhead of this call is what > caused the performance to go down. (Another guess.) I don't thing with virtual method you can avoid boxing/casting. IMHO this will lead to boxing/casting May be I am wrong. A code snipped will be of help here. Anyway don't worry about the performance of virtual methods. In the way CLR works the difference is not as big as in C++ for example. Even non-virtual methods are called indirectly. Virtual methods need only one step of indirection more + it doesn't adjust the pointer as C++ does. The diference is not as big as if you compare to C++.
> BTW, performance went down 15%. > [quoted text clipped - 3 lines] > typed data that allows reading/writing without casting and without > boxing/unboxing. Strongly-typed and heterogenous? IMHO those two guys are mutullay exclusive.
B\rgds 100
Mountain Bikn' Guy - 20 Nov 2003 20:56 GMT Thanks for all your tips and thoughts. Your comments helped me see why some alternatives we tried were slower than boxing/unboxing.
More responses inline.
> > Does anyone have any ideas on how to implement the following: > > [quoted text clipped - 4 lines] > > Strongly-typed and heterogenous? IMHO those two guys are mutullay exclusive. Yes, but that doesn't mean some creative thinking can't result in a satisfactory solution. When I was in college, I created a data structure that allows (psuedo) random access into dynamically allocated memory without any external indexes. My goal was to have array style random access into dynamically allocated (non-contiguous) memory -- two goals that seem incompatible. The result of my effort was a data structure that out-performed height-balanced AVL trees. So I'm hoping to have some luck and come up with something creative that lets me get the benefits of strong typing in a heterogenous array. ... maybe it will happen, maybe not.
100 - 20 Nov 2003 22:30 GMT Hi,
> Yes, but that doesn't mean some creative thinking can't result in a > satisfactory solution. Don't get me wrong. I'm not arguing against creativity. I just wanted to say that heterogenous storage means (as far as I understand that word) storage that contains different types of data. Strongly-typed would mean (in most of the casses) storage that contains only one type of data and doesn't accept any other.
But in my second thought it could be called like that if you have storage that accept only certain types of data (say integers, floats and my Foo class) and reject the others.
So if that is what you mean. I see one posible solution which may satisfy your needs of speed. For each value type you may have designated array to keep the values. In this case you won't have boxing/unboxing operations going behind the scene. Reference types you may hold in one common array (or any other kind of storage).
Provide as many overloads of the methods for adding and retrieving data as many different types you have
something like this (suppose we want to accept int, double and strings(this is my reference type))
class Storage { int[] intArr= new int[10]; double[] dblArr = new int[10]; object[] refArr = new object[10];
public int this[int i] { get{return intArr[i];} set{intArr[i] = value;} }
public double this[int i] { get{return dblArr[i];} set{dblArr[i] = value;} }
public string this[int i] { get{return (string)refArr[i];} //Conversion - it is not as bad as boxing/unboxing set{refArr[i] = value;} }
....... } Of course we have to implement logic to expand the storage when the capacity is exceeded and so on. We waste too much memory; to be more concrete you'll have N*M unused slots where N is the number of elemens in the storage and M is the number of the value types which can be stored. It could be huge amount of unused memory if you are planing to store a lot of data, but for small quantities it could be acceptable and we won't have any boxing/unboxing.
B\rgds 100
Paul Robson - 21 Nov 2003 09:26 GMT > class Storage > { [quoted text clipped - 9 lines] > > public double this[int i] I thought that C# couldn't do polymorphism on return types only ? Am I wrong ?
100 - 21 Nov 2003 15:29 GMT Hi Paul, You are hundred percent right. That was my fault. I wrote this minutes before I went home yesterday and its my fault that I posted this without any revision.
Thanks for bringing this out. We can't use indexer for that. What we can do is to use separate operations SetData and GetData with as many overloads as different types we want to accept. GetData cannot return value because ot will introduce the same problem that we have already with the indexer. So we can have prototype like
void GetData(int index, out <type> value) { }
this should fix the problem.
B\rgds 100
> > class Storage > > { [quoted text clipped - 12 lines] > I thought that C# couldn't do polymorphism on return types only ? Am I > wrong ? 100 - 21 Nov 2003 16:50 GMT The real problem with this idea is - how we can possibly know which type is the object at given index in order to know what overload to use to retrieve the data. We can provide method that returns the type of an object by index value. But then we need to check this type and call the appropriate method which will introduce more overhead when reading data. It may turns that boxing/unboxing is the best solution and the problem with the prformance is not there.
That's why I think we can't come up with good idea unless Mountain gives us an example how he is planning to use this storage. He may not need to have the objects ordered and then this solution will work and we won't have the problem of wasting the memory.
B\rgds 100
> Hi Paul, > You are hundred percent right. That was my fault. I wrote this minutes [quoted text clipped - 32 lines] > > I thought that C# couldn't do polymorphism on return types only ? Am I > > wrong ? Mountain Bikn' Guy - 21 Nov 2003 22:01 GMT Here's the answer to your question (I hope):
The ordering is determined by computational dependencies. Just before processing starts, an ordering is determined dynamically. Then during processing, items are written to the array in that order. There is actually an external index that is used to find items in the array.
Does that answer your question?
I do currently use 'out' parameters in the overloaded methods that retreive data from the array, as you figured out.
I've written code for other approaches too. I have lots of alternative implementations that I have tested. As a side note -- which isn't really relevant to the main topic -- when I need to have overloaded return types, I use this approach:
<type> GetData(int index, <type> value) { return (type)valueList[index];//cast to <type> }
What this does is use a parameter to chose an overload, and each overload has a different return type. It is very similar to the 'out' approach except the data is returned in a more familiar way.
I hope mentioning this doesn't divert this thread from the main topic. My concern is still more with a design that might solve the problem more creatively. I have an interesting approach that I tested a couple days ago. I'll post some example code this weekend.
Thanks for your interest in the topic! Mountain
> The real problem with this idea is - how we can possibly know which type is > the object at given index in order to know what overload to use to retrieve [quoted text clipped - 52 lines] > > > I thought that C# couldn't do polymorphism on return types only ? Am I > > > wrong ? Frank Hileman - 22 Nov 2003 02:41 GMT That's not hard. Use an array of unsigned ints; upper 3 can be type identifier; remaining bits index into array of a specific type. For example, wrap it in a struct to make it easier:
enum ArrayType { Int = 0x1, Double = 0x2,... }
struct ArrayEntry { const unsigned int typeMask = 0x7 << 29; const unsigned int indexMask = ~ typeMask;
usigned int data;
public ArrayEntry(ArrayType type, int index) { data= 0; // for compiler warning Type = type; Index = index; }
ArrayType Type { get { return (ArrayType)(data>> 29); } set { value = Index | ((int)data) << 29); } }
int Index { get { return data & indexMask; } set { data = Type | (value & indexMask); } } }
Then if you have ArrayEntry[] a, you do a[i].Index, a[i].Type, then use that info to index into the type-specific arrays, which I would call pools, if you choose to use them that way.
Regards, Frank Hileman Prodige Software Corporation
check out VG.net: www.prodigesoftware.com/screen5.png An animated vector graphics system integrated in VS.net beta requests: vgdotnetbeta at prodigesoftware.com
> The real problem with this idea is - how we can possibly know which type is > the object at given index in order to know what overload to use to retrieve [quoted text clipped - 52 lines] > > > I thought that C# couldn't do polymorphism on return types only ? Am I > > > wrong ? Mountain Bikn' Guy - 22 Nov 2003 15:49 GMT Frank, That's awesome! Beautiful! Just what I was looking for, and the style is consistent with other code I've written, so it's a beautiful fit! Thanks.
I think I'll use 6 bits for the types (I probably have about 15 different types I use already). My (upper) array length will not need to be more than 1 million, so this leaves me plenty of room. I might even allocate another bit for the types.
The obvious way to do the last step you mention, indexing into the type-specific sub arrays (pools), is to select the type specific array via a switch or if/else block. Do you have a better suggestion? This step could end up negating some of the potential performance gains if it isn't done correctly. Another indexing/dereferencing operation at this step would be great. Maybe I can put all the type-specific sub arrays in an object[] and cast from an object to a type[] (ie, double[], bool[], int[] etc.).
object[] poolArray = new object[numberOfTypes]; //array of type specific arrays
//this statement would exist inside an overloaded "Get" method that requests a double: return ((double[])poolArray[typeNdx])[itemNdx];
I have the above code because I was putting type specific arrays of length 1 inside each slot in the upper array as a way to avoid boxing and unboxing. (In that case the 2nd index was always 0. This is the experimental solution I previously mentioned that I would post. It's fairly simple to implement, so I'll just skip a full post.)
Also, I will need to come up with a nice way to properly size the type specific arrays (pools) before processing starts. My current design should support this, with a little work. I already know the total size (ie, total number of items to go into the array) in advance. However, I do not currently know the size (count) needed on a per-type basis, so I'll have to figure a way to compute this. This step isn't really as peformance critical because it only needs to be done once.
Regards, Mountain
P.S. Your code is a perfect example of why I love C# so much.
> That's not hard. Use an array of unsigned ints; upper 3 can be type > identifier; remaining bits index into array of a specific type. For example, [quoted text clipped - 65 lines] > > B\rgds > > 100 Frank Hileman - 22 Nov 2003 19:11 GMT > I think I'll use 6 bits for the types (I probably have about 15 different > types I use already). My (upper) array length will not need to be more than > 1 million, so this leaves me plenty of room. I might even allocate another > bit for the types. With 5 bits you have 16 different enum values that can be put in (0-15).
> The obvious way to do the last step you mention, indexing into the > type-specific sub arrays (pools), is to select the type specific array via a [quoted text clipped - 10 lines] > a double: > return ((double[])poolArray[typeNdx])[itemNdx]; That would work, but I think the switch would be faster. The one thing I worry about with your method is possible array type conversion. I don't think it would happen but the compiler might put some dynamic type checking in there that is really unnecessary (to make sure the cast is fair). So I think a switch is most likely faster (small switches optimize down to tight machine code), since no dynamic cast is needed. Also no array bounds checking is needed:
switch (typeNdx) { ArrayType.Int: return intArray[itemNdx]; ... }
In C of course, the array indexing would not cause typechecking or array bounds checking. But even there the switch would probably win because it does not require an extra memory dereference (every array index operation is an extra dereference).
> Also, I will need to come up with a nice way to properly size the type > specific arrays (pools) before processing starts. My current design should [quoted text clipped - 3 lines] > figure a way to compute this. This step isn't really as peformance critical > because it only needs to be done once. I would not precompute the sizes at all. Instead, allocate the typespecific arrays on the fly, in the same way an arraylist works. For example, here is a types-pecific arraylist sort of collection. It is a bit of code, I will put it in the next message.
> P.S. Your code is a perfect example of why I love C# so much. Thanks! Structs in C# wrap bitfields beautifully. I think perhaps Brad Adams and others should not discourage mutable structs so much, because they are very useful when used as fields in objects. And the compiler can catch the use of a temporary struct as an lvalue.
I thought of a way to clean up the "magic numbers" a bit (0x7 and 29 in my sample). In the ArrayType enum:
[Flags] enum ArrayType : unsigned int { Int = 0x1, Double = Int << 1, ... All = Int | Double | ... }
That would give you an unshifted bit mask in the enum, called All. Then, if you write a function which can count the bits in an unsigned int, you could compute the constants this way:
static readonly insigned int typeShift = 32 - BitFunctions.CountBits(ArrayType.All); static readonly unsigned int typeMask = ArrayType.All << typeShift; static readonly unsigned int indexMask = ~typeMask;
and in the property:
ArrayType Type { get { return (ArrayType)(data >> typeShift); } set { data = Index | ((unsigned int)value) << typeShift); } }
Note the change to the set function! I made a mistake before, and swapped the words "value" and "data". I just wrote the code out without an editor or compiler.
- Frank
Frank Hileman - 22 Nov 2003 19:28 GMT > With 5 bits you have 16 different enum values that can be put in (0-15). Whoops! I meant, with 4 bits you have 16 possible enum values. And the way I assigned the values was wrong, they are not bitflags:
enum ArrayType : unsigned int { Int = 1, Double = 2, String = 3, ... Max= what ever the biggest one is... }
Then in the bit shifting computation, you would use Max instead, but if it is not all 1 bits, you have to count the position of the topmost set bit, and not the number of bits set.
Sorry about that. Should always run these things through a compiler...
Mountain Bikn' Guy - 22 Nov 2003 19:48 GMT Frank, Thanks for your additional comments. My responses are inline. Regards, Mountain
> That would work, but I think the switch would be faster. The one thing I > worry about with your method is possible array type conversion. I don't [quoted text clipped - 3 lines] > machine code), since no dynamic cast is needed. Also no array bounds > checking is needed: I'll take your advice on this. My only experience in this regard is a data structure I wrote in C++ that used dereferencing extensively and in place of conditionals whenever possible. It was super fast, out performing the optimum data structure in this category. (Just so this doesn't sound too vague, I'll add that it was a binary tree that outperformed an identically implemented height balanced AVL tree.) This experience has made me inclined to favor dereferencing whenever possible, but obviously that was a limited situation.
> I would not precompute the sizes at all. Instead, allocate the typespecific > arrays on the fly, in the same way an arraylist works. For example, here is > a types-pecific arraylist sort of collection. It is a bit of code, I will > put it in the next message. Thanks. I will use a "type-specific arraylist sort of collection" if I can implement it such that when I clear my arrays and prepare for new data I don't have to reallocate anything. That way the cost would be incurred only once. I'm sure this won't be a problem. All subsequent writes (thousands of them) of new data (after the first full population and clear) will be identically sized and identically indexed.
> Note the change to the set function! I made a mistake before, and swapped > the words "value" and "data". I just wrote the code out without an editor > or compiler. I didn't see that one, but I did see this: "usigned int data;"
That clued me in that you wrote it without an editor/compiler -- I was impressed. If I don't check my code samples by compiling, I typically end up with something nonsensical in there somewhere.
Frank Hileman - 23 Nov 2003 01:13 GMT > I'll take your advice on this. My only experience in this regard is a data > structure I wrote in C++ that used dereferencing extensively and in place of [quoted text clipped - 4 lines] > to favor dereferencing whenever possible, but obviously that was a limited > situation. Yep, sounds like you had a killer implementation. It all depends on the assembly generated. When you start poking at the assembly level you can see that every pointer deref has a cost that can be avoided. We are probably making a lot of assumptions about the generated code from a high level language. I would be curious to see if there was any speed diff between the switch and the generic array method. The other thing mentioned, a union, sounds interesting too, but that requires special permissions at run-time, because that technique theoretically has security implications. I miss the union from C. Union would have been my first thought in C++ for this problem. Seems like they might have put union somehow in C# without this security problem -- perhaps by forcing the CLR to clear the bits before you cast it differently.
have fun - Frank
Frank Hileman - 22 Nov 2003 19:36 GMT Here is the basic code for an arraylist style collection. I took out some of the ICollection stuff, enumerator, etc. It is called PointList, but really it is an array of Vector structs (x and y values). So you can see how it can be used for any data type. Just swap your type name for "Vector".
public class PointList { // -------- static members -------- private const int DefaultCapacity = 4;
// -------- instance members -------- private Vector[] items; private int count;
publicPointList() { items = new Vector[DefaultCapacity]; }
public int Count { get { return count; } }
public int Capacity { get { return items.Length; } set { if (items.Length == value) return; Check.Argument(value >= count, "value", "Capacity must be greater or equal to Count"); int newCapacity = value > DefaultCapacity ? value : DefaultCapacity; Vector[] newItems = new Vector[newCapacity]; if (count > 0) Array.Copy(items, 0, newItems, 0, count); items = newItems; } }
public Vector this[int index] { get { return items[index]; } set { if (index >= count) throw new ArgumentOutOfRangeException("index", index, "Index must be smaller than Count."); if (items[index] == value) return; items[index] = value; } }
public int Add(Vector point) { if (items.Length == count) Capacity = items.Length * 2; int index = count; items[index] = point; ++count; return index; }
public void AddRange(Vector[] points) { int minCapacity = count + points.Length; if (minCapacity > items.Length) Capacity = minCapacity > DefaultCapacity ? minCapacity : DefaultCapacity; Array.Copy(points, 0, items, count, points.Length); count += points.Length; }
public virtual void Clear() { Array.Clear(items, 0, count); count = 0; } }
Mountain Bikn' Guy - 25 Nov 2003 20:58 GMT Frank, I assume you've seen the Rotor source and maybe used it as a "go-by" for your arraylist-style collection. I'm thinking about just doing a search/replace of "object" with my type (double, etc.) to create all my strongly typed expandable collections. I wanted to ask if you know of any issues or problems in the Rotor ArrayList source code that I should be aware of, or any other reason why I shouldn't take the "easy way out" via search/replace. That is of course, assuming that you may have already done (or considered and rejected) what I'm thinking about doing. Regards, Mountain
> Here is the basic code for an arraylist style collection. I took out some of > the ICollection stuff, enumerator, etc. It is called PointList, but really > it is an array of Vector structs (x and y values). So you can see how it can > be used for any data type. Just swap your type name for "Vector". Frank Hileman - 26 Nov 2003 00:51 GMT Mr. Mountain,
No, actually, that did not come from Rotor. I extracted it from some common list code we have been using a very long time now, (over a year and a half) so it is stable. Not much to it really, except the Array calls. I think if you review it carefully you can see how it is straightforward logic.
Search and replace is unfortunately the way to go. We have a ListBase that encapsulates the Array stuff but the array itself must be strongly typed so we were never able to reuse much code with a base class. In the future generics will eliminate this problem...
By the way I created several strongly type collections with search/replace and it worked fine.
regards, Frank
> Frank, > I assume you've seen the Rotor source and maybe used it as a "go-by" for [quoted text clipped - 14 lines] > can > > be used for any data type. Just swap your type name for "Vector". Mountain Bikn' Guy - 26 Nov 2003 03:09 GMT Hi Frank, Thanks again for your feedback. I just finished an implementation of a DoubleArrayList based on the Rotor source I've seen. It ended up not being a simple search/replace because I had to take the IList implementation out. I implemented IEnumerable, ICollection and ICloneable, and I also implemented all the private nested classes, so there are classes like this: private class IListWrapper : DoubleArrayList, IList that do implement IList.
So far all I've done is compile it. This definitely wasn't the easy way to go. The DoubleArrayList class is almost 3000 lines of code. I wouldn't do it this way again, especially because its useful life will end when generics are available. However, for this one time it was probably worth it just to become more intimately acquainted with the Rotor implementation of the ArrayList class.
Let me know if you care to see any of it. I doubt it contains anything you would find interesting. It's just a more complex version of what you already have -- and the complexity is stuff I might not even use now -- stuff like the various wrappers and thread safe code.
Once I test this class, I'll just use search/replace to create the other type specific versions.
Regards, Mountain
> Mr. Mountain, > [quoted text clipped - 37 lines] > > can > > > be used for any data type. Just swap your type name for "Vector". Frank Hileman - 26 Nov 2003 22:15 GMT Thanks, Mountain, but that complexity is why we did not use it. - Frank
> Let me know if you care to see any of it. I doubt it contains anything you > would find interesting. It's just a more complex version of what you already > have -- and the complexity is stuff I might not even use now -- stuff like > the various wrappers and thread safe code. 100 - 24 Nov 2003 15:14 GMT Hi Frank, Yes, you can do that. Haw I said "We can provide method that returns the type of an object by index value. But then we need to check this type and...". As far as I remember Mountain was concerned about reading performance. My question now is: what is your suggestion about the storage-class (as a whole) interface. So if you have class MyStorage { ..... }
What interface you suggest to retrieve the data. If you have one method XXXX GetData(int index) What is the type of XXXX if it is an 'obect' we have to unbox the object (Montian wanted to avoid exactly this). If your idea is not to use one class, but insted bunch of different arrays for each type + one for ArrayElements We'll have something like this ArrayElements e = arrElements[index]; switch(e.Type) { case Integer: int i = intArr[e.Index]; /// do sth with int break; case Double: double d = dblArr[e.Index]; /// do sth with double break; ......
}
Ok, we avoided boxing/unboxing. But how many copy operations we do have?
1 for returning the ArrayElement value + 1 for extracting the index + 1 for extracting the type + 1 for extracting the concrete value from the type-cpecific array. At least 4 copy operations + supporting opeations as shift and logical ops + 2 array indexing(all bound checks, etc). Mountain finds unboxing(one copy + one typecheck) for expensive. Do you thing it is more cheap?
B\rgds 100
> That's not hard. Use an array of unsigned ints; upper 3 can be type > identifier; remaining bits index into array of a specific type. For example, [quoted text clipped - 108 lines] > > > > I thought that C# couldn't do polymorphism on return types only ? Am I > > > > wrong ? Frank Hileman - 24 Nov 2003 17:46 GMT If you cannot use strong typing throughout, and avoid casting to object, then you may as well stick to an ArrayList and boxing. The only interface that can be used to efficiently retrieve the data is a strongly typed one, GetInt, GetDouble, etc. Otherwise the whole concept is defeated.
Regarding the overhead from the copy, shift, logical ops, etc. These are all very fast and inlined. This is the great thing about structs. Time it and see. Make sure the class you use is not derived from MarshalByRefObject, which defeats inlining. Based on my own perf tuning experience strongly typed arrays should beat the boxing alternative hands down.
regards, Frank
> Hi Frank, > Yes, you can do that. Haw I said "We can provide method that returns the [quoted text clipped - 41 lines] > B\rgds > 100 Mountain Bikn' Guy - 25 Nov 2003 19:34 GMT my comments are inline Mountain
> If you cannot use strong typing throughout, and avoid casting to object, > then you may as well stick to an ArrayList and boxing. The only interface > that can be used to efficiently retrieve the data is a strongly typed one, > GetInt, GetDouble, etc. Otherwise the whole concept is defeated. Absolutely! I use an overloaded GetData method that includes an 'out' parameter for each data type. The GetData overloads simply call GetInt, GetDouble, etc.
> Regarding the overhead from the copy, shift, logical ops, etc. These are all > very fast and inlined. This is the great thing about structs. Time it and > see. Agreed. I'm doing timing tests at each step. I've often been surprised to get results I didn't expect, which, in this case, is what led to my initial post on this topic.
>Make sure the class you use is not derived from MarshalByRefObject, > which defeats inlining. Good tip. I wasn't aware of this.
>Based on my own perf tuning experience strongly > typed arrays should beat the boxing alternative hands down. Frank, you've given me some great tips. I'm working on the implementation now. As a side effect of using strongly typed arrays, at steps further down the line I will be able to eliminate the use of reflection, which should give even further performance gains.
Regards, Mountain
> regards, Frank > [quoted text clipped - 44 lines] > > B\rgds > > 100 Frank Hileman - 26 Nov 2003 00:54 GMT > Frank, you've given me some great tips. I'm working on the implementation > now. As a side effect of using strongly typed arrays, at steps further down > the line I will be able to eliminate the use of reflection, which should > give even further performance gains. Yes, you are damn right there! We got a great speedup once by replacing reflection with generated, strongly typed code. Of course it has to be something in a fairly tight loop to make a difference.
100 - 25 Nov 2003 20:24 GMT Hi Frank,
> If you cannot use strong typing throughout, and avoid casting to object, > then you may as well stick to an ArrayList and boxing. I didn't get that. Sorry ;(
Do you want to say that if we use strong typing we cannot avoid casting?
> The only interface > that can be used to efficiently retrieve the data is a strongly typed one, > GetInt, GetDouble, etc. Otherwise the whole concept is defeated. That is the point. you cannot know, which method overload to use without checking the type of the item at given index. I haven't tested it, but I might do it when I have more time. At the moment I'm very sceptic that reading data from two arrays (one array for items's info, which involves copying data - the expensive part of unboxing and one for the actual data, which copy data anyway) will be faster than reading reference type of reference array + type checking+unboxing. These doubts of mine are only regarding the reading operation; storing value-type data in reference array will be much slower. Of course using vaue types and arrays (not dynamic collections) as a storage gives good opportunity to c# compiler and JITter for optimizations. So it might be faster.
> Regarding the overhead from the copy, shift, logical ops, etc. These are all > very fast and inlined. This is the great thing about structs. Time it and > see. Yes, bringing shift and logical ops was overkill. BTW they are not inlined because there is nothing to inline. There are no methods behind them. About the copy operation as operation which doesn't have to be considered as a factor when it comes for performance.... hmmm. Again I'm having doubts. I have timed reading 10,000,000 structures from an array (structs and arrays good for optimization) and the size of data has noticeable impact on the results.
> Make sure the class you use is not derived from MarshalByRefObject, > which defeats inlining. It would be really unfortunate if it does inlining ;)
B\rgds 100
> > Hi Frank, > > Yes, you can do that. Haw I said "We can provide method that returns the [quoted text clipped - 42 lines] > > B\rgds > > 100 Frank Hileman - 26 Nov 2003 01:08 GMT > Do you want to say that if we use strong typing we cannot avoid casting? Once you cast to object you box anyway. So if there is a cast to object anywhere in the chain just use an object array.
> I haven't tested it, but I might do it when I have more time. At the moment > I'm very sceptic that reading data from two arrays (one array for items's [quoted text clipped - 3 lines] > mine are only regarding the reading operation; storing value-type data in > reference array will be much slower. Not sure I understand what you mean here. With a strongly typed array, the strongly typed value is just copied right out of the array, and through a function call. Arrays are fast. You are hitting two or more arrays, but it is contiguous blocks. Boxing allocates memory on the heap and messes up locality of reference, pressures the GC. Really I did not time it but I hope someone else will.
> Yes, bringing shift and logical ops was overkill. BTW they are not inlined > because there is nothing to inline. There are no methods behind them. I meant the struct functions.
100 - 26 Nov 2003 15:15 GMT Hi Frank,
> > Do you want to say that if we use strong typing we cannot avoid casting? > > Once you cast to object you box anyway. So if there is a cast to object > anywhere in the chain just use an object array. I'm still confused what we are talking about here. Anyway, it is not so important. I agree with you for what you said here.
> > I haven't tested it, but I might do it when I have more time. At the > moment [quoted text clipped - 8 lines] > > Not sure I understand what you mean here. Ok, I'll try to explain. Lets assume that you want to store two different value types in our storage. As far as I understand your suggestion to Mountain is to use 2 separate arrays (strongly typed) for each of it and one array for your structure ArrayEntry. When we want to get a value you check in array-entry array to see in which data array and at which index we can find the value. Am I right? So, My point is that using this will be slower (in terms of reading) than using one array of untyped references (reference to Object) and chek the type and unbox to apropriate value type.
To back up my point I did some timing of both techniques esterday. Unboxing beats the other method off. It is indeed faster. Once again I'm talking about reading. If we want performance for writing (storing) your method is definitely the winner and I never argue against it. But as far as I remember Mountains issues were about the performance at reading or maybe I'm wrong. So, if I were to implement such a storage I would use your method (or something similar) if I want to make a storage for general use (since the performance is good for reading and writing) or I would go with unboxing if the performance for reading is critical and I don't care for writing since, for example, it is done only once (the impression I got from the first Mountain's post).
If you are curious I can give you the test project. I don't want to attach it to this post because I believe attaching files to posts in newsgroups is bad practice.
> > Yes, bringing shift and logical ops was overkill. BTW they are not inlined > > because there is nothing to inline. There are no methods behind them. > > I meant the struct functions. Yes, they are good for oprimization.
BTW your method of keeping datatype(table type) and index in one unsigned value is used internaly from CLR as metadata tokens for indexing metdata in different tables.
B\rgds 100
Mountain Bikn' Guy - 26 Nov 2003 19:58 GMT Hi 100, See my comments inline. Regards, Mountain
> To back up my point I did some timing of both techniques esterday. Unboxing > beats the other method off. It is indeed faster. Once again I'm talking > about reading. If we want performance for writing (storing) your method is > definitely the winner and I never argue against it. > But as far as I remember Mountains issues were about the performance at > reading or maybe I'm wrong. Yes, reading occurs more often than writing. The ratio is about 2:1.
> So, if I were to implement such a storage I would use your method (or > something similar) if I want to make a storage for general use (since the [quoted text clipped - 4 lines] > > If you are curious I can give you the test project. Yes, I'm interested in taking a look at it. I appreciate you following up on this in so much depth. Please email it to me.
the first part of my address is "Number_235" and the domain is yahoo dot com
Richard A. Lowe - 27 Nov 2003 09:02 GMT Hmmm, I'm not convinced that we can make any real conclusions about the performance of such an application as this one without having the parameters nailed down a bit better. For example, I wrote another sample that used multiple arrays like Frank wisely proposed and it was 80% faster than a similar object[] sample I wrote. I guess with all of us writing our own samples to test ideas, we're probably not making the same assumptions about anything (the proportion of reference types to value types, the *size* of the objects and value types etc.)
Since I'm so deep in this, MBG, could you post your object array sample that show the baseline performance that you want to improve? (Or not, if you have ideas your are happy with already)...
Richard
 Signature Veuillez m'excuser, mon Fran?ais est tr?s pauvre. Cependant, si vous voyez mauvais C #, c'est mon d?faut!
> Hi 100, > See my comments inline. [quoted text clipped - 27 lines] > the first part of my address is "Number_235" > and the domain is yahoo dot com Mountain Bikn' Guy - 27 Nov 2003 14:12 GMT Hi Richard,
I certainly would like to reach a conclusion on this, if for no other reason than for the benefit of anyone reading this thread who may be interested in general solutions that could improve upon boxing/unboxing when different types are stored in a (hetereogeneous) array. I am sending you an email with more detail (I used the corrected version of <chadich@yumspamyumYahoo.com>).
I do not yet have a good solution. Here is the issue I'm thinking through at the moment. In order to use Franks very nice suggestion, I have to create yet another index (I think). We already have about 100MB of indexes on this object[]. Right now, each data item is indexed 0 to N for an array of object[N+1]. Let's say we use Frank's approach and there are just 3 types: ints, doubles and a "myObj" type. There will be 3 "pools" as Frank called them. However, the original index values will have mapped from 0 to N to three new index sets: 0 to n1, 0 to n2, 0 to n3.
We already have some added complexity of needing a switch statement to select the "pool". The addition of another index mapping scheme (on top of 100MB of existing indexes) has me discouraged at the moment. And 100's test seemed to indicate that this solution may not yield strong performance gains. So I'm brainstorming at the moment. Your test is encouraging and I would like to explore this until we reach a definitive conclusion (and hopefully the holiday interruption won't kill momentum).
Regards, Mountain
> Hmmm, I'm not convinced that we can make any real conclusions about the > performance of such an application as this one without having the parameters [quoted text clipped - 45 lines] > > the first part of my address is "Number_235" > > and the domain is yahoo dot com 100 - 27 Nov 2003 16:31 GMT Hi Richard, You are right. There are a lot of factors that may lead to different results. I played a bit more with my test program and I kind of give up of the conclusion I made in my previous post. It seems like unbox method suffers worse performance if a structure has reference type field(s) for example (which may have some reasonable explanation, but I can't come up with any). Anyway I'm not sure even of the correctness of the results I got. So now I'm leaning against the method Frank suggests because even if in some cases unboxing might be a bit faster the difference wouldn't be that big, however writing operation using boxing, though, is big times slower.
Mountaing, stick with Frank's idea it is better than unboxing.
B\rgds 100
> Hmmm, I'm not convinced that we can make any real conclusions about the > performance of such an application as this one without having the parameters [quoted text clipped - 45 lines] > > the first part of my address is "Number_235" > > and the domain is yahoo dot com Eric Gunnerson [MS] - 20 Nov 2003 20:58 GMT Mountain (can I call you Mountain...),
The overhead from boxing should be roughly equivalent to the overhead of creating the wrapper class, as they're doing nearly exactly the same thing. I'm not sure how to contrast the overhead of the two options in #2 - my guess is that virtual dispatch is roughly as expensive as an unbox, but if you do the virtual every time and the unbox only on value types, it could explain your results.
You should also probably spend some time using the CLR profiler to examine your application.
http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE6052-D7F4-4AEB-9B7A -94635BEEBDDA&displaylang=en
 Signature Eric Gunnerson
Visit the C# product team at http://www.csharp.net Eric's blog is at http://blogs.gotdotnet.com/ericgu/
This posting is provided "AS IS" with no warranties, and confers no rights.
> I have a situation where an app writes data of various types (primitives and > objects) into a single dimensional array of objects. (This array eventually [quoted text clipped - 25 lines] > > All thoughts/suggestions are appreciated. 100 - 20 Nov 2003 21:51 GMT Hi Eric,
I agree that using wrapping classes is probably as slow (even it could be slower) than boxing/unboxing. And I gave my point in my prevoius post. However I'm not qiute sure what you are saying here
> guess is that virtual dispatch is roughly as expensive as an unbox In .NET dispatching of virtual-method calls is pretty simple and doesn't take much work than the calls of non-virtual methods. So if what you are saying is true it means that every function calls is as expensive as unbox operation.
B\rgds 100
> You should also probably spend some time using the CLR profiler to examine > your application. http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE6052-D7F4-4AEB-9B7A -94635BEEBDDA&displaylang=en
> > I have a situation where an app writes data of various types (primitives > and [quoted text clipped - 29 lines] > > > > All thoughts/suggestions are appreciated. Eric Gunnerson [MS] - 21 Nov 2003 20:48 GMT Well, it is just a guess, but here's my reasoning.
The virtual dispatch by itself isn't a big deal - it's only an extra few instructions. But because it's virtual (and there could be derived classes loaded later), the JIT doesn't inline virtual calls, so you do pay a non-trivial penalty there.
Unbox, on the other hand, is a runtime type check (fairly cheap), and then a copy of memory from the reference location to the stack location. That's pretty cheap, and my guess is that it's cheaper than the lost inline
 Signature Eric Gunnerson
Visit the C# product team at http://www.csharp.net Eric's blog is at http://blogs.gotdotnet.com/ericgu/
This posting is provided "AS IS" with no warranties, and confers no rights.
> Hi Eric, > [quoted text clipped - 13 lines] > > You should also probably spend some time using the CLR profiler to examine > > your application. http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE6052-D7F4-4AEB-9B7A -94635BEEBDDA&displaylang=en
> rights. > > > I have a situation where an app writes data of various types (primitives [quoted text clipped - 33 lines] > > > > > > All thoughts/suggestions are appreciated. 100 - 21 Nov 2003 21:34 GMT Hi Eric,
> Well, it is just a guess, but here's my reasoning. > [quoted text clipped - 6 lines] > copy of memory from the reference location to the stack location. That's > pretty cheap, and my guess is that it's cheaper than the lost inline Pretty wild guess I think ;)
1. There is no guarantee that a non-virtual method will be inlined.
2. Unboxing is a cheep operation. The copy operation which c# generates always after unboxing could be expensive depending on the size of the value type data. Calling virtual function is like indirect call with two indirections and if the method doesn't have any parameters (which have to be copied in the stack) - like simple GetMethod - I don't think it would be more expensive then unboxing (event if it does have paramters if they are primitives or reference types the fist two will be transfered via CPU registeres, the return value as well could be passed back thru register(s) ).
Of course this guess is as wild as yours ;)
B\rgds 100
> > Hi Eric, > > [quoted text clipped - 14 lines] > examine > > > your application. http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE6052-D7F4-4AEB-9B7A -94635BEEBDDA&displaylang=en
> > rights. > > > > I have a situation where an app writes data of various types [quoted text clipped - 38 lines] > > > > > > > > All thoughts/suggestions are appreciated. Mountain Bikn' Guy - 20 Nov 2003 22:33 GMT Hi Eric, You can certainly call me Mountain ;) I have a small mountain bike track outside my back door (it's more like a BMX track). About the only time I leave the computer is when I go ride my mountain bike (or when my wife forces me to go to sleep).
Thanks for your input and the link to the CLR profiler. I think using the CLR profiler is a good next step. Regards, Dave
> Mountain (can I call you Mountain...), > [quoted text clipped - 7 lines] > You should also probably spend some time using the CLR profiler to examine > your application. http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE6052-D7F4-4AEB-9B7A -94635BEEBDDA&displaylang=en
> > I have a situation where an app writes data of various types (primitives > and [quoted text clipped - 29 lines] > > > > All thoughts/suggestions are appreciated. Richard A. Lowe - 22 Nov 2003 22:47 GMT I had to work up my own example since you didn't post any code (Hey, I was bored today) so maybe my assumptions are really different from yours, but I found one way of doing it that was 28% faster than the object array in pure C# and 41% faster with a few easy IL modifications.
Rather than create a specific container *class* create a container *union struct* with all value types residing in the same space on your struct. Now I only tried with with one reference type and two value types, but it did work a lot faster (a plain struct was slower, for some reason creating a union via the StructLayout attribute was faster).
My 4 tests are downloadable from here: http://chadich.mysite4now.com/FastestHeterogenousArray.zip
There's a PDF/Word doc that shows a snapshot of the IL I took out of EACH of the sturct constructors (should be obvious which code is redundant). But post again if you are unsure how to use ILDasm and ILAsm to de/re-compile it.
Richard
 Signature Veuillez m'excuser, mon Fran?ais est tr?s pauvre. Cependant, si vous voyez mauvais C #, c'est mon d?faut!
> I have a situation where an app writes data of various types (primitives and > objects) into a single dimensional array of objects. (This array eventually [quoted text clipped - 25 lines] > > All thoughts/suggestions are appreciated. 100 - 24 Nov 2003 18:19 GMT Hi Richard, Thanx for the post. I was palnning to make the suggestion with "unions ", but I didn't because I found potential problems with it. I'll try to explain my issues. I modified a bit your examples and I got the following results: * Example using normal obect array - reading 10M items for 0.34s (29,069,637.13 items/s) * UnionStruct - reading 10M items for 0.30s (33,025,715.36 items/s). The others are way worse so I don't want to discuss them. Way to go "UnionStruct" ;))
IMHO we gain this performance because using unions we copy less data when we read Value items.
Now my issues. 1. In your case you have value types and one reference type. As you know we can make an union only with value types. You cannot map reference types to the same memory with other references or value types. If we try the UnionStruct example with 2 reference types and one value type the performance goes as bad as ValueArry exmaple (10M for 0.45s or 22,381,528.52 items/s ). The ObjectArray with the same types is doing as good as before.
2. Even if you have only value types like big structures "UnionStruct" example suffer performance hit. ObjectArray is doing better. I tried your examaples with 1 ref type, 1 double and one structure defined as struct MyStruct { public int a; public int b; public int c; } The *double* and the struct share the same memory.
So, the results were UnionStruct example - 0.44s; ObjectArray - 0.4s.
IMHO "unions" could be good in some special casses, but they are not easy for maintenance (hard to be extended, etc). So, I give my preferences to the ObjectArrays
B\rgds 100
> I had to work up my own example since you didn't post any code (Hey, I was > bored today) so maybe my assumptions are really different from yours, but I [quoted text clipped - 50 lines] > > > > All thoughts/suggestions are appreciated.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|