Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / March 2008

Tip: Looking for answers? Try searching our database.

C# generic containers from a "C++ perspective"

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Giovanni Dicanio - 21 Mar 2008 13:04 GMT
When I have a List<T> in C#, I think that the "list" (i.e.: dynamically
growing array) stores "pointers" to T instances, e.g. considering a
pseudo-equivalent C++ code, my understanding is that the C# version:

List< MyClass >

is equivalent to something like this in C++:

std::vector< MyClass * >

Is that correct?

Now, the question is: if we have List of "primitive" types (like an 'int',
or 'double'...), how are these "primitive types" stored in the List? Are
they stored "by value"? Or are they stored "by pointers"?

i.e. the C# version:

 List< double >

is equivalent to C++ :

vector< double >

or instead:

 vector< double * >

?

Thanks in advance,
Giovanni
Anthony Jones - 21 Mar 2008 14:04 GMT
> When I have a List<T> in C#, I think that the "list" (i.e.: dynamically
> growing array) stores "pointers" to T instances, e.g. considering a
[quoted text clipped - 7 lines]
>
> Is that correct?

Close enough.  All memory for classes are allocated from a heap anyway and
all references point to that space in the heap.  Hence in C# there is no
need to differentiate between a class instance and a class pointer.

> Now, the question is: if we have List of "primitive" types (like an 'int',
> or 'double'...), how are these "primitive types" stored in the List? Are
[quoted text clipped - 13 lines]
>
> ?

They are stored by value IOW vector< double >.

Signature

Anthony Jones - MVP ASP/ASP.NET

ignacio machin - 21 Mar 2008 15:34 GMT
On Mar 21, 8:04 am, "Giovanni Dicanio" <giovanni.dica...@invalid.com>
wrote:
> When I have a List<T> in C#, I think that the "list" (i.e.: dynamically
> growing array) stores "pointers" to T instances, e.g. considering a
[quoted text clipped - 5 lines]
>
>  std::vector< MyClass * >

Hi,

They are stored by value. That is the good thing about generics
in .NET, you do not need to box (create a reference) a value type.
The opposite happens when you use for example ArrayList , in that case
the value type is converted to a reference type

Ignacio Machin ( .NET/ C# MVP )
machin TA laceupsolutions.com
Jon Harrop - 21 Mar 2008 16:22 GMT
> i.e. the C# version:
>
[quoted text clipped - 7 lines]
>
>   vector< double * >

The former. This is one of the things that C++ screwed up and is fixed in
C#, making it much nicer to program in.

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Giovanni Dicanio - 21 Mar 2008 18:34 GMT
>> i.e. the C# version:
>>
[quoted text clipped - 10 lines]
> The former. This is one of the things that C++ screwed up and is fixed in
> C#, making it much nicer to program in.

I agree with you.
And that is not the only thing done better in C# than in C++ (other things
include better string management and built-in support for Unicode strings,
reflection, etc.)

However, there are also some things that IMHO are better in C++ than C#,
like RAII and deterministic destruction.
(The garbage collector is good for memory resources, but there are also
*non*-memory resources, like files, sockets, textures... for them garbage
collector is useless, and RAII and deterministic destruction are better.)

Giovanni
Ben Voigt [C++ MVP] - 21 Mar 2008 18:42 GMT
>> i.e. the C# version:
>>
[quoted text clipped - 10 lines]
> The former. This is one of the things that C++ screwed up and is
> fixed in C#, making it much nicer to program in.

Go ahead, spread the FUD.  Or explain exactly how you feel the .NET (not C#)
way is better than the stl containers.  Surely the OP would be interested in
knowing the differences and advantages of each.

Actually templates and generics are different in a lot of ways, but
templates are much more powerful, effectively a superset of generics (modulo
some bad error messages which will be fixed in C++0x).
Jon Skeet [C# MVP] - 21 Mar 2008 20:54 GMT
<snip>

> Actually templates and generics are different in a lot of ways, but
> templates are much more powerful, effectively a superset of generics (modulo
> some bad error messages which will be fixed in C++0x).

In terms of being a superset, a couple of things I don't *think* you
can do with C++ templates:

1) Create a constructed type at execution time via reflection. I don't
know how much reflection exists at *all* in C++ (or standard/common
libraries) but my understanding of templates would suggest this would
be harder.

2) Ship a generic type without any source. I could very well be wrong
on this, but my understanding is that when you use a template you need
the source code, so the compiler can build the specialized version. In
.NET the compiler just says "I want a List<int>" and refers to the
*binary* List<T>. This comes with a bunch of corollaries to do with
versioning/upgrades, of course.

Please, please correct me if I'm wrong though :)

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk

Ben Voigt [C++ MVP] - 21 Mar 2008 22:28 GMT
> <snip>
>
[quoted text clipped - 9 lines]
> libraries) but my understanding of templates would suggest this would
> be harder.

C++ reflection gives dynamic type testing, but nothing more.  COM gives more
reflection-like capabilities, as long as you are using type libraries.
Using CoCreateInstance or the factory pattern for instantiation, if you
compiled the template against an interface type (pure abstract class), you
could then use it with any coclass implementing that interface, pretty much
the same as a generic.  These execution-time constructed types aren't
offering compile-time type safety in .NET anyway, so I don't consider it a
drawback that C++/COM might require calls to QueryInterface.

> 2) Ship a generic type without any source. I could very well be wrong
> on this, but my understanding is that when you use a template you need
> the source code, so the compiler can build the specialized version. In
> .NET the compiler just says "I want a List<int>" and refers to the
> *binary* List<T>. This comes with a bunch of corollaries to do with
> versioning/upgrades, of course.

The C++ standard supports this, but Visual C++ doesn't.  On other platforms
that define a C++ ABI, this might give you some ability to upgrade templates
without a recompile, but not much.

> Please, please correct me if I'm wrong though :)

.NET generics do preserve "type identity" across instantiations in different
modules as well, while C++ needs COM to accomplish that.

But this list pales in comparison to the things templates do that generics
don't and can't emulate.
Arne Vajhøj - 24 Mar 2008 02:26 GMT
> .NET generics do preserve "type identity" across instantiations in different
> modules as well, while C++ needs COM to accomplish that.

If C++ needs COM to do something then the C++ language/C++
standard environment does not support it.

It is a fair assumption that C# code is running on Windows, but
it is a very bad assumption that C++ code is running on Windows.

Arne
Jon Harrop - 22 Mar 2008 02:14 GMT
>> The former. This is one of the things that C++ screwed up and is
>> fixed in C#, making it much nicer to program in.
>
> Go ahead, spread the FUD.

The design flaws in C++ are widely appreciated.

> Or explain exactly how you feel the .NET (not C#) way is better than the
> stl containers.  Surely the OP would be interested in knowing the
> differences and advantages of each.

This has nothing to do with containers or even templates. The OP's confusion
stems from the fact that C++ forces you to use pointers in order to make
full use of its object system. In reality, pointers are not necessary for
OOP.

> Actually templates and generics are different in a lot of ways, but
> templates are much more powerful, effectively a superset of generics

Turing argument.

Your subjective notion of "power" equates to slow compilation and poor error
messages in practice, neither of which are desirable. The advantage of
templates is a rather naff form of metaprogramming, long since obsoleted by
JIT compilers.

> (modulo some bad error messages which will be fixed in C++0x).

And Fortran 2003 will rule the world...

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Arne Vajhøj - 24 Mar 2008 02:31 GMT
>> Actually templates and generics are different in a lot of ways, but
>> templates are much more powerful, effectively a superset of generics
[quoted text clipped - 5 lines]
> templates is a rather naff form of metaprogramming, long since obsoleted by
> JIT compilers.

Try and search for C# generic calculate.

There are plenty of functionality in C++ templates that are not
in C# generics.

And JIT does not change anything about that.

Arne

PS: That something is missing from a language does not necessarily mean
    that it will be good to add it. More is not always better.
Jon Harrop - 24 Mar 2008 05:47 GMT
> Try and search for C# generic calculate.

I cannot find what you are indicating. Do you have a specific reference?

> There are plenty of functionality in C++ templates that are not
> in C# generics.

Exactly. C# provides parametric polymorphism in the form of generics (which
is one use of C++ templates) and JIT compilation for metaprogramming (which
is the other use of templates).

> And JIT does not change anything about that.

On the contrary, template metaprogramming in C++ is used to work around the
absence of a JIT compiler. This is a fundamental aspect of lots of
mainstream C++ libraries like Blitz++ and Boost. Reams have been written on
the subject. But it is all a waste of time because run-time code generation
is vastly more productive than template metaprogramming: orders of
magnitude faster to compile and easier to debug.

> PS: That something is missing from a language does not necessarily mean
>      that it will be good to add it. More is not always better.

The problem is that people like Ben Voigt spread mindless propaganda
like "templates are much more powerful, effectively a superset of
generics", implying that the Turing completeness of templates makes them
superior with no regard for the crippling degradation in programming
productivity incurred by slow compile times and awful error messages in
C++.

In practice, the usability of generics easily outweights the benefits of
templates.

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 24 Mar 2008 14:54 GMT
>> Try and search for C# generic calculate.
>
> I cannot find what you are indicating. Do you have a specific
> reference?

C# generics cannot possibly use any feature of the type argument that is not
expressed as an interface implementation.  Some things, like constructor
parameters, cannot be expressed with interfaces, while others, like simple
arithmetics, aren't on the built-in types.  Whatever the reason, if there's
no interface, C# generics can't do it.

Just so you understand fully, I challenge you to create a .NET generic for a
moving average (ring buffer of N entries that returns the mean) which can be
instantiated on both double and decimal (the only operations needed are add
and divide).  For bonus points make it work with any UDT that defines those
two operations.

The C++ template is both more straightforward and more efficient.

>> There are plenty of functionality in C++ templates that are not
>> in C# generics.
[quoted text clipped - 7 lines]
> On the contrary, template metaprogramming in C++ is used to work
> around the absence of a JIT compiler. This is a fundamental aspect of

Or gain the same capabilities but precompiled, so the deployed application
is lightweight and ready to run immediately.  JIT is a disaster if you need
fast startup.

> lots of mainstream C++ libraries like Blitz++ and Boost. Reams have
> been written on the subject. But it is all a waste of time because
> run-time code generation is vastly more productive than template
> metaprogramming: orders of magnitude faster to compile and easier to
> debug.

More FUD, especially the "easier to debug" part is just outright wrong.
It's like saying that dynamic languages are more productive -- well they are
for small tasks, but for larger ones the benefits of static typing outweigh
the additional effort.

As an example of how much "easier to debug" run-time generated code is, try
just getting a stack trace of a .NET program from a minidump.  It's both far
more complicated a user interface and far more complicated underneath, plus
it won't work with a selective dump, you need all the internal JIT data
structures included, bloating your dumps by hundreds of megabytes.

>> PS: That something is missing from a language does not necessarily
>>      mean that it will be good to add it. More is not always better.
[quoted text clipped - 8 lines]
> In practice, the usability of generics easily outweights the benefits
> of templates.
Jon Harrop - 24 Mar 2008 22:21 GMT
> C# generics cannot possibly use any feature of the type argument that is
> not
> expressed as an interface implementation.  Some things, like constructor
> parameters, cannot be expressed with interfaces, while others, like simple
> arithmetics, aren't on the built-in types.  Whatever the reason, if
> there's no interface, C# generics can't do it.

Sure. I'm not contesting that templates try to solve problems that generics
do not. I am saying that all of the other problems solved by templates are
better solved by other techniques, most notably explicit code generation.
Moreover, in the overlap between generics and templates, generics are a
clear win because they are so much easier to use.

> Just so you understand fully, I challenge you to create a .NET generic for
> a moving average (ring buffer of N entries that returns the mean) which
[quoted text clipped - 4 lines]
>
> The C++ template is both more straightforward and more efficient.

You have drawn that conclusion before looking at any evidence.

Let's look at some real examples. For trivial cases like the (useless)
static factorial function you can encode your metaprograms in C++
templates:

 http://aszt.inf.elte.hu/~gsd/halado_cpp/ch06s04.html

Here is their factorial implementation:

 #include <iostream>
 
 template <int N>
 struct Factorial
 {
     enum { value = N * Factorial<N-1>::value };
 };
 
 template <>
 struct Factorial<1>
 {
     enum { value = 1 };
 };
 
 int main()
 {
     const int fact5 = Factorial<15>::value;
     std::cout << fact5 << endl;
     return 0;
 }

This is useless because you can just use a lookup function.

Now let's look at something non-trivial, like a high-performance FFT
implementation. The fastest cross-platform FFT implementation is FFTW,
which is C code generated by an OCaml program.

Many C++ advocates have tried to do the same code generation using C++
templates because it would make such a compelling example of templates
actually being good for metaprogramming. However, every single person to
have ever attempted to do this has completely and utterly failed. Dr Dobb's
even published an article on exactly this:

 http://www.ddj.com/cpp/199500857

If you look at the OCaml source code of FFTW you will see many non-trivial
symbolic optimizations implemented using pattern matching over variant
types that optimize the generated code in ways that are prohibitively
difficult to implement using C++ templates.

The fact that C++ templates are theoretically capable of doing this because
they are Turing complete is useless because it is practically impossible to
do so because templates are so cumbersome to use for metaprogramming.

>> On the contrary, template metaprogramming in C++ is used to work
>> around the absence of a JIT compiler. This is a fundamental aspect of
>
> Or gain the same capabilities

That is wrong. C++ templates cannot perform specialization based upon
run-time values.

> but precompiled,

JIT provides both precompilation and run-time compilation. C++ only provides
precompilation.

> so the deployed application is lightweight and ready to run immediately.

That is also wrong. Because C++ templates cannot handle run-time values the
only efficient solution is to precompile every conceivable permutation of
inputs. This results in combinatorial code explosion with incredibly
heavy-weight libraries and very slow loading times. C++ is famously bad for
this.

In contrast, JIT compilation can lazily compile code as and when it is
needed. This results in leaner libraries and executables that load faster
and even work cross-platform.

> JIT is a disaster if you need fast startup.

If your application requires submillisecond startup times, sure.

>> lots of mainstream C++ libraries like Blitz++ and Boost. Reams have
>> been written on the subject. But it is all a waste of time because
[quoted text clipped - 6 lines]
> are for small tasks, but for larger ones the benefits of static typing
> outweigh the additional effort.

I'm just going to pull rank here. You need to learn some decent languages.
Look at modern statically-typed functional programming languages for a
start.

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 24 Mar 2008 23:01 GMT
>> C# generics cannot possibly use any feature of the type argument
>> that is not
[quoted text clipped - 19 lines]
>>
>> The C++ template is both more straightforward and more efficient.

[snip talking about OCaml]

> I'm just going to pull rank here. You need to learn some decent
> languages. Look at modern statically-typed functional programming
> languages for a start.

Last I checked we were comparing C++ templates to .NET generics.  Are you
claiming that you could insert F# everywhere you said OCaml and all your
claims remain true?  Last I heard you were complaining about the miserable
performance of F#'s allocator.

You need to learn about .NET generics before you start saying they are
oh-so-much-better than C++ templates.  They aren't.  Maybe OCaml generic
programming is, I haven't played with that for a while, but definitely .NET
generics fall far short.

Also you completely avoided my example.

For that matter, I don't think the OCaml moving average is going to beat the
C++ template either, for performance or readability.  Plus anyone trying to
do an FFT with metaprogramming is doing it for the challenge, no other
reason.  A much better way is selecting between several pre-optimized
routines based on a couple of conditions like magnitude of length and
whether the length is a power of 2.

For example, how much .NET code (pick your language) would you need to
accomplish the same thing as this?

template <typename Integral>
inline bool isPowerOfTwo(Integral const n)
{
   return (n & (n-1)) == 0;
}
Jon Harrop - 25 Mar 2008 05:54 GMT
> Last I checked we were comparing C++ templates to .NET generics.

Yes. In the process, you brought up a problem (metaprogramming) that
templates can solve but generics do not. My point is that .NET languages
already have better ways to solve that problem, independently of generics.

> Are you claiming that you could insert F# everywhere you said OCaml and
> all your claims remain true?

Yes.

> Last I heard you were complaining about the miserable performance of F#'s
> allocator.

The single-threaded symbolic optimizer from FFTW would currently be ~5x
slower in F# than it is in OCaml. There is no easy way to improve this.

> You need to learn about .NET generics before you start saying they are
> oh-so-much-better than C++ templates.  They aren't.

The error reporting from generics is much better than templates.

> Maybe OCaml generic programming is, I haven't played with that for a
> while, but definitely .NET generics fall far short.

That's because you're trying to use a tool for parametric polymorphism for
code generation. They are completely different problems best solved by
completely independent solutions like generics and JIT compilation.

> Also you completely avoided my example.

Your problem statement does not make sense to me. If you want the moving
average to work properly over ints and floats then you must use different
algorithms. Or do you want to cast the ints to floats?

> For that matter, I don't think the OCaml moving average is going to beat
> the C++ template either, for performance or readability.

On what basis?

> Plus anyone trying to do an FFT with metaprogramming is doing it for the
> challenge, no other reason.

On the contrary, this is how professionals write production-quality code.
FFTW has millions of users and is at the core of one of the world's most
successful technical computing environments, MATLAB.

> A much better way is selecting between several pre-optimized
> routines based on a couple of conditions like magnitude of length and
> whether the length is a power of 2.

All of the world's best FFT implementations used to do that. Actually, they
even wrote dozens of painstakingly hand-optimized routines and chose
between them at run-time.

Then FFTW came along with its thousands of automatically optimized routines
(codelets) and beat all other implementations hands down on performance
whilst remaining platform independent.

> For example, how much .NET code (pick your language) would you need to
> accomplish the same thing as this?
[quoted text clipped - 4 lines]
>     return (n & (n-1)) == 0;
> }

In F#:

 let inline is_pow2(n : #Math.INumeric) =
   n &&& (n - n.One) = n.Zero

As long as you stick to trivial examples you won't see the downside of C++
templates.

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 25 Mar 2008 14:32 GMT
>> Last I checked we were comparing C++ templates to .NET generics.
>
[quoted text clipped - 19 lines]
>
> The error reporting from generics is much better than templates.

That is a known issue which I spoke of in my first post in this thread.
C++0x fixes that too.

>> Maybe OCaml generic programming is, I haven't played with that for a
>> while, but definitely .NET generics fall far short.
[quoted text clipped - 3 lines]
> problems best solved by completely independent solutions like
> generics and JIT compilation.

Generics, which are early bound, don't provide parametric polymorphism.
Templates do.

>> Also you completely avoided my example.
>
> Your problem statement does not make sense to me. If you want the
> moving average to work properly over ints and floats then you must
> use different algorithms. Or do you want to cast the ints to floats?

What part of "both double and decimal" didn't you understand?

And why would different algorithms be needed?  In C++ the same algorithm
will work for both int and float (until overflow becomes a problem, but the
accumulator can be a different type) except for roundoff, which is expected
for int.

>> For that matter, I don't think the OCaml moving average is going to
>> beat the C++ template either, for performance or readability.
[quoted text clipped - 19 lines]
> routines (codelets) and beat all other implementations hands down on
> performance whilst remaining platform independent.

through the use of JIT?

No, as I recall the comparison process is very slow (I may be thinking of
ATLAS rather than FFTW), and the compile time is only a small fraction of
that.

>> For example, how much .NET code (pick your language) would you need
>> to accomplish the same thing as this?
[quoted text clipped - 9 lines]
>  let inline is_pow2(n : #Math.INumeric) =
>    n &&& (n - n.One) = n.Zero

Oh, F# defines the bitwise and operation for all types it pretends to
implement INumeric for?  It's not part of the INumeric interface.  It's not
even part of the IIntegral interface.  And it must turn that into something
pretty ugly in order for the CLR to accept it.

> As long as you stick to trivial examples you won't see the downside
> of C++ templates.

If they are so trivial, how come .NET generics can't solve them?
Jon Harrop - 25 Mar 2008 16:37 GMT
> That is a known issue which I spoke of in my first post in this thread.
> C++0x fixes that too.

Has C++0x ever been implemented?

> Generics, which are early bound, don't provide parametric polymorphism.

I suggest you read up on parametric polymorphism.

> What part of "both double and decimal" didn't you understand?

"decimal" because it is meaningless.

> And why would different algorithms be needed?

Because these types have completely different properties in this context:

. Ints do not accumulate round-off error. Floats do.

. The "/" operator you use is Euclidean quotient for ints and an
approximation to real division for floats.

> In C++ the same algorithm
> will work for both int and float (until overflow becomes a problem, but
> the accumulator can be a different type) except for roundoff, which is
> expected for int.

As I thought, your algorithm is broken. This is difficult to demonstrate
because you failed to provide working C++ code.

>> Then FFTW came along with its thousands of automatically optimized
>> routines (codelets) and beat all other implementations hands down on
>> performance whilst remaining platform independent.
>
> through the use of JIT?

FFTW precompiles the most common codelets and puts them in its standard
library. With a JIT, you can compile any codelet as and when you need it.

> No, as I recall the comparison process is very slow (I may be thinking of
> ATLAS rather than FFTW), and the compile time is only a small fraction of
> that.

That is generally the case, yes.

>>> For example, how much .NET code (pick your language) would you need
>>> to accomplish the same thing as this?
[quoted text clipped - 12 lines]
> Oh, F# defines the bitwise and operation for all types it pretends to
> implement INumeric for?

No, the INumeric interface is only used for the n.One and n.Zero. The
inlining makes this function generic over everything else just like a C++
template.

> It's not part of the INumeric interface.

Yes.

> It's not even part of the IIntegral interface.

Yes, of course.

> And it must turn that into something pretty ugly in order for the CLR to
> accept it.

I've no idea how it is compiled but it provides strictly more functionality
than your C++ template.

>> As long as you stick to trivial examples you won't see the downside
>> of C++ templates.
>
> If they are so trivial, how come .NET generics can't solve them?

How come a ripe banana can't solve them?

Nobody cares when modern languages are more expressive in every respect.

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 25 Mar 2008 17:08 GMT
>>>> For example, how much .NET code (pick your language) would you need
>>>> to accomplish the same thing as this?
[quoted text clipped - 16 lines]
> inlining makes this function generic over everything else just like a
> C++ template.

Ok, now I understand.

The ironic thing is that you've proved my point, by showing that F# needed
to implement a feature similar to C++ templates in order to overcome the
deficiencies in .NET generics.
Jon Harrop - 25 Mar 2008 19:39 GMT
>> No, the INumeric interface is only used for the n.One and n.Zero. The
>> inlining makes this function generic over everything else just like a
[quoted text clipped - 5 lines]
> to implement a feature similar to C++ templates in order to overcome the
> deficiencies in .NET generics.

Exactly. Templates were obsoleted by a variety of alternatives (for
parametric polymorphism, inlining and metaprogramming). Pretending that C++
is better because templates provide an unusably poor solution to problems
that generics did not even attempt to address is just deceitful propaganda.

As I said, templates can do more metaprogramming than a ripe banana. That
information is also of no use but at least it isn't deliberately
misleading.

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 25 Mar 2008 17:15 GMT
>> What part of "both double and decimal" didn't you understand?
>
> "decimal" because it is meaningless.

We are posting in a C# group about .NET.

decimal is a C# keyword just like double.
System.Decimal is a standard .NET type.

Before you "pull rank" you should make sure you are familar with the subject
matter.
http://msdn2.microsoft.com/en-us/library/x53a06bb.aspx
http://msdn2.microsoft.com/en-us/library/364x0z75.aspx
Jon Harrop - 25 Mar 2008 19:31 GMT
>>> What part of "both double and decimal" didn't you understand?
>>
[quoted text clipped - 4 lines]
> decimal is a C# keyword just like double.
> System.Decimal is a standard .NET type.

My explanation of why your algorithm is fundamentally broken still stands.
If you implement the floating-point and integer algorithms properly you
will see that there is nothing worth factoring out between them.

> Before you "pull rank"...

Out of context.

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 25 Mar 2008 22:58 GMT
>>>> What part of "both double and decimal" didn't you understand?
>>>
[quoted text clipped - 9 lines]
> properly you will see that there is nothing worth factoring out
> between them.

The implementation would be:

Maintain the last N items in a ring buffer.  (This consists of an array of
size N and the index of the oldest element).
Maintain the sum of all the items in an accumulator.
On arrival of a new data point, deduct the oldest datum from the sum,
overwrite it with the new one (updating the index), add the new one to the
sum, return sum / N.

This works very well for any numeric data type as long as the accumulator
doesn't overflow, but the accumulator doesn't need to be the same type as
each buffer element.
Jon Harrop - 25 Mar 2008 23:42 GMT
>> My explanation of why your algorithm is fundamentally broken still
>> stands. If you implement the floating-point and integer algorithms
[quoted text clipped - 9 lines]
> overwrite it with the new one (updating the index), add the new one to the
> sum, return sum / N.

That is the fundamentally-broken algorithm that I was expecting you to
describe: you have implicitly assumed that addition and subtraction
commute, which is true for integers and not floats.

Specifically, your algorithm will suffer from cumulative round-off errors
and will not handle special values (e.g. infinities) at all.

> This works very well for any numeric data type as long as the accumulator
> doesn't overflow,

That is not true. You only need one element 10^17x larger than any other
currently in the buffer and you lose all precision in your accumulator from
catastrophic round-off error, hundreds of orders of magnitude before any
overflow.

> but the accumulator doesn't need to be the same type as each buffer
> element.

I see. And what type would you recommend for the accumulator? An
arbitrary-precision rational, perhaps?

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 26 Mar 2008 14:38 GMT
>>> My explanation of why your algorithm is fundamentally broken still
>>> stands. If you implement the floating-point and integer algorithms
[quoted text clipped - 30 lines]
> I see. And what type would you recommend for the accumulator? An
> arbitrary-precision rational, perhaps?

The issue you describe is theoretically valid, but not applicable to
real-world data created via analog-to-digital sampling and linear functions
thereof, which is probably somewhere upwards of 99% of all data sets.  If it
is a problem, you can always recalculate the sum directly from the stored
elements.
Jon Harrop - 28 Mar 2008 13:54 GMT
> The issue you describe is theoretically valid, but not applicable to
> real-world data created via analog-to-digital sampling and linear
> functions thereof, which is probably somewhere upwards of 99% of all data
> sets.

I guess my experience has been tainted by my extensive use of really
esoteric mathematical functions like sqrt and even quadratics. Nobody would
want to use those so-called non-linear functions in the "real world". Or
big numbers. Nobody uses big numbers in the real world either. Or small
numbers.

Would you also say that positive numbers account for around 99% of all "real
world" numbers too?

Signature

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Ben Voigt [C++ MVP] - 28 Mar 2008 18:36 GMT
>> The issue you describe is theoretically valid, but not applicable to
>> real-world data created via analog-to-digital sampling and linear
[quoted text clipped - 7 lines]
> big numbers. Nobody uses big numbers in the real world either. Or small
> numbers.

The question isn't whether people use big numbers, or people use small
numbers.  The question is whether people have both really big and really
small numbers in the same dataset.  Not often.  In datasets where you want
to talk about the moving average (via arithmetic mean)... almost never.

> Would you also say that positive numbers account for around 99% of all
> "real
> world" numbers too?

Probably not 99%.  If I had to guess I'd say non-negative (as opposed to
strictly positive) are around 92%.  Still a large enough fraction that an
algorithm that doesn't handle any negative numbers is still pretty doggone
useful.  Which might be why most programming languages have "unsigned"
types.

Just because some mathematical number exists that an algorithm can't handle
doesn't mean the algorithm is useless.
Marc Gravell - 24 Mar 2008 23:52 GMT
> Just so you understand fully, I challenge you to create a .NET generic for a
> moving average (ring buffer of N entries that returns the mean) which can be
> instantiated on both double and decimal (the only operations needed are add
> and divide).  For bonus points make it work with any UDT that defines those
> two operations.

If the mood took me, I'm pretty sure U could take that challenge, and
the bonus points for UDT:
www.pobox.com/~skeet/csharp/genericoperators.html

(full download available in MiscUtil, and for the record, I also have
a .NET 2.0 version as well as the .NET 3.5 version cited)

> The C++ template is both more straightforward and more efficient.

Stating it (by itself) isn't exactly conclusive. Besides; C++
templates are compile time; I can make new types and use them with
generics at runtime. The JIT isn't perfect, but generally does a
pretty good job... pretty flexible.

Marc
Marc Gravell - 24 Mar 2008 23:53 GMT
>sure U could

oops - I meant "I" could... QWERTY slip...

Marc
Ben Voigt [C++ MVP] - 25 Mar 2008 15:12 GMT
>> Just so you understand fully, I challenge you to create a .NET
>> generic for a moving average (ring buffer of N entries that returns
[quoted text clipped - 15 lines]
> generics at runtime. The JIT isn't perfect, but generally does a
> pretty good job... pretty flexible.

Except that .NET's JIT is neutered when it comes to reference types as
generic arguments.  Or does your .NET 3.5 Expression-based implementation
overcome that as well?

> Marc
Marc Gravell - 25 Mar 2008 16:53 GMT
It depends on what you mean by neutered - plus in all the cases discussed
above (decimal vs double vs int etc) we are talking about value-types, not
reference types.
As far as the JIT is concerned, it doesn't *need* more than one
reference-type IL version, since every reference takes the same size and is
accessed the same (possibly losing out on some "sealed" optimisations, but
heh...)

As it happens, the generic-operator stuff will cope with either
reference-type or value-type without issue (and in particular without
boxing) - but if you clarify the question I might be able to give a better
answer.

Marc
Jon Skeet [C# MVP] - 25 Mar 2008 17:03 GMT
> It depends on what you mean by neutered - plus in all the cases discussed
> above (decimal vs double vs int etc) we are talking about value-types, not
[quoted text clipped - 8 lines]
> boxing) - but if you clarify the question I might be able to give a better
> answer.

I think what Ben's driving at is that the JIT doesn't inline code which
could be inlined for reference types. It might do for value types,
because it JITs on a per-type basis, but for reference types it has to
assume that any virtual method *might* be overridden, because it's
going to use the same native code for all type arguments which are
reference types.

There may well be other optimisations with similar issues.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk

Ben Voigt [C++ MVP] - 25 Mar 2008 17:12 GMT
> It depends on what you mean by neutered - plus in all the cases
> discussed above (decimal vs double vs int etc) we are talking about
[quoted text clipped - 8 lines]
> boxing) - but if you clarify the question I might be able to give a
> better answer.

I was referring to this line in the page you linked me to:

<quote>
the JIT-compiler is remarkably good as inlining the simple code that invokes
the methods; enough that runtime performance is within a few percent of
regular compiled code.
</quote>

AFAIK, this sort of inlining doesn't take place at all when the generic
argument is a reference type.  I was wondering if maybe using the Expression
class is causing the JIT to specialize for each different reference type,
overcoming the problem.

> Marc
Marc Gravell - 25 Mar 2008 21:18 GMT
I'm not sure that is a meaningful comparison to standard generics,
since operators are static and thus outside of what generics offers by
default. But yes; I think (from your and Jon's comments) that this
relates back to the "sealed" comment in my post above - it may well
miss some optimisations due to not being optimised by type. The
generic operator code *has* to be per-type (even for reference-types)
simply to exist, since it isn't driven by generics. Basically I
compile a delegate on the fly and cache it against the type.

In reality, the cost of invoking a delegate is probably (untested)
higher than the cost of a virtual call, so I don't think that the
delegate approach is going to be a blanket fix - but it would /
perhaps/ merrit testing *if* there is a demonstrable case where the
virtual call is actually being a bottleneck.

Marc

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.