.NET Forum / Windows Forms / WinForm General / January 2007
Why does 28.08 show up as 28.080000000000002 in DataGridView
|
|
Thread rating:  |
Daniel Manes - 09 Jan 2007 20:53 GMT I'm baffled. I have a column in a SQL Server Express database called "Longitude," which is a float. When I view the table in a DataGridView, some of the numbers, which only have two decimal places in the database show up with *15* decimal places and are ever so slightly off (in the example in the subject line, by about 2E-15).
I'm not doing any operations on this column. It's just running a stored procedure which performs a pretty basic SELECT on the table. If I run the stored procedure in Management Studio Express, all numbers show up fine (just two decimal places).
What's going on here?
Thanks,
-Dan
AlterEgo - 09 Jan 2007 21:01 GMT Daniel,
Float is an approximate datatype. Lookup approximate numeric data in BOL.
-- Bill
> I'm baffled. I have a column in a SQL Server Express database called > "Longitude," which is a float. When I view the table in a DataGridView, [quoted text clipped - 12 lines] > > -Dan Aaron Bertrand [SQL Server MVP] - 09 Jan 2007 21:01 GMT FLOAT is an approximate data type. If you want precision, then use an appropriate DECIMAL instead of FLOAT.
> I'm baffled. I have a column in a SQL Server Express database called > "Longitude," which is a float. When I view the table in a DataGridView, [quoted text clipped - 12 lines] > > -Dan Jon Skeet [C# MVP] - 09 Jan 2007 21:33 GMT > FLOAT is an approximate data type. If you want precision, then use an > appropriate DECIMAL instead of FLOAT. Decimal is as "approximate" as float, in that neither can represent every possible rational number exactly. They just have different bases - decimal will represent numbers like 0.1234 exactly, but will be inaccurate with 1/3 in the same way that float is.
Both are floating point types - float is a floating *binary* point type, and decimal is a floating *decimal* point type.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Jon Skeet [C# MVP] - 09 Jan 2007 21:44 GMT > > FLOAT is an approximate data type. If you want precision, then use an > > appropriate DECIMAL instead of FLOAT. [quoted text clipped - 6 lines] > Both are floating point types - float is a floating *binary* point > type, and decimal is a floating *decimal* point type. Apologies - some clarification is required here. The above is certainly true for the C# float/decimal types. After a bit of digging (I don't have SQL Server help available at the minute) I believe that any particular column with a defined precision and scale, a DECIMAL column in SQL Server is effectively "fixed point" (i.e. the value itself doesn't specify where the decimal point is, the column does).
That doesn't mean it's "precise" in a way that FLOAT isn't, it just alters the storage (and therefore the range and precision available).
The above probably isn't terribly clear, but the bottom line is that a floating point number is a very precise number - it has an exact value - but not every number can be exactly represented as a floating point number (given the base/storage size etc). That's true for fixed point numbers as well, and it's not the "fixedness" that makes DECIMAL more appropriate for business calculations, but the fact that it uses base 10 instead of base 2.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 09 Jan 2007 22:17 GMT What I meant by "not approximate" is that if you put 28.08 in a decimal(5,2), you are never going to get 28.08000000000000002
>> > FLOAT is an approximate data type. If you want precision, then use an >> > appropriate DECIMAL instead of FLOAT. [quoted text clipped - 24 lines] > appropriate for business calculations, but the fact that it uses base > 10 instead of base 2. Jon Skeet [C# MVP] - 09 Jan 2007 22:34 GMT > What I meant by "not approximate" is that if you put 28.08 in a > decimal(5,2), you are never going to get 28.08000000000000002 True. If you ask SQL server to divide 1 by 3, however, you certainly *won't* get an exact answer in a DECIMAL though (or FLOAT, admittedly).
The reason I'm bothering to make the distinction is that there's a widespread myth that decimal types are "exact" in a way that floating binary point types aren't - it's just down to people having a natural bias to base 10. If you consider numbers in base 3 (or 7, or 11, etc) instead, DECIMAL is just as bad as FLOAT.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 09 Jan 2007 23:42 GMT >> What I meant by "not approximate" is that if you put 28.08 in a >> decimal(5,2), you are never going to get 28.08000000000000002 [quoted text clipped - 5 lines] > widespread myth that decimal types are "exact" in a way that floating > binary point types aren't - I think that most of us recognize the difference between SET @foo = 0.33 and SET @foo = 1.0/3 ... in the former, we're choosing to limit the "exact" nature of the result, and if we choose to store it in a DECIMAL as opposed to a FLOAT, we know we're going to get 0.33 every time...
Jon Skeet [C# MVP] - 09 Jan 2007 23:58 GMT > >> What I meant by "not approximate" is that if you put 28.08 in a > >> decimal(5,2), you are never going to get 28.08000000000000002 [quoted text clipped - 10 lines] > nature of the result, and if we choose to store it in a DECIMAL as opposed > to a FLOAT, we know we're going to get 0.33 every time... But if you were to express floating binary point numbers in a binary format (eg 0.01101) then you'd get the exact same result back when you reformatted it as binary. The nature of the literal formats available doesn't alter whether or not a data type is essentially approximate or not.
The "approximate" nature of binary floating point isn't because it's floating point, and isn't changed by using a decimal format. It's just the inability to exactly represent all *decimal* numbers, which isn't the same thing as all numbers.
I personally believe that's *not* well understood - hence people (e.g. 3 different people in this thread) referring to floating binary point numbers as "approximate" when they're not at all - they're exact numbers which may only be approximately equal to the number you originally tried to set them to.
As an example of what I mean, consider integer types. Are they "approximate"? No - they represent exact numbers. However, if you do (in C#):
int x = (int) 10.3;
The value of x won't be 10.3, it will be 10. 10 is an exact number, but the conversion from 10.3 was an approximation. The same is true if you have a float and do
SET @my_float = 0.33 (SQL) or float myFloat = 0.33f; (C#)
my_float represents an exact number, but the conversion from the literal "0.33" to floating binary point is an approximating one. The differences between the floating binary point conversion and the conversion to an integer earlier are:
1) The integer conversion is explicit, highlighting that data may be lost 2) When reformatted as a decimal value, an integer is always "tidy" whereas a float often isn't (as evidenced by the subject line of this thread)
I don't see either of those as reasons to describe floating binary point numbers as "approximate" when integers aren't described in that way.
Given a floating point value, I can tell you *exactly* what that number is as a decimal expansion. In what sense is the value (or the type) "approximate"? It's important (IMO) not to be blinded by the bias of the fact that humans tend to represent numbers as decimals.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Mike C# - 10 Jan 2007 01:50 GMT > The "approximate" nature of binary floating point isn't because it's > floating point, and isn't changed by using a decimal format. It's just > the inability to exactly represent all *decimal* numbers, which isn't > the same thing as all numbers. That seems like hair-splitting, for real.
> I personally believe that's *not* well understood - hence people (e.g. > 3 different people in this thread) referring to floating binary point > numbers as "approximate" when they're not at all - they're exact > numbers which may only be approximately equal to the number you > originally tried to set them to. People in this newsgroup refer to binary floating point numbers as "approximate" because that's what the ANSI SQL standard calls them.
"The data types NUMERIC, DECIMAL, INTEGER, and SMALLINT are collectively referred to as exact numeric types. The data types FLOAT, REAL, and DOUBLE PRECISION are collectively referred to as approximate numeric types. Exact numeric types and approximate numeric types are collectively referred to as numeric types. Values of numeric type are referred to as numbers." -- ANSI SQL-92
Jon Skeet [C# MVP] - 10 Jan 2007 04:12 GMT > > The "approximate" nature of binary floating point isn't because it's > > floating point, and isn't changed by using a decimal format. It's just > > the inability to exactly represent all *decimal* numbers, which isn't > > the same thing as all numbers. > > That seems like hair-splitting, for real. I see nothing hair-splitting about recognising that 1/3 is a number.
> > I personally believe that's *not* well understood - hence people (e.g. > > 3 different people in this thread) referring to floating binary point [quoted text clipped - 4 lines] > People in this newsgroup refer to binary floating point numbers as > "approximate" because that's what the ANSI SQL standard calls them. You may notice that there are multiple newsgroups involved here...
> "The data types NUMERIC, DECIMAL, INTEGER, and SMALLINT are collectively > referred to as exact numeric types. The data types FLOAT, REAL, and DOUBLE > PRECISION are collectively referred to as approximate numeric types. Exact > numeric types and approximate numeric types are collectively referred to as > numeric types. Values of numeric type are referred to as numbers." -- ANSI > SQL-92 Just because ANSI says something doesn't mean it's right, IMO.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Mike C# - 10 Jan 2007 17:16 GMT >> > The "approximate" nature of binary floating point isn't because it's >> > floating point, and isn't changed by using a decimal format. It's just [quoted text clipped - 4 lines] > > I see nothing hair-splitting about recognising that 1/3 is a number. Listen to some of your arguments:
Argument A ======== 0. "All numeric data types are *approximate* and none are *exact* because you can't represent every single combination, including infinitely repeating combinations, using them."
By extension (and we may need to enter the realm of philosophy here), your argument says there is no such thing as an exact data type of *any kind* since you can't represent every combination of whatever their base type is using them; including infinitely repeating series. VARCHAR(8000) is now *approximate* because I can't store 8,001 'Z' characters in it. INT is *approximate* because I can't store 999999999999999999999999999999999999999 in it. Personally I find that silly, IMO. An exact data type is not *exact* based on being able to store every infinite combination; it is exact because the valid values it does store are stored using *exact* representations instead of approximations.
Argument B ======== 0. "I type in the FLOAT value 10.0," 1. "The computer stores it as an approximate value of 9.999999," 2. "The binary representation of 9.999999 is *exact*," 3. "Therefore FLOAT is an *exact* data types."
Really...
>> People in this newsgroup refer to binary floating point numbers as >> "approximate" because that's what the ANSI SQL standard calls them. > > You may notice that there are multiple newsgroups involved here... Yes I did notice, and I'm responding from the sqlserver newsgroup, just FYI. SQL folks call floating point numbers "approximate" because that's what the standard says.
>> "The data types NUMERIC, DECIMAL, INTEGER, and SMALLINT are collectively >> referred to as exact numeric types. The data types FLOAT, REAL, and [quoted text clipped - 8 lines] > > Just because ANSI says something doesn't mean it's right, IMO. I agree with that. However, they did just fine with this one. Unless you can find a better argument than "You can't store an infinitely repeating decimal exactly in a fixed-decimal point type, on a machine with a finite amount of memory, therefore the fixed-decimal point type is not an *exact* type", you're not going to convince ANSI or just about anyone else of the justness of your cause.
Jon Skeet [C# MVP] - 10 Jan 2007 20:02 GMT <snip>
> > Just because ANSI says something doesn't mean it's right, IMO. > [quoted text clipped - 4 lines] > type", you're not going to convince ANSI or just about anyone else of the > justness of your cause. Just *one* point I can't resist here: I've never claimed that decimal isn't exact. I've claimed that float is exact too.
Once you've got a float value, you can always convert it to an exact decimal string too, and indeed I've got C# code to do it, referenced from http://www.pobox.com/~skeet/csharp/floatingpoint.html There's also an online version where you can type in the decimal value you want to find the closest double to, and then show the exact value of the double. (These are C# floats and doubles, but the principle is the same). For example, the closest double to 28.08 is *exactly* 28.0799999999999982946974341757595539093017578125
My point (which I thought I'd made again and again, but clearly not enough) is that it's the conversion from a decimal literal to a floating point value which is an "approximate" conversion, *not* the float value itself. The point about base 3 is that the same would be true - convert a base 3 number to a decimal and you'd end up with an exact value which is the closest decimal to that number, but that may not be exactly the same as the original value. (eg 1/3 becoming 0.333). In both cases the conversion is "lossy" but the resultant value is exact.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 10 Jan 2007 20:14 GMT > Just *one* point I can't resist here: I've never claimed that decimal > isn't exact. I've claimed that float is exact too. The problem is, if I hard-code a value like 28.08, and store it in a FLOAT, when I ask for that value back, I may not get that EXACT value. It doesn't matter to me if that approximation occurred during conversion, or translation, or storage, or divine intervention. The value I thought I had sent to the database has been changed on me.
A
Jon Skeet [C# MVP] - 10 Jan 2007 20:27 GMT > > Just *one* point I can't resist here: I've never claimed that decimal > > isn't exact. I've claimed that float is exact too. > > The problem is, if I hard-code a value like 28.08, and store it in a FLOAT, > when I ask for that value back, I may not get that EXACT value. Indeed - and that's why you want to use DECIMAL in such a situation. I've never denied the usefulness of DECIMAL in the slightest.
> It doesn't matter to me if that approximation occurred during > conversion, or translation, or storage, or divine intervention. The > value I thought I had sent to the database has been changed on me. It may not matter to you, but accuracy in description of types matters to me. It's a bit like people claiming that "objects are passed by reference by default" in C#, or that "value types live on the stack and reference types live on the heap" - both statements sound okay on first hearing, but dig a little deeper and they're inaccurate. I like to be accurate from the start (where I can, at least). Why claim that it's the type which is approximate rather than the conversion? Where's the benefit?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 10 Jan 2007 20:36 GMT > Why claim that it's > the type which is approximate rather than the conversion? Where's the > benefit? Because in SQL Server there is no plausible distinction. I pass in 28.08, I get back a lot more decimals, it really is irrelevant whether it's the type of the conversion or something else.
Jon Skeet [C# MVP] - 10 Jan 2007 20:50 GMT > > Why claim that it's > > the type which is approximate rather than the conversion? Where's the [quoted text clipped - 3 lines] > get back a lot more decimals, it really is irrelevant whether it's the type > of the conversion or something else. You don't like to understand what's going on beneath the surface? Why there's a loss, and why there *wouldn't* be a loss if you pass in a value which is already in binary, and why (if you were able to do so) you'd still lose information if you passed a base 3 value into a DECIMAL?
<shrug> I guess I'm just inquisitive.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 10 Jan 2007 21:45 GMT >> Because in SQL Server there is no plausible distinction. I pass in >> 28.08, I [quoted text clipped - 3 lines] > > You don't like to understand what's going on beneath the surface? It's interesting, sure. But I don't have a problem with the standards calling it an "approximate data type" if it's meant to describe how data will potentially be stored differently from how it is provided to the database. As I've said multiple times, it really isn't relevant in this case whether the approximation occured during conversion or during storage or during retrieval. And again, base 3 is about as useful as fool's gold here, because base 3 does not come into play.
Unlike INT or DECIMAL, where "approximation" can occur because you used a value that doesn't fit into the constraints you've defined, with FLOAT there is much less control over how this happens, and it is much more surprising to most users I have come across who chose FLOAT for reasons other than the fact that they knew the nature of its conversion and storage. Do you know how many people I have seen choose FLOAT for storing monetary values like prices, or even quantities? And do you know how surprised they are to find out that SQL Server changed 28.08 to 28.0800000000002? SQL Server approximated the data that they passed in, and they don't really care how it happened beneath the surface.
Jon Skeet [C# MVP] - 10 Jan 2007 21:57 GMT > > You don't like to understand what's going on beneath the surface? > [quoted text clipped - 5 lines] > or during retrieval. And again, base 3 is about as useful as fool's gold > here, because base 3 does not come into play. Well, I think it's useful for demonstrating that it's not the type that's exact, merely the conversion from data which is already in decimal form. I never claimed real world usefulness.
> Unlike INT or DECIMAL, where "approximation" can occur because you used a > value that doesn't fit into the constraints you've defined, with FLOAT there [quoted text clipped - 6 lines] > approximated the data that they passed in, and they don't really care how it > happened beneath the surface. Do you think it would be hard to explain that it's the *conversion* which loses the data, rather than giving the impression that FLOAT is inherently inexact?
I don't doubt that people use FLOAT inappropriately, and that they are indeed surprised by the lossy conversion, but I don't see how describing the type as inaccurate is any clearer than saying (accurately) that it's the conversion which loses the information.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 10 Jan 2007 22:08 GMT > Do you think it would be hard to explain that it's the *conversion* > which loses the data, rather than giving the impression that FLOAT is > inherently inexact? I don't think it would be too hard, but I think it would be useless. As I said before, most of the people don't care why it happened, they just want to know how to fix it. When you get your oil changed, do you want to know exactly how impure your oil was and how this affected your engine and gas mileage down to the nano-details? Or do you just want new oil?
I've had an article about this since 2003, and you're the first person I've ever heard even raise a peep about it. http://classicasp.aspfaq.com/general/why-does-3-2-1-5-4-7000000000000002.html
And I still argue that it is valid to call the data type approximate, because frankly, conversion is a part of the data type, as far as I'm concerned. YMMV.
A
Jon Skeet [C# MVP] - 10 Jan 2007 22:19 GMT > > Do you think it would be hard to explain that it's the *conversion* > > which loses the data, rather than giving the impression that FLOAT is [quoted text clipped - 5 lines] > exactly how impure your oil was and how this affected your engine and gas > mileage down to the nano-details? Or do you just want new oil? I guess it depends on what kind of person you are. If I get results I don't expect, I want to get a good, *accurate* description of what's going on - in this case, where I'm losing information. The fact that it's the conversion which is lossy would be important to me.
> I've had an article about this since 2003, and you're the first person I've > ever heard even raise a peep about it. > http://classicasp.aspfaq.com/general/why-does-3-2-1-5-4-7000000000000002.html I have no problems with that article.
> And I still argue that it is valid to call the data type approximate, > because frankly, conversion is a part of the data type, as far as I'm > concerned. YMMV. The conversion from decimal is part of what you can do with the type, but I don't think that's enough to deem the type itself as approximate. To go back to the base 3 case, if there *were* a conversion from base 3 to decimal, would that make the decimal type in some way approximate in a way that it isn't at the moment? It would still be able to represent the same data it can today, so how would it have become less exact?
(No, I'm still not saying that a base 3 type will ever come about - it's merely a thought experiment to examine how reasonable the "exact" vs "approximate" distinction is.)
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 10 Jan 2007 22:26 GMT > The conversion from decimal is part of what you can do with the type, > but I don't think that's enough to deem the type itself as approximate. It is if the limitations of the type affect how certain data is stored there. I'm not going to keep beating dead horse here (and once again, I don't really care about base 3, if I hadn't already beat that dead horse enough).
A
Jon Skeet [C# MVP] - 10 Jan 2007 22:32 GMT [Removed VB group]
> > The conversion from decimal is part of what you can do with the type, > > but I don't think that's enough to deem the type itself as approximate. [quoted text clipped - 3 lines] > don't really care about base 3, if I hadn't already beat that dead horse > enough). I think we can work towards a compromise then. How about the following statements:
<proposal> FLOAT is an approximate type with respect to some values which are exactly representable in base 10. It is an exact type with respect to values which are exactly representable in base 2.
DECIMAL is an approximate type with respect to some values which are exactly representable in base 3. It is an exact type with respect to values which are exactly representable in base 10. </proposal>
Is there anything in there that you'd actually disagree with (rather than just not caring about)?
The difference between those statements and the simplistic statement that "FLOAT is approximate, DECIMAL is exact" is that the above make it perfectly clear that it's only the base involved which determines the "exactness", not anything else about the type.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 10 Jan 2007 22:39 GMT > DECIMAL is an approximate type with respect to some values which are > exactly representable in base 3. It is an exact type with respect to [quoted text clipped - 3 lines] > Is there anything in there that you'd actually disagree with (rather > than just not caring about)? I don't think there's any reason to point out that decimal can't store base 3 precisely, since this is completely irrelevant in SQL Server. If you really like those definitions, then propose them to the standards bodies and to Microsoft (connect.microsoft.com/sql). I don't feel very inclined to adopt them.
> The difference between those statements and the simplistic statement > that "FLOAT is approximate, DECIMAL is exact" is that the above make it > perfectly clear that it's only the base involved which determines the > "exactness", not anything else about the type. Irrelevant in the context of this discussion, imho, since you cannot alter the base for any of the numeric data types in SQL Server.
Jon Skeet [C# MVP] - 10 Jan 2007 22:46 GMT > > DECIMAL is an approximate type with respect to some values which are > > exactly representable in base 3. It is an exact type with respect to [quoted text clipped - 6 lines] > I don't think there's any reason to point out that decimal can't store base > 3 precisely, since this is completely irrelevant in SQL Server. The reason for pointing out the situations in which decimal would be "approximate" is to throw more light on why float is "approximate" with regards to base 10. It shows (correctly, IMO) that there's not much *fundamental* difference between the two types - just the base involved. (The precision and scale being fixed or floating is another difference, of course, but not related to this issue IMO.)
> If you really like those definitions, then propose them to the > standards bodies and to Microsoft (connect.microsoft.com/sql). I > don't feel very inclined to adopt them. I was merely trying to find some common ground between us.
> > The difference between those statements and the simplistic statement > > that "FLOAT is approximate, DECIMAL is exact" is that the above make it [quoted text clipped - 3 lines] > Irrelevant in the context of this discussion, imho, since you cannot alter > the base for any of the numeric data types in SQL Server. The context of this discussion, as far as I'm concerned at least, is a consistent treatment of "approximate" compared with "exact". My statements gave a consistent treatment, whereas I see inconsistency in the way that float and decimal are described otherwise. There's a context of "with respect to numbers which are exactly representable in base 10" in the simplistic statement which I feel would be better made explicit than implicit.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 10 Jan 2007 22:52 GMT > The reason for pointing out the situations in which decimal would be > "approximate" is to throw more light on why float is "approximate" with > regards to base 10. It shows (correctly, IMO) that there's not much > *fundamental* difference between the two types - just the base > involved. And since the base is fixed and not negotiable in SQL Server, we'll just have to agree to disagree.
> I was merely trying to find some common ground between us. I don't think you will, because I still consider FLOAT and REAL to be approximate data types in SQL Server. I don't think you'll convince many people to change their opinions by trying to segregate conversion from storage, because quite frankly, 99% of them don't care, and the other 1% are likely busy solving more important problems. :-)
A
Jon Skeet [C# MVP] - 10 Jan 2007 23:10 GMT > > The reason for pointing out the situations in which decimal would be > > "approximate" is to throw more light on why float is "approximate" with [quoted text clipped - 4 lines] > And since the base is fixed and not negotiable in SQL Server, we'll just > have to agree to disagree. Do you see no value in thought experiments then? Not interested in what the impact of a base 3 type *would* be if it were introduced in the next version of SQL Server? (No, I'm still not suggesting it will be.)
But you're right, we will have to agree to disagree.
> > I was merely trying to find some common ground between us. > [quoted text clipped - 3 lines] > storage, because quite frankly, 99% of them don't care, and the other 1% are > likely busy solving more important problems. :-) I'll try for one last time to find common ground: do you agree that your statements about exact/approximate are only true from a base 10 perspective?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Barry Kelly - 10 Jan 2007 23:25 GMT Jon Skeet wrote:
> The reason for pointing out the situations in which decimal would be > "approximate" is to throw more light on why float is "approximate" with > regards to base 10. It shows (correctly, IMO) that there's not much > *fundamental* difference between the two types - just the base > involved. It is justified, however, for reasons of human culture and in particular accounting and legal requirements in accounting.
It's not a technical slight or insult; it's about people.
-- Barry
 Signature http://barrkel.blogspot.com/
Jon Skeet [C# MVP] - 10 Jan 2007 23:41 GMT > > The reason for pointing out the situations in which decimal would be > > "approximate" is to throw more light on why float is "approximate" with [quoted text clipped - 6 lines] > > It's not a technical slight or insult; it's about people. The reason DECIMAL exists is certainly about people/accounting/etc - but I don't see that the differences between DECIMAL and FLOAT need to made to seem greater than they are.
(The impression that decimal is exact and float/double are approximate occurs in more than SQL land - it's a common belief in the C#/.NET communities as well, usually where people don't really understand how the types work under the hood.)
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Barry Kelly - 10 Jan 2007 23:52 GMT Jon Skeet wrote:
> > > The reason for pointing out the situations in which decimal would be > > > "approximate" is to throw more light on why float is "approximate" with [quoted text clipped - 10 lines] > but I don't see that the differences between DECIMAL and FLOAT need to > made to seem greater than they are. It's probably because most people posting here are people, I expect :P
> (The impression that decimal is exact and float/double are approximate > occurs in more than SQL land - it's a common belief in the C#/.NET > communities as well, usually where people don't really understand how > the types work under the hood.) I think the people who understand less are better guided (i.e. more easily, more emotionally, less intellectually demanding) by the rule of thumb that decimal is exact, float / double is fast, and would thus choose decimal where fidelity in especially important matters like money is required. And the people who understand more aren't confused by the shorthand.
In short, I'm not seeing who the victim is, unless one is trying to eradicate ignorance from all programmers (I hopeless task, I fear).
-- Barry
 Signature http://barrkel.blogspot.com/
Jon Skeet [C# MVP] - 11 Jan 2007 00:18 GMT > > (The impression that decimal is exact and float/double are approximate > > occurs in more than SQL land - it's a common belief in the C#/.NET [quoted text clipped - 10 lines] > In short, I'm not seeing who the victim is, unless one is trying to > eradicate ignorance from all programmers (I hopeless task, I fear). Well, I like to eradicate ignorance where I can. I'm not sure I see that people are better guided by a "fudge" (as I see it) of the truth rather than the truth itself. I've seen many similar situations where over-simplifications (or just not mentioning the context, such as "in base 10" in this case) have ended up doing more harm than good, and where a more direct explanation is actually no harder to understand.
Of course, having a rule of thumb like the one above is absolutely fine - it's when the *explanation* of the rule of thumb skirts around the issue that I worry a bit, that's all.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 11 Jan 2007 02:07 GMT > over-simplifications (or just not mentioning the context, such as "in > base 10" in this case) Since in SQL Server there is only base 10, this is like saying the speed limit sign should have an asterisk stating that this is true as long as you're driving a vehicle and not walking or flying a plane.
A
Mike C# - 11 Jan 2007 16:13 GMT >> over-simplifications (or just not mentioning the context, such as "in >> base 10" in this case) > > Since in SQL Server there is only base 10, this is like saying the speed > limit sign should have an asterisk stating that this is true as long as > you're driving a vehicle and not walking or flying a plane. Not a bad idea :) How about posting all speed limit signs in Base-5?
Mike C# - 10 Jan 2007 22:31 GMT > Do you think it would be hard to explain that it's the *conversion* > which loses the data, rather than giving the impression that FLOAT is > inherently inexact? SQL is an abstraction, as are C#, C++, VB, yadda yadda. Part of that abstraction is designed to pull people away from the underbelly of the machine and give them tools they can use to get a particular job done. While it's useful from a CS perspective to understand the IEEE floating point standard, how chip designers and manufacturers implement FPU's, the differences between binary floating point/binary coded decimal/packed binary coded decimal, the Assembly Language instructions happening under the hood, etc., it's just not *necessary* to have this intimate knowledge to do 99.9999999999999999% of what most SQL developers, DBA's, etc., do. Just like it's not necessary to have Ph.D.'s in mechanical engineering, civil engineering, physics, mathematics, and electrical engineering to drive a car. It might be nice to have all that knowledge, but it's not really necessary.
> I don't doubt that people use FLOAT inappropriately, and that they are > indeed surprised by the lossy conversion, but I don't see how > describing the type as inaccurate is any clearer than saying > (accurately) that it's the conversion which loses the information. Speaking of "inaccurate", the FLOAT type is "approximate", not "inaccurate". I don't recall seeing anyone call the FLOAT type "inaccurate" until this post.
As for why it's described as "approximate"; because SQL is an abstraction, and in SQL (which is not object-oriented) you cannot override operators like the assignment operator. The operators are closely tied to the data types, and to almost all users (and the vast majority of developers) it's enough to know that if you assign a FLOAT variable a value and later retrieve it, it might not be the same value you assigned. That's the level of abstraction brought by SQL - it's several layers of abstraction higher than Assembler Language, C++, or even C#.
Stephany Young - 10 Jan 2007 22:02 GMT I have a fair idea. And they're the same people who get a surprise when they light the barbecue after pouring petrol (gasoline) on the charcoal.
It's the best argument that there is for not allowing users to muck about with things that they don't understand.
<snip>
> fact that they knew the nature of its conversion and storage. Do you know > how many people I have seen choose FLOAT for storing monetary values like > prices, or even quantities? And do you know how surprised they are to > find out that SQL Server changed 28.08 to 28.0800000000002? SQL Server </snip>
Daniel Manes - 10 Jan 2007 22:44 GMT LOL. Good thing I don't bbq!
-Dan "still has his eyebrows" The Man
> I have a fair idea. And they're the same people who get a surprise when they > light the barbecue after pouring petrol (gasoline) on the charcoal. [quoted text clipped - 9 lines] > > </snip> Mike C# - 10 Jan 2007 20:36 GMT > Why claim that it's > the type which is approximate rather than the conversion? Where's the > benefit? Because some MVP will jump in and claim that the conversion cannot technically be called "approximate". After all, the conversion is a hard-coded set of instructions that perform the same function every time and produce the same value given the same input every time. Sounds fairly exact to me.
Jon Skeet [C# MVP] - 10 Jan 2007 20:53 GMT > > Why claim that it's > > the type which is approximate rather than the conversion? Where's the [quoted text clipped - 5 lines] > produce the same value given the same input every time. Sounds fairly exact > to me. That means it's deterministic, not that it's exact. It certainly *is* an approximation, and it's losing information. I'd certainly be perfectly happy to argue the "approximate" side of that debate :)
Do either you or Aaron actually disagree that when you separate the idea of "type" from "conversion" it's the *conversion* which is approximate rather than the FLOAT type itself? If you do, it's probably worth continuing to debate (despite my 5am post!) - if not, it's presumably just a matter of how much we all do/don't care about making the distinction between types and conversions. That's a matter of opinion and probably not worth arguing about.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Mike C# - 10 Jan 2007 22:40 GMT >> > Why claim that it's >> > the type which is approximate rather than the conversion? Where's the [quoted text clipped - 11 lines] > an approximation, and it's losing information. I'd certainly be > perfectly happy to argue the "approximate" side of that debate :) We could debate the exactness there :) After all, the conversion process boils down to a bunch of machine-level instructions that perform the exact same task each and every time. So that in addition to being deterministic, they perform a very exact process. There's nothing approximate about the machine language instructions that perform the conversion; they are exactly defined, exactly performed, and you retrieve an exact result.
> Do either you or Aaron actually disagree that when you separate the > idea of "type" from "conversion" it's the *conversion* which is [quoted text clipped - 3 lines] > the distinction between types and conversions. That's a matter of > opinion and probably not worth arguing about. I can agree with that to an extent; however, the lossy conversion is necessary because of the limitations of the storage format. Consider BCD format which stores each digit in 3 bits. Absolutely no loss occurs during the character conversion process from "28.08" to the internal BCD representation, precisely because the format is not limited in that respect.
SQL is a highly abstract language, and as I mentioned most users and developers don't care about the intricacies of IEEE floating point representation, mantissas (mantissae?), etc. As Aaron pointed out, SQL users tend to focus on higher level aspects, like if I put in 28.08 will I get back 28.08? Or will I get 28.079237402371? That's the power of higher level languages - you don't need to know the intricate details of the hardware to do useful things.
Shuurai - 10 Jan 2007 20:59 GMT > > It doesn't matter to me if that approximation occurred during > > conversion, or translation, or storage, or divine intervention. The [quoted text clipped - 8 lines] > the type which is approximate rather than the conversion? Where's the > benefit? The benefit is that you're left with a useful definition. By your reasoning, no datatype can ever be "exact" so the term is rendered essentially useless. We can't ever store the "exact" value of 1/3 in any meaningful way.
In this case the terminology is more or less describing the way the datatype can be expected to perform. When storing a valid value (28.08) in a DECIMAL column you can expect to get exactly the same value back again. Store the same value in FLOAT and you get an approximation.
Adrian Gallero - 10 Jan 2007 21:13 GMT > We can't ever store the "exact" value of 1/3 in > any meaningful way. Hate to add more noise to this thread, but I dissagree.
You can store 1/3 exactly, just not on base 10. If you use for example to base 24 (who does not use base 24 on a day by day basis...), you can store 1/3 of a day exactly. 8 hours.
Jon Skeet [C# MVP] - 10 Jan 2007 21:22 GMT > > It may not matter to you, but accuracy in description of types matters > > to me. It's a bit like people claiming that "objects are passed by [quoted text clipped - 9 lines] > essentially useless. We can't ever store the "exact" value of 1/3 in > any meaningful way. No, by my definition pretty much all datatypes are exact. (I can't think of any that aren't, offhand.) Conversions between datatypes (at least numeric ones) tend to be lossy.
Let's look at another example - INTEGER. Is INTEGER approximate? I wouldn't say so - but if you tried to convert 28.08 into an INTEGER, you'd certainly lose information, because that conversion is lossy. Is INTEGER described as an "approximate" type by ANSI? I doubt it somehow...
> In this case the terminology is more or less describing the way the > datatype can be expected to perform. When storing a valid value > (28.08) in a DECIMAL column you can expect to get exactly the same > value back again. Store the same value in FLOAT and you get an > approximation. So when there's no lossy conversion involved, there's no loss of information, but when there *is* a lossy conversion involved, information is lost. That's not exactly a surprise, and has nothing to do with the types themselves, as far as I can see.
I don't see why people believe it's so difficult to explain/understand that it's the conversion which is lossy, not the type.
Note that if you pass in floating binary point data rather than decimal data, you *won't* lose information - because hey, again, there's no conversion involved. That isn't clear if you regard the type itself as being lossy, but it's absolutely obvious when you separate conversions from types.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Shuurai - 11 Jan 2007 16:08 GMT > > The benefit is that you're left with a useful definition. By your > > reasoning, no datatype can ever be "exact" so the term is rendered [quoted text clipped - 4 lines] > think of any that aren't, offhand.) Conversions between datatypes (at > least numeric ones) tend to be lossy. Either way you render the definition useless. The way it's used in SQL explains the behavior that can be expected.
> Let's look at another example - INTEGER. Is INTEGER approximate? I > wouldn't say so - but if you tried to convert 28.08 into an INTEGER, > you'd certainly lose information, because that conversion is lossy. > Is INTEGER described as an "approximate" type by ANSI? I doubt it > somehow... When you convert 28.02 into an integer you are storing it as 28. Not an approximation, but exactly 28. The definition describes the behavior of the data type.
> > In this case the terminology is more or less describing the way the > > datatype can be expected to perform. When storing a valid value [quoted text clipped - 6 lines] > information is lost. That's not exactly a surprise, and has nothing to > do with the types themselves, as far as I can see. Exactly - but the definitions are there to describe to a user (like our OP) who wants to know which data type will give him the same number he intended to store.
> I don't see why people believe it's so difficult to explain/understand > that it's the conversion which is lossy, not the type. It's not that it's difficult to explain or understand; it's just not the intent of the defintion.
> Note that if you pass in floating binary point data rather than decimal > data, you *won't* lose information - because hey, again, there's no > conversion involved. That isn't clear if you regard the type itself as > being lossy, but it's absolutely obvious when you separate conversions > from types. Note that the OP asked why he was getting a 28.080000000000002 instead of 28.08 :)
Using the terminology common to SQL, we can tell him to use one of the various data types described as "exact" rather than the "approximate" data type he is using now. Simple, correct answer that solves his problem immediately.
Or, we could give him the nitty-gritty details of converting base2 numbers to base10, which while certainly interesting isn't all that helpful.
Daniel Manes - 11 Jan 2007 19:18 GMT As the OP (!), I can tell you that, yes, this whole debate has been interesting, but the helpful part is knowing which data type to use when.
For the particular case at hand (trying to display lat/long values back to the user exactly as originally entered), I decided to change the data type of my lat/long columns in SQL to DECIMAL(18, 15). Note, more precision than necessary for this application but just wanted to see what happens.
The result: When displayed in a .NET DataGridView, every number displayed had 15 decimal digits. So, a number orginally entered as "23.5" showed up as "23.500000000000000." Not what I was hoping for. Using floats, the DataGridView would display only as many digits as necessary. With decimals, it shows them all, whether they add useful information or not.
Apparently, I want a hybrid of the two. Give me VARDECIMAL please :) I want the ability to store any decimal number within a reasonable range and have it spit back exactly what I entered but without any unnecessary decimal places.
Anyway, my stop-gap measure was to change the format of the affected columns in the DataGridView to "0.000," which limits the displayed precision to three decimal places. This certainly improves readability, but it will lead to a loss of accuracy if the user tries to edit one of these formatted cell, because the missing digits don't naturally reappear.
Of course, I could write some custom code to make the remaining digits appear, but, besides being a pain in the a.s, if all of a sudden 13 trailing zeros appear when the user goes into edit mode, that's not really a good thing for usability, etc.
I'm starting to wonder if I should just store the lat/long values as VARCHAR and convert them to decimals when it comes time to actually display them on the map or do calculations with them, but something tells me that will cause a whole other set of problems.
Color me frustrated.
Help greatly appreciated.
-Dan
> > > The benefit is that you're left with a useful definition. By your > > > reasoning, no datatype can ever be "exact" so the term is rendered [quoted text clipped - 56 lines] > numbers to base10, which while certainly interesting isn't all that > helpful. Razvan Socol - 11 Jan 2007 19:34 GMT > [...] Apparently, I want a hybrid of the two. Give me VARDECIMAL please :) [...] Actually, there is something called vardecimal in SQL Server 2005 SP2, but I think you are not refferring to this. It is a "storage format", not a "data type" (it is enabled at the table level, after configuring a database-level option). It allows decimal columns to use less space, but it only works on Enterprise Edition (and Developer Edition and Evaluation Edition), and only for databases using the Simple recovery model. For more informations, see the updated Books Online. (Note: Currently, SP2 is in the CTP phase)
Razvan
Aaron Bertrand [SQL Server MVP] - 11 Jan 2007 19:39 GMT > The result: When displayed in a .NET DataGridView, every number > displayed had 15 decimal digits. So, a number orginally entered as > "23.5" showed up as "23.500000000000000." Not what I was hoping for. So why don't you use the ToString() and supply format arguments (e.g. #,##0.000)? That's what they're there for, and that tier is the proper place to handle presentation / formatting. The database's job is to store and retrieve data, not to make it look pretty. What if SQL Server returned only non-zero trailing decimal values, but I *want* all the decimal places, for example to make sure the values have a consistent number of digits? If the database determined how to present the data, one of us would be having fits. In the meantime, you have a good workaround, imho.
> Apparently, I want a hybrid of the two. Give me VARDECIMAL please :) SQL Server 2005's Service Pack 2 adds VARDECIMAL, but it is not implemented the way you desire. It is about storage, not presentation. Here are a couple of articles that describe it.
http://sqlservergems.blogspot.com/2006/11/vardecimal.html
http://weblogs.sqlteam.com/mladenp/archive/2006/11/10/19546.aspx
http://blogs.msdn.com/sqlserverstorageengine/archive/2007/01/05/boundary-conditi ons-for-enabling-vardecimal-storage-format.aspx
> Anyway, my stop-gap measure was to change the format of the affected > columns in the DataGridView to "0.000," which limits the displayed [quoted text clipped - 7 lines] > trailing zeros appear when the user goes into edit mode, that's not > really a good thing for usability, etc. That is debatable, I suppose. If the users are going to be editing the 13th decimal place when it is non-zero, it is arguable that they aren't going to complain much to see digits there when they are simply zeros. In any case, SQL Server is not in a position to magically fix this problem for you, nor would I expect it to be.
> I'm starting to wonder if I should just store the lat/long values as > VARCHAR and convert them to decimals when it comes time to actually > display them on the map or do calculations with them, but something > tells me that will cause a whole other set of problems. Yes, it will.
Daniel Manes - 11 Jan 2007 20:40 GMT > > The result: When displayed in a .NET DataGridView, every number > > displayed had 15 decimal digits. So, a number orginally entered as > > "23.5" showed up as "23.500000000000000." Not what I was hoping for.
> So why don't you use the ToString() and supply format arguments (e.g. > #,##0.000)? That's what they're there for, and that tier is the proper [quoted text clipped - 4 lines] > the database determined how to present the data, one of us would be having > fits. In the meantime, you have a good workaround, imho. Two things:
1. With floats, you can specify the size of the mantissa, but the decimal place can be located anywhere. With decimals, you need to specify where the decimal place is located or it defaults to zero (no decimal places). In .NET, double and decimal seem to behave pretty much the same in this regard--the decimal point really can float. Hence my thought that VARDECIMAL would be nice.
2. Over in .NET, when you try to display a float column in a DataGridView, it truncates trailing zeros, but when you display a decimal column, you get trailing zeros galore. I would be more than happy to manually set the format of any decimal column in my DataGridView to "act like you're displaying a float even though you're not," but I have yet to figure out how.
> > Of course, I could write some custom code to make the remaining digits > > appear, but, besides being a pain in the a.s, if all of a sudden 13 > > trailing zeros appear when the user goes into edit mode, that's not > > really a good thing for usability, etc.
> That is debatable, I suppose. If the users are going to be editing the 13th > decimal place when it is non-zero, it is arguable that they aren't going to > complain much to see digits there when they are simply zeros. Let's go back to what the user is trying to do. I'm trying to enter a latitude, say, 25.5 degrees. I type in 25.4. Later I realize, oops, that should have been 25.5. I go in to edit the cell.
The cell now contains 25.400000000000000, but all I see is 00000000, because the column is not wide enough to display all the digits. Now I have to put the cursor in the cell, hold down left arrow (or try to nudge it with the mouse) until I can once again see the 4. Then I need to carefully select just the 4 and type a 5.
If the cell simply contained 25.4, I could just place cursor to the right of the 4, hit backspace, type 5. Done.
So, for this scenario at least, I see the usability difference as quite non-debatable.
> In any case, > SQL Server is not in a position to magically fix this problem for you, nor > would I expect it to be. It's really an interaction between SQL Server and .NET, so I'm not saying it's SQL Server's "fault," but I can't see how it would *hurt* to have something like a vardecimal.
There are four combinations of things a programmer may need:
1. Floating decimal point/no trailing zeros + faithful base-10 storage/retrieval 2. Floating decimal point/no trailing zeros + efficient storage/processing 3. Fixed decimal point/mandatory trailing zeros + faithful base-10 storage/retrieval 4. Fixed decimal point/mandatory trailing zeros + efficient storage/processing
Right now, I can only do 2 and 3, and I think that's a problem.
-Dan
Aaron Bertrand [SQL Server MVP] - 11 Jan 2007 21:16 GMT > Let's go back to what the user is trying to do. I'm trying to enter a > latitude, say, 25.5 degrees. I type in 25.4. Later I realize, oops, > that should have been 25.5. I go in to edit the cell. How many times is the user going to enter 25.4672362763232 and want to edit that?
A
Daniel Manes - 11 Jan 2007 21:33 GMT > > Let's go back to what the user is trying to do. I'm trying to enter a > > latitude, say, 25.5 degrees. I type in 25.4. Later I realize, oops, > > that should have been 25.5. I go in to edit the cell.
> How many times is the user going to enter 25.4672362763232 and want to edit > that? If you're trying to say my example was an exaggeration, fine. Sometimes people do that to make a point. All I'm saying is that trailing zeros are sometimes a nuisance, and it would be nice if there were an easy way to get rid of them without losing any data in the process.
-Dan
Aaron Bertrand [SQL Server MVP] - 11 Jan 2007 21:59 GMT > If you're trying to say my example was an exaggeration, fine. Sometimes > people do that to make a point. All I'm saying is that trailing zeros > are sometimes a nuisance, and it would be nice if there were an easy > way to get rid of them without losing any data in the process. I'm sure in C# you could work out a way to format the data as a string, removing trailing zeros but keeping all relevant decimal places.
A
Mike C# - 11 Jan 2007 21:14 GMT > As the OP (!), I can tell you that, yes, this whole debate has been > interesting, but the helpful part is knowing which data type to use [quoted text clipped - 12 lines] > necessary. With decimals, it shows them all, whether they add useful > information or not. 6 digits past the decimal point is accurate to about 4.4 inches in geocoding. Five digits past the decimal point is about 3.6 feet of accuracy. Unless you're trying to use a spy satellite to read license plates, 5 places past the decimal point should be plenty.
Daniel Manes - 11 Jan 2007 22:07 GMT > 6 digits past the decimal point is accurate to about 4.4 inches in > geocoding. Five digits past the decimal point is about 3.6 feet of > accuracy. Unless you're trying to use a spy satellite to read license > plates, 5 places past the decimal point should be plenty. Thanks, Mike,
That's exactly what I was just trying to calculate. I was figuring accurate to a yard would be plenty and was just going to base my decimal digits on that. I did my calculations a little differently, though, since users can enter in either degrees or degs-mins-secs. I somehow came up with ~two feet per hundredth of a second (centisecond?) of arc. Which is equivalent to 0.0000028 degrees. Either way, looks like decimal(9, 6) ought to be enough, even for detecting relatively small movements.
-Dan
Stephany Young - 11 Jan 2007 22:23 GMT Rule of thumb is:
For decimal places of decimal degrees of longitude at the equator:
3rd decimal place is sub-Kilometer
4th decimal place is sub-Hectometer
5th decimal place is sub-Decameter
6th decimal place is sub-meter
7th decimal place is sub-centimeter
8th decimal place is sub-millimeter
>> 6 digits past the decimal point is accurate to about 4.4 inches in >> geocoding. Five digits past the decimal point is about 3.6 feet of [quoted text clipped - 13 lines] > > -Dan Stephany Young - 11 Jan 2007 22:37 GMT Sorry! Correction.
Rule of thumb is:
For decimal places of decimal degrees of longitude at the equator:
1st decimal place is sub-Kilometer
2nd decimal place is sub-Hectometer
3rd decimal place is sub-Decameter
4th decimal place is sub-meter
5th decimal place is sub-decimeter
6th decimal place is sub-centimeter
7th decimal place is sub-millimeter
> Rule of thumb is: > [quoted text clipped - 29 lines] >> >> -Dan Mike C# - 12 Jan 2007 14:45 GMT > That's exactly what I was just trying to calculate. I was figuring > accurate to a yard would be plenty and was just going to base my [quoted text clipped - 4 lines] > like decimal(9, 6) ought to be enough, even for detecting relatively > small movements. There's a blog entry over at geocoder.us where he breaks down the digits of precision in a table format: http://geocoder.us/blog/2006/03/23/how-many-digits-are-enough/. His table has decimals, degrees, statute miles, feet and inches. For East-West distances it's different (since the distance between lines of longitude gets shorter as you move away from the equator), but it's pretty close to the same thing over the continental U.S. If you're geocoding much farther north or far south of the equator, you will probably need to make adjustments. At any rate, 6 digits past the decimal point should be adequate regardless of where you're geocoding.
Jon Skeet [C# MVP] - 11 Jan 2007 21:04 GMT > > > The benefit is that you're left with a useful definition. By your > > > reasoning, no datatype can ever be "exact" so the term is rendered [quoted text clipped - 7 lines] > Either way you render the definition useless. The way it's used in SQL > explains the behavior that can be expected. ... at the cost of making it less clear what's actually going on, IMO.
> > Let's look at another example - INTEGER. Is INTEGER approximate? I > > wouldn't say so - but if you tried to convert 28.08 into an INTEGER, [quoted text clipped - 5 lines] > an approximation, but exactly 28. The definition describes the > behavior of the data type. When you convert 28.08 into a float, you are storing it as 28.0799999999999982946974341757595539093017578125. Not an approximation, but exactly 28.0799999999999982946974341757595539093017578125
Now, that's certainly less intuitive than 28.08 -> 28, but it's still an exact number.
> > So when there's no lossy conversion involved, there's no loss of > > information, but when there *is* a lossy conversion involved, [quoted text clipped - 4 lines] > OP) who wants to know which data type will give him the same number he > intended to store. Unfortunately, by doing so it loses sight of the fact that floats store numbers exactly - they just can't store all decimal numbers exactly.
> > I don't see why people believe it's so difficult to explain/understand > > that it's the conversion which is lossy, not the type. > > It's not that it's difficult to explain or understand; it's just not > the intent of the defintion. I believe that definitions should be *accurate*, and in my view (I'm afraid no-one's changed my mind, and I don't expect I've changed anyone else's mind) the definition *isn't* accurate.
> > Note that if you pass in floating binary point data rather than decimal > > data, you *won't* lose information - because hey, again, there's no [quoted text clipped - 9 lines] > data type he is using now. Simple, correct answer that solves his > problem immediately. We could have told him that without giving the impression that the float type is "approximate" though:
"When a decimal number is converted into a FLOAT, information can be lost. It's a lossy conversion because not all decimal numbers can be exactly represented as binary floating point numbers. Use the DECIMAL type instead to keep the exact decimal value."
> Or, we could give him the nitty-gritty details of converting base2 > numbers to base10, which while certainly interesting isn't all that > helpful. Does the above go into nitty gritty? Nope - but it doesn't give a misleading impression either, IMO.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Shuurai - 12 Jan 2007 14:56 GMT Jon Skeet [ C# MVP ] wrote:
> > > > The benefit is that you're left with a useful definition. By your > > > > reasoning, no datatype can ever be "exact" so the term is rendered [quoted text clipped - 9 lines] > > ... at the cost of making it less clear what's actually going on, IMO. While making it more clear what can be expected as a result. Those of us who know how it's working don't need the definition as much as someone who's just looking to get the same number out that they put in.
> > > Let's look at another example - INTEGER. Is INTEGER approximate? I > > > wouldn't say so - but if you tried to convert 28.08 into an INTEGER, [quoted text clipped - 10 lines] > approximation, but exactly > 28.0799999999999982946974341757595539093017578125 And you'll note that 28.0 != 28.0799999999999982946974341757595539093017578125
> Now, that's certainly less intuitive than 28.08 -> 28, but it's still > an exact number. Yes, it's an exact number. It's just not the exact number that we stored.
> > > So when there's no lossy conversion involved, there's no loss of > > > information, but when there *is* a lossy conversion involved, [quoted text clipped - 7 lines] > Unfortunately, by doing so it loses sight of the fact that floats store > numbers exactly - they just can't store all decimal numbers exactly. So your argument is that they store SOME numbers exactly?
> > > I don't see why people believe it's so difficult to explain/understand > > > that it's the conversion which is lossy, not the type. [quoted text clipped - 5 lines] > afraid no-one's changed my mind, and I don't expect I've changed anyone > else's mind) the definition *isn't* accurate. Definitions need to have a balance between usefulness and accuracy. Your definition, even if technically accurate (which I don't believe it is) just isn't useful.
> > > Note that if you pass in floating binary point data rather than decimal > > > data, you *won't* lose information - because hey, again, there's no [quoted text clipped - 17 lines] > exactly represented as binary floating point numbers. Use the DECIMAL > type instead to keep the exact decimal value." Which is absolutely correct. The definition is used to describe what can be expected of the data type. It doesn't address (or need to address) anything that is going on in the background. The point of the definition is to tell the user which types will store 28.08 as 28.08.
> > Or, we could give him the nitty-gritty details of converting base2 > > numbers to base10, which while certainly interesting isn't all that > > helpful. > > Does the above go into nitty gritty? Nope - but it doesn't give a > misleading impression either, IMO. Not misleading; just not particularly useful.
Mike C# - 12 Jan 2007 16:00 GMT > So your argument is that they store SOME numbers exactly? From his posts so far his argument seems to boil down to "all data types are stored as bits which are either *exactly* on or *exactly* off".
Of course by that reasoning you could create a data type that completely disregards anything you put in and stores something entirely different, but it would be just as *exact* as any other data type. It would be interesting (though admittedly pointless) to have a SURPRISE data type where you could put in "28.08" and retrieve "A suffusion of yellow", "999999.9283746", "c^2 = a^2 + b^2", "3.141592", or "9i + 12"; depending on what the computer decided to store in it at any given moment. But since bits are exactly either on or off, this would be another "exact" data type.
Another one of those "It's technically accurate, but completely useless" type things.
Jon Skeet [C# MVP] - 12 Jan 2007 22:21 GMT > > > Either way you render the definition useless. The way it's used in SQL > > > explains the behavior that can be expected. [quoted text clipped - 4 lines] > us who know how it's working don't need the definition as much as > someone who's just looking to get the same number out that they put in. Well, I don't think it makes it any clearer, but that's clearly a matter of opinion.
> > > When you convert 28.02 into an integer you are storing it as 28. Not > > > an approximation, but exactly 28. The definition describes the [quoted text clipped - 7 lines] > And you'll note that 28.0 != > 28.0799999999999982946974341757595539093017578125 (I assume you meant 28.08 rather than 28.0?)
Yes, just as 28 != 28.08 in your integer example. Just as with integer you are storing exactly 28 (i.e. the closest integer to 28.08 with appropriate rounding behaviour), with float you're storing exactly the closest float to 28.08 with the specified rounding behaviour.
> > Now, that's certainly less intuitive than 28.08 -> 28, but it's still > > an exact number. > > Yes, it's an exact number. It's just not the exact number that we > stored. Well, it's the exact number that you *stored*, just not the exact number which was the source of the conversion.
My point is that the number stored in the float example is just as exact as the number stored in the integer example. Both types have rules for converting an arbitrary number to an exact value within the set of representable values for that type. Where do you see the difference that makes integer exact and float approximate?
> > > Exactly - but the definitions are there to describe to a user (like our > > > OP) who wants to know which data type will give him the same number he [quoted text clipped - 4 lines] > > So your argument is that they store SOME numbers exactly? All types available only store *some* numbers exactly. Decimal doesn't store all numbers exactly: "a third" is a number, but it can't be stored exactly. Be careful to distinguish between "decimal numbers" and "numbers".
> > I believe that definitions should be *accurate*, and in my view (I'm > > afraid no-one's changed my mind, and I don't expect I've changed anyone [quoted text clipped - 3 lines] > Your definition, even if technically accurate (which I don't believe it > is) just isn't useful. Unsurprisingly, I disagree. I don't expect either of us to persuade the other.
> > We could have told him that without giving the impression that the > > float type is "approximate" though: [quoted text clipped - 8 lines] > address) anything that is going on in the background. The point of the > definition is to tell the user which types will store 28.08 as 28.08. It's not clear to me what you're regarding as "absolutely correct". The bit in quotes in my post was my suggested alternative to just saying that float is an approximate type. If you agree that my text is correct, and tells the user which type will store 28.08 as 28.08, what's the disadvantage of using my text?
> > > Or, we could give him the nitty-gritty details of converting base2 > > > numbers to base10, which while certainly interesting isn't all that [quoted text clipped - 4 lines] > > Not misleading; just not particularly useful. What does it *not* address that would be useful?
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] - 12 Jan 2007 22:30 GMT > My point is that the number stored in the float example is just as > exact as the number stored in the integer example. Both types have > rules for converting an arbitrary number to an exact value within the > set of representable values for that type. Where do you see the > difference that makes integer exact and float approximate? Maybe it's the surprise factor.
Maybe it's that we are used to being able to supply *less* than the number of acceptable significant digits, and not having the values change on us after the fact.
Maybe it's because integers are simpler, and the rounding of 28.08 -> 28 is much more intuitive, clear, expected, and predictable. This is stuff we all learned in grade school. I think it takes a much better understanding of mathematics to draw the same conclusions from the approximation of 28.08 -> 28.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
A
Jon Skeet [C# MVP] - 12 Jan 2007 22:44 GMT > > My point is that the number stored in the float example is just as > > exact as the number stored in the integer example. Both types have [quoted text clipped - 7 lines] > of acceptable significant digits, and not having the values change on us > after the fact. FWIW, I've always thought that describing floating binary point types as being accurate to a certain number of *decimal* digits is asking for trouble.
> Maybe it's because integers are simpler, and the rounding of 28.08 -> 28 is > much more intuitive, clear, expected, and predictable. This is stuff we all > learned in grade school. I think it takes a much better understanding of > mathematics to draw the same conclusions from the approximation of 28.08 -> > 28.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Yes, I dare say it takes a bit more understanding if you want to know exactly why it stores the values it does.
I would even have been *somewhat* happy with an original answer (when I started this whole business) of "yes, we know that float is exact really, but explaining it as approximate is easier than bringing conversions into the conversation". I would still have argued against that, *but* the fact is that everyone's been arguing that float really *is* inherently approximate, and from a mathematical standpoint which doesn't have a base 10 bias I just can't accept that as being true. Just because a conversion takes a bit more understanding doesn't mean the value stored isn't exact.
All the types we've talked about (integer, decimal, float) have the following in common:
1) They have a well-defined set of values they can exactly represent.
2) If you ask them to store a value which isn't in that set, they will store (exactly) an approximation to that value. The rules for that approximation are also well-defined.
3) If you ask them to store a value which *is* in that set, they will store that value exactly.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Mike C# - 13 Jan 2007 02:45 GMT >> > My point is that the number stored in the float example is just as >> > exact as the number stored in the integer example. Both types have [quoted text clipped - 46 lines] > 3) If you ask them to store a value which *is* in that set, they will > store that value exactly. And the difference is: If you store a value in an INTEGER or DECIMAL that is coincident with that type you do not risk introducing error or data loss. With REAL or FLOAT you risk introducing error or losing data when storing values that are coincident with the type.
Jon Skeet [C# MVP] - 13 Jan 2007 08:23 GMT <snip>
> > All the types we've talked about (integer, decimal, float) have the > > following in common: [quoted text clipped - 12 lines] > With REAL or FLOAT you risk introducing error or losing data when storing > values that are coincident with the type. See other replies. By defining =3Fcoincident with the type=3F to require evaluation from a string of *decimal* digits, you=3Fre shifting the goalposts from where I at least interpreted them to be.
I had originally understood your =3Fvalues coincident with the type=3F to be =3Fvalues which can be exactly represented by that type=3F, and that=3Fs the definition I was considering when I agreed with your definition of an exact data type. If you look up =3Fcoincident=3F with a Google search of =3Fdefine: coincident=3F then the definitions given there are a lot closer to =3Fvalues which can be exactly represented by that type=3F than =3Fworks with a given length of decimal digits=3F (a simplification of what you said, but the main thrust).
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Mike C# - 13 Jan 2007 19:03 GMT |
|