[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: NUMBERVALUE and VALUE (was: [office-formula] Groups - OpenFormulaSpecification 2007-03-22-wheeler (ODT) (openformula-20070322-wheeler.odt)uploaded)
Hi David, Sorry for the delay, but Monday I had a day off and yesterday I was busy with other things. I actually also wanted to send out this mail earlier today but needed some time to make up my mind.. see below. On Friday, 2007-03-23 22:38:11 +0000, David A. Wheeler wrote: > Okay folks - to make ideas more concrete, I've posted another modified > version of the document. Consider this a "non-official" version, feel free > to continue working off the March 22 release. I'll take your modified version as a basis to work on. Browsing through the changes I think we should accept most, but there are some details I'm not contended with. > VALUE is now clearly locale-dependent, and there are specific requirements > for locale en_US (so that we can test it!). It now says that "commas are ignored", which isn't quite correct. The group separator should be "ignored" only if it is used according to the locale's rules, which for most locales makes it equal to a thousands separator. A string "1,2,3" ususally does not convert to the numeric value 123, but generates an error instead. And of course 1,.2 is invalid as well. Therefor the regexp also isn't correct: [+|-]? \$?((\.[0-9]+)|([0-9,]+(\.[0-9]+)?([eE][+-]?[0-9]+)?))%? Note also that blanks between sign and currency symbol are optional, as they are between the currency symbol and the digits. Some applications do not accept the sign before '$', some do not accept blanks in between. As both, the integer and the fractional part with a leading decimal separator, can be present standalone, some simplification can be made there. I also doubt that applications should parse the percent sign after an exponential value. I think that for he minimal requirement the regexp should be [+|-]?\$?([0-9]+(,[0-9]{3})*)?(\.[0-9]+)?(([eE][+-]?[0-9]+)|%)? if I didn't make a mistake, could someone verify? However, that would still not catch the cases where a separator was inserted every 6th digit.. > I've also added NUMBERVALUE, > which takes two parameters (the text to convert, and the character to use > as the decimal point)... that won't handle ALL locales, but it'll handle a > whole lot without needing to deal with "all possible locales". Here I don't see why | Regardless of the current locale, the implementation shall accept text | representations that match this regular expression when DecimalPoint is | “.” (a period): [+|-]? \$?((\.[0-9]+)|([0-9,]+(\.[0-9]+)?([eE][+-]?[0-9]+)?))%? this should parse a '$' currency symbol if the decimal separator is a period. I also don't see why it should parse a comma group separator but not others like a blank or apostrophe, or why it would not parse group separators at all if the decimal separator is not a period. Instead, I propose to leave out currency symbols and introduce a 3rd parameter that specifies the group separator to be used, defaulted to comma if the decimal separator is a period, and defaulted to a period if the decimal separator is a comma. Here indeed the group separator could be ignored instead of requiring a specific grouping, so the regexp could read [+|-]?([0-9]+(,[0-9])*)?(\.[0-9]+)?(([eE][+-]?[0-9]+)|%)? I also don't think that having | If the provided text does not match the pattern, an implementation must | at least accept the same formats as VALUE does, and should accept the | given DecimalPoint where appropriate (e.g., HH:MM:SS.sss or HH:MM:SS,sss | depending on the DecimalPoint value). is actually a good idea, for two reasons. First, if we define NUMBERVALUE it should serve one specific purpose: parse numbers. Not dates, not times, or whatsoever. Second, if NUMBERVALUE was used with a specific separator the user did that on purpose, otherwise he could had used the locale-dependent VALUE. If NUMBERVALUE couldn't parse a string, falling back to VALUE with the decimal separator exchanged most certainly would not deliver the result intended, even if it did parse _something_. Eike -- Automatic string conversions considered dangerous. They are the GOTO statements of spreadsheets. --Robert Weir on the OpenDocument formula subcommittee's list.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]