-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
fix: Uniformize bigint construction #3375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
OK, I have marked this as draft for two reasons:
Looking forward to your call here, then I can complete this PR to be ready for review. |
Thanks for putting together this draft PR. I think in general each factory function ( Right now, the rounding behavior of About the The improvements in the examples are nice. I'm not sure though why you have added indentation to some of the inline comments |
What facilities should we then provide to support a use that needs a bignumber but needs to avoid the creation of illusory precision where there is none? That is exactly our use case, and mathjs is generally supportive of avoiding spurious precision. So I am fine to change this particular signature, but we need a safety mechanism, and then I need to put it in this PR, since it was created and filed precisely to allow us to adopt v14 of mathjs, and our adoption is waiting on such a mechanism. Thanks for your advice/design here. |
I am fine of course with keeping the rounding config. The current behavior is |
I think History in the source file is much more maintainable than central history: you must edit the function's source file to make the change anyway, so you add a line to the History. Once every function has such a section, I think it will become very natural, and the reviewer can easily check it was done just by glancing at the diffs. However, if I need to augment this with an autoparser to get it merged, let me know and I will do it -- but I would recommend inserting the results in the source file in the History section so you can see them there and possibly tweak them; otherwise I think they are not going to come out very readable. |
It is because and only because the doc test ignores indented comments, and so a cheap way to fix a doc test that is failing because of free form comments that confuse its attempts to grab the expected return value of an expression is to just indent the comment. This behavior came from a number of pre-existing doc sections that used indented comments as sort of section headers. So the "ignoring indented comments" exception was already there, and I was using it to get more tests to pass. The goal of course is to get to all doc tests passing, at which point they can finally switch to being enforced. |
|
I personally would be happier with options on the conversion functions than new functions -- there are a lot of functions, which gets a bit unwieldy. This PR currently makes a Note there is a need for some safety mechanism for number => bigint as well as for bigint => number, because if someone tries to convert a
Yes, I get it. Happy to discuss in the history issue #3341. Does that mean you would like me to remove the History sections I added and the code that supported them from this PR, or should I leave them in as not doing any harm, and to keep around the information I generated, and they can always be massaged to the decided-on scheme later? Fine either way, just want to know what to do to get this PR in shape in the most direct way possible.
I will give it a try and let you know. |
OK, got through this. Unsurprisingly (with the free form nature of doc comments) it was a bit more convoluted than
I will work on the last one ASAP and look forward to your feedback on the first two. |
(!) Ok yes let's go for a solution with (2) Thanks. Can you please remove the History sections from this PR? (maybe you can create a new temporary feature branch containing the current History sections for later reuse, so we don't throw away your work) (5) Nice that you managed to get the issue with example lines with comments solved 👍 |
Sure, I can add other safe options. Bigint => number safety is clear, as is BigNumber => number. But number => bignumber is much murkier. Nominally, every number can be exactly represented as a bignumber. But I think the "safety" you have in mind is not wanting 1.20000000001 to become a bignumber because it will make what seems to be roundoff error seem precise. I think we have discussed this before -- there is an algorithm for guessing whether a number comes from a precise rational number, based on continued fractions. But we didn't seem to reach consensus on this before. So my preference would be not to address that type of safety in this pr lest it becomes bogged down. |
OK I have moved the History work to a branch feat/function_history in the main repo. |
Thanks!
Yes agree, let's adress those ideas in an other PR 👍 |
The relevant discussion is in #1485. |
Slightly confused: am implementing the |
Hm. Let me think. Would we do anyone a favor by returning So then the description of the option "safe" would be something like: When true, it will only convert when this will not lead to a loss of information, like when converting a number with digits into a bigint. When false, it will convert also when this will lead to a loss of information. In both cases, an exception will be thrown when the conversion is not possible at all, like when trying to convert the string "foo" into a number. Does that make sense? |
You could alternatively take the perspective that any string can be converted into a number without throwing when However, the conversion from 'string' to every other scalar type does currently take the point of view that there are some strings that represent values of the type and other strings that simply are not representations of that type and throw errors, even when It's unfortunate that the two ways of looking at it (consistency with other types to number, vs. consistency with string to other types) lead to different conclusions. There are a couple other considerations:
-- The main upshot is that whichever of the below alternatives you pick, some cases of some conversions are going to have to change (but I don't think to the point of this PR being a breaking change; I think whichever way you change, all of the behavior changes can be looked at as either extensions or bugfixes to current behavior rather than breaking changes). So in light of all the above, please let me know if you would like me to: (A) When (B) Make type conversions always throw on those elements of the source type that "just don't represent" any element of the destination type. So all conversions from string to other scalar types would have some strings which "just don't represent that type" and throw an error regardless of the "safe" setting; when "safe" is true, even if the overall format of the string is OK, if there are any characters (e.g. too many digits) that aren't reflected in the value, also throw an error. If we go this way, I will add a conversion of Complex to number that always throws when the imaginary part is nonzero, and returns the real part when the imaginary part is zero, under the concept that a Complex number with nontrivial imaginary part "just doesn't represent" a (real) number. Moreover, the consistent behavior under this option for converting 24.5 to bigint would be to always throw (regardless of whether (C) Make type conversions from string always throw on certain strings that "don't represent" the definition type, and all other scalar conversions always return something when (D) Maybe there's some other option I missed that you would prefer to any of the above? Oh and please also let me know your thoughts on whether the implicit conversions (generated by typed-function rules) should be changed to call these constructor functions with |
Maybe I should rephrase
Into:
I think converting a string to a number is really different from converting another numeric data type to a number, because a string can contain anything, whereas a numeric type can only contain numeric information, and, so is in the same "realm" as a number already.
About the ideas (A) to (D): I think it is important to look at it from the user perspective and come up with a pragmatic approach here. Maybe it's easiest to look at the individual cases. What do you think about the following approach?
|
I feel strongly that number(complex()) should be implemented, for two reasons: (a) when the complex number happens to have an imaginary part, then it obviously corresponds to a specific number -- in fact, (b) in our project, we have MaJE formulas that users can freely enter. Right now, there is not a way to check what is the data type that their formula produces (that should come with a mathjs core overhaul, but in the meantime...). But the TypeScript code that receives the value needs an entity of a particular type, say number. So there needs to be a uniform way to coerce the result of the formula into the type the code needs. The fewer exceptions there are to that process, and the fewer opportunities there are for it to throw, the better (cleaner, simpler) it is for our code. (In this vein please see also #3451 which I just discovered.) I also feel that conversions should follow a disciplined principle or philosophy, as they are a notoriously touchy part of any software system that needs to do them, and there needs to be guidance for implementing future ones; I think lack of such principle/guidance led to the concerns this PR is addressing. But which principle? Clearly you're not interested in (A) above, and that's totally fine. You also don't seem to align with (B), because you recommend So whether we could adopt (C) depends on how far you are willing to go with conversions from complex. If it's maintained that all conversions from complex throw, or even just that an unsafe conversion from complex(1,1) to number (or other real type) throws, then (C) is out the window as well. On the other hand, if you were to allow an unsafe conversion from complex(1,1) to any of the other number types to return 1 converted to that type, and only have a safe conversion throw, then it seems like (C) would be a good fit. If (C) is untenable, then it becomes less clear how to proceed with deciding on the behavior of all of the conversions. They are idiosyncratic and have bugs now, as some of the examples in this conversation show, and those things are causing problems in our project, which is after all why I opened this PR. Here is another possible, somewhat more complicated, position that could be adopted: (E) All unsafe conversions between types that represent real numbers (in the mathematical sense), including strings that are printed representations of real numbers in any of the most common conventions for writing out such numbers, should succeed as best as possible, and never throw. Safe conversions throw if there is loss of precision, or introduction of illusory precision. Conversions from strings that aren't (entirely) customary representations of numbers, to any number type, throw. Conversions (even unsafe ones) from other number systems besides the real numbers to any type representing real numbers throw when [two sub-alternatives here: (i) the value to be converted does not directly correspond to a real number; or (ii) always]. Conversions from other data types that cannot be characterized as number systems, to any number type, should always throw. (The current situation with Unit in this regard is pretty painful, see #3451, but I wouldn't propose dealing with that concern in this PR, as it would be a breaking change. ) Conversions from number types to other non-numeric data types are up to the design/semantics of that other data type. At least, (E) would pretty much provide a blueprint for how all conversions should behave. And obviously I would argue strongly for sub-option (E.i) rather than (E.ii) as it will definitely make my life easier. But even (E.ii) would be a specification that could be documented and would direct the completion of this PR and future conversions. Looking forward to your thoughts. |
I'm not sure there is a JavaScripts philosophy, JavaScript can be a mess ;). Anyway, let's focus on coming up with a clear description of the philosophy for mathjs:
I'm not sure on your preference regarding complex conversions. Do you prefer to let I've added a case |
OK, how about something like this: unsafe conversions throw when there is a complete loss of any information (parts of a string that don't encode a number, the nonzero imaginary part of a complex); safe conversions throw in addition when there is either loss of precision or creation of illusory precision. That seems fairly simple and principled, and seems to align well with both the status quo and your preferences. In this scenario, number(complex) would always throw on nonzero imaginary part, and bignumber(complex, {safe: true}) would in addition throw when there is danger that the real part has roundoff error. |
That is a clear description, thanks. Would this approach also work for the project that you're currently working on? And just to be sure: would you like to have a complex number throw (that's my preference) or not? |
Yes, this approach will be fine. I think it means that number(3+0i) succeeds and number(3 + 2i) throws, which is also fine for us. I am assuming that is OK with you. I will proceed on this basis as soon as I am able. I think it's higher priority than nanomath, correct, since it's not good to have this PR "hanging out there"? |
Yes indeed, sounds good! I'm glad we have a satisfying way ahead now.
Feel free to pick up either, depending on how your mood is :). Since this is not a hotfix I think there is no time pressure there. |
Adds options to the bigint constructor function `math.bigint()` to control whether/how non-integer inputs are rounded or errors thrown, and whether "unsafe" values outside the range where integers can be uniquely represented are converted. Changes `math.numeric(x, 'bigint')` to call the bigint constructor configured to allow unsafe values but throw on non-integers (closest approximation to its previous buggy behavior). Also restores documentation generation for constructor functions, and initiates History sections in function doc pages. Resolves josdejong#3366. Resolves josdejong#3368. Resolves josdejong#3341.
…throws" Plus knock-on changes. This "simple" change unsurprisingly altered what doc tests occurred, leading to shaking out a bug in doc testing, which in turn shook out a bug in logical `and` (!). All should be well now. Also prevent `numeric(blah, 'bigint')` from throwing when it needs to round.
719d3a8
to
2fae716
Compare
OK I rebased (except for the logo commit you performed just a short while ago), and updated the top-level datatypes page in the docs, in an effort to reflect everything we've discussed and agreed on. Please could you review the new doc page and let me know if you agree with everything I've written? If so, I will go ahead and complete the PR by making sure that all of the conversion functions conform to the new documentation. Note that conforming effort will include implementing a function |
P.S. On |
I've read up on docs/datatypes/index.md in the PR. I like the diagram, that helps getting an overview! Maybe some day we can replace it with an SVG image. You description with the sections "implicit" and "explicit" is really clear, thanks! Maybe we can try to rephrase the following sentence:
Into something like:
On a side note: wow, there where quite some errors in the "Example usage" section, thanks for improving on that 😅.
Adding a function |
Excellent. Will proceed, including the doc wording improvements, as soon as I can. |
Adds options to the bigint constructor function
math.bigint()
tocontrol whether/how non-integer inputs are rounded or errors thrown,
and whether "unsafe" values outside the range where integers can be
uniquely represented are converted. Changes
math.numeric(x, 'bigint')
to call the bigint constructor configured to round and to allow unsafe
values.
Also restores documentation generation for constructor functions.
and initiates History sections in function doc pagesResolves #3366.
Resolves #3368.
Resolves #3341.