Skip to content

Alternative assignment rework #1442

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Aug 11, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 26 additions & 17 deletions mathics/builtin/assignments/assignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"""
from typing import Optional

from mathics.core.atoms import String
from mathics.core.atoms import Integer1, String
from mathics.core.attributes import (
A_HOLD_ALL,
A_HOLD_FIRST,
Expand Down Expand Up @@ -137,9 +137,9 @@ class Set(InfixOperator):

summary_text = "assign a value"

def eval(self, lhs, rhs, evaluation):
"lhs_ = rhs_"

def eval(self, expr, evaluation):
"Pattern[expr, _ = _]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1440 for how to do this without introducing Pattern[expr, ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further inspection, we don't use expr here at all. So the original seems better to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's return to the previous pattern.

lhs, rhs = expr.elements
eval_assign(self, lhs, rhs, evaluation)
return rhs

Expand Down Expand Up @@ -217,11 +217,9 @@ class SetDelayed(Set):

summary_text = "test a delayed value; used in defining functions"

def eval(
self, lhs: BaseElement, rhs: BaseElement, evaluation: Evaluation
) -> Symbol:
"lhs_ := rhs_"

def eval(self, expr: BaseElement, evaluation: Evaluation) -> Symbol:
"Pattern[expr, _ := _]"
lhs, rhs = expr.elements
if eval_assign(self, lhs, rhs, evaluation):
return SymbolNull

Expand Down Expand Up @@ -355,14 +353,21 @@ class UpSet(InfixOperator):
attributes = A_HOLD_FIRST | A_PROTECTED | A_SEQUENCE_HOLD
grouping = "Right"

messages = {
"normal": "Nonatomic expression expected at position `1` in `2`.",
"nosym": "`1` does not contain a symbol to attach a rule to.",
}
summary_text = (
"set value and associate the assignment with symbols that occur at level one"
)

def eval(
self, lhs: BaseElement, rhs: BaseElement, evaluation: Evaluation
) -> Optional[BaseElement]:
"lhs_ ^= rhs_"
def eval(self, expr: BaseElement, evaluation: Evaluation) -> Optional[BaseElement]:
"Pattern[expr, _ ^=_]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about the use of Pattern[expr] to create a placeholder for expr as opposed to using something like Expression() or to_expression() and I don't see using Pattern[] here as an improvement, even in those cases where get_eval_Expression() can't be used. (More on get_eval_Expression() below).

Previously, I mentioned the fact that now the docstring or pattern is a little more complicated and harder to understand the intent — capturing is put first and foremost, while everything else follows on the line.

It was claimed that this was somehow also faster. And here, I am not so sure...

First, upon loading, there is code to read the docstring and process it. This is up-front work, whether or not the builtin is ever used. Second, on each evaluation, there is code to bind this variable. So here the question is about whether that work is less than the work done by Expression or to_expression(). Well, if this is done only to provide a more complete error message, then one expects that most of the time and in those cases where it matters most, doing lazy expression creation is also a time win.

Of course, this concern to me is a secondary concern after the misdirection aspect of adding Patter[expr] in a docstring describing the matching pattern, which causes a function to get invoked.

Now let me come back to get_eval_Expression() and how that is different from Expression(). In PR #1446, I made a pass over the Mathics3 Python code to convert Expression() and to_expression() into get_eval_Expression(). Here is what I learned from this.

One difference that can occur in the form of the expression. get_eval_Expression() shows the expression closer to how it was entered, whereas using Expression(), you can tweak this however you want to. But typically, the expression will reflect the reordering of parameters and canonicalization.

This kind of thing is most apparent in algebraic expressions.

The other "difference" is that Cython can have trouble introspecting the call stack. I guess this has to do with its "optimization". Personally, I think we should ditch Cython run over Mathics3 modules, unconditionally. As best as I can tell, it is not giving us any speed advantage nowadays. As for using it as a type checker, I think nowadays the other type checkers, like Mypy, are much better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about the use of Pattern[expr] to create a placeholder for expr as opposed to using something like Expression() or to_expression() and I don't see using Pattern[] here as an improvement, even in those cases where get_eval_Expression() can't be used. (More on get_eval_Expression() below).

OK, in any case, it was not the main goal of this PR, but I found the chance to check if this work and see if there was some consensus to change in that direction.

Previously, I mentioned the fact that now the docstring or pattern is a little more complicated and harder to understand the intent — capturing is put first and foremost, while everything else follows on the line.

It was claimed that this was somehow also faster. And here, I am not so sure...

First, upon loading, there is code to read the docstring and process it. This is up-front work, whether or not the builtin is ever used.

OK, but this overhead happens at load time, and probably can be avoided if we compile and store the rules.

Second, on each evaluation, there is code to bind this variable. So here the question is about whether that work is less than the work done by Expression or to_expression(). Well, if this is done only to provide a more complete error message, then one expects that most of the time and in those cases where it matters most, doing lazy expression creation is also a time win.

Actually, what would be faster would be something like expr:Set[___], just because to match the pattern -when there are no other rules- would be faster than matching Set[x_,y_]. In any case, this would impact performance only if we use several assignment statements in a loop.

Of course, this concern to me is a secondary concern after the misdirection aspect of adding Patter[expr] in a docstring describing the matching pattern, which causes a function to get invoked.

Now let me come back to get_eval_Expression() and how that is different from Expression(). In PR #1446, I made a pass over the Mathics3 Python code to convert Expression() and to_expression() into get_eval_Expression(). Here is what I learned from this.

One difference that can occur in the form of the expression. get_eval_Expression() shows the expression closer to how it was entered, whereas using Expression(), you can tweak this however you want to. But typically, the expression will reflect the reordering of parameters and canonicalization.

An even simpler way to find the expression that is actually evaluated would be to store it in another attribute of the Evaluation object. This does not require any call to introspection functions or loops: just access to a Python variable. In any case, I think the proposal of using get_eval_Expression() is good in connection with the current development of the debugger API.

This kind of thing is most apparent in algebraic expressions.

The other "difference" is that Cython can have trouble introspecting the call stack. I guess this has to do with its "optimization". Personally, I think we should ditch Cython run over Mathics3 modules, unconditionally. As best as I can tell, it is not giving us any speed advantage nowadays. As for using it as a type checker, I think nowadays the other type checkers, like Mypy, are much better.

This is another problem. In any case, this is a different problem. By now, since the goal is to improve mathics.eval.assignment module, I removed the changes in mathics.builtin except for the ones required to pass the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An even simpler way to find the expression that is actually evaluated would be to store it in another attribute of the Evaluation object. This does not require any call to introspection functions or loops: just access to a Python variable. In any case, I think the proposal of using get_eval_Expression() is good in connection with the current development of the debugger API.

See #1450 as an example of this approach

lhs, rhs = expr.elements
if not hasattr(lhs, "elements"):
# This should be the argument of this method...
evaluation.message(self.get_name(), "normal", Integer1, expr)
return None

eval_assign(self, lhs, rhs, evaluation, upset=True)
return rhs
Expand Down Expand Up @@ -395,10 +400,14 @@ class UpSetDelayed(UpSet):
"with symbols that occur at level one"
)

def eval(
self, lhs: BaseElement, rhs: BaseElement, evaluation: Evaluation
) -> Symbol:
"lhs_ ^:= rhs_"
def eval(self, expr: BaseElement, evaluation: Evaluation) -> Symbol:
"Pattern[expr, _ ^:=_]"
lhs, rhs = expr.elements
if not hasattr(lhs, "elements"):
# This should be the argument of this method...
expression = Expression(Symbol(self.get_name()), lhs, rhs)
evaluation.message(self.get_name(), "normal", Integer1, expr)
return None

if eval_assign(self, lhs, rhs, evaluation, upset=True):
return SymbolNull
Expand Down
5 changes: 5 additions & 0 deletions mathics/builtin/patterns/composite.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,11 @@ def match(self, expression: Expression, pattern_context: dict):
# yield new_vars_dict, rest
self.pattern.match(expression, pattern_context)

def get_sort_key(self, pattern_sort=True):
if not pattern_sort:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of parameter patten_sort, why not define function get_pattern_sort_key()?

It would be nice to add a docstring that explains that how each sort is used, e.g. get_sort_key is used in ordering of expression elements. While get_pattern_sort_key is used to prioritize which rule to select.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return self.expr.get_sort_key()
return self.pattern.get_sort_key(True)


class Longest(Builtin):
"""
Expand Down
29 changes: 29 additions & 0 deletions mathics/core/assignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,35 @@ def normalize_lhs(lhs, evaluation):
return lhs, lookup_name


def pop_focus_head(lhs: Expression, focus: BaseElement):
"""
Convert expressions of the form
```
Head1[Head2[...Headn[FocusHead[a,b],p1],p2,...]..]
```
into
```
FocusHead[Head1[Head2[...Headn[a],p1],p2,...]..],b]
```
Used in eval_assign_[n|format|...]
"""
if lhs is focus:
return lhs

lhs_head = lhs.get_head()
if lhs_head is focus:
return lhs

elems = lhs.elements
focus_expr = elems[0]
if focus_expr.get_head() is not focus:
focus_expr = pop_focus_head(focus_expr, focus)

focus_elems = focus_expr.elements
inner = Expression(lhs_head, focus_elems[0], *elems[1:])
return Expression(focus, inner, *focus_elems[1:])


def repl_pattern_by_symbol(expr: BaseElement) -> BaseElement:
"""
If `expr` is a named pattern expression `Pattern[symb, pat]`,
Expand Down
7 changes: 6 additions & 1 deletion mathics/core/builtin.py
Original file line number Diff line number Diff line change
Expand Up @@ -462,12 +462,17 @@ def get_functions(self, prefix="eval", is_pymodule=False):
if pattern is None: # Fixes PyPy bug
continue
else:
# TODO: consider to use a more sophisticated
# TODO 1: consider to use a more sophisticated
# regular expression, which handles breaklines
# more properly, that supports format names
# with contexts (context`name) and be less
# fragile against leaving spaces between the
# elements.
#
# TODO 2: allow
# expr: pat
# to allow passing the whole expression instead their elements.
# This requires to change how Format rules are stored...
m = re.match(
r"[(]([\w,]+),[ ]*[)]\:\s*(.*)", pattern.replace("\n", " ")
)
Expand Down
34 changes: 24 additions & 10 deletions mathics/core/definitions.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,9 +161,6 @@ def __init__(
"Global`",
)
self.inputfile = ""

# These are used by TraceEvaluation to```
# whether what information to show.
self.trace_evaluation = False
self.trace_show_rewrite = False
self.timing_trace_evaluation = False
Expand Down Expand Up @@ -872,9 +869,10 @@ def strip_pattern_name_and_condition(pat) -> BaseElement:
# We have to use get_head_name() below because
# pat can either SymbolCondition or <AtomPattern: System`Condition>.
# In the latter case, comparing to SymbolCondition is not sufficient.
if pat.get_head_name() == "System`Condition":
if len(pat.elements) > 1:
return strip_pattern_name_and_condition(pat.elements[0])
if pat.has_form(("System`Condition", "System`PatternTest"), 2):
return strip_pattern_name_and_condition(pat.elements[0])
if pat.has_form("System`HoldPattern", 1):
return strip_pattern_name_and_condition(pat.elements[0])
# The same kind of get_head_name() check is needed here as well and
# is not the same as testing against SymbolPattern.
if pat.get_head_name() == "System`Pattern":
Expand Down Expand Up @@ -996,10 +994,26 @@ def insert_rule(values: List[BaseRule], rule: BaseRule) -> None:

"""

for index, existing in enumerate(values):
if existing.pattern.sameQ(rule.pattern):
del values[index]
break
def is_conditional(x):
"""
Check if the replacement rule is a conditional replacement.
FunctionApplyRules are always considered "conditional", while
replacement rules are conditional if the replace attribute is
a conditional expression.
"""
return not hasattr(x, "replace") or x.replace.has_form("System`Condition", 2)

# If the rule is not conditional, and there are
# equivalent rules which are not conditional either,
# remove them.
if not is_conditional(rule):
for index, existing in enumerate(values):
if is_conditional(existing):
continue
if existing.pattern.sameQ(rule.pattern):
del values[index]
break

# use insort_left to guarantee that if equal rules exist, newer rules will
# get higher precedence by being inserted before them. see DownValues[].
bisect.insort_left(values, rule)
Expand Down
28 changes: 22 additions & 6 deletions mathics/core/expression.py
Original file line number Diff line number Diff line change
Expand Up @@ -613,19 +613,35 @@ def evaluate(

def evaluate_elements(self, evaluation) -> "Expression":
"""
return a new expression with the same head, and the
evaluable elements evaluated.
return a new expression with the head and the
evaluable elements evaluated, according to the attributes.
"""
head = self._head
if isinstance(head, EvalMixin):
head = head.evaluate(evaluation) or head
attributes = head.get_attributes(evaluation.definitions)
if (A_HOLD_ALL | A_HOLD_ALL_COMPLETE) & attributes:
return Expression(head, *self._elements)
if A_HOLD_REST & attributes:
first, *rest = self._elements
if isinstance(first, EvalMixin):
first = first.evaluate(evaluation) or first
return Expression(head, first, *rest)

elements = []
for element in self._elements:
for pos, element in enumerate(self._elements):
if pos == 0 and (A_HOLD_FIRST & attributes):
elements.append(element)
continue

if isinstance(element, EvalMixin):
result = element.evaluate(evaluation)
if result is not None:
element = result
elements.append(element)
head = self._head
if isinstance(head, Expression):
head = head.evaluate_elements(evaluation)

# if isinstance(head, Expression):
# head = head.evaluate_elements(evaluation)
return Expression(head, *elements)

def filter(self, head, cond, evaluation: Evaluation, count: Optional[int] = None):
Expand Down
42 changes: 36 additions & 6 deletions mathics/core/rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def eval_f(self, x, evaluation) -> Optional[Expression]:
from mathics.core.evaluation import Evaluation
from mathics.core.expression import Expression
from mathics.core.pattern import BasePattern, StopGenerator
from mathics.core.symbols import strip_context
from mathics.core.symbols import SymbolTrue, strip_context


def _python_function_arguments(f):
Expand All @@ -76,6 +76,15 @@ class StopGenerator_BaseRule(StopGenerator):
pass


class RuleApplicationFailed(Exception):
"""
Exception raised when a condition fails
in the RHS, indicating that the match have failed.
"""

pass


class BaseRule(KeyComparable, ABC):
"""This is the base class from which the FunctionApplyRule and
Rule classes are derived from.
Expand Down Expand Up @@ -134,9 +143,10 @@ def yield_match(vars, rest):
if isinstance(self, FunctionApplyRule)
else self.apply_rule
)
new_expression = apply_fn(expression, vars, options, evaluation)
if new_expression is None:
new_expression = expression
try:
new_expression = apply_fn(expression, vars, options, evaluation)
except RuleApplicationFailed:
return None
if rest[0] or rest[1]:
result = Expression(
expression.get_head(),
Expand Down Expand Up @@ -254,6 +264,13 @@ def apply_rule(
new = self.replace.replace_vars(vars)
new.options = options

while new.has_form("System`Condition", 2):
new, cond = new.get_elements()
if isinstance(cond, Expression):
cond = cond.evaluate(evaluation)
if cond is not SymbolTrue:
raise RuleApplicationFailed()

# If options is a non-empty dict, we need to ensure
# reevaluation of the whole expression, since 'new' will
# usually contain one or more matching OptionValue[symbol_]
Expand Down Expand Up @@ -284,6 +301,16 @@ def get_replace_value(self) -> BaseElement:
def __repr__(self) -> str:
return "<Rule: %s -> %s>" % (self.pattern, self.replace)

def get_sort_key(self, pattern_sort=True) -> tuple:
# FIXME: check if this makes sense:
if not pattern_sort:
return tuple((self.system, self.pattern.get_sort_key(False)))

sort_key = list(self.pattern.get_sort_key(True))
if self.replace.has_form("System`Condition", 2):
sort_key[-1] = 0
return tuple((self.system, tuple(sort_key)))


class FunctionApplyRule(BaseRule):
"""
Expand Down Expand Up @@ -358,9 +385,12 @@ def apply_function(
# context marks.
vars_noctx = dict(((strip_context(s), vars[s]) for s in vars))
if options:
return self.function(evaluation=evaluation, options=options, **vars_noctx)
return (
self.function(evaluation=evaluation, options=options, **vars_noctx)
or expression
)
else:
return self.function(evaluation=evaluation, **vars_noctx)
return self.function(evaluation=evaluation, **vars_noctx) or expression

def __repr__(self) -> str:
# Cython doesn't allow f-string below and reports:
Expand Down
Loading
Loading