Skip to content

Mathml tweaks indent#1665

Open
mmatera wants to merge 17 commits intomathml_tweaksfrom
mathml_tweaks_indent
Open

Mathml tweaks indent#1665
mmatera wants to merge 17 commits intomathml_tweaksfrom
mathml_tweaks_indent

Conversation

@mmatera
Copy link
Contributor

@mmatera mmatera commented Jan 30, 2026

This PR implements the indentation in MathMLForm. What is missing here is to adjust the tests.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

@rocky, shall we go with this version of #1661?

@rocky
Copy link
Member

rocky commented Feb 1, 2026

@rocky, shall we go with this version of #1661?

I was just waiting for you to acknowledge that you understand that for properties of specific boxes, especially inherent (immutable) properties like nesting level, it may be better to store those in the box, leaving options for more user customizable properties, like the number of spaces per indent level, or the color or font style indicated across several boxes.

This is kind of a boundary case here, and there is room for interpretation.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

@rocky, shall we go with this version of #1661?

I was just waiting for you to acknowledge that you understand that for properties of specific boxes, especially inherent (immutable) properties like nesting level, it may be better to store those in the box, leaving options for more user customizable properties, like color or font style indicated across several boxes.

This is kind of a boundary case here, and there is room for interpretation.

OK, but nesting level is something that you can evaluate on the fly. And that is how this implementation works. I do not see the need to add that attribute to each specific BoxExpression. Other options like color or fontface are stored in StyleBox options, which propagate to the sub-elements at render time.

Also, because the way BoxExpressions are built, to associate a level level to a BoxExpression is not practical. Notice for example that a String (which is a singleton object, and a BoxElement) can appear at different levels. Consider for example the BoxExpression

RowBox[{"F", "[",RowBox[{"G", "[", "x" ,"]"}] ,"]"}]

In this expression, the string String("[") appears in two different container objects with two different levels.

Consider also this other example:

StyleBox[GridBox[{{"a","b"},{"c","d"}}], FontColor->Red]

It does not make sense to propagate to the Strings "a", "b" , "c", "d" the property FontColor (which has the same problem with the singleton chracracter of String objects) . The property is propagated to inside from the top level, like it happens in html code:

<div style="color:#FF000&">
<table>
<tr><td>a</td><td>b</td></tr>
<tr><td>c</td><td>d</td></tr>
</table>
</div>

On the other hand, if for some render it is needed to specify the color element by element, it can be done on the fly.

@rocky
Copy link
Member

rocky commented Feb 1, 2026

@rocky, shall we go with this version of #1661?

I was just waiting for you to acknowledge that you understand that for properties of specific boxes, especially inherent (immutable) properties like nesting level, it may be better to store those in the box, leaving options for more user customizable properties, like the number of spaces per indent level, or the color or font style indicated across several boxes.

This is kind of a boundary case here, and there is room for interpretation.

In the back of my mind, I have been thinking that the "self" parameter under mathics.format might be better named box. It has a type basebox or some such thing, to make it clearer that there is a box object around.

BTW, that's another thing that distinguishes this from the eval routines. In Boxing routines, the objectness of boxes is used, and constantly varies as the boxing, forming, and rendering occur. In mathics.eval, the objectness of the caller e.g. Plus, Integrate, is not used. There is a global "evaluation" parameter, but that is mainly used in error messages.

@rocky
Copy link
Member

rocky commented Feb 1, 2026

@rocky, shall we go with this version of #1661?

I was just waiting for you to acknowledge that you understand that for properties of specific boxes, especially inherent (immutable) properties like nesting level, it may be better to store those in the box, leaving options for more user customizable properties, like color or font style indicated across several boxes.
This is kind of a boundary case here, and there is room for interpretation.

OK, but nesting level is something that you can evaluate on the fly.

There is still a misunderstanding, then. Box attributes are indeed supposed to be computed and altered in boxing routines. That is normal.

Also, because the way BoxExpressions are built, associating a level to a BoxExpression is not practical.

If you want, maybe I should implement what I suggest, and it will be easier to understand. Or I'll understand why this is not possible. (But if that's the case, we are in a sad position, because I strongly suspect this is the way it should be done, and if done that way, it will reduce the hacky code that is prone to get written when we don't follow conventional patterns.)

Notice for example that a String (which is a singleton object, and a BoxElement) can appear at different levels.

Strings and other atoms don't need nesting levels, they are always one more than the level of their parent. Nesting levels, box dimensions and cumulative properties are only needed in the boxes themselves.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

@rocky, shall we go with this version of #1661?

I was just waiting for you to acknowledge that you understand that for properties of specific boxes, especially inherent (immutable) properties like nesting level, it may be better to store those in the box, leaving options for more user customizable properties, like the number of spaces per indent level, or the color or font style indicated across several boxes.
This is kind of a boundary case here, and there is room for interpretation.

In the back of my mind, I have been thinking that the "self" parameter under mathics.format might be better named box. It has a type basebox or some such thing, to make it clearer that there is a box object around.

This is a different thing: the name self is there just because originally it was a method boxes_to_something of these classes. To reduce the number of changes in the diffs, I keep the name of the parameters until we agree in all these changes. We can rename them in a subsequent step.

BTW, that's another thing that distinguishes this from the eval routines. In Boxing routines, the objectness of boxes is used, and constantly varies as the boxing, forming, and rendering occur. In mathics.eval, the objectness of the caller e.g. Plus, Integrate, is not used. There is a global "evaluation" parameter, but that is mainly used in error messages.

Indeed, these classes works in a more similar way in which Sympy object does. This is because we do not expect complex custom replacement rules. And when we want to use them into an evaluation, we need to convert them back to (evaluable) Expression objects.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

There is still a misunderstanding, then. Box attributes are indeed supposed to be computed and altered in boxing routines. That is normal.

Yes, so?

Also, because the way BoxExpressions are built, associating a level to a BoxExpression is not practical.

If you want, maybe I should implement what I suggest, and it will be easier to understand. Or I'll understand why this is not possible. (But if that's the case, we are in a sad position, because I strongly suspect this is the way it should be done, and if done that way, it will reduce the hacky code that is prone to get written when we don't follow conventional patterns.)

For me, to see your implementation would be useful to understand your point. On the other hand, the pattern I followed here is not so unusual: HTML and the way in which graphics in WL are built follow that pattern.

Notice for example that a String (which is a singleton object, and a BoxElement) can appear at different levels.

Strings and other atoms don't need nesting levels, they are always one more than the level of their parent. Nesting levels, box dimensions and cumulative properties are only needed in the boxes themselves.
OK, but by induction, it happens for all the box classes, which is what I am exploiting here.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

@rocky, maybe I should reformulate the question: the mathml code generated by this patch is good enough to use it in Mathics3? if it is, we can merge this and have the tests inside. Then we can improve the implementation of how this mathml code is generated, contrasting it against the existing tests.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

@rocky, notice that the most laborious thing I had to do in this PR was to update the tests in the YAML file: YAML does not preserve the indentation of the text, unless I put the text as a literal inside quotes. Also, I didn´t find a way to modify parts of a YAML file without spoiling the format, so changes must be done by hand. Maybe you have a better idea on how to store these tests in more readable or automatizable way.

@rocky
Copy link
Member

rocky commented Feb 1, 2026

@rocky, maybe I should reformulate the question: the mathml code generated by this patch is good enough to use it in Mathics3? if it is, we can merge this and have the tests inside. Then we can improve the implementation of how this mathml code is generated, contrasting it against the existing tests.

I'd rather not go down this route, but instead let's see if there is a better foundation to build on.

I'll try to come up with some similar but slightly different code.

@rocky
Copy link
Member

rocky commented Feb 1, 2026

There is still a misunderstanding, then. Box attributes are indeed supposed to be computed and altered in boxing routines. That is normal.

Yes, so?

So the phrase "But nesting level is something that you can evaluate on the fly" is irrelevant. And there is a misunderstanding.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 1, 2026

There is still a misunderstanding, then. Box attributes are indeed supposed to be computed and altered in boxing routines. That is normal.

Yes, so?

So the phrase "But nesting level is something that you can evaluate on the fly" is irrelevant. And there is a misunderstanding.

BoxExpressions are literal objects, so once you create them, they should not be modified, right? Also, the same box expression could be used as parts of other BoxExpressions, so the level is not one of their attributes. The level is something that you compute in the render step, not in the construction of the box expressions. In any case, I think it worth to see your alternative implementation.

@rocky
Copy link
Member

rocky commented Feb 3, 2026

For the most part, the merge conflicts are resolved by changing "self" to "box".

@mmatera mmatera force-pushed the mathml_tweaks_indent branch from d2383b9 to 5bc5469 Compare February 5, 2026 12:10
@mmatera mmatera force-pushed the mathml_tweaks_indent branch from 9ed6424 to b7a4c79 Compare February 6, 2026 21:07
Copy link
Member

@rocky rocky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - note one small change in grammar requested.

Also note in mathml_tweaks, a hardcoded Unicode string was introduced.

Other than this, looks good.


indent_level = options.get("_indent_level", 0)
indent_spaces = " " * indent_level
options["_indent_level"] = indent_level + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is okay since this is an indent level rather than a nesting level.

I fear, though, there may still be a fundamental misunderstanding (or stubbornness?) that will lead to mistakes down the line.

A nesting level is a property of how the expression was written; there is no dispute or variation on that.

An indent level one might imagine being changed according to a user's whim or as an option on MathMLForm.

Irrefutable properties, especially boxing properties, I think would be better put inside the box, and sometimes should be for things that vary between boxes, like nesting and indentation, which is specific to the specific box.

The box property is self here rather than options. A subtle thing to keep in mind when working with options is that in Python, unless the option is passed as a copy, the value seen in the parent can change. But if multiple copies of options are used, we unnecessarily bloat the space and memory management, especially when most of the copies contain essentially the same information.

Acknowledge that you understand this and have considered this, and then we can consider the matter closed here.

# InputForm produce outputs of the form
# InterpretationBox[Style[_String, ...], origin_InputForm, opts___]
assert isinstance(box, StyleBox), f"boxes={box} is not a StyleBox."
assert isinstance(box, StyleBox), f"box={box} are not a StyleBox"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert isinstance(box, StyleBox), f"box={box} are not a StyleBox"
assert isinstance(box, StyleBox), f"box={box} is not a StyleBox"

There is only one single StyleBox. Right?

@@ -44,9 +44,15 @@ def eval_mathmlform(expr: BaseElement, evaluation: Evaluation) -> BoxElementMixi
# #convert_box(boxes)
query = evaluation.parse("Settings`$UseSansSerif")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticing now that it was a mistake to add $UseSansSerif.

WMA uses CurrentValue[$DefaultFrontEnd, DefaultFontProperties].

I guess for now, we have to live with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants