Skip to content

MathML tweaks#1661

Closed
mmatera wants to merge 31 commits intomasterfrom
mathml_tweaks
Closed

MathML tweaks#1661
mmatera wants to merge 31 commits intomasterfrom
mathml_tweaks

Conversation

@mmatera
Copy link
Contributor

@mmatera mmatera commented Jan 29, 2026

This PR makes some tweaks on how mathml output is render.

  • adds line breaks following the WMA usage.
  • Handle more carefully InterpretationBox to work with InputForm and OutputForm.
  • Remove OutputForm[s_String] format rule.
  • update format-tests-WMA.yaml with more tests compared against WS.
  • Update format-tests.yaml accordingly.

"\u2146",
"\u301a", # [[
"\u301b", # ]]
"\u00d7", # \[Times]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this be picked up from character tables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I have to check if we are not doing already that.

"\u2062",
"\u222b",
"\u2146",
"\u301a", # [[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently, RightDoubleBracket and LeftDoubleBracket were tagged as operators. I suspect that these can now be picked up from the character tables.

_options.update(options)
options = _options
return "<mfrac>%s %s</mfrac>" % (
return "<mfrac>\n%s\n %s\n</mfrac>" % (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a typographic difference here, or is this simply to make human comprehension easier when looking at the text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just human readability, and helps to compare with WMA output, which split the output in lines. Also, WMA uses indentation for the different levels, which we are not supporting right now. For example

In[1]:= a^b/c//MathMLForm                                                       

Out[1]//MathMLForm= <math>
                     <mfrac>
                      <msup>
                       <mi>a</mi>
                       <mi>b</mi>
                      </msup>
                      <mi>c</mi>
                     </mfrac>
                    </math>

while in Mathics3

In[1]:= a^b/c//MathMLForm
Out[1]//MathMLForm= <mfrac>
                    <msup>
                    <mi>a</mi> 
                    <mi>b</mi>
                    </msup>
                     <mi>c</mi>
                    </mfrac>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the output split in lines, we can add the indentation later, or remove it from the WMA output for doing the comparison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just human readability, and helps to compare with WMA output, which split the output in lines. Also, WMA uses indentation for the different levels, which we are not supporting right now. For example

In[1]:= a^b/c//MathMLForm                                                       

Out[1]//MathMLForm= <math>
                     <mfrac>
                      <msup>
                       <mi>a</mi>
                       <mi>b</mi>
                      </msup>
                      <mi>c</mi>
                     </mfrac>
                    </math>

while in Mathics3

In[1]:= a^b/c//MathMLForm
Out[1]//MathMLForm= <mfrac>
                    <msup>
                    <mi>a</mi> 
                    <mi>b</mi>
                    </msup>
                     <mi>c</mi>
                    </mfrac>

Ok. Got it. I was not opposed to it, but just wondered about the intention. (BTW, adding a comment to this effect will help others from wondering or thinking about this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is where to leave the comment. Maybe in the head of the file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did realize that indentation is something that can be done on the fly. The question is if we want to do it. I had my doubts about adding the line breaks, because -differently from WMA graphics interface- we use mathml form to render output. So, adding complexity and length to the output makes harder to the browser show the output.

On the other hand, adding the line breaks makes easier the comparison between Mathics3 and WMA results. So, I proposed an intermediate format.

It's okay to discuss approaches before coding and PR's.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so, thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BoxExpression should have an integer "nesting_level" attribute, and that value times the string value of an indent is what prefaces the tags after a "\n".

INDENT_SPACES = "  " # or "\t" or whatever
(f"\n{self.nesting_level * INDENT_SPACES}).join(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but the first thing is that we want to indent the mathml code. A way to achieve that is to use an argument for the render functions tracking the indentation level of the context. If we agree that it a good idea, I can implement it.

Copy link
Member

@rocky rocky Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is set in a BoxExpresssion object, MathML rendering can pick up this attribute using its self parameter.

for line in text.split("\n"):
outtext += render("<mtext>%s</mtext>", line)
return outtext
return "".join(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where getting the proper indent level would be done.

This was referenced Feb 1, 2026
rocky and others added 6 commits February 3, 2026 09:54
Also: remove "_" from non-private classes. Go over render docstrings.
Initialize:
  * BoxExpression.boxes
  * GraphicsBox.boxwidth
  * GraphicsBox.boxheight
  * GrapnicsBox.boxes

Make sure to convert sympy.Float to Python float
mmatera and others added 3 commits February 7, 2026 10:07
Co-authored-by: R. Bernstein <rocky@users.noreply.github.com>
Co-authored-by: R. Bernstein <rocky@users.noreply.github.com>
@rocky
Copy link
Member

rocky commented Feb 7, 2026

@mmatera What the test expects is ⅆ (U2146), but what the named character encoded is 𝑑 (U0001D451).

Notes in the YAML say that the latter is what we want.

So do we adjust the test expectation or change to use 2146, which I think we decided looks worse?

This code does not follow the comment about how this turns into invisible space. When this was \u2146, that might have been true, but we use for DifferentialD the value U0001D451
@mmatera
Copy link
Contributor Author

mmatera commented Feb 7, 2026

Superseed by #1665

@mmatera mmatera closed this Feb 7, 2026
@mmatera mmatera deleted the mathml_tweaks branch February 7, 2026 19:09
@mmatera
Copy link
Contributor Author

mmatera commented Feb 7, 2026

@mmatera What the test expects is ⅆ (U2146),

I would keep this one (ⅆ). Let's adjust it in another round.

but what the named character encoded is 𝑑 (U0001D451).

Notes in the YAML say that the latter is what we want.

So do we adjust the test expectation or change to use 2146, which I think we decided looks worse?

@rocky
Copy link
Member

rocky commented Feb 7, 2026

@mmatera What the test expects is ⅆ (U2146),

I would keep this one (ⅆ). Let's adjust it in another round.

We don't have a named character for this. How does this get into the output then?

@mmatera
Copy link
Contributor Author

mmatera commented Feb 7, 2026

@mmatera What the test expects is ⅆ (U2146),

I would keep this one (ⅆ). Let's adjust it in another round.

We don't have a named character for this. How does this get into the output then?

Maybe '[DifferentialD]' must be associated to 'ⅆ' in Mathics-Scanner.

@rocky
Copy link
Member

rocky commented Feb 7, 2026

@mmatera What the test expects is ⅆ (U2146),

I would keep this one (ⅆ). Let's adjust it in another round.

We don't have a named character for this. How does this get into the output then?

As I wrote before:

Notes in the YAML say that the latter is what we want.

@mmatera
Copy link
Contributor Author

mmatera commented Feb 7, 2026

In that case, OK, let's update the renders to use the U0001D451 character.

@rocky
Copy link
Member

rocky commented Feb 8, 2026

In that case, OK, let's update the renders to use the U0001D451 character.

That's already done automatically. That's why we got the test failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants