Skip to content

Comments

sgf: fix issues with pandanet files#148

Merged
ale64bit merged 3 commits intoale64bit:mainfrom
benjaminpjones:lower-props
Dec 13, 2025
Merged

sgf: fix issues with pandanet files#148
ale64bit merged 3 commits intoale64bit:mainfrom
benjaminpjones:lower-props

Conversation

@benjaminpjones
Copy link
Contributor

@benjaminpjones benjaminpjones commented Nov 18, 2025

From discussion on #112

  • CoPyright instead of CP
    • Fix: allow lowercase in props, but remove them from the output
    • I used Sabaki (MIT License) tests and source as inspiration as it seems to do a good job handling this.
  • Whitespace after the initial (;
    • Fix: allow it. Generally, whitespace between tokens shouldn't affect an SGF
  • OS[] field after the game tree
    • Fix: allow arbitrary characters after the game tree, and ignore them

@ale64bit
Copy link
Owner

I don't think the OS[] thing is non-standard. It's just a (admittedly, useless) property within the node and it's grammar conforming. Current parser seems to handle it just fine.

In code review, it was pointed out that this was already working as expected.
Confirmed by updating the test.
@benjaminpjones
Copy link
Contributor Author

benjaminpjones commented Nov 19, 2025

Good catch! I think it threw me off because it's not one of the fields described at https://www.red-bean.com/sgf/. But, to your point, it conforms to the grammar, and the gameTree parser was already working. I reverted the changes to gameTree and updated the test to remove the junk, but still test for the trailing OS[] field.


Parser<String> propIdent() => (uppercase() & uppercase().star()).flatten();
Parser<String> propIdent() => (letter() & letter().star()).flatten().map((s) {
// Extract only uppercase letters (e.g., "CoPyright" -> "CP", "GM" -> "GM")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the rationale for this? 🤔 should we not just extract the whole property identifier? (i.e. CoPyright) wdyt?

Copy link
Contributor Author

@benjaminpjones benjaminpjones Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can choose what we do here, but rationale is threefold:

  • lowercase is technically illegal
  • CP is an official documented field for SGF
  • this is what Sabaki does

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically illegal

That ship has sailed, hasn't it? 😆
But OK, I think that's fair enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, right 😆 I guess what I'm trying to say is that we can kind of do whatever we want because Panda is wrong, and CP seems like the more useful field to extract (if at some point headers are shown in some user-friendly way)

@ale64bit ale64bit merged commit 84fb9cb into ale64bit:main Dec 13, 2025
1 check passed
@benjaminpjones benjaminpjones deleted the lower-props branch December 13, 2025 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants