Skip to content

Return lineno:column alongside error messages#18

Open
oisanjaya wants to merge 3 commits intotjol:mainfrom
oisanjaya:error_with_lineno
Open

Return lineno:column alongside error messages#18
oisanjaya wants to merge 3 commits intotjol:mainfrom
oisanjaya:error_with_lineno

Conversation

@oisanjaya
Copy link

@oisanjaya oisanjaya commented Feb 2, 2026

This PR adds additioinal line number and column information on error messages.

What I've done in this PR was declaring static int variable in parser.c named current_line_number and current_column_number, Theese variables value initialized with 1.

Every kdl_parser_next_event() loop I increase value of current_column_number.

Every time parser find KDL_TOKEN_NEWLINE, current_line_number is increased and current_column_number reset back to 1.

Then finally inside _set_parse_error() I allocate new string then format it using snprintf prepending current_line_number current_column_number into final error message

@oisanjaya oisanjaya marked this pull request as ready for review February 2, 2026 12:24
src/parser.c Outdated
_fallthrough_;
case KDL_TOKEN_NEWLINE:
case KDL_TOKEN_SEMICOLON:
current_line_number++;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this will give incorrect line numbers after any semicolon

@tjol
Copy link
Owner

tjol commented Feb 2, 2026

Hi! Thanks for giving this a go!

Without having tested it, I would assume that this sometimes gives incorrect line numbers (particularly when there are semicolons involved) and usually gives incorrect column numbers (since the code appears to assume that 1 token = 1 character, which is normally not true)

I think that the tokenizer would have to keep track of (and return) the line and column for each token for the numbers to be correct.

@oisanjaya
Copy link
Author

OK I'll admit my last PR was a half assed effort. Here's my updated PR.

current_column_number and current_line_number initialized to 1 in kdl_create_string_tokenizer() and kdl_create_stream_tokenizer()

current_column_number increased every _tok_get_char() call

current_column_number decreased back when:

  • _remove_inital_bom() call
  • kdl_pop_token() loop when:
    • End of find whitespace run (overshoot, traceback one character)
    • Current character will be consumed by _pop_string() i.e. found ", r, or #
    • _pop_string() returns KDL_TOKENIZER_OK (overshoot, traceback one character)
    • Current character will be consumed by _pop_word()
  • _pop_word() when end_of_word reached (overshoot, traceback one character)
  • Inside _pop_string():
    • After make determine raw/hash/regular string
    • After counting hash
    • After counting quotes

current_column_number reset to 1 and current_line_number increased upon finding newline:

  • Inside kdl_pop_token() loop
  • Inside _pop_string()

kdl_token struct now includes line_no and col_no to store token location

_set_parse_error() of parser.c now expects current token as an argument to build error message with problematic token location

Reverting back last PR:

  • Remove current_column_number and current_line_number from parser.c as it is now handled in tokenizer.c

I feel like i still miss something, feel free to point me out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants