Skip to content

Question regarding mark_utf8 #35

@MartinKies

Description

@MartinKies

In write.output.solution (create_ps.r) you have

out.txt = mark_utf8(out.txt)

I am unsure about its purpose. This line sometimes leads to errors when used before my function "fix.parser.inconsistencies" due to incompatibilities with the stringr package, e.g. regarding stringr::str_length().

Uncommenting the line fixes the error and the resulting solution looks fine to me (in particular regarding Umlauts). Perhaps the following code makes my point more clear:

fix.parser.inconsistencies("Test ü")
[1] "Test ü"
mark_utf8("Test ü")
[1] "Test \xfc"
str_length("Test ü")
[1] 6
str_length("Test \xfc")
[1] 6
str_length(mark_utf8("Test ü"))
[1] NA
Warnmeldung:
In stri_length(string) :
invalid UTF-8 byte sequence detected; try calling stri_enc_toutf8()
fix.parser.inconsistencies(mark_utf8("Test ü"))
[1] "Test �"

I am a bit wary whether uncommenting the line is the way to go, because I do not fully understand what its purpose is. Maybe I found an error in mark_utf8 itself, als str_length("Test \xfc") does work?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions