Question regarding mark_utf8

In `write.output.solution` (create_ps.r)  you have 

out.txt = mark_utf8(out.txt)

I am unsure about its purpose. This line sometimes leads to errors when used before my function "fix.parser.inconsistencies" due to incompatibilities with the stringr package, e.g. regarding stringr::str_length().

Uncommenting the line fixes the error and the resulting solution looks fine to me (in particular regarding Umlauts). Perhaps the following code makes my point more clear:

> fix.parser.inconsistencies("Test ü")
[1] "Test ü"
> mark_utf8("Test ü")
[1] "Test \xfc"
> str_length("Test ü")
[1] 6
> str_length("Test \xfc")
[1] 6
> str_length(mark_utf8("Test ü"))
[1] NA
Warnmeldung:
In stri_length(string) :
  invalid UTF-8 byte sequence detected; try calling stri_enc_toutf8()
> fix.parser.inconsistencies(mark_utf8("Test ü"))
[1] "Test �"

I am a bit wary whether uncommenting the line is the way to go, because I do not fully understand what its purpose is. Maybe I found an error in mark_utf8 itself, als str_length("Test \xfc") does work?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question regarding mark_utf8 #35

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question regarding mark_utf8 #35

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions