Skip to content

Conversation

@leroycep
Copy link

If a the msgpacked portion of an LXMessage uses strings (fixstr tags 0xa0 to 0xbf, and probably str_8 (0xd9), str_16 (0xda), and str_32 (0xdb) as well) instead of byte arrays, LXMF will happily unpack them but then crash when trying to decode the string. For example:

0090: 94 CB 41 D9 EF DB 0D 00  00 00 AC 48 65 6C 6C 6F  ..A........Hello
00a0: 2C 20 6C 78 6D 66 21 A0  C0                       , lxmf!..       

(Side note, is there a reason to LXMF uses byte arrays in the msgpack instead of strings? Like, does msgpack not specify UTF-8 as the string format or something?)

This PR changes unpack to have logic similar to LXMessage.__init__, where it will always decode it into UTF-8 byte array based on the type.

@markqvist
Copy link
Owner

Content and titles should never be strings, but always binary data in UTF-8 encoding. To support any language and character set, LXMF only supports UTF-8 for parts that are intended for human readability.

This should have been made more explicit, though, and I think the right way to go about this is to not accept messages where the byte arrays of title and content cannot be decoded as UTF-8. It should definitely not crash, of course ;)

As you can see in set_content_from_string() and set_title_from_string(), the passed string values are encoded to UTF-8, and then set on LXMessage instance. These helper methods exist for message creation. Over the wire, there should never be any string data types in those parts of the message-packed data structure.

@leroycep
Copy link
Author

Quoting the msgpack spec:

  * **String** extending Raw type represents a UTF-8 string

It seems like msgpack defines the string type to mean a UTF-8 encoded byte array. That implies that the implementation should be switched to use only msgpack strings and not byte array, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants