Skip to content

Incorrectly packed strings #2

@mindplay-dk

Description

@mindplay-dk

I recently tried to pack a binary-form UUID with this library, and for certain specific UUIDs, it would fail.

I just realized why: you're packing PHP strings as msgpack strings - but msgpack strings are UTF-8 Unicode strings, and the PHP string-type is just binary data.

Some binary sequences will be invalid code-points, so packing/unpacking will fail.

I believe PHP strings should be packed as the binary type in msgpack.

At least that's what the PECL extension does.

The problem of course is if you know you have unicode strings, and if the msgpack recipient on the other end is not a PHP script, and expects strings to be encoded as strings.

Since there's no unicode string type in PHP, the rybakit/msgpack package actually goes so far as to use a UTF-8 detection/validation pattern, which must have detremental performance implications.

I guess there's no way to do this "right" in PHP, but if you're going to encode PHP strings as strings, at least the readme should probably note that binary strings aren't supported?

Alternatively, you could try the slightly faster UTF-8 string detection "hack" I used here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions