Skip to content
This repository was archived by the owner on Nov 28, 2024. It is now read-only.
This repository was archived by the owner on Nov 28, 2024. It is now read-only.

Fix filename encoding for attachments #11

@ivy

Description

@ivy

Short summary: Non-ASCII filenames should be encoded following RFC 2047 in the name parameter of the Content-Type and RFC 2231 in the filename parameter of the Content-Disposition. For example, an attachment named 今日は世.txt might include the following headers:

Content-Type: text/plain; charset=UTF-8; name=?UTF-8?Q?=E4=BB=8A=E6=97=A5=E3=81=AF=E4=B8=96.txt?=
Content-Disposition: attachment; filename*=%E4%BB%8A%E6%97%A5%E3%81%AF%E4%B8%96.txt

Any proposed fix should first be tested against a few major email clients.


Originally reported in go-gomail#66, non-ASCII filenames are garbled when attaching files to messages. In looking at some of the proposed solutions (go-gomail#83), some people have solved this by adding charset=UTF-8 to the Content-Type header. This may work for some but that is almost certainly a coincidence which I expect causes more issues later for certain clients and attachments.

Digging into various IETF standards, none of the RFCs seem to specify filename encoding for non-ASCII characters. Take RFC 2183, section 2.3 for example:

Current [RFC 2045] grammar restricts parameter values (and hence
Content-Disposition filenames) to US-ASCII. We recognize the great
desirability of allowing arbitrary character sets in filenames, but
it is beyond the scope of this document to define the necessary
mechanisms.

This thread on the ietf-smtp mailing list seems to have the answer:

Finally, if you have to include filename information, either put it in a
filename= parameter or both a filename= and name= parameter. Never ever use
just a name= parameter because that opens you up to gratuitous interpretation
of the part using some disposition value you didn't intend. (I note in passing
that this is what Thunderbird now dows, with the added nuance of using
nonstandard RFC 2047 encoding for the name= paramter and standard RFC 2231
encoding for the filename= parameter.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions