Skip to content

IPA tokenizer does not recognize pre-aspiration and pre-nazalization #41

@Anaphory

Description

@Anaphory

I just tried to use segments.Tokenizer()(x, ipa=True) on some data containing pre-aspirated and pre-nazalized consonants and wondered that a subsequent pyclts.TranscriptionSystem('bipa') call complains about very many undefined segments. Apparently segments does not know to associate , and with the subsequent sound, but appends them to the preceding vowel. (A similar problem exists with pre-aspirated consonants, but in that case I understand that distinguishing between pre- and post-aspiration is beyond the complexity segments wants to provide.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions