Skip to content

Support for Mistral and Mixtral? #33

@xingyaoww

Description

@xingyaoww

Hi there,

I wonder if there's any plan for supporting Mistral and Mixtral?

For Mistral, I think it should just be a GPT-2 with little tweaks (e.g., different activation function, sliding window attention) which might be easier to support; but not sure if it is more complicated with Mixtral since it is a MoE.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions