Skip to content

Training a new language (Latin script) #146

@jheinecke

Description

@jheinecke

I'm trying to find documentation how to add a new language (written in Latin script): Wolof.

  • How much data do I need?
  • how to prepare the data? is it only a language model to train, i.e. distribution of characters, since OCR for Latin alphabet works well or do I need to train from scratch (i.e images of text and their transcriptions)
    Can anybody point me to the documentation ?
    Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions