I'm trying to find documentation how to add a new language (written in Latin script): Wolof.
- How much data do I need?
- how to prepare the data? is it only a language model to train, i.e. distribution of characters, since OCR for Latin alphabet works well or do I need to train from scratch (i.e images of text and their transcriptions)
Can anybody point me to the documentation ?
Thanks