Conversation
|
@g-hano Did you train with Turkish data? |
Yes I did, had to update many lines but finally was able to train on 4 T4s. Extra training on 46k audio samples made the model produce clearer outputs. |
|
Hello, thank you for supporting the Turkish TTS model, first of all. When I reviewed your code, I noticed that you convert text to lowercase during normalization. Unfortunately, for Turkish, uppercase "I" is converted to lowercase "i," which is incorrect. As a solution, the following change is needed: text.replace("I", "ı").lower(). Since you are using lowercase text, I recommend trying ytu-ce-cosmos/turkish-base-bert-uncased. |
Handle Turkish character 'I' and change tokenizer to ytu-ce-cosmos/turkish-base-bert-uncased
Thank you for your recommendations, I updated the code. |
|
Thanks for the great contribution @g-hano . I'm trying to test it however I couldn't find a way to test. Is there any way to test something like that? |
|
Any huggingface space for testing? or sample output? |
I created
turkish.pyandturkish_bert.pyto support Turkish language. Used dbmdz/bert-base-turkish-cased as tokenizer