-
Notifications
You must be signed in to change notification settings - Fork 13
Support Whisper training with Google Cloud buckets #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
huwenjie333
wants to merge
18
commits into
main
Choose a base branch
from
whisper_gcp
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,258
−66
Open
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
38f72f0
init
huwenjie333 9532576
fixes to start training
huwenjie333 e4e269c
training updates
huwenjie333 5dad82f
clean up dataset load and notebook; add gpu metrics
huwenjie333 c02b14d
support current huggingface_load
huwenjie333 4597e15
add gcs_key_path
huwenjie333 79d7fa5
update SALT_LANGUAGE_TOKENS_WHISPER
huwenjie333 ba89074
add all gcs datasets
huwenjie333 043ee6f
update datasets in yaml by script
huwenjie333 8899665
fix empty folder; fix download_datasets_in_parallel; fix long multili…
huwenjie333 91fea72
multi-thread for valid dataset; skip matching
huwenjie333 78cea35
valid max_examples_per_dataset 50
huwenjie333 78b95bc
train with script; epoch=2; fix dataset max 50;
huwenjie333 398087f
add back data aug; calculate train steps by epoch; eval show progress
huwenjie333 8947be5
reduce eval max exmaple to 20; max_steps: 8000; add eval predict log;…
huwenjie333 c1dac1a
disable augment_audio_noise
huwenjie333 8fb5250
move scrits and configs to sunbird-speech repo
huwenjie333 ddfc940
fix augment_audio_noise error
huwenjie333 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. A further improvement for later, in case it's an ASR/audio dataset and the format already matches, is not to use a generator at all - we just load the huggingface datasets and concatenate them. That should reduce CPU bottleneck and could improve GPU utilisation.