REQUEST: Single GPU inference code for audio_driven model

Could you release/guide on writing single GPU (high/low VRAM) inference code for audio-driven model? I assume loading the audio model using sample_gpu_poor.py with torch.load(strict="false") won't make sense at all. I tried converting the code from sample_batch.py for the audio-driven model while talking sample_gpu_poor.py as reference, while being assisted by Sonnet 4. The code exectuted correctly, the generation took the expected time, but the output had nothing convincing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

REQUEST: Single GPU inference code for audio_driven model #40

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

REQUEST: Single GPU inference code for audio_driven model #40

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions