Skip to content

Audio missing when using read_video() with video_reader backend #3890

@prabhat00155

Description

@prabhat00155
  • Read entire video
video_path = "data/WUzgd7C1pWA.mp4"
set_video_backend('video_reader')
print(f'set backend: {get_video_backend()}')

visual, audio, info = read_video(video_path, pts_unit='pts')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)

visual, audio, info = read_video(video_path, pts_unit='sec')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)
---
Visual: torch.Size([327, 256, 340, 3]) Audio: torch.Size([0, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
Visual: torch.Size([327, 256, 340, 3]) Audio: torch.Size([0, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
  • Read video from start_pts
video_path = "data/WUzgd7C1pWA.mp4"
set_video_backend('video_reader')
print(f'set backend: {get_video_backend()}')

visual, audio, info = read_video(video_path, start_pts=1001, pts_unit='pts')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)

visual, audio, info = read_video(video_path, start_pts=0.0333667, pts_unit='sec')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)
---
set backend: video_reader
Visual: torch.Size([326, 256, 340, 3]) Audio: torch.Size([0, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
Visual: torch.Size([326, 256, 340, 3]) Audio: torch.Size([0, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
  • Read video from start_pts to end_pts
video_path = "data//WUzgd7C1pWA.mp4"
set_video_backend('video_reader')
print(f'set backend: {get_video_backend()}')

visual, audio, info = read_video(video_path, start_pts=1001, end_pts=2002, pts_unit='pts')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)

visual, audio, info = read_video(video_path, start_pts=0.0333667, end_pts=0.1001000, pts_unit='sec')
print('Visual:', visual.shape, 'Audio:', audio.shape, info)
---
set backend: video_reader
Visual: torch.Size([2, 256, 340, 3]) Audio: torch.Size([0, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}
Visual: torch.Size([3, 256, 340, 3]) Audio: torch.Size([3072, 1]) {'video_fps': 29.970029830932617, 'audio_fps': 48000.0}

cc @bjuncek

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions