-
Notifications
You must be signed in to change notification settings - Fork 3.2k
fix(dataset): Fix reindexing bug for videos on splits #2548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR attempts to fix a floating point precision issue where video frames could end up in the wrong split during dataset operations. The approach changes from timestamp-based reindexing to frame-index-based reindexing. However, there is a critical bug in the implementation.
Key Changes:
- Modified
_keep_episodes_from_video_with_avto work with frame indices instead of timestamps - Changed
_copy_and_reindex_videosto extractdataset_from_index/dataset_to_indexinstead of timestamp values - Updated function signatures and logic to use integer frame indices throughout
Critical Issue Found:
The code incorrectly uses global dataset frame indices (dataset_from_index, dataset_to_index) as if they were local frame positions within individual video files. Since video files can contain multiple episodes concatenated together, the global indices don't correspond to actual frame positions in the video file. This will cause the wrong frames to be extracted.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mkiefel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Sometimes during split operations the frame timestamp floating precision leads to frame ending up in the wrong split. This changes fixes the issues by directly working with frame indices instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This is fixing a bug. Is there anything else that I need to start the review process? |
Sometimes during split operations the frame timestamp floating precision leads to frame ending up in the wrong split.
This changes fixes the issues by directly working with frame indices instead.
Closes #2547.
How it was tested
I ran the dataset test creation test script before and after the change and verified that the commit indeed fixed the issue.
How to checkout & try? (for the reviewer)