Inspired by Track-Anything, TrackGPT allows users to detect and track objects in videos using text prompts. It is developed upon GroundingDINO, DetGPT, Segment Anything and XMem. By leveraging the capabilities of DetGPT, TrackGPT is able to interpret user instructions in natural language to segment objects of interest in video frames. Users input a text instruction, and TrackGPT intelligently finds and tracks the specified object throughout the video.
- [2023-05-15] We made TrackGPT public!
In order to execute the code, it is required to have a minimum of either one 40GB GPU or two 32GB GPUs.
This section is to be done.
This repository is released under BSD 3-Clause License.