This is an ongoing research, which is just half-done.
This research considered two tasks:
Real World Ground Truth | Video with low temporal resolution | Recovered Video | ||
---|---|---|---|---|
Original Video | Input Video with missing frames | Recovered Video | ||
---|---|---|---|---|
temporal super resolution rate as 4(green frames is given, red frames are missing)
Methods | Running | boxing | clapping | waving |
---|---|---|---|---|
Ground Truth | ||||
Ours | ||||
Niklaus et al. |
Given first 5 frames and last 5 frames, we can interpolate 10 frames in the middle
(green frames is given, red frames are missing)