forked from cvondrick/vatic
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
527 lines (349 loc) · 19.9 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
vatic - Video Annotation Tool from Irvine, California
vatic is an online video annotation tool for computer vision research that
crowdsources work to Amazon's Mechanical Turk. Our tool makes it easy to build
massive, affordable video data sets.
This document will describe how to install and use vatic. If you want to modify
vatic, please read DEVELOPERS after reading this document.
== INSTALLATION ===============================================================
Note: vatic has only been tested on Ubuntu with Apache 2.2 HTTP server and a
MySQL server. This document will describe installation on this platform,
however it should work any operating system and with any server.
--- Download ------------------------------------------------------------------
You can download and extract vatic from our website. Note: do NOT run the
installer as root.
$ wget http://mit.edu/vondrick/vatic/vatic-install.sh
$ chmod +x vatic-install.sh
$ ./vatic-install.sh
$ cd vatic
--- HTTP Server Configuration -------------------------------------------------
Open the Apache configuration file. On Ubuntu, this file is located at:
/etc/apache2/sites-enabled/000-default
If you do not use Apache on this computer for any other purpose, replace the
contents of the file with:
WSGIDaemonProcess www-data
WSGIProcessGroup www-data
<VirtualHost *:80>
ServerName vatic.domain.edu
DocumentRoot /path/to/vatic/public
WSGIScriptAlias /server /path/to/vatic/server.py
CustomLog /var/log/apache2/access.log combined
</VirtualHost>
updating ServerName with your domain name, DocumentRoot with the path to
the public directory in vatic, and WSGIScriptAlias to vatic's server.py file.
If you do use Apache for other purposes, you will have to setup a new virtual
host with the correct document root and script alias, as shown above.
Make sure you have the mod_headers module enabled:
$ sudo cp /etc/apache2/mods-available/headers.load /etc/apache2/mods-enabled
After making these changes, restart Apache:
$ sudo apache2ctl graceful
--- SQL Server Configuration --------------------------------------------------
We recommend creating a separate database specifically for vatic:
$ mysql -u root
mysql> create database vatic;
The next section will automatically create the necessary tables.
--- Setup ---------------------------------------------------------------------
Inside the vatic directory, copy config.py-example to config.py:
$ cp config.py-example config.py
Then open config.py and make changes to the following variables in order to
configure vatic:
signature Amazon Mechanical Turk AWS signature (secret access key)
accesskey Amazon Mechanical Turk AWS access key (access key ID)
sandbox If true, put into Mturk sandbox mode. For debugging.
localhost The local HTTP address: http://vatic.domain.edu/ so it
matches the ServerName in Apache.
database Database connection string: for example,
mysql://user:pass@localhost/vatic
geolocation API key from ipinfodb.com for geolocation services
After saving results, you can then initialize the database:
$ turkic setup --database
Note: if you want to reset the database, you can do this with:
$ turkic setup --database --reset
which will require confirmation to reset in order to prevent data loss.
Finally, you must also allow vatic to access turkic, a major dependency:
$ turkic setup --public-symlink
== ANNOTATION =================================================================
Before you continue, you should verify that the installation was correct. You
can verify this with:
$ turkic status --verify
If you receive any error messages, it means the installation was not complete
and you should review the previous section.
--- Frame Extraction ----------------------------------------------------------
Our system requires that videos are extracted into JPEG frames. Our tool can
do this automatically for you:
$ mkdir /path/to/output/directory
$ turkic extract /path/to/video.mp4 /path/to/output/directory
By default, our tool will resize the frames to fit within a 720x480 rectangle.
We believe this resolution is ideal for online video viewing. You can change
resolution with options:
$ turkic extract /path/to/video.mp4 /path/to/output/directory
--width 1000 --height 1000
or
$ turkic extract /path/to/video.mp4 /path/to/output/directory
--no-resize
The tool will maintain aspect ratio in all cases.
--- Importing a Video ---------------------------------------------------------
After extracting frames, the video can be imported into our tool for
annotation. The general syntax for this operation is:
$ turkic load identifier /path/to/output/directory Label1 Label2 LabelN
where identifier is a unique string that you will use to refer to this video,
/path/to/output/directory is the directory of frames, and LabelX are class
labels that you want annotated (e.g., Person, Car, Bicycle). You can have as
many class labels as you wish, but you must have at least one.
When a video is imported, it is broken into small segments typically of only a
few seconds. When all the segments are annotated, the annotations are merged
across segments because each segment overlaps another by a small margin.
The above command specifies all of the required options, but there are many
options available as well. We recommend using these options.
MTurk Options
--title The title that MTurk workers see
--description The description that MTurk workers see
--duration Time in seconds that a worker has to complete the task
--lifetime Time in seconds that the task is online
--keywords Keywords that MTurk workers can search on
--offline Disable MTurk and use for self annotation only
Compensation Options
--cost The price advertised to MTurk workers
--per-object-bonus A bonus in dollars paid for each object
--completion-bonus A bonus in dollars paid for completing the task
Qualification Options
--min-approved-percent Minimum percent of tasks the worker must have
approved before they can work for you
--min-approved-amount Minimum number of tasks that the worker must
have completed before they can work for you
Video Options
--length The length of each segment for this video in frames
--overlap The overlap between segments in frames
--use-frames When splitting into segments, only the frame intervals
specified in this file. Each line should contain a
start frame, followed by a space, then the stop frame.
Frames outside the intervals in this file will be
ignored.
--skip If specified, request annotations only every N frames.
--blow-radius When a user marks an annotation, blow away all other
annotations within this many frames. If you want to
allow the user to make fine-grained annotations, set
this number to a small integer, or 0 to disable. By
default, this is 5, which we recommend.
You can also specify temporal attributes that each object label can take on.
For example, you may have a person object with attributes "walking", "running",
or "sitting". You can specify attributes the same way as labels, except you
prepend an ~ before the text, which bind the attribute to the previous label:
$ turkic load identifier /path/to/output/directory Label1 ~Attr1A ~Attr1B
Label2 ~Attr2A ~Attr2B ~Attr2C Label3
In the above example, Label1 will have attributes Attr1A and Attr1B, Label2
will have attributes Attr2B, Attr2B, and Attr2C and Label3 will have no
attributes. Specifying attributes is optional.
--- Gold Standard Training ---------------------------------------------------
It turns out that video annotation is extremely challenging and most MTurk
workers lack the necessary patience. For this reason, we recommend requiring
workers to pass a "gold standard" video. When a new worker visits the task,
they will be redirected to a video for which the annotations are already known.
In order to move on to the true annotations, the worker must correctly annotate
the gold standard video first. We have found that this approach significantly
improves the quality of the annotations.
To use this feature, import a video to be used as the gold standard:
$ turkic load identifier-train /path/to/frames Label1 Label2 LabelN
--for-training --for-training-start 0 --for-training-stop 500
--for-training-overlap 0.5 --for-training-tolerance 0.1
--for-training-mistakes 1
You can also use any of the options described above. Explanations for the new
options are as follows:
--for-training Specifies that this video is gold standard
--for-training-start Specifies the first frame to use
--for-training-stop Specifies the last frame to use
--for-training-overlap Percent overlap that worker's boxes must match
--for-training-tolerance Percent that annotations must agree temporally
--for-training-mistakes The number of completely wrong annotations
allowed. We recommend setting this to a small,
nonzero integer.
After running the above command, it will provide you with an URL for you to
input the ground truth annotation. You must make this ground truth annotation
as careful as possible, as it will be used to evaluate future workers.
You can now specify that a video should use a gold standard video:
$ turkic load identifier /path/to/output/directory Label1 Label2 LabelN
--train-with identifier-train
When a not-yet-seen worker visits this video, they will now be redirected to
to the training video and be required to pass the evaluation test first.
--- Publishing Tasks ---------------------------------------------------------
When you are ready for the MTurk workers to annotate, you must publish the
tasks, which will allow workers to start annotating:
$ turkic publish
You can limit the number of tasks that are published:
$ turkic publish --limit 100
Running above command repeatedly will launch tasks in batches of 100. You can
also disable all pending tasks:
$ turkic publish --disable
which will "unpublish" tasks that have not yet been completed.
If you have videos that are offline only, you can see their access URLs with
the command:
$ turkic publish --offline
Note: for the above command to work, you must have loaded the video with the
--offline parameter as well:
$ turkic load identifier /path/to/frames Person --offline
--- Checking the Status ------------------------------------------------------
You can check the status of the video annotation server with the command:
$ turkic status
This will list various statistics about the server, such as number of jobs
published and how many are completed. You can get even more statistics by
requesting additional information from Amazon:
$ turkic status --turk
which will output how much money is left in your account, among other
statistics.
When all the videos are annotated, the last line will read:
Server is offline.
--- Retrieving Annotations ---------------------------------------------------
You can get all the annotations for a video with the command:
$ turkic dump identifier -o output.txt
which will write the file "output.txt" where each line contains one
annotation. Each line contains 10+ columns, separated by spaces. The
definition of these columns are:
1 Track ID. All rows with the same ID belong to the same path.
2 xmin. The top left x-coordinate of the bounding box.
3 ymin. The top left y-coordinate of the bounding box.
4 xmax. The bottom right x-coordinate of the bounding box.
5 ymax. The bottom right y-coordinate of the bounding box.
6 frame. The frame that this annotation represents.
7 lost. If 1, the annotation is outside of the view screen.
8 occluded. If 1, the annotation is occluded.
9 generated. If 1, the annotation was automatically interpolated.
10 label. The label for this annotation, enclosed in quotation marks.
11+ attributes. Each column after this is an attribute.
By default, the above command will not attempt to merge annotations across
shot segments. You can request merging with the command:
$ turkic dump identifier -o output.txt --merge --merge-threshold 0.5
The --merge-threshold option is optional, but it is a number between 0 and 1
that represents much the paths must agree in order to merge. 1 specifies a
perfect match and 0 specifies no match. In practice, 0.5 is sufficient. Merging
is done using the Hungarian algorithm.
You can also scale annotations by a factor, which is useful for when the
videos have been downsampled:
$ turkic dump identifier -o output.txt -s 2.8
or force it to fit within a max dimension:
$ turkic dump identifier -o output.txt --dimensions 400x200
The command can also output to many different formats. Available formats are:
--xml Use XML
--json Use JSON
--matlab Use MATLAB
--pickle Use Python's Pickle
--labelme Use LabelMe video's XML format
--pascal Use PASCAL VOC format, treating each frame as an image
The specifications for these formats should be self explanatory.
--- Visualizing Videos -------------------------------------------------------
You can preview the annotations by visualizing the results:
$ turkic visualize identifier /tmp --merge
which will output frames to /tmp with the bounding boxes with the file name
as the frame number. The visualization will contain some meta information
that can help you identify bad workers. You can remove this meta information
with the option:
$ turkic visualize identifer /tmp --merge --no-augment
If you want to make a video of the visualization (e.g., with ffmpeg), it is
useful to renumber the frames so that they start counting at 0 and do not
have any gaps:
$ turkic visualize identifier /tmp --merge --renumber
If you wish to display the class label and their attributes next to the box,
specify the --labels option:
$ turkic visualize identifier /tmp --labels
--- Compensating Workers -----------------------------------------------------
When you are ready, you can compensate workers:
$ turkic compensate --default accept
which will pay all workers for all outstanding tasks. We strongly recommend
paying all workers regardless of their quality. You should attempt to pay
workers at least once per day.
--- Quality Control ----------------------------------------------------------
The gold standard does a "pretty good" job of weeding out bad workers.
Nonetheless, there will always be bad workers that we must identify and
invalidate. Our tool provides a method to sample the annotations provided by
workers, which you can then manually verify for correctness:
$ turkic sample /tmp
which by default will pick 3 random videos that the worker has completed, and
pick 4 random frames from each of those videos, and write visualiations to a
file in /tmp. You can tweak the number of videos and the number of frames with
the options:
$ turkic sample /tmp --number 3 --frames 4
Moreover, you can only look at work from a certain date:
$ turkic sample /tmp --since "yesterday"
The filename will follow the format of WORKERID-JOBID.jpg. Once you have
identified a mallicious worker, you can block them, invalidate ALL of their
work, and respawn their jobs with the command:
$ turkic invalidate workerid
The options are also available:
--no-block invalidate and respawn, but don't block
--no-publish block and invalidate, but don't respawn
You can also invalidate and respawn individual jobs with the command:
$ turkic invalidate --hit hitid
Furthermore, if you have found a small mistake in a video and want to make
the correction yourself, you can start an annotation session initialized with
the MTurk workers annotations:
$ turkic vet identifier
$ turkic vet identifier frame
where identifier is the identifier for the video and frame is the frame number
that the error occurs. In most cases, this command will return one URL for you
to make the corrections. If it outputs two URLs, it means the frame number
occurs in two overlapping segments, and so you may have to make changes to both
of the segments. You can also omit the frame argument, in which case it will
output all URLs for that video.
If you want to find the HIT id, assignment ID, or worker ID for a particular
video, specify the --ids parameter to the vet command:
$ turkic vet identifer --ids
$ turkic vet identifer frame --ids
will print a list of all the IDs for the video. If the corresponding segment
has been published and completed, it will list three strings: the HIT ID,
assignment ID, and the worker ID. If the job has been published but not
finished, it will just list the HIT ID. If the job has not yet been published,
it prints "(not published)".
--- Listing all Videos -------------------------------------------------------
You can retrieve a list of all videos in the system with:
$ turkic list
If you want just the videos that have been published:
$ turkic list --published
If you want just the videos that have been worked on:
$ turkic list --completed
If you instead want the videos that are used for gold standard:
$ turkic list --training
Finally, if you just want to count how many videos are in the system, use the
--count option, in combination with any of the above:
$ turkic list --count
$ turkic list --published --count
--- Managing Workers ---------------------------------------------------------
You can list all known workers with the command:
$ turkic workers
which will dump every worker with the number of jobs they have completed. You
can also use this command to block and unblock workers:
$ turkic workers --block workerid
$ turkic workers --unblock workerid
You can also search for workers by the first few letters of their ID:
$ turkic workers --search A3M
--- Deleting a Video ---------------------------------------------------------
You can delete a video at any time with:
$ turkic delete identifier
If the video has already been annotated (even partially), this command will
warn you and abort. You can force deletion with:
$ turkic delete identifier --force
which will REMOVE ALL DATA AND CANNOT BE UNDONE.
== WORKAROUNDS ================================================================
If you get the error:
error: each element of 'ext_modules' option must be an Extension instance
or 2-tuple
then follow these steps (we will implement a workaround at a later date):
1) Edit /usr/lib/python2.7/distutils/command/build_ext.py
2) Find line 356, where it reads:
for i, ext in enumerate(extensions):
3) After that line, add a new line that just says "continue", so that
it looks like:
for i, ext in enumerate(extensions):
continue
if isinstance(ext, Extension):
continue # OK! (assume type-checking done
# by Extension constructor)
4) Save the file
5) Go to the pyvision directory and run: $ sudo python setup.py install
== REFERENCES =================================================================
When using our system, please cite:
Carl Vondrick, Deva Ramanan, Donald Patterson. "Efficiently Scaling Up
Video Annotation with Crowdsourced Marketplaces" European Conference on
Computer Vision (ECCV) Crete, Greece, September, 2010.
== FEEDBACK AND BUGS ==========================================================
Please direct all comments and report all bugs to:
Carl Vondrick
Thanks for using our system!