This section outlines the steps in running experiments with our platform.
Before spending any money, you should veriy that everything works in the MTurk sandbox.
Set MTURK_SANDBOX = True in server/config/settings_local.py.
This will use https://workersandbox.mturk.com instead of https://www.mturk.com/.
The following commands will dispatch tasks to MTurk:
./manage.py mtconfigure
./manage.py mtconsume
See Commands for documentation of these commands.
Navigate to http://workersandbox.mturk.com and find your task on the sandbox marketplace.
Try submitting some results and check that they show up in the admin submission view (http://YOUR_HOSTNAME/mturk/admin/submission/)
To expire all experiments, run:
./manage.py mtexpire '.*'
This section outlines the step in running paid experiments with our platform.
The following commands will dispatch tasks to MTurk:
./manage.py mtconfigure
./manage.py mtconsume
Note that if all input objects have received the specified number of labels, then no work will be dispatched. See Commands for documentation of these commands.
Watch users by either setting up Google Analytics or viewing the server log:
tail -f run/gunicorn.log
It will take ~10min before you see the first submissions.
There are two methods to reviewing submissions:
Automatically approve all submissions. When using both tutorials and sentinels, I find that the proportion of high quality submissions is high enough to approve all workers. While some bad work sneaks by I find that it is not worth rejecting, since workers get upset and don’t like the uncertainty.
To approve all submissions, set MTURK_AUTO_APPROVE = True in server/config/settings_local.py. This will approve with celery, which could have a long delay. Workers like seeing instant approvals (I found that my submission rate increased by 50-100%), so it is worth running
./manage.py mtapprove_loop '.*'
while the experiment is running to automatically approve everything as quickly as possible. The argument is a regular expression on the Experiment slug (human-readable ID).
Manual review. Unfortunately I haven’t had time to update the admin interface to have approve/reject buttons (since I always approve all submissions). You can manually approve/reject by opening a Python shell on the server (./scripts/django_shell.sh) and running the command:
MtAssignment.objects.get(id='ID').approve(feedback='Thank you!')
or
MtAssignment.objects.get(id='ID').reject(feedback='You made too many mistakes.')
where ID is the assignment ID.
I find that quality tends to be consistent within a worker, so you could write a loop to iterate over known good workers and approve those:
GOOD_WORKER_MTURK_IDS = [ ... ]
asst_qset = MtAssignment.objects.filter(
status='S', worker__mturk_worker_id__in=GOOD_WORKER_MTURK_IDS)
for asst in asst_qset:
try:
asst.approve(feedback='Thank you!')
except:
pass
See mturk.models.MtAssignment for more assignment-related methods.
Note that approve/reject commands have a high chance of failing. The Amazon MTurk server takes a while to recognize that a certain assignment is ready for approval. The above scripts take this into account, so don’t worry about lots of errors in the celery logs regarding approvals.
You can grant bonuses to assignments in the Python shell (./scripts/django_shell.sh) with the command:
MtAssignment.objects.get(id='ID').grant_bonus(price=0.10, reason='You did a great job')
where ID is the assignment ID.
Note that users are promised small bonuses for completing feedback. This is automatically handled by the mturk.models.MtAssignment.approve() method.
To expire all experiments, run:
./manage.py mtexpire '.*'
OpenSurfaces stores a local copy of the status of each HIT and Assignment. To make sure that local data is synchronized, run:
./manage.py mtsync
To print your Amazon account balance to the console, run:
./manage.py mtbalance
If an experiment uses CUBAM to aggregate binary answers, run this to update all labels:
./manage.py mtcubam
Warning: this will take several hours to run if you have millions of labels.
To add your own experiment, see Extending the system.