I just took the new Mturk new test and probably will fail as I didn't really think about what the ratings should be in comparison to the examples. A couple of things:
-How often can I take the test? Is it a once you fail, tough luck thing?
Things I think would be useful to add:
Engagement level - one thing we are all told is that engagement is very important. A few of the Hubs I read ranked well based on the examples but did not engage me - therefore I would not have read them had I stumbled across them. This seems to be missing from the test and can differentiate from a very good but un-read hub to a very good and engaging hub...
HP are using MTURK to quality check hubs - they've just made it much harder to qualify as a rater:
... Which is a good thing, right? Because these are OUR Hubs they're looking at!
I just pinged Paul and Fawntia to see if it's possible to re-take it.
So you read a Hub and rate it and you get 5 cents? Am I missing something?
I think once we get approved through MTurk and pass the test, yes -- it looks like the compensation depends on our accuracy. I looked at the compensation chart earlier and there appears to be a sliding scale based on accuracy score.
The five cents is a base. There is a bonus system in place worth substantially more per 10.
As it stands, one does only have one shot, so to those who have not taken it yet, be sure to give it your full focus!
I'm dooooooomed! I'm a failure........
...seriously, I think you really need to add a note that this is a one-off test. It took me five to ten 'Hits' to get used to the process - by then it was way too late to go back and re-evaluate my methods. I feel there should be a practice group where you can find out how good or bad your ratings are prior to taking the test.
Very disappointed that this happened - I'm not saying that I would have passed - but had I had a chance to really look at some example compared to how I rated them prior to taking the test then I may have been able to adapt to become the type of rater you wanted.
Very sucky way to start the day! I also note that there may be a bug in your system as I think I also was able to re-evaluate some of the Hubs this morning...
Ouch! If no one passed it as Paul said, there definitely needs to be more practice, and more than one try of the test. I recommend three different qualification tests, so that people can take one test, get their score and learn what they did wrong, and then take the next one.
Okay, color me stupid, but I'm still not sure what the benefit is. A person reads an article and rates it and gets paid a base rate of 5 cents, and a bonus of up to $2.00 per 10 articles, or 20 cents an article, so can earn up to 25 cents per article? Even assuming someone can read an article and rate it in 10 minutes and can read 6 articles an hour, that's $3 an hour? So why are we wanting to do this?
My experience is that you could probably read and rate an article in two to three minutes - so that makes a potential of 20 to 30 an hour - max 7.50 - yes minimum wage - but the whole idea is to help improve the quality of HP - so if I do this for an hour a day I get a little pocket money and I help remove a lot of the garbage - which in theory should help my own articles (assuming they have not been removed for being garbage!).
You don't have to do this of course...
I have some suggestions on improving the Mturk rating system (which will also help the HubHopper system)
1. Have fewer categories. Right now, you have to rate from a scale of 1-10 in substance, organization, and mechanics. That's 30 choices of ratings on one article! Of course there is going to be disagreement. Writing is somewhat subjective, especially since the topics are not limited, and this many choices means that it is pretty much impossible to actually agree 100% on all 30 choices. You are more likely to get more accurate scores if you make the rating system simpler. I recommend limiting it to a scale of 1-5. Personally I wouldn't break out the substance, organization and mechanics into different categories. Just make sure that the training emphasizes that those are the key factors. It will take some getting used to and lots of practice, since scoring may feel harder at first since it is hard to decide how to weigh the various factors,
2. Complete the descriptions. Right now, there are only descriptions on 5 of the ratings on each of the three scales - the even numbered ratings. The rest of them say for example "Exhibits characteristics between a 2 and a 4". Of course, if you take my advice in #1, you won't have to do this one.
3. Have more examples. Right now, there are only examples on 5 of the ratings - the odd numbered ratings. Since the descriptions and the examples are on different ratings, it is hard to really get a good handle on what the criteria is exactly. We need at a minimum, an example at each score point. I rate articles for a living, and we generally have more than three examples at each score point - a low, a medium, and a high to see the whole range at each score point.
When I rated hubs, I tended to have more odd numbered ratings because I was comparing them to my examples.
4. Provide an email address or an FAQ to get answers. I now see that you can flag hubs while rating them, but there are no instructions that tell you that is an option.
5. Provide feedback. When something is rated incorrectly, let the rater know, so that s/he can score more accurately in the future.
By the way, Hubbers, I think these examples are very useful for Hubbers even if they don't participate in rating hubs. They give you a clear idea of the different levels of quality, so you can get a feel for how to improve your own hubs.
Thank you for your suggestions.
1. We have found that people are more consistent in their ratings when they rate with the three scales rather than with a single scale. The reason seems to be that grammar (for example) matters more to some people than to others. Having the three separate scales removes some of these rater-dependent preferences. For this reason, we aren't likely to switch back to a single scale.
Edit: Of course, we do not expect perfect agreement between your ratings and the panel's ratings. If you are off by no more than 1 or 2 points consistently, then that is about as good as is humanly possible.
2. I agree that we could and should improve the descriptions in the rating guide.
3. Good point. We hope to have more examples available in the future.
4. I am working on the FAQ today. It will probably be available next week. In the meantime, you are welcome to contact us through Mechanical Turk if you have specific MTurk-related questions.
5. We actually do provide occasional feedback on our regular HITs, just not on the screening test. We don't want to tell people the correct answers on the screening test because of the danger that the answers could be shared.
Thank you ,Fawntia, for considering my requests and providing the FAQ. In #20, you stated reasons for reading the whole article. I would add that sometimes the article may get better or worse as you go along, so that would be another reason to read the whole thing.
I see you provided advertising terms and glossary. How are Mturkers affected by Impressions and CPM?
Sure, I will mention that an article can change in quality in the middle.
The MTurk FAQ is part of the main HubPages FAQ, and the advertising glossary is actually a separate section in the main FAQ. It is not related to Mechanical Turk. Sorry for the confusion!
Fawntia, can you add something to the FAQ to make it clear that only US and Indian residents can receive payment via MTurk. Not everyone is aware. Thank you
I wasn't aware, myself. I just took a look at Amazon's FAQ and it appears that workers outside of the U.S. and India can earn on Mechanical Turk, but they must receive payment in the form of an Amazon.com gift certificate. I will add something to the FAQ. Thanks for pointing this out.
Yes, they will send you a gift certificate, but it's useless. I made that mistake with the Amazon Associates program. If you order something to be sent to the UK the postage eats up the money, so I thought I'd try ordering an ebook, but they just transfer you to Amazon.uk, where you can't spend the Amazon.com certificate. Fortunately, they took the certificate back and put the money back in my account, and I got a cheque instead. MTurk don't have the cheque option though.
That seems rather silly. I'll email our contact at Mechanical Turk and ask her about this. Thanks again for bringing this issue to our attention.
To be fair, Paul E did mention that it was for US residents only (didn't mention Indians!) when the system first came in, but a lot of people have probably forgotten (if they read the thread in the first place). His comment is about halfway down the page.
Judi Bee - not totally useless - you can by something on Amazon.com and send it to me I'll get my Dad to bring it back to the UK when he visits - honest!
I think I would be good at it but I can't get an Amazon account because I live in Illinois. What's with that?
Of course, one could also argue that the reason nobody is passing the test is because the "quality standards" it defines do not match the criteria used by the majority of people to define the true quality of an article...
I heard from someone doing this, who found the average hourly income generated was well under $5 an hour.
Apparently a penalty is imposed on raters who work too fast. This kicks in regardless of whether their ratings match the arbitrary ratings assigned by a panel, which are used as the standard for defining raters' accuracy scores. (Not matching the arbitrary ratings imposes a further penalty).
Thus, an individual's reading speed will not necessarily show a positive correlation with income earned.
Personally, should I ever qualify, I wouldn't be doing it for the money - I'd be 'trying' to help get rid of the crap that is holding us all down. It's a long long road and one that also should include the 'free' hopper - but I cannot sit back and complain about the crap without trying to do something!
Will it work? Who knows - at the very least it'll remove some of the more current crap - the older crap should in theory disappear from Google - assuming Google is right and they really do care about quality and engagement.
I find it as a nice distraction from Facebook - and if I get paid $5 to waste away an hour - then that's a few more beers for me at the end of the month!
One other thought - how many writers on hub pages actually earn more than $5 an hour with their writing!!!!
That's not correct.
I can generally do between 20 (in a house of distractions) to 40 (in a quiet house) an hour. I'm generally working at top bonus so five an hour is on the low and ten an hour is high. My bonuses don't change with speed.
This is inaccurate. If your ratings are poor (because you are going too fast or for any other reason), then your accuracy will be low. We do not take time spent into account directly, but it is hard to give good ratings if you are going too fast.
Furthermore, the panel ratings are not "arbitrary." Three members of the panel independently rate a Hub, and only if all three members closely agree is the Hub deemed officially rated. A majority of the time, all three members of the panel closely agree on their ratings, so it is far from arbitrary.
A lot of us used to hop and flag hubs for free because we wanted to help the site.
As was admitted yesterday in another thread, flagging hubs for low quality now has no effect whatsoever.
Because of that fact, plus the other negative changes that have happened here, I do not intend to hop or flag anything again while the current situation persists.
Respect needs to flow in both directions.
As for earning for writing, you know as well as I do that people write here for a variety of reasons. I have no idea of what range of earnings is generated per hour of writing hubs. I would imagine it is hard to quantify, given the different lifespans of different hubs and the number of views each might attract during its lifespan.
Moreover, it is one thing to write something under one's own steam about a subject of one's choice. It is a completely different thing to perform a repetitive task to meet requirements imposed from outside.
The first activity can be viewed as a hobby and as pleasure, as well as being a potential income source. The second is contract work. My going rate for doing contract work to other people's specifications is $40/hour minimum, not $5/hour.
Different people are different. Some hubbers are from countries where even one US dollar could go a long way.
It is my understanding MTurk only allows US residents to sign up at present.
Aw, what a pity Simey! Obviously you aren't qualified enough to check for crap like this;
http://ghaziabadguide.hubpages.com/hub/ … f-Internet
Here's the first paragraph;
"Internet is one of the most sought after things in the world. In the present day scenario, almost every person takes help of internet even for the smallest thing to find. Be it a pizza to order or movie tickets to book, or even finding a match for a girl or boy, internet has become the strongest tool in today’s world."
I've done some research and the MTurk quality rater was a blind Botswanan Aardvark by the name of Barry.
Simone - what is going on here? Find the tester and fire him.
If this test is so hard, how come there is still so much cr*p getting featured?
OK, they changed policy again at MTurk.
Yippee, they pay in rupees! So all those whose Indian aunty hubs are eventually deleted will be able to make up the lost income by rating our hubs!
Just what we need.
Other non-US members are paid in Amazon.com gift vouchers, which are more or less useless given the postage charges to mail from the US to other countries.
We can actually get paid by cheque - in dollars though. The bank does charge a fee to work out the exchange (seems to vary from bank to bank - I think mine charge £8, I've heard that some charge £15).
Entirely take your point about the rupees though.
That's strange. Seems the MTurk FAQ is out of date, because it says:
"If you are a US based Worker, you can choose to disburse your earnings to your U.S. bank account or to an Amazon.com gift certificate balance. Workers in India have the option of receiving bank checks denominated in Indian Rupees or disbursing their earnings to an Amazon.com gift certificate. All other international Workers can only disburse to an Amazon.com gift certificate."
https://www.mturk.com/mturk/help?helpPa … r#how_paid
I've misunderstood (it happens a lot) - I was basing what I wrote on the Amazon Associates programme, entirely possible that MTurk do it differently, I've no experience of their operation.
You're British Judi, so you can't participate in the Hogfest anyway. We're not good enough apparently...
Well Guv'nor, seeing as 'ow we all talks like what Dick Van Dyke did in Mary Poppins, we ain't much use anyhow - I just wishes I could talk proper like...
Canadians don't seem to be good enough either!
It is interesting to speculate on why HP needs to spend so much money on a system that will rate the hubs out of 100, when all they need to do is to set a minimum standard ‘pass mark’. Here’s my guess.
=>Google wants to deliver good quality useful pages for its readers. Part of its rating of pages to be displayed includes an algorithm (algo) for ’G quality’. While it is an algo, it uses humans to refine, test and develop this algo.
=>HP wants to simulate the G system. It has its own algo and a human based ranking system (QAP). The concept is that if HP can rank pages is a similar way to G it will be able to ‘filter’ out poor quality pages, and better meet the needs for aht G wants. G changes ‘what it wants’ regularly and so by having QAP with its three elements it can adjust not only the overall QAP score for a page but recalculate it using a different formula. This means that HP will ideally always have a ranking for every page that mirrors the G ranking. It also means that it can adjust the ‘pass mark’ for being ‘Featured’ as G changes ‘What it wants’.
=> HP wants to be able to rate the quality of pages so it can present the best of the best on the topic pages etc.
What are the potential problems with this approach
=>HP has a ‘cream will always float to the top’ philosophy. It pushes ‘Stellar Hubs’ because it believes that such hubs will rank better with G. While this is generally true, there are many reasons why fabulous Stellar hubs, HOTD, AP hubs, Exclusives and ‘High Quality’ hubs fail to get traffic – which has nothing to do with ‘quality’ or ‘content’. This is related to the ‘topic’ and choice of the keywords for the targeted niche. It is also related to the ‘reputation’ of the author and the ‘reputation’ of the site on which it is published. If the article is written in a saturated topic area with a title/words that can’t compete it will be extremely difficult for an article to get traffic no matter how good its quality.
=> Vast number of articles are being written which have virtually no hope of getting traffic - they have great quality but they fail (no hits).
What are the Alternatives?
In my humble opinion, HP should let G sort out what is good quality – which it does anyway through its page ranking and position on the SERPs.
=> HP should refine its algo as the QAP – do it all electronically and refine it based on what has been successful in terms of traffic. To the existing algo should be added a spelling and grammar checker/ rating score system such as Grammarly. Then add a rating for number of words (500-1000 ideal), at least 3 images, at least 2 non-advert capsules (whatever it wants as a minimum. The formula could be revised as G changes what it wants. This algo could be applied for ALL hubs (which does not apply with the existing QAP, - too expensive to do it for old hubs). The focus is defining a PASS MARK for what HP wants. Traffic is the real measure of success (G loves me) – let traffic dictate the order that the pages appear for the topics. Having and electronic algo for ranking pages is better as it means it can be run on all the pages and upgraded as things change.
=> HP should use its vast database of information on keyword popularity to devise a Title Tuner that works in a predictive way to suggest better keyword choices. This would be focused on suggesting changes to the Title at Inception, when an author is entering keywords into the title. The problem with Exclusives is that very few of them are competitive. To ask people to spend 3-4 hours writing a hub on a title which has a high risk of failure is a tragedy. HP has a huge database about keyword popularity, successful keywords etc. Surely HP use its staff to create a list of Exclusives that are COMPETITIVE. Also it could be used to display a lit of related keywards as suggestions to the author.
Conclusion: HP should shift its focus from ‘ QAP cream will always rise to the top’ to redefining what CREAM is (it is MORE than QAP quality).
=> Quality that meets the ‘Pass Mark’ using an algo approach for spelling and grammar
=> 500-1500 words, 3 or more images, 2 or more capsules (non-commercial), + others
=> A Competitive title for an available keyword niche that has real traffic potential.
=> Traffic (after a minimum of 2-3 months
=> Revise the hubscore as discussed previously so hubbers can see their scores in this new system and work on improving traffic and revising their titles etc.
I guess I don't yet qualify -- maybe I haven't hopped enough Hubs since becoming active?
I tried to take it today after getting the approval through MTurk. I guess I'll have to try again the next time the opportunity comes round.
We let 30 people take the test a day.
It can be done well in a couple of minutes per hub. We bumped pay that should make it out to the site soon. I think 10 to $15 hr will be doable for accurate folks.
The ratings done on HP also get counted. We may build some feedback so people can see how they're doing on HP as well.
When I tried to take the test it indicated that I did not qualify. I have hopped Hubs here on a fairly regular basis -- I just don't know whether I have satisfied the required accuracy score of 30? The feedback would help -- right now I'm totally at sea.
I wish they would do it somewhere other than MTurk. For some odd reason, MTurk will not approve me. I don't understand why.
I just took the test. We'll see what happens - I looked at the examples and tried to rate the sample hubs as carefully as I could. It takes some time to look at each hub, though the really bad ones are obvious straight off the mark.
....and I've just re-read your comment, Paul, and realized that I was being oblivious. Thanks.
This is an interesting thread. Mturk is something I have not heard of. If some hubbers can't ever avail (for geographical reasons) of whatever rewards that might stem from Amazon membership of application approval for payments, then how can the test uniformly apply to all people on Hub Pages? Forgive me if I got this wrong, but some hubbers are surely disqualified from the test?
How is it that Mturk / Amazon is the only way of measuring such a content rating from a reviewer. There must be libraries, universities and institutions that have such a method. Do International Standards Organisations not have a review system? Wow.
At the center of all of this is perhaps an urgent call to hubbers to hop some new hubs.
I do hop hubs and I suppose I'll still do so, despite being effectively barred from MTurk. Can't imagine that creating a situation where some Hubbers can apply to be paid whilst others can't will make the idea of Hub Hopping more attractive though.
What if QAP scores don't align with what Google really wants, and rates as 'good quality' in the SERPS?
'To be, or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles
And by opposing end them.'
Cut the QAP to a CRqAP filter ! IMO
by Leah Lefler 5 years ago
I passed the qualification test for rating hubs (obtained a 92% score and received the $5 bonus), but when I returned to Mechanical Turk to actually rate hubs, the HIT states I am not qualified. I am posting a screenshot of the page.The qualification requires a score of "30" and I have a...
by Catherine Giordano 2 years ago
My hub scores range from 65 to 87 as of the last time I looked. My average is 73. To me it seems like I have a "C" average. I'm disappointed and discouraged. I am doing everything that is recommend to get a good score.It is disheartening not to get a feedback. For...
by Liz Elias 5 years ago
Just when I got used to the featured/not featured ratings, now we have a whole NEW system of circles with full, half or not filled-in centers. Sure, it's easy to understand at a glance, but, in the meantime, it seems some changes have been made to how the ratings are dished out.I now have 3...
by Paul Edmondson 5 years ago
We first launched the new Hub Hopper in early August http://blog.hubpages.com/2012/08/hub-ho … o-feature/ We are looking to increase the amount of Hub Hopping and last night deployed an application to Mechanical Turk to find more people to Hop Hubs. For those that are good at it, there...
by Simone Haruko Smith 5 years ago
We see HubPages as the ultimate place online to publish long-form, media-rich resources. To make sure that people who visit our site get the same impression, we are doing everything we can to let our best Hubs shine.Our latest development regarding this effort has been to run Hubs through a Quality...
by Mark Ewbie 5 years ago
So my 'article' on Stellar Hubs failed to get past the MTurk QAP whatever.Even though I stuffed it full of all those lovely maps and things.No explanation of course. Just the whirly thing changes to a space and you are left wondering what to do.In the middle of a Panda upgrade - with 6...
|HubPages Device ID|
|Login||This is necessary to sign in to the HubPages Service.|
|HubPages Google Analytics|
|HubPages Traffic Pixel|
|Google Hosted Libraries|
|Google AdSense Host API|
|Conversion Tracking Pixels|
|Author Google Analytics|
|Amazon Tracking Pixel|