Arts Autos Books Business Education Entertainment Family Fashion Food Games Gender Health Holidays Home HubPages Personal Finance Pets Politics Religion Sports Technology Travel

How To Improve The Quality Process Of Hubs

Updated on September 24, 2016

Jack Lee

Contact Author

Introduction

This hub is inspired by the current controversy surrounding the "featured" and "un-featured hubs" statistics being made public by HubPages. This is indirectly related to the quality assessment process of HubPages. My contention is that the current rating process is deficient and could use some improvement. It unfairly places certain hubs in the "un-featured" catatgory causing distress among some hubbers. For the record, I do not think this stat should be made public. I do have some suggestions on how to improve the quality assessment process.

-November 2015

Background

To learn about the current quality assessment process, I read the link below which describe the current HubPages process. I must commend the staff at HubPages for creating HubPages in the first place and for creating a working platform for writers. The process as currently implemented is not bad but can be improved. Considering the limited staff, it is probably the best that can be done as a first trial. I do think it can be improved. Here is how I perceive this problem and solution.

What Would I do?

Based on the current information available, HubPages has a staff of only 23. There are approximately 55,000 hubbers and almost 800,000 hubs and more being generated every day. The problem is, how to insure quality of content across the whole spectrum of HubPages? and more importantly, with minimum human intervention.

After thinking about this for awhile, here is what I propose if I was given the task of designing a quality assessment process. Notice that many of this exist today in HubPages. My additions are the improvements that is recommended.

Use an AI algorithm for automatic processing and checking
Use feedback loop to incrementally improve the system
Two phases of operation, initial assessment and a maintenance mode
Provide feedback to authors to guide them
Must include a small human component to assist the AI to learn
Inject some randomness to the process (to simulate the real world)

Details On AI

Use of AI algorithms to validate the content. I envision 8 stages of checking, some are easy but some hard.

Stage 1 - Check the Title / URL for uniqueness and valid character set. (easy)

2 - Check the category selection to make sure it is the best suited.

3 - Check all text modules for proper spelling and mechanics of writing... (easy)

4 - Check the content for duplication with existing web pages (automated search)

5 - Validate all images for sufficient resolution and for being royalty free or original content.

6 - Check all links to make sure they are valid and not SPAM sites. (easy)

7 - Make sure the promo modules of Amazon are relevant to topic and within limit. (easy)

8 - Content of hub is "context relevant" to topic (hardest to do in my opinion)

After the initial process, a score can be generated from 1 to 100. The hubs will be placed in three initial bins; Featured score (70-100), Needs work (40-69), Rejected (0-39) for violations.

Feedback Loop

The feedback loop is a key component and it implies a time sequence. Data will be collected after the initial phase of assessment and it will be fed back to iterate the process on a regular basis perhaps weekly. Some items to be feed back may include, traffic count, human input (if any, to be discussed later) and a random element (explained later) and the google page rank.

The important of the feedback is to simulate the "learning" process of human beings. An AI system is only "intelligent" if it can learn by trial and error and try different solutions.

Another important feedback input should be from Google Page Rank. Once a hub is published and featured, the google web crawler should find it in a few days. It then assign a page rank to that hub and it is indexed in the google vast search database. No one really knows how the algorithm works and it is periodically updated to incrementally improve the search results. This "page rank" is an important data point along with actual traffic (page views) to determine the quality of the hub.

Two Phases

This Assessment process has two phases. The initial phase which determines the starting "score" and a maintenance phase where the "score" may rise or fall based on some time sensitive data such as viewer traffic. The data may include not only first access click but time spent on page etc. and whether there were any user comments and whether any sales were made as a result of viewing the page.

Let me address the problem with how to deal with stale pages. Currently, HubPages and other sites put a value on webpages that are updated on a regular basis. The implication being that a hub that is updated by the creator must have some improvements however small. That may be true for some hubs but for the majority of hubs that I create, require little updates. If they did, I would fix errors or broken links or occasional new updated information. It seems to me, the frequency of edits to any hub should not figure into the general quality assessment of that hub.

This is also directly related to the current debate of "featured" vs. "un-featured" hub stats. Some hubbers feel it is unfair to penalize a hub for being stale even though it gets great traffic and the content is excellent. This is especially true for people who are prolific and publish hundreds of hubs. Why should they be required to "update" their hubs periodically just to stay featured?

Feedback To Authors

The next point deals with providing meaningful feedback to hubbers so that they can provide better content. It should not be a guessing game as to why a hub is featured or not? The data is readily available as part of the automated processing. Provide this information to the hubber so that they know why their hub was rated such. For example, text too short, too many use of lists, poor image resolution, too many Amazon promo products etc.

Any information feedback provided to the hubber will lead to better quality hubs in the future and/or improving existing hubs.

A second important feedback to authors is the dashboard with important statistics on all the hubs that was created.The current HubPages dashboard is good but have some deficiencies.

First, the page view count per hub includes the visits of the author. This is a distortion of the number. Anytime the author visit the page either to update or to reply to comments, the number is counted in the total. The "page view count" should only include other readers that have clicked on the site.

Second, I believe a good quality metric is the google page rank and it should be included in the dashboard. I would use 50 as the upper limit. Anything greater than 50 belongs in the "no rank" category. A new column in the dashboard could list the rank number. This will help the hubber in choosing future hub titles that will produce higher ranking 1-10 being the goal. For lack of a better term, you can call this the SEO rank.

Human Input

Unfortunately or fortunately, AI has not reached the ultimate sophistication of replacing an experienced human editor. If it did, many HubPages staff will be out of a job. Currently, there is no AI system capable of replacing a human editor.

What I am suggesting is a minimal human input for a small number of random hubs. An experienced human editor can rate these sample hubs using the same criteria programmed into the AI system and provide them as guidelines to the AI system. A template if you will to help the system learn. This plays an important role in the feedback portion mentioned earlier. Part of intelligence is the ability to learn by trial and error and also by imitation. If one provide a few sample data, over time, the AI system will be able to detect a pattern that will "teach" it to recognize what is good writing.

A Chinese Chess Board

Random Element

Lastly, the random element is also a key to designing a robust system. Let me give a short example to illustrate why it is needed.

Years ago, when the Palm Pilot was a hot technology gadget, I found a free game app that plays Chinese Chess. This is a game similar to traditional chess with a board and different pieces with rules on their moves. Chinese chess is easier than traditional chess because of the limited rules and can be mastered by almost anyone. When I started to play the game, it would beat me every time. I thought the designer did a pretty good job of implementing the winning algorithm. As I play more and more, I was able to detect it's "rule" base. On one occasion, I finally won the game. Unfortunately, that was when I realized that the designer had limited the algorithm to provide only one move. I was able to beat it every time after that. Needless to say, I stopped playing it after that.

What was the point of that story? The designer of the game did not provide a "random" element. In order to make the game more robust, it needs to be able to play different "equally viable moves" during the game. That way, it will insure that the same game is rarely repeated.

How does this apply to the QC process of HubPages? Just like the game I just described, when rating a hub, there needs to be a random element, however small, to the algorithm. After all, even two experienced human editors will produce different ratings on the same hub. That is because we are dealing with a subjective assessment of a page. There is no exact correct answer. In fact, one way to test the AI system is to provide it with the exact page with one small insignificant change. If it can produce a small difference in the assessment score, that is a good sign.

Current Limitation of AI

I've written about the limit of Artificial Intelligence in other hubs. Let me summarize what I think is still work in progress.

1. In the image section, AI is very poor at finding relevance to the subject matter. A human can view an image and within seconds determine it's "quality" and relevance to the subject matter. That is human intelligence. It can detect if the image is upside down or reversed or "wrong" based on prior knowledge. It can detect subtle features such as a carton being a satire or a caricature. It can spot imperfections due to processing or intentional deceit. All of these are very hard for an AI system to do not to mention the vast computing processing required. By the way, this problem is not solved by ever faster computers. More computing power does not lead to smarter solutions and in some cases any solution.

2. AI is also poor at identifying relevance of text. It can be easy to detect poor grammar or incomplete sentences but it can't determine if the text is relevant to the main topic. Also, it can't determine how artistic or creative the text is and how it would be perceived by a human reader. This is very easy to verify. I can come up with a paragraph of text that is total gibberish which a human editor would recognize in two seconds while an AI system may pass as "good." On the other hand, a poet can write a novel poem that is unique and a human editor would rate as great while an AI system may reject as "poor."

That's the good news for all experienced editors. Your jobs are safe for now. The Turing test was designed years ago to detect AI. I think we are still far from that despite the claims of many intelligent people. Time will tell. AI can be a great tool to help us but it can't replace us.

Summary

I must summarize by repeating my original comment. I think the current quality assessment process is good but needs improvement. Ideally, a human editor would be best to review each hub. Due to the lack of resources, AI is a good alternative but it is not perfect and it needs work. HubPages can do a lot to help all hubbers improve the quality of the content. This will eventually help all of us. Our goal is the same, to become more successful than the next. Any tools or assistance or feedback that will help should be implemented and welcomed.

Privacy is the most important factor. Many of the statistics are good for the hubbers but not to be made public. They have little meaning to the casual reader and should be kept private.

Thanks for reading my humble opinion.

Current Description Of HubPages Quality Process

Featured Hubs and the Quality Assessment Process
The Quality Assessment Process determines which of these Hubs end up being showcased on Hubs and Topic Pages and made available to search engines. These Hubs are known as Featured Hubs.

Improving Your Article
How To "Tweak" An Online Article
by Sharilee Swaity77
Help for New Hubbers
How to get your Hub past 'Pending' to be featured.
by LongTimeMother61
Improving Your Article
6 Tips for Writing a Successful Hub
by Melanie Palen48
Help for New Hubbers
Hubber Score: How I Got a Hubber Score of 99 in Two Weeks - Make that 100!
by Sid Kemp128
Help for New Hubbers
How to Improve Your Writing with HubPages
by Kiz Robinson8

Improving Your Article
How To Build An Empire On Hubpages - The Write Way
by Suzanne Day89
Improving Your Article
5 Ways to Capture a Screenshot on an Apple iPhone or iPad
by Mark Shulkosky31
Improving Your Article
The Most Popular Topics on HubPages: Here's What People Follow on This Site
by Marcy Goodfleisch204

working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

Necessary

Features

Marketing

Statistics

Approve All & Submit
Approve Checked Only

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details

Necessary
HubPages Device ID	This is used to identify particular browsers or devices when the access the service, and is used for security reasons.
Login	This is necessary to sign in to the HubPages Service.
Google Recaptcha	This is used to prevent bots and spam. (Privacy Policy)
Akismet	This is used to detect comment spam. (Privacy Policy)
HubPages Google Analytics	This is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic Pixel	This is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web Services	This is a cloud services platform that we used to host our service. (Privacy Policy)
Cloudflare	This is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted Libraries	Javascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)

Features
Google Custom Search	This is feature allows you to search the site. (Privacy Policy)
Google Maps	Some articles have Google Maps embedded in them. (Privacy Policy)
Google Charts	This is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host API	This service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTube	Some articles have YouTube videos embedded in them. (Privacy Policy)
Vimeo	Some articles have Vimeo videos embedded in them. (Privacy Policy)
Paypal	This is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook Login	You can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
Maven	This supports the Maven widget and search functionality. (Privacy Policy)

Marketing
Google AdSense	This is an ad network. (Privacy Policy)
Google DoubleClick	Google provides ad serving technology and runs an ad network. (Privacy Policy)
Index Exchange	This is an ad network. (Privacy Policy)
Sovrn	This is an ad network. (Privacy Policy)
Facebook Ads	This is an ad network. (Privacy Policy)
Amazon Unified Ad Marketplace	This is an ad network. (Privacy Policy)
AppNexus	This is an ad network. (Privacy Policy)
Openx	This is an ad network. (Privacy Policy)
Rubicon Project	This is an ad network. (Privacy Policy)
TripleLift	This is an ad network. (Privacy Policy)
Say Media	We partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing Pixels	We may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking Pixels	We may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.

Statistics
Author Google Analytics	This is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
Comscore	ComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking Pixel	Some articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
Clicksco	This is a data management platform studying reader behavior (Privacy Policy)