So Exactly What is CAPTCHA?

If you’ve been using the Web for quite some time now, probably you’ve encountered an annoying mini-test called CAPTCHA.  It comes in many forms including those frustratingly hard to read squiggly words, a check box that sometimes gives a bunch of pictures to figure out if there is some type of object (e.g. a car or a sign) in it and even simple math problems.  But why oh why do we have to deal with this insanity?

When CAPTCHA first started out it typically consisted letters that are stretched, distorted, squished-up or color-blotched that must be deciphered before you post a comment, create an account or buy a ticket. It’s not a hard test, but sometimes due to the distortion, you wonder if it’s a letter “i” or an “l,” or if it’s a zero or a capital letter “O.” Sometimes you wonder if it’s an “m” or two “n’s.” But most of the time, you “pass” the test easily or maybe after a try or two.

CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. It’s also known as a type of Human Interaction Proof (HIP). It’s goal is to create a test that humans can pass easily, but machines or computers can’t.

Why do the makers of websites need to know if I’m human or not?

The short answer for this is to prevent spam (and it’s not the yummy kind that’s in a can). It can be used to deter hackers and spammers from abusing online services or conducting unethical activities online. They needed to make sure that a user is a human because computers may be used for:

  • Signing up for hundreds of email accounts for spamming
  • Spamming blogs and news articles with fake and repetitive comments
  • Scraping email addresses from websites to use in spam attacks
  • Swaying online poll by automatically submitting hundreds of false responses, such as in online voting, political surveys, etc.

Verifying that the website is interacting with a human person can stop and prevent many of these automated attacks. Not passing the CAPTCHA test blocks the computers and auto-filling software programs from submitting online requests.

People who are trying to game the system, or exploit weaknesses in the computers running the site using bots can be prevented by CAPTCHA. If they are not barred, their actions can affect millions of users and websites. CAPTCHA guards against the bots of spammers and other computer underworld “creatures.”

Who invented CAPTCHA?

The “T” in CAPTCHA stands for Turing test created by Alan Turing, the father of modern computing. Turing was the one who developed a test as a way to examine whether machines could think (or appear to think) like humans, and if people would notice. The Turing test is a classic game of imitation. Here, an interrogator asks two participants a series of questions. One of them is a human, while the other is a machine. The interrogator can’t see or hear the participants, so he has no way of telling which is which. If the interrogator can’t figure out which of the participants is a machine based on the responses, the machine passes the test.

CAPTCHA was first invented in 1997 by two groups working in parallel. These groups created the most common type of CAPTCHA – the one that requires the users to type letters of a distorted image featuring a sequence of letters or digits. Since the test was administered by a computer (instead of a human in a standard Turing test), the CAPTCHA is sometimes considered as a reverse Turing test.

The term CAPTCHA was coined in 2000 at Carnegie Mellon University by Luis von Ahn, Manuel Blum, Nicholas J. Hopper and John Langford. This team was the one who first described CAPTCHAs in a publication published in 2003, making them the receiver of much coverage about it. For them, CAPTCHAs are any program that can distinguish humans from computers, covering many different types of CAPTCHAs.

What are the different types of CAPTCHA?

Not all CAPTCHAs ask you to type in some text. Over time, the computers of spammers and other bad guys of the internet have become smarter, and people have not. The CAPTCHAS need to get harder for the users, because it became easier for computers.

The goal of CAPTCHA is to generate a test that humans can easily pass but machines can’t. It is important for the CAPTCHA application to show different CAPTCHAs for users to crack, because if it’s the same for every user, it won’t take long before a spammer can spot the form, decipher the letters/digits, and program an application to automatically type the correct answer.

The most common type of CAPTCHA is the one mentioned earlier – the one that asks you to type distorted words or alphanumeric strings to determine if you’re really a human.

Usually, CAPTCHA relies on visual tests or patterns. There are versions of CAPTCHAs that asks users to trace certain shapes found in photographs. We humans can look at an image and pick out patterns in it more easily than a computer, because computers lack the sophisticated thought perceptions of humans when it comes to processing virtual data.

Some CAPTCHAs ask the reader to interpret a short passage of text. This contextual kind of CAPTCHA quizzes the reader and tests its comprehension skills. Computers may pick up key words, but they aren’t good at understanding what those words actually mean, especially when already in phrases or sentences.

However, not all CAPTCHAs rely on visual tests or patterns. Not all Web users can see things clearly, especially those people with visual impairment. For instance, an older person may find it hard to decipher the set of words because of an eye condition. Some CAPTCHA apps offer audible tests, which presents users with a series of spoken letters or numbers. Some of these tests include background noise in the recording while some comes with a distorted voice, to help thwart voice recognition softwares.

Vohn Ahn said that there are probably hundreds of different kinds of CAPTCHAs. One of the biggest and most well-known CAPTCHA application is reCAPTCHA, which is owned by Google. You’ve most probably encountered this one. This next evolution of CAPTCHA technology tries to guess whether a session was initiated by a human or a bot by examining behavior when pages load. When it gets suspicious, it offers two test options: the box where you need to “click here to prove you’re human,” or the visual puzzle based on Google Images. This visual puzzles requires you to click all images that follows the descriptions, such as “Select all images with cars,” “Select all images that match this one,” “Select all squares with street signs. If there are none, click skip.”

On average, people spend 9 seconds solving a reCAPTCHA, and 92% of them get it right. Every now and then, you can encounter a CAPTCHA that presents an image or sound that’s so distorted that it becomes time-consuming to decipher (and you might get frustrated, saying “I just wanted to load that page!”). This is why many CAPTCHA applications offer options to create a new CAPTCHA and try again.

Who uses CAPTCHA in their sites?

One of the most common application of CAPTCHA is for verifying online polls. If surveys aren’t filtered, the results can be rigged. One example of this happened in 1999 when Slashdot published a poll asking users to choose the graduate school with the best computer science program. Students from MIT and Carnegie Mellon created their bots to repeatedly vote for their respective schools. Those two schools received thousands of votes, while the other universities only got a few hundreds. If it’s the case, we can never trust online poll results, and the CAPTCHA works to prevent these.

Signing up for websites often use CAPTCHAs, especially e-mail services like Gmail, Yahoo! Mail and Hotmail. Since these websites offer their services for free, spammers might take advantage of it by using bots to create hundreds of spam mail accounts.

Ticket selling websites like TicketMaster also use CAPTCHA apps, as these applications can help prevent ticket scalpers from using the service by buying tickets in bulk. Without CAPTCHA a scalper can use a bot to place hundreds or thousands of ticket orders in only a matter of seconds, leaving other legitimate customers out of tickets. Since the event sold out immediately, the scalpers would then sell the tickets above its face value. CAPTCHA can’t prevent scalping, but they make it difficult for them to scalp tickets on a large scale.

Message boards and contact forms that allow visitors to write a message to the Web administrators prevent an avalanche of spam messages through CAPTCHA. It won’t prevent a user to post a rude or harassing message, but it can prevent bots from automatically posting messages.