Posted: June 11, 2020

Historically, CAPTCHA (“Completely Automated Public Turing test to tell Computers and Humans Apart”) challenges have been used to distinguish human web-browsing activities from automated “robot” activity.  To combat increasingly sophisticated bots, CAPTCHA developers have had to continually increase the complexity of these challenges to prevent them from being bypassed.  Over time, these challenges have become frustratingly difficult for humans to solve, while simultaneously, bots have become sophisticated enough that they can often bypass these challenges by emulating human behavior.  We have reached a point in the advancement of technology where CAPTCHA challenges are no longer effectively serving the purpose for which they were created.  They are blocking humans, while sophisticated bots can effectively pass through.  And at this juncture, increasing the complexity of these challenges will only further exacerbate the problem.  In short, CAPTCHA (at least as a preventative control), has outlived its usefulness.  One might also consider Google’s deployment of reCAPTCHA v3 (the latest version) as a silent concession of this fact – given that v3 no longer uses traditional CAPTCHA challenges as a preventative control, but instead, silently evaluates the likelihood of bot activity (in a best-effort fashion), and then encourages app developers to implement adaptive logic based on their own defined thresholds.  This article will address some of the reasons why traditional CAPTCHA challenges are becoming an obsolete control for combatting the ever-increasing threat of bot activity, and alternative ways developers can use adaptive logic to continue to protect their applications and users.

A Brief History of CAPTCHA

In the early 2000s, CAPTCHA technology was developed with the intention of stopping automated spam activity.  Early CAPTCHAs were plagued with design and implementation flaws that made them frequently easy to bypass (such as inclusion of the answer in the HTML source-code, client-side enforcement, etc).  In 2007, Google took its seat at the table by creating the first version of reCAPTCHA, and quickly became the de facto source for quality CAPTCHA software.  The first version of reCAPTCHA required the user to solve word/character identification problems.  These words were presented with visual modifications to make them intuitive for people to recognize, but more difficult for a computer system to identify.

Initially, this seemed to effectively solve the problem.  Though some well-funded criminal organizations managed to circumvent these controls by outsourced cheap labor (called “CAPTCHA farms”), there was few readily accessible options to programmatically bypass them.  However, with ever-increasing technological advancements, the years that followed saw a fast-paced game of “cat and mouse” between service-providers and cyber-criminals.

Early breakthroughs in OCR (“Optical Character Recognition”) machine learning technology began making it easier for automated systems to analyze images of text and interpret them.  As OCR technology got better, the words and character sequences in the CAPTCHA challenges became more complex and more visually distorted – to such an extent that they would be even difficult for a human to solve.

Eventually, image challenges were also introduced, but these were also often confusing, due to ambiguity and questions of semantics.  Consider the common CAPTCHA request to “Select all squares with traffic lights”.  But no clarification as to what exactly is implied by the phrase “with traffic lights”.  Is it referring to the entire structures (to include the poles)?  Is it just referring to the lights themselves, or the encasements which hold the lights?  Does the entire light encasement structure need to be inside the square for it to be consider “with traffic lights”?  Ironically, it was the strictly human characteristic of over-thinking problems, that actually made it more difficult for people to prove they were human.

CAPTCHA challenges became so notoriously difficult, that it became a long running joke of the Internet. People’s general annoyance with CAPTCHAs have inspired countless viral memes and other satirical content, to include an entire series from cartoonist Mark Parisi (which are definitely worth checking out).

Finally, Google resurfaced with a new solution that they called “reCAPTCHA v2”.  This second iteration was intended to accomplish the same task (determine if interactions were initiated by a human or robot – and then deny robots access), but would do so by just supplying a simple interactive checkbox.  And only if the browser suspected foul play, would it require the user to solve the now infamous CAPTCHA challenges.

And while reCAPTCHA v2 was a great advancement in terms of the balance between the user experience and the effectiveness of the control, it still shared one common fundamental flaw with the first version.  Because both versions 1 and 2 were designed to operate as preventative controls (to stop bots from interacting with the service), they both attempt to make a strictly binary evaluation, to classify web interactions as either robot or human – with no ground in-between.  This strict binary classification required the CAPTCHA software to define an unwavering line of separation between perceived bot and human activity.  While on face value, this may seem like it would be an easy distinction, the nature of modern web-browsing has made it anything but.

Not Quite Human / Not Quite Robot

The truth is, our online identities are far more complex than strictly human.  The “thing” that is interacting with websites online is a confluence of human (yourself) and non-human elements to include:

  • Your Device – The device you are using to access the Internet becomes an inherent extension of your online identity, as it is constantly communicating out to and interacting with other systems on the Internet (even when you are not browsing the web or actively using applications). Some users may also root or jailbreak their devices and implement system-level modifications that may change how the device interacts with web servers online.  Additionally, the unique configurations of your device can alter the way that it interacts with other systems online.  For example, device profiles, which are commonly installed by MDM (“Mobile Device Management”) solutions define how your device will interact.
  • Your Client – When you browse the Internet with a web client (such as a web browser), it may be easy to forget that a significant amount of the network interaction is happening “under the hood” and is handled by automated processes that are executed on your behalf. Internet users also frequently install web-browser “Plug-ins” or “extensions”, which manipulate and sometimes explicitly automate tasks for users.  Well-known examples include plugins or extensions that will automatically compare prices as you shop on online retail sites, or automatically find coupon codes on the Internet for whatever you are considering purchasing.  Additionally, infections from malware, “Man-in-the-Browser” attacks, or third-party browser modifications may also change the way your browser interacts.
  • Your Network – Network infrastructure sitting between your device and the web server may also modify the interactions between the two. Common examples include VPNs and routing technology which can define how data is transmitted, and web proxy services that can manipulate connections and/or the contents of those communications.

This combination of yourself and all of these other non-human elements (augmenting technologies) makes your online interactions not explicitly human, and not explicitly robot, but rather a confluence of the two.  And this increasingly large “grey area”, makes it difficult for reCAPTCHA v2 to make the appropriate classification in all cases.  As such, it is not uncommon for legitimate users to get flagged for further inspection…where they would once again have to suffer through a gauntlet of infuriatingly difficult puzzle challenges.

The Bots can Pretend to be Human

While CAPTCHA challenges have continued to be a source of frustration for humans, bot developers are continuing to find more and more creative ways to get past these challenges.  Among these, the most common techniques for bypassing CAPTCHA challenges include:

  • Failure to Enforce on the Server Side – Some implementations of CAPTCHA “fail open”. That is to say, if the bot strips the CAPTCHA-related call backs in the client-side source code, before interacting with the site…, they can effectively eliminate the need to even solve the CAPTCHA. Such implementations undermine the entire process by making validation discretionary for bots, but mandatory for (non-technically sophisticated) humans.
  • Machine Learning Image Classification – There have been multiple proof-of-concepts over the years to demonstrate how the power of machine learning image classification algorithms can be used to automate the solution of CAPTCHA challenges.
  • Audio Speech Recognition – reCAPTCHA v2 also has an audio puzzle that can be used as an alternative to the visual challenges, for users that may be visually impaired. Multiple proof-of-concepts have demonstrated how speech recognition software can be used to consistently solve these challenges as well.  A recently documented proof-of-concept can be found on the Sociosploit blog.

Shifting the Paradigm

With the release of reCAPTCHA v3, Google has once again demonstrated its ability to effectively adapt to an ever-evolving challenge.  reCAPTCHA v3 has abandoned the binary classification of incoming communications (as “bot” or “not”) in favor of evaluations on a spectrum or a scale.  It also has solved the problem of detrimental user experiences by eliminating CAPTCHA challenges altogether.  Never again will you need to decipher what closely resembles italicized hieroglyphics.  This new iteration does away with the “gate-keeper” concept and no longer attempts to explicitly determine whether the user is a human or a machine.  Instead, developers have the ability to submit any interactions on their site for analysis, and then the reCAPTCHA v3 service returns a range of values between 0 and 1, to define the likelihood of bot activity.  This non-binary scale accounts for the modern complexities of technologically augmented web-browsing.  And it also discards the previously held fallacy that “one size fits all” by allowing developers to determine (based on their application’s unique risk profile) what actions should be taken based on their own defined thresholds.  But most importantly, this new solution can be used to enhance an organization’s security monitoring capabilities by providing a new and useful dataset to correlate against other activities being observed on the perimeter.

With ongoing negative user experiences (even with the “easy on humans” approach of reCAPTCHA v2) and the continual discovery of workarounds for bots…it was apparent that a shift in paradigm was once again needed.  And this is precisely what reCAPTCHA v3 accomplished!  reCAPTCHA v3 seems less like a new version of reCAPTCHA, and more like an entirely new service. But given the modern complexities of bot warfare, it is exactly the solution that we needed (…at least for now).


This blog was written by Justin Hutchens, Consulting Services Practice Lead at Set Solutions.