Yes, But How Does It Know? How AI Is Used to Detect AI-Generated Content

detecting ai written content using ai

Table of Contents

There is an old joke about the invention of the vacuum flask, what we commonly refer to as the Thermos. It involved a student who presented the assignment on what they thought was the most important scientific invention ever. The student was of an ethnic group of the joke tellers’ choice. Since I am relating the joke, I will use the ethnic version of my name, Teodor.

Teodor was presenting THE MOST IMPORTANT SCIENTIFIC INVENTION EVER to his science class. Teodor chose the Thermos. He presented the miracle of this invention to the class, going over the history of the Thermos and how it would keep hot things hot and cold things cold.

After presenting, the teacher and class were unimpressed.

“What’s the big deal?” someone asked.

“Yeah!” someone else chimed in. “All it does is keep hot things hot and cold things cold. That’s no big deal!”

Teodor was taken aback by this reaction.

“Ha! ‘All’ it does is keep hot things hot and cold things cold, but how does it know..?

That gets us to the point of the blog post. Shortly after the introduction of ChatGPT and other AI writing tools, AI tools to check if something was written by AI were introduced. (It may have been the same afternoon.)  How does an AI tool “know” something is AI generated?

Here are some of the ways AI can “tell” text is written by AI.

ai robot drafting ideas

Predictability

Yet another personal anecdote.

I read a lot of books. My wife and I enjoy watching movies and series on streaming services. She often tells me to STOP telling her what is going to happen next. How do I know? One way is Chekhov’s gun. Simply stated, if a gun is hanging above a door in Act 1 of a three-act play, it should be used in the third act. Read enough books and watch enough movies, and you see the plot and where it is going. AI-generated text is overly predictable. 

Perplexity and Burstiness

That’s one way AI can tell AI. How predictable is the text? Are there any surprises?  Not just what the text says, but how predictable are the sentence structure and length? These are known as perplexity and burstiness. 


AI tends to write sentences of similar length. You won’t find it writing John Galt’s speech from Ayn Rand’s novel Atlas Shrugged, where the longest sentence contains about 2,600 words. Check some AI-written content yourself. You will not find any long intricate arguments. 

Randomness and Creativity

Like this blog! This is certainly random and may have pushed the boundaries of creativity. AI doesn’t have a wife to watch movies with to relate humorous anecdotes in a blog post. Probably not, anyway.

AI-written content can also be repetitive, like talking to an aging relative. Telling the same stories over and over again. “Okay Grandma, I gotta go…”

Context and Consistency

AI-written content may make sense in small portions but can lack consistency on a large scale. It can lack a deeper understanding of a topic. Like reading the Cliff Notes to The Gulag Archipelago. You really won’t get the context.

Errors

Sentences that are just plain wrong.  We all have seen images created by AI. “Something just ain’t right.”  Hands with eight fingers misplaced ears, and lack of reflection in shiny surfaces. Disturbing misspellings.  The same is true for the written word. AI can make mistakes that a human would catch before publishing. The longer the content, the more likely that errors will occur.

AI is a great tool for researching and drafting content, but don’t fall into the trap of copying and pasting it directly into a blog post. Be creative! That’s what you are being paid for – your brain. Use it!

The important thing to remember is that AI knows AI. It has access to everything ever published on the web and access to everything it has written. 

I ran this blog through an AI checker, GPTZero and this is the result. (I must be 7% AI, myself!)

data representation through pie chart

Does Google rank AI-generated content lower (or higher)?

That’s the name of the game, right? Well, Google uses AI to create content for its own products, so it would be more than a little nasty if it pushed AI-generated content down. Conversely, if it ranked AI-generated content higher, no one would write original content about anything, ever again. 

What Google, Bing, and other search engines care about is the quality of the content. Quality is measured in terms of its relevance and usefulness. The same rules for human-generated content apply to AI. It needs to follow the Google Search Guidelines (or Bing Webmaster Guidelines

ai robot recreating Mona Lisa

What matters:

  • E-E-A-T score. Experience. Expertise. Authoritativeness. Trustworthiness. 
  • Presentation. How is the content presented and experienced? Is it readable? How fast does the page load? Is the page navigable? What is the mobile experience?
  • Engagement. Is the content share-worthy?
  • Freshness. Content has an expiration date! Whatever you wrote three years ago will not be ranked as high as something newer. It can’t be written back when Pluto was still a planet.
  • Originality. Do something creative for goodness sake! This is where AI will fall down. Do some brainstorming and generate some new ideas and new ways of saying something that has been said before.

If you can accomplish the above and more with AI, go for it. But remember, fortune (and Google) favors the bold.

Tags