Can You Get Penalized for AI Content? (With Cleanup Tips)
AI content generators are sweeping the world in all sorts of different industries. Bloggers are using them to create ai-written content. Some companies are using them to replace artists and graphic designers. The SFF magazine Clarkesworld closed submissions due to a flood of AI-generated content. It's a sea change that threatens to upend the entire content creation industry and leave millions of content creators and artists struggling more than they already are.
So, of course, thousands of businesses are looking to invest whole-hog into it. It's powerful, cost-effective, and frankly, it's impressive.
Less than a year ago, John Mueller used strong language prohibiting AI-generated content in the past and calling it "spam."
Then, Google quietly removed their section about it and replaced it with language about low-quality SEO content while releasing their own AI with a bumpy launch.
With so much uncertainty, some people are getting nervous. What will they do if they're using an artificial intelligence to generate blog content, but Google comes along and says they won't allow that content to rank? Will there be an AI penalty? Will their site be deindexed? How can they recover?
The convenience of AI is seductive, and I can't blame anyone for falling victim to this, especially with the vauge and contradictory information in the SEO community..
We're going to see a lot more of this in the coming years. Let's analyze the situation.
Is There an AI-Generated Content Penalty?
First, you can (for now) put your fears to rest.
While Google has penalties against low-quality content, including AI-generated and automatically generated content, there are no penalties for the use of AI.
The trouble is AI generation has a lot of issues as it currently stands, and those issues mean the content you generate using a tool like Jasper AI or ChatGPT can fall squarely into the automatically generated, over-optimized, or low-quality content buckets.
When I asked ChatGPT if it can replace human content writers, it told me this:
Here are a few more issues with it:
- AI tends to generate broadly surface-level content that offers nothing new or exciting to bring to a discussion. That's the sort of thin, fluffy content that Google doesn't like to see ranking highly in their results.
- AI also tends to produce content that looks very similar to other content it has produced. While it's all technically unique – it passes Copyscape checks, if nothing else – it's going to, more and more, create a background noise of same-feeling content across the board. It will take someone who knows how the artificial intelligence works to get it to generate content in varying styles and compositions to make it more unique, and a lot of the businesses at risk of penalties aren't going to be the ones doing that.
- One of the most significant issues with AI content is that it doesn't have a fact-checking ability. AI can confidently tell you completely incorrect statements, and there's no way you can know the difference if you aren't checking it all yourself.
- Remember, a considerable element of Google's SERPs these days is EAT – Expertise, Authority, and Trust. AI doesn't have the expertise, it doesn't have authority, and it doesn't have trust. You need external verification and fact-checking for everything it says, and you need to attach your name and reputation to it. Attaching your name, face, and credentials to content that was generated with AI isn't a good idea. The writing style, accuracy, and tone will be all over the place, and it will send mixed messages to both search engines and visitors.
AI content can get you in trouble with Google by saying false things that you publish uncritically, attempting to over-optimize and outmaneuver the search algorithms to game the system, and producing fluffy, surface-level that comes across as thin content.
Can Google Tell If Your Content is AI-Generated?
This one is a tricky question to answer.
In the past, it was a definitive yes because AI was very primitive, and it was trivial for Google to look at its massive index, compare content to itself and other content, and reverse-engineer the patterns behind it.
These days, though, it's a lot murkier.
Detecting AI-generated content is a huge issue. Multiple instances of students using AI to do their homework for them have led the leading AI content tool ChatGPT to release their own detection tool based on their intimate knowledge of how their own system works. You can bet Google's engineers are looking very closely at that tool in developing their own versions.
Currently, there's no indication that Google is detecting and penalizing AI content. You'd know when they roll out a major algorithm update to address it, and I'll be covering the topic with great interest when it rolls around.
Can Google detect whether AI-generated content is independent of their core search algorithm?
Will Google Penalize and Detect AI-Generated Content?
This one is an even trickier question.
Honestly? I don't think so.
They've been up-front in talking about it publicly. Danny Sullivan said this:
"We haven't said AI content is bad. We've said, pretty clearly, content written primarily for search engines rather than humans is the issue. That's what we're focused on. If someone fires up 100 humans to write content just to rank, or fires up a spinner, or an AI, same issue.
I'm not sure if AI is how we would go about making titles, headlines, or descriptions, but I do think it's a great way to gather ideas or inspiration. We would strongly discourage blindly using AI as a means to 'make your job easier.'"
Using AI, machine learning, or even just automatic systems to streamline parts of your content creation process is fine. It's always been fine:
- Using something like Clearscope or MarketMuse to give you machine-driven suggestions on SEO tips to use is perfectly acceptable.
- Using a WordPress plugin to automatically generate image alt text based on the image file name has always been fine.
- Look at directories like Yelp that rank highly for millions of local results; those pages are entirely auto-generated content pages, but those are still helpful content pages to people searching for local results.
Meanwhile, using an article spinner or a bot to whip up a few hundred or thousand variations on different blog posts with different keywords infused throughout them isn't acceptable and never has been. It doesn't matter how sophisticated the bot is or how natural its output looks.
I believe that Google will learn how to identify AI-generated content and include it in part of its content quality analysis. I don't believe it will be a ranking factor on its own; rather, it will be rolled into something like Panda as a content quality evaluation factor and probably a negative one.
We create blog content that converts - not just for ourselves, but for our clients, too.
We pick blog topics like hedge funds pick stocks. Then, we create articles that are 10x better to earn the top spot.
Content marketing has two ingredients - content and marketing. We've earned our black belts in both.
Again, Google doesn't care if you use AI generators to help you create content:
Let's face it, most people using artificial intelligence content generators aren't doing those things. After all, AI is supposed to save you time and do the work for you; how many people are putting hard work into ensuring their AI content is high quality, citing correct information, linking to sources, and so on? That's a lot of effort.
I don't believe there will be an "AI Content Detected" Google penalty in the near future. I think Google will raise the stakes for what constitutes high-quality content and that spitting out AI-created content without significant work to improve it will be the new cutoff for what you need to exceed to rank.
You Published a Lot of AI Content: How Can You Recover?
Suppose you're a company or blogger, and you published a lot of content generated by AIs.
In that case, you might wonder how you can recover from any Google search penalties, algorithmic hits, a significant drop in search rankings, or even future-proof your website to be resilient.
In other words: how can you clean up this mess?
I'm not judging you. Plenty of people use shady techniques to get a site off the ground these days, whether PBNs, spun content, link exchanges, or whatever. They use these techniques to get off the ground, then revamp and disavow them when they reach a tipping point. It's shady, but if it works, it works. Until it doesn't.
Google has also been rolling out several versions of their latest updates, The Helpful Content Updates. This hit a lot of AI-generated content sites; not because their content was AI generated, but becuase it wasn't helpful and it was low-effort. I wrote a guide on this and how to recover if you were hit by the Helpful Content Update here.
Using AI-generated content might be fine for now, but you might want to review it to ensure it doesn't come back to bite you. That means cleaning it up. That means a full content audit.
1. Make a List of All AI-Generated Content
You probably have a decent idea of which posts on your site were created using AI tools. If you don't, you'll want to make a list of all of the articles you've published since the AI tools started to gain traction, which could be a lot, depending on your publication schedule.
It's better to cast a wide net or even do a full site content audit rather than just an AI content audit.
2. Test Content Against Checking Tools
You can check several different tools to see if your content would trip flags. Copyscape can't detect AI web pages (at least not yet; they might release their own AI checker eventually), but other tools are available. Harvard made Glitter, for a previous version of the language models, that can show you how obvious or non-obvious your content may be. ChatGPT is looking into "watermarks" for content, and there's a growing industry of companies releasing tools to detect AI to varying levels of success.
"What about false positives?" you may ask.
Well, ask yourself this; if a human wrote content that is indistinguishable from GPT-3 AI-generated content, shouldn't you fix that too?
3. Determine the Right Move for Each Post
Any post that passes your checks includes unique, valuable, well-written, and developed information that can be kept as-is.
Any post that doesn't pass the checks you need to do something with.
You generally have three options.
- You can keep the article as-is and hope it doesn't hurt you down the line.
- You can edit the post and improve it to something that no longer resembles AI content.
- You can remove the post entirely under the assumption that it isn't worth keeping.
These guidelines will be pretty contextual.
For example, if you're creating a reference document or glossary with fixed information – that is, facts that don't change – it might be flagged as AI content simply because the AI would output the same thing. That kind of content is fine to keep since there's no natural way to alter hard facts to be more "unique."
On the other hand, if you publish an opinion piece and that piece is flagged as AI content, you will have a lot more trouble with it.
Another thing I recommend is looking very, very closely at any of these posts and making sure that any claim, statement, or fact it cites is actually true. Remember, AI doesn't have fact-checking in its design yet.
4. Nuke Bad Posts, Improve Good Posts
Take action! Decide what you're going to do, and put it into practice. Improving low-quality AI-written posts won't be easy, but a talented writer can take a piece of content and write something better based on it, so you can do it eventually.
You just might need to pay for the services of a good writer or editor. Sort your articles into two categories; some posts won't be worth saving, and others are targeting topics with substantial traffic with the potential to perform well if - it's written properly.
How to Use AI Writing Tools Properly
I'm pessimistic about AI-generated content, but I admit it has a few uses. You just need to use it properly and not in an exploitative way.
- DO: Use AI to brainstorm ideas and ways of approaching a topic you might not have thought of.
- DON'T: Use AI to generate whole blog posts for you.
- DO: Use AI to give you an idea of an outline or structure for new content.
- DON'T: Use AI to populate that entire outline.
- DO: Use AI to speed up content writing, like basic descriptions, ad groups, and simple, factual content that you can verify.
- DON'T: Use AI to draw conclusions for you.
- DO: Hire a good content marketing company instead of using the same AI tools everyone is going to use and end up with bland content that looks like everything else out there and performs poorly.
AI content generation tools and new technologies are going to get better and better over time. I'm equally sure that, without massive improvements to the state of the computation, we're not going to reach a point where AI can include factual accuracy, logical thinking, and conclusions without a lot of human oversight.
The way that AI content generation works is purely math and word association, and that's still a long way below how the human brain works.
Put those human brains to work!