Are WordPress Attachment and Tag Pages Killing Your Rankings?
WordPress has a lot of different little system and sub-pages that show up for various reasons, including accidental visibility caused by third party plugins.
One such example that I’ve personally had to deal with is Yoast. Now, don’t get me wrong, Yoast is an excellent plugin. It does a ton of very beneficial things for your site and gives you a wealth of options for controlling your SEO data. It’s just that, now and then, something goes haywire, and it can hurt your SEO before you find it and fix the issue.
What happened to me (and millions of other users) was attachment pages were suddenly added to my sitemap after a Yoast plugin update. See, when you upload a file to WordPress, that file is stored in your system exactly the same way as a blog post is stored. The file is given its own little page on your site. Normally, that page is hidden, invisible to both users and search engines. It’s only there to host the file. Unfortunately, now and then the post format for attachment pages will lose that invisibility, and Google will sweep through and index them all. Then, you’re left with hundreds or even thousands of low-quality thin pages with very little content or duplicate content. We all know how much that can hurt.
To be clear, Yoast didn’t generate these pages; WordPress creates them by default. Yoast just accidentally changed a setting during an update and added them to my sitemap, making them indexable by Google, and became a problem I had to solve. This is a problem you’ll need to solve as well to prevent this from happening in the future, or to fix this if it's currently happening to you.
It’s not difficult to fix these issues - the hard part is determining if they’re happening.
So, what should you look for, and how can you fix it?
Look for Any Issues
The easiest way to identify whether or not you have this issue is performing a Google site search and looking for those kinds of pages. There are two ways to do this.
For smaller sites, go to Google and just do a blank site search. Type in "site:yourdomain.com" and see what comes up. If all you see are your main pages, your blog posts, your landing pages, and so on, you’re probably fine. If, on the other hand, you see a bunch of attachment pages that have very little meta data and seem to be centered around a single uploaded file, you’ve got a problem.
You might need to page a few pages back; obviously, your real content is going to rank better than your contentless pages. That’s why the second option exists.
Go to your site and find a post that was published relatively recently, and find an image you added to it. Look for the file name for the image, and search for that image, again with a site search. Something like "site:yourdomain.com image name or keyword". This will run a search on your site for all instances of the image name or keyword you specified, which should pull up the blog post the image is contained within. If it also pulls up a page with nothing but the image on it, you have a problem.
You may have noticed that the title of this post refers to attachment pages and another kind of page, the tag pages.
Tag pages are similar, in that they’re system pages that generally have little content of their own and are often better hidden from the search engines. They serve the same kind of purpose as category pages; they’re navigation aids that show related content. The difference is in how many tags you’ll have compared to categories.
Tag pages can be more useful and relevant than attachment pages, but they might also be worth hiding. It depends on what you want to do with them. Some people decide to convert from tags to categories. Some people choose to boost up their tag pages. Some people find they don’t hurt as much as the attachment pages, and just leave them in place. The choice is yours, of course.
Personally, I like category pages more. In my experience, people don’t use the tags enough to justify their visibility, and categories work just as well. Every audience is different, though, so if your audience actually uses your tag pages, keep them around or beef up your tag pages to improve their layout, user experience, and value.
My advice is to pick one; use either tags or categories. In my opinion, they’re redundant. Blogs have pagination (page 1, 2, 3...), tag pages, and category pages, and they're all lower value pages that just list content that exists in other areas of your site.
Decide What to Do With the Pages
There are, in my view, three things you can do with tag or attachment pages. You can handle them differently depending on what your goals are.
The first option is to hide the pages from the search engines. Let’s say your blog has 100 posts on it, and every post has three images in it. That’s 100 pages of content on your site, plus a dozen for homepage, landing pages, about page, and so on. That’s 300 attachment pages. To Google, 75% of your site is thin content pages. That’s not great!
So, why not just hide the pages? The choice here is to go in and apply the meta noindex tag to all of your attachment pages. Noindex tells Google not to include those pages in the index, and is commonly used for system pages. It’s not going to hurt your site to do this, unless those pages were somehow drawing in positive value, which they almost definitely aren’t.
If all you want to do is noindex your attachment pages, it’s pretty easy. All you need to do is add a little bit of code to your WordPress system php files. Go to your system editor and find the header.php file, and open it up.
Add this code to it, making sure it’s not in the middle of another code block:
This should simply insert a noindex tag to any attachments page, without affecting other pages on your site. If this doesn’t work, you can try out a different code snippet in your functions.php file. Instructions for that are here.
Alternatively, you can approach the problem from Google’s end instead. Go to your Google Search Console account and find your crawling rules page. Google will show a list of parameters they’ve seen on your site, which can be things like utm_source or attachment_id. That second one is the one you’re looking for. Simply click on the edit button for that parameter and click on the option to change or narrow page content. In the box that appears, check "no URLs" and save your settings. This tells Google to ignore the pages on your site that use attachment_id in the URL.
The second option is to redirect the attachment pages to the pages the attachment is used in. This is the general best option. You don’t have to worry about accidentally adding a noindex tag to pages you want indexed, and you don’t have to worry about an update breaking your code.
We create blog content that converts - not just for ourselves, but for our clients, too.
We pick blog topics like hedge funds pick stocks. Then, we create articles that are 10x better to earn the top spot.
Content marketing has two ingredients - content and marketing. We've earned our black belts in both.
Unfortunately, this is also the option that can get you in trouble if you’re using a plugin that breaks. This is what happened to me with Yoast. In Yoast, you can click on the Search Appearance area, and under "Media", tick the "redirect attachment URLs to the attachment itself" slider so it says "Yes". This automatically adds a code snippet to your site that makes every attachment page just load up the page that hosts the attachment instead. Quick, easy, simple, and effective.
There are other plugins you can use to do the same thing, if you don’t want to rely on something like Yoast not updating. Keeping Yoast up to date is important, but something like Attachment Pages Redirect is a simple fire-and-forget solution you won’t have to update unless WordPress changes the way their code works and breaks it.
The third option is to beef up the pages and make them more valuable to users. After all, just because the pages don’t have content on them by default doesn’t mean they can’t have more content on them, right?
This is most common with category or tag pages, where you might want to add a bunch of information about the category or tag in general, to introduce your perspectives on it and link to some of your more evergreen posts. Give your page a nice design, and try to use as few tags as possible. I see some blogs with (no exaggeration) 3,000 tags, which is way too much. If you have 10-20 tags that you use regularly, you aren't going to be watering down your overall content quality too much with a small handful of pages.
Attachment pages, by default, typically have some information about the image. It’s the same information you plug in when you upload it; the alt text, the description, the photographer credits, and so forth. Generally this is useful for photographers sharing their portfolio work, but less useful for everyone else. You can, however, add more content to the image page. This can make it more relevant for Google image search, though it’s more work too.
I don’t really recommend doing this, because of the work involved for the minimal benefit you get out of it. The redirect is generally going to solve your problems and won’t take a lot of extra time.
A fourth option is to go through and delete all the attachment pages. Sometimes, people who don’t really know what they’re doing come up with exotic solutions to easy problems. This is one such solution. You can go through your entire site and manually delete every attachment page WordPress has generated. The attachments still exist in your database, so you don’t lose the embedded information, but you will lose the other data attached to them. It works, but it’s not an ideal solution.
Make Sure to Remove Pages from Your Sitemap
One of the biggest problems with attachment pages isn’t that they exist, it’s that Google is told to check them frequently. Sometimes your site will generate a sitemap, and sometimes that sitemap will add the attachment pages to the list of pages on the site. It makes sense; they’re pages that exist and have content on them, so why wouldn’t they be on the sitemap?
Unfortunately, this is how Google finds those thousands of pages with little to no content on them and indexes them, making your site as a whole look worse. Simply removing those listings from the sitemap doesn’t make Google forget about them. When you implement your redirects or noindex tags, that’s when you need to go into your sitemap and remove the attachment entries.
It’s easy enough; XML is a pretty simple language. Just do a search for every page with the attachment_id parameter in the URL and remove the XML entry for that page, then reupload the sitemap. Google will find it soon enough and adjust their expectations accordingly.
More often than not, though, your plugin is adding your attachment pages to your sitemap for you, in which case you'll need to find those settings in your SEO plugin.
Make Sure Your SEO Plugin Isn’t Restoring the Pages
Depending on the SEO plugin you use, you might have to check to see if it’s changing your settings or restoring your pages when you don’t want it to. I mentioned already that I had Yoast do this to me once, though I suspect it was just messing up a configuration file when it updated.
Other SEO plugins that control your attachment pages can do the same thing. If they break or if they reset their settings, your attachment pages can become visible all at once, and you’ll have to fix the problem again. You can see an example in the above image, which is the All In One SEO Plugin.
All this really means is just keeping an eye on your sitemap and your site search results. If your traffic drops unexpectedly, if your rankings take a dip, or if your number of indexed pages skyrockets out of nowhere, take a look and see if these system pages are the problem.
Consider a 410 Gone Flag
If you aren’t redirecting your attachment pages, and you simply want them gone, you can consider implementing an HTTP status code called
410 GONE. We’re all familiar with the HTTP status code
404 Not Found, where a page is broken or missing. Google might keep a
404 page in their index for a while, waiting to see if the page comes back, and will eventually remove it if it doesn’t. A
410 is a more permanent and intentional directive.
It tells Google "hey, this page is gone and it isn't coming back, you should remove it from the index.", whereas 404's may just be temporarily broken and will be fixed soon. Google can take months to remove 404 pages from their index, but they will remove 410'd pages much more quickly.
Adding this status code is pretty simple. You can follow the instructions on this StackExchange post to do it in a few minutes. Make sure to adjust the target pages away from the contact and about examples they use, though!
Alternatively, if you're using WordPress, you can check out the 410 for WordPress plugin. It hasn't been updated in a while, but there isn't a whole lot to the plugin so it shouldn't break anything. We're still using it and it works just fine.
Personally, I find the redirect to be the better options, and it doesn’t need a response code like this. A 410 GONE is only useful if you want to delete a page and make sure it is removed promptly from search engines.
Submit Your URLs for Removal
When you remove pages from visibility, Google’s index won’t reflect that immediately. They have a bit of delay to make sure that the page isn’t coming back. If you want to make sure they are removed faster, you can use Google’s "remove outdated content" tool. You can find this tool in the search console here. Simply plug in a list of URLs you want removed, and Google will remove them.
There’s not really a huge reason to do this unless you’re on a time crunch where every minute of lost time matters. If that’s the case, go right ahead. Google will eventually – within a few days, usually – remove content it can no longer find, though, so it’s not a huge priority if you don’t want to use the tool.
Have you had to deal with attachment or tag pages showing up in the index when you don’t want them to? What caused the issue in your case, and how did you deal with it?