James Parsons is the founder and CEO of Content Powered, a content creation company. He’s been a content marketer for over 10 years and writes for Forbes, Entrepreneur, Inc, and many other publications on blogging and website strategy.
The Importance of The XML Sitemap Priority and Changefreq Tags
We all know by now that the XML sitemap is a powerful tool for SEO. A sitemap is exactly that: a map of your site. It shows all of the unique destinations a visitor can arrive at, from your homepage to your most hidden public subpage. While the average user never looks at or even knows that a sitemap exists, Google certainly does.
Submitting your sitemap to Google via the Search Console is one of the first SEO steps to cross off your list after you’ve launched your site. This submission tells Google that your site exists, gives them a list of all of your public pages, and gives them a location to check to see when new content is posted or old content is updated.
Sitemaps are so important that the functionality is built into major SEO plugins like Yoast. Creating and distributing a sitemap is high on the list of every SEO checklist. At the same time, however, it’s often considered a fire-and-forget tool. You create a sitemap, it’s updated automatically when you publish new content, you submit it to Google, and that’s it. You never have to look at it again except during audits, to make sure it still exists.
Most people, then, don’t think any deeper about their sitemaps. Fewer still bother to learn what all of the attributes tagged on each URL are. Two of those attributes (Priority and Chagefreq) are interesting enough I would like to talk about them today.
Let’s dig into it!
Sitemap Priority Attribute
The priority attribute is an optional attribute you can add to a sitemap. Google won’t penalize you for not having it; it’s simply a way to give Google a little bit more data about the URLs on your sitemap. So what does the priority tag do?
Priority, as the name implies, simply gives a rating to the priority of the piece of content on your site. It’s a numerical value between 0 and 1, so, a decimal with a single digit. A priority of zero indicates content that is low priority, not useful, not kept up to date. A priority of one indicates that the content is important to your site. The scale typically looks a little like this:
- 0 – 0.3: Old news posts, outdated guides, irrelevant pages you nevertheless don’t want to delete, merge, or update.
- 4 – 0.7: Articles, blog posts, category pages, FAQs, system pages. The bulk of your site’s content falls into this range.
- 8 – 1.0: Extremely important content, such as your homepage, major category pages, product pages, and subdomain indexes.
It’s unclear how Google treats content that does not have a priority tag assigned to it. Assumptions can be made, however, given the scale: most content will typically be a 0.5, with older and less useful content slipping lower, and certain high-importance content assigned higher numbers.
Essentially, Google will assign priority levels to your content if you don’t have any default values listed. You can choose values of your own to give Google and other search engines an idea of what you consider important on your site.
Some people might be tempted to set the priority of every page on their site to 1.0, to try to gain some extra value out of Google. Unfortunately for those people, Google is much too smart to fall for this kind of trick, and will happily disregard the values that you set.
The general advice here is to set your primary pages to higher priority levels. Less important or less visible pages don’t need a high priority, and it may confuse search engines if you assign them one. Your homepage generally has the highest priority out of all of your pages.
Sitemap Chagefreq Attribute
The Changefrew attribute is another way Google uses (or used to use; see below) to indicate how often a page’s content might change. A home page could change every day. A site like Forbes’s homepage changes multiple times an hour. A high “Changefreq” (which stands for change frequency) indicates that a page is likely to change more often. Conversely, a lower Changefreq indicates the page isn’t likely to change at all.
Changefreq can have one of seven different attributes attached to it.
1. Always: This means the page is constantly changing with important, up-to-the-minute updates. A subreddit index page, a stock market data page, and the index page of a major news site might use this tag.
2. Hourly: The page is updated on an hourly basis, or thereabouts. Major news sites, weather sites, and active web forums might use this tag.
3. Daily: The page is updated with new content on average once a day. Small web forums, classified ad pages, daily newspapers, and daily blogs might use this tag for their homepage.
4. Weekly: The page is updated around once a week with new content. Product info pages with daily pricing information, small blogs, and website directories use this tag.
5. Monthly: The page is updated around once a month; maybe more, maybe less. Category pages, evergreen guides with updating information, and FAQs often use this tag.
6. Yearly: The page is rarely updated, but may receive updates once or twice a year. Many static pages, such as registration pages, About pages, and privacy policies fall into this category.
7. Never: The page is never going to be updated. Old blog entries, old news stories, and completely static pages fall into this category.
The attribute essentially lets Google know roughly how often they should be checking any given page to see if there are changes.
As far as I know, there are no penalties for using a higher frequency tag than is necessary for the pages on your website. If you tell Google that your About page changes daily, but you only change it about once per year, Google will simply stop checking when they realize the content hasn’t changed the last dozen times they’ve checked. They may also start to ignore that tag in your sitemap entirely.
For larger publications, like Forbes (mentioned above), hourly makes more sense. They want breaking news to be discovered quickly and their site has new content every hour.
Set your Changefreq to something realistic. Setting it too high will confuse search engines and may result in them searching your site too aggressively, even when there aren’t any changes. If you publish a blog post per week, setting your blog homepage to weekly or daily makes sense, but if you set it to hourly, they’ll have to check your blog 168 times before they see a difference in your page content.
The Truth About Sitemap Attributes
The real truth is, you don’t have to use either one of these tags. It might not be a good idea to use them at all.
Some sites like this one recommend using both of these tags to coerce Google into checking your site frequently enough to catch and index new content and new changes as soon as possible after they happen.
While this is an admirable goal, it’s simply not effective anymore, for two reasons.
1. First: Google doesn’t need this information anymore. One of the other sitemap attributes is the timestamp of the last time the content was updated. Google maintains, in their index, a list of your pages, their content, and the last time the content was indexed. If the last time they indexed the page was before the last time your sitemap says it was updated, they check it again. Simple, easy, and already implemented on every sitemap out there.
2. Second: Google doesn’t need your help. They’ve even said that many smaller sites may not even need a sitemap at all. According to their developer section, you might not need a sitemap if:
- Your site is “small”. By small, we mean about 500 pages or less on your site. (Only pages that you think need to be in search results count toward this total.)
- You’re on a simple site hosting service like Blogger or Wix. If your site is on a service that helps you set up a site quickly with pre-formatted pages and navigation elements, your service might create a sitemap for your site automatically, and you don’t need to do anything. Search your service’s documentation for the word “sitemap” to see if a sitemap is generated automatically, or if they recommend creating your own (and if so, how to submit a sitemap on your hosting service).
- Your site is comprehensively linked internally. This means that Google can find all the important pages on your site by following links starting from the homepage.
- You don’t have many media files (video, image) or news pages that you need to appear in the index. Sitemaps can help Google find and understand video and image files, or news articles, on your site, if you want them to appear in Google Search results. If you don’t need these results to appear in Image, Video, or News results, you might not need a sitemap.”
Of course, there’s no reason not to have a sitemap. It helps Google find pages that might have slipped through the cracks, and it absolutely won’t hurt you unless you’re trying to do some shenanigans with it.
Interestingly, I was going to list another reason. From what I heard and believed, Google no longer needs to use the Googlebot to index the web; they have Chrome, and can simply pull changed data from users who browse. Rand Fishkin ran a poll about whether or not people believed this was true back in 2017:
In your opinion, does Google’s Search Engine use Chrome data to discover and crawl web pages?
— Rand Fishkin (@randfish) March 11, 2017
Except, as it turns out, this isn’t exactly true. I know – I’m just as shocked as you are. I would assume the Google Panopticon would use every source of data it could for everything it could, especially since they removed their “don’t be evil” tagline from their site. And yet, during a test run by Perficient which sent users to a page that had no links, no presence in a sitemap, and no other connections but the users arriving, the tests didn’t trigger indexation. Google didn’t know about the pages even after 27 users visited them.
Now, it’s possible that Google has changed and does now use that data (since this test was run in 2017), and it’s possible that Google doesn’t index new pages, but can discover them, or can use data from users to identify changes in a page that was already indexed. It’s also possible that the volume of visitors was simply too small for Google to care; only 27 visitors in a spike and then nothing isn’t very compelling. None of these were tested. If anyone wants to test them (or work with me to test them), feel free to drop me a line!
Truth be told, all of this is circumstantial, but that’s fine. We also have direct word on this subject from John Mueller of Google.
Back in 2015, during one of his regular video hangouts, John was asked a question.
“Does priority and frequency matter in a sitemap? If not, how can we tell Google to crawl specific pages on daily or high priority?”
He answered, and summarized this by explaining how Google no longer pays attention to those attributes.
“Priority and change frequency doesn’t play that much of a role with Sitemaps anymore.
This is something where we’ve tried various things but essentially, if you have a sitemap file and you are using it to tell us about the pages that were changed or updated, it is much better to just specify the time stamp directly so that we can look into our internal systems and say we haven’t crawled since this date, therefore, we should crawl again.
And just crawling daily doesn’t make much sense if your content doesn’t change. So that is something where we see a lot of sites that give us this information in the sitemap, they said it changes daily or weekly, and we look in our database and it hasn’t changed in a month or years…
So what I’d recommend is using the timestamp.”
In other words, these attributes increase the burden on webmasters to specify an attribute or pair of attributes for every page (manually, no less), which is always a bad thing. They can easily be falsified, wasting Google’s time until they learn better. And, the timestamp method is simply more effective and simpler.
So, there you have it: the truth about the Priority and Changefreq attributes for sitemaps. They exist, and you can use them if you want, but they don’t matter as much as you’d think. There’s no reason to go out of your way to use them. There’s no reason to go out of your way to remove them if they’re added to your sitemap automatically.
Google isn’t the only search engine that checks your sitemap, so if you have your Changefreq values set too high, you could have dozens of search engines hammering your server and slowing it down. It’s still something you should tune, but it isn’t going to make or break your website.
The only thing I would watch out for is checking to make sure your sitemap entries don’t have duplicate attributes. If, for example, you have two instances of Priority for each entry, you’re going to have an error in your sitemap parsing in the Search Console. Those errors can hurt your indexation, so it’s better to avoid the problem altogether.
What do you think? What has your experience been playing with these values? Any interesting findings that you’d like to share with us? Let us know in our comments section below – we reply to all comments and are looking forward to hearing from you!
Are You Blogging? You Should Be.
I wrote a 6 part article titled “Why Blog?” that breaks down the stats and facts on why blogging is one of the best marketing investments, period. I guarantee you’ll learn something new.Read Article