What a sitemap is and how to create one

Share this article

A sitemap is one of the main files used to make it easier for search engines to crawl a website. It contains all or part of the site’s URLs, so Googlebot and other crawlers can discover them from a single place. A sitemap is especially useful on larger websites, although smaller sites can also benefit from having one.

What an XML sitemap is and what it is used for

An XML sitemap is a file that contains information about the content on a website, such as pages, images, videos, or news content. In most cases, it is simply a list of the most important URLs on the site.

Search engines use this file to discover and crawl those URLs more efficiently. A sitemap can also include extra information, such as the last modification date of a page or its alternate language versions.

If you include your key URLs in the sitemap, search engines are more likely to find and crawl them. It can also be a useful way to review how those URLs are performing in Google Search Console once the sitemap has been submitted there. A sitemap can help with discovery, but it does not guarantee crawling or indexing on its own.

Do you need a sitemap for your website?

If a website does not have a sitemap, search engines mainly have to rely on internal links to discover its pages. On a small site with a solid internal linking structure, that may not be a major issue. On larger sites, or on sites with weak architecture, a sitemap becomes much more valuable because it helps search engines find important pages more efficiently.

In practice, almost any website can benefit from having an XML sitemap. It becomes especially important for large websites, news sites, websites with lots of images or videos, or sites with orphan pages or weak internal linking. It is also useful for helping search engines discover newly published or recently updated URLs more quickly.

How to create an XML sitemap

There are many ways to create an XML sitemap. One common option is to use a crawling tool that generates the sitemap after scanning the site.

Another is to rely on a CMS plugin. In WordPress, for example, plugins such as Yoast SEO or Rank Math can generate the sitemap automatically and let you decide which URLs should be included or excluded.

If you only want to include a small, fixed list of important URLs, you can even create the sitemap manually. But if you want it to update automatically as new pages are published, it makes more sense to use a plugin or another dynamic solution that works with your CMS.

A sitemap should only include the URLs you actually want search engines to crawl and index. In practice, that usually means canonical URLs that return a 200 status code. Redirected URLs and URLs with 4xx errors should be left out. Each sitemap can contain up to 50.000 URLs and can be up to 50 MB uncompressed, so larger sites need to split their URLs across multiple sitemap files and, if needed, use a sitemap index file.

Submitting a sitemap through Google Search Console

Once the XML sitemap has been created, it can be submitted through Google Search Console so Google can start processing it more easily. To do that, go to the Sitemaps section in Search Console and submit the sitemap URL. After that, Search Console can also be used to review errors and check how many submitted URLs are being processed and indexed.

It is also a good idea to include the sitemap URL in the robots.txt file, since crawlers often check that file and can discover the sitemap there.

Differences between an XML sitemap and an HTML sitemap

Although XML is the most common sitemap format, it is not the only one. Search engines can also work with other formats, such as RSS, Atom, or plain text. A website can also have an HTML sitemap.

The main difference between an XML sitemap and an HTML sitemap is that the XML version is primarily intended for search engines, while the HTML version is visible to users. Both can help with discovery, but the HTML sitemap can also be useful for visitors, especially on large websites with more complex architecture.

Another difference is that XML sitemaps can include extra metadata such as the last modification date, while HTML sitemaps usually do not. On the other hand, HTML sitemaps are often linked from the footer and can support internal linking by making more pages accessible from a single place.

In short, a sitemap is a very useful file because it helps search engines discover the most important URLs on a website more easily and keep up with new or updated content. It does not replace good internal linking or guarantee indexing, but it is still one of the most useful technical SEO files for supporting crawl discovery.


Share this article
raul revuelta seo y marketing digital

About me

Raúl Revuelta

Digital marketing consultant specialized in SEO, CRO, and digital analytics. On this blog, I share content about these areas and other topics related to digital marketing, always with a practical, business-focused approach. You can also find me on LinkedIn and X.

Leave a Comment

Your email address will not be published. Required fields are marked *

Would you like to talk about your project?

Scroll to Top