The Essentials of Working With XML Sitemaps

While working with the websites and their content for the sake of promotion, there should be an opportunity to look through all pages available without additional hassle conveniently. This is the reason why such a notion as “sitemap” exists – it is a file of XML format that ranges all important links according to their priority. For the search engines, where the document is uploaded later, the sitemap simplifies pointing out the relevant pages for the input requests, greatly speeding up the information processes.

In order to organize the proper sitemap for the work, there are several basic requirements that are to be met:

  1. When creating the file that is supposed to interact with the search engines, the format should be only XML and no else. The alternative html-sitemaps, which are also widely known, are made for visitors, not influencing the rates.

  2. Inside the sitemap.xml, only up to 50 thousand links can be mentioned. In other cases, the map should be split into several files.

Sitemaps diagram

  1. Each of the mentioned pages should weigh no more than 50 MB. As for the final XML-file, its size is limited to 10 MB to be accepted by the systems.

  2. The coding format is necessarily UTF-8 so that programs can read it properly.

For maintaining the sitemap, there is no need to include each and every page so as not to take up space. Instead, the ones should be mentioned that contain the relevant content that is actual for the searches. Thus, the pages which serve as transition or presentation of additional information can be excluded without regrets. As a result, this will speed up the file reading, introducing only necessary links that hold actual importance in search.

The Basic Ways to Find the Sitemap of the Necessary Source

There are several basic ways to start working with XML-sitemaps, either via direct ways or supporting plugins that are especially helpful in finding out the mistakes. The first step is actually finding the sitemap or its location that is registered on the server. Beware of the fact that the file can be not the only one in the list or have additional words in the name.

  • One of the most popular methods for searching is looking for the exact location in robots.txt directory, which is accessed through https://sitename.com/robots.txt on any website. Such links are necessarily fixed, so they work in each of the cases. When looking for the sitemap's exact location, the format should be “sitemap: https://sitename.com/sitemap.xml” or anything similar to it. Some details may vary, but the most common way is the one represented.

  • The inclusion of the step with robots.txt can be omitted when trying the direct link https://sitename.com/sitemap.xml. It usually works with standard websites that do not have many pages or differentiation in languages. Otherwise, there are several maps that are divided into corresponding categories, languages, and so on.

Sitemaps example

  • With search engines, when used properly, the sitemap can also be put out quickly. For this, two leading operators should be included – “site:” for pointing out the source and “filetype:” for specifying exactly the XML format. As a rule, the results show exactly the sitemaps that are included in the server.

Some of the downloadable plugins, especially the ones for identifying the issues, search for the file automatically when entering the full link of the server. After one or two specific steps which work differently in each software, the sitemap can be downloaded within it and checked for possible mistakes.

Finding the Sitemap in the Wordpress Environment

When working with site construction instruments, like WordPress - the most popular one, the sitemap can be managed with the help of installed plugins. As different developers can offer their own products, there are many well-known or brand-new choices that can be convenient for different purposes. Some want to see multifunctional software, while others need one plugin per task, not planning to use it much.

Sitemaps in Yoast SEO

Most users choose such software as:

  • All in One SEO pack;

  • Google Sitemap;

  • Google XML Sitemaps;

  • Simple Sitemap;

  • Sitemap by click5;

  • Yoast SEO.

Each of the tools presented allows one to create sitemaps and edit them according to the preferences or needs. The inclusion is set via the plugin menu, where the user manages the categories as “shown in the search engine” or separate pages in full links. In general, the final location of the file is sitename.com/sitemap.xml or sitename.com/sitemap_index.xml.

Looking for the Sitemap on the Shopify Platforms

One of the definite advantages of the Shopify platform is the automatic generation of XML-file, which is accessed via sitename.com/sitemap.xml link. When entering the map right after the creation, everything is already separated in the proper way, allowing one to not care much about it.

Sitemap on the Shopify Platforms

On the other hand, as easy for it to be generated, there are issues arising with the necessity to correct the included content according to the needs. The Shopify services do not filter links, listing, therefore, all of the categories which can sometimes be irrelevant. If one wants to delete or add other pages, the .liquid files are to be edited, requiring certain theoretical preparation in advance.

The Next Steps After Sitemap Searches

Creating from scratch the working XML-sitemap

If the search did not reveal any of the existing sitemaps, generating one is not really difficult. The process can be either automatized with specific applications or plugins or created from scratch if there is enough knowledge for doing it and not so many pages. Including the previously mentioned ideas, form the file by making sure the following points are present:

  1. Document declaration. There the creator mentions the XML version, encoding, and parental tag, under which the URLs are united.

  2. URL-addresses. Consist of <url> frame and page location as the main elements, where additional settings are applied individually.

  3. lastmod formatting. It is not obligatory but highly recommended for increasing the scanning speed later.

  4. Priority mark. Allows rating the pages according to their role, scaling from 0.0 to 1 from the lowest to highest.

  5. Change frequency. With its help, the search engine takes into account the regular changes on the mentioned page, from hourly to yearly periods.

When completing the compilation of the sitemap, another step lies in verifying the validity of the file. Sometimes even seemingly minor issues can negatively affect the whole performance later, making one wonder about the reasons why a website is unpopular. So, the frequent mistakes the person can make while creating or editing the XML-sitemap are:

  • including the links on the other files belonging to the Sitemap index. The pages represented must form the map of the specific website, not the map for other maps. If the number of links requires several XML files to be formed, no need to forcefully intersect them with “relinking” purposes;

  • neglecting the need to mention the full link version. The inclusion of HTTP or HTTPS protocols and www. parts are all obligatory, as a similar page with or without them can also fall under the search. For this reason, all the notions are absolute, directly corresponding to the actual website one owns;

  • overstepping the limit for the file weight or the number of included pages. With more than 50 thousand links, the system considers the sitemap as invalid and too heavy to be read during relevant searches. It is safer to divide the website content into several categories that allow to perform thematic selection or make different maps per one language;

  • uploading the empty file for the sake of its existence in the database. Only the content within XML-sitemap influences the search rates, so just introducing the file as a notion does nothing to the system, leading to source ignorance.

The errors above are considered critical, not allowing us to proceed further with the sitemap application. When no other issues seem to arise from the superficial point of view, the ready XML is free to be uploaded to the online environment of the root directory. With the next set of notifications, either specialized applications or the inner software for search consoles can help to deal with it.

Adding the sitemap to the search engine software

After preparing and/or receiving the link for the map of website pages, the next action to be done is uploading one to the search engines the source is going to work with. In most cases, the obligatory choice is Google Search Console because of its internationality and ease of operation. Moreover, the user can regularly follow the statistics and see the influences on the website trending. Sure thing, the software also introduces the tools for error discovery and its fixing for better optimization.

XML Sitemaps Google search console

  1. Cutting off the redirects – in order to avoid the long way to the designated page, the search engine warns about the issue of redirects. However, this point can be omitted if absolutely necessary, but in most cases, it is rather unwelcome.

  2. Deleting broken links – when the next updates dispose of several pages, the console detects all the elements leading to currently non-existent links and warns about them. Such a process is recommended to be held or, at least, checked manually so as not to ruin the whole composition.

  3. Finding out irrelevant links that waste the overall budget prepared – system allows to see canonically matched, unmatched, and unset links in the sitemap for further review and decision on their necessity.

  4. Fixing tag issues – inside the definitions, the text in links or content may either hold a mistake or contain non-standard ways of tagging that can be unrecognizable for systems. For such cases, it is better to check the issues manually and let them be if they just hold the title of possible ones.

  5. Incorrect dates for publication or update – the data should be converted to the unified model in order to be accepted by the system. Otherwise, it becomes dead weight to the XML-sitemap, also making the links less actual for the engines.

  6. Shortening the link length – some of the sources do not allow overstepping the limit of 1024 symbols per line, considering it too heavy. This is why the addresses should be formed in a shortened but quite informative manner, appealing to both reader and the visitor.

  7. Unacceptable symbols in tags of sitemap – either just a mistake or the incorrect data compilation, but the links may sometimes include the unreadable symbols. Under the same umbrella comes the low match rate of separate elements, which are automatically pointed out by the system.

  8. Working with the media content – the photos or videos have the potential to be included in sitemaps but require advanced knowledge or help from the professional. The probability of meeting the notification with such files is twice as high as with general links and tags.

The list is regularly updated due to the constantly changing technologies and updated acceptable tools.

Being checked completely up to the satisfying results, the sitemap is usually updated automatically, not requiring additional uploads. However, some sources would operate better if the fixed version is introduced again as a new one, speeding up the status update. This also helps discover possible additional problems from the previously ignored parts because of broken links.

The best option in such scenarios – even while using sitemap generators, launching manual checks with the assistance of specialized services. They are mostly concentrated on optimizing the XML-file to the universal standards, showing the errors, ways of solution, and saving the progress. After that, the upload to the Internet goes much smoother, giving satisfactory results with no need for waiting for additional checks online in enormous queues. Many specialists actually recommend balancing the process in such a manner, especially if one works alone on the project.

With the material listed above, the user will be able to work with sitemaps on their own, understanding the principles of work, frequent issues, and possible solutions. After reading this, there will be no more confusion about the necessity of such a file, reasons for its format, canonical structure, or ways to find one in the website root directory. While it plays an important role in source promotion, it is obligatory to learn the basic notions of the topic together with applied rules to one. As a result, the website relevance for the applied search engines rises up, showing the increase in statistics for the visits from new or regular visitors.