Before you start a search engine promotion of your website, conduct a thorough internal audit of a website, in order to identify and eliminate the possible errors.
We collected the recommendations of different SEO-specialists and prepared a step-by-step guide for a website technical verification.
Here, you will get only practical tips with the links to the official sources of the search engines. 30 paragraphs, a bit more steps, and the DIY website technical analysis is ready!
1. Test and check your website for errors
It is recommended to scan your website for errors with a help of the following programs:
While doing it, you should not only check the website pages, but also the scripts, styles and images elements. Make sure to use mobile or desktop version of User Agent, recommended by Google.
2. Structure your project
Website structure analysis allows you to better understand the logic of the resource and select the landing pages that can generate more search traffic. Based on the assembled structure, you will be able to prepare the recommendations for CNC address settings, and automatically generate website semantics, to create a comprehended commercial content plan.
It is more convenient to structure your project in the Excel file.
3. Setup and implementation of clean URLs.
CNC are human-understandable URLs that are user-friendly and briefly display the web page content.
They should:
- be logically structured;
- be static or pseudo static;
- contain clear and concise sections titles;
- contain the maximally natural amount of keywords.
Therefore, it is better to eliminate the following things on your website:
- dynamically generated web pages;
- pages with not informative for users URLs.
For the big websites, it is necessary to provide formula-type references: {Domain}/{Section}/{Subsection}.
For the small websites, provide references for every web page.
URLs for multilingual websites
The multilingual versions of websites are very often implemented without changing the URL. If the second language version is helpful for website promotion, you need to set up the static pages for it. The best option is to have language versions in different website depositories (e.g. /fr/, /en/, etc).
URLs for products
The URL addresses must be automatically generated for all the products. In this case, make sure the addresses do not have any unnecessary symbols. The following symbols are allowed:
- Latin alphabet symbols;
- Lower case;
- Punctuation marks, such as hyphen.
Spaces are not allowed.
URLs for useful filters web pages
All useful website filters (those ones that are capable of generating organic traffic) should be formed with URLs and direct links to these web pages.
Do not forget to make static URL addresses for pagination pages, articles, categories, and subcategories.
Dynamics for useless filters web pages
Generate dynamic pages, if there is an opportunity on your website to choose between the following options:
- Two or more filters from the same selection;
- Two or more filters from different selections;
- The filter, incapable of generating traffic.
In robots.txt, create a directive for Bing, where the dynamic parameter of filter combinations should not be indexed. For the Google dynamic pages, create either canonical attribute for main category, or noindex, follow attribute. It is necessary to close the dynamic attribute in the scanning parameters on the Search Console.
It helps to reduce the website load time while Google crawling.
4. Configure https server
It is preferable to create a task for https setting up at the very beginning of a technical audit, as the setting up of the links in a sitemap and inside the code will hugely depend on the correct implementation of https set up. You can read more here.
5. Optimize images
The images optimization includes the following stages:
- Alt and title set up
Set up alt=»» and title=»» attributes for images that did not have it before.
- Alt and title removal from technical images
Remove alt and title attributes from all the images, that are part of website layout and design.
- Uninformative Alt and Title removal
Remove or replace uninformative alt=»» and title=»» attributes from the images.
6. Optimize external outbound links
This task is relevant for those projects that have a large amount of external outbound links. You can identify them with such services as Ahrefs and Xenu.
Collect all the outbound links in a separate file, and set a task to a programmer to make these links no follow.
7. Optimize elements of SEO functionality
At this stage, it is necessary to optimize the sorting and close the website search pages.
Sorting optimization
Tag all the sorting pages with <metaname=»robots» content=»noindex, follow» />.
It is necessary to specify follow parameter, instead of no follow one.
Closing website search pages
Remove the website search pages, by closing them in robots.txt file and removing search directory using GoogleSearchConsole.
8. Optimize the code
In order to speed up the pages processing by search engines and browser, conduct a further code analysis and optimization.
Transfer the script from the page to an external file
If the page code contains massive volumes of scripts, transfer them from the page to an external file. It will reduce the volume of a main code and will make it simpler for the search engines to scan your website.
Transfer the CSS from the page to an external file
If you find out that the page code contains massive volumes of CSS elements, it is also necessary to transfer them from a page to an external file. It will optimize the code and speed up the website processing by search robots.
Combine the .js files into one file
If the main code contains many references to the files with .js extension, it is necessary to combine the .js files into one file (if possible). Your main goal is to reduce the number of .js files that are used for downloads. In fact, these operations reduce the page load time and allow you to extend a limit of pages a search engine crawler can scan per day.
Combine the .css files into one file
If the main code contains many references to the files with .css extension, it is necessary to combine the .css files into one file (as in previous paragraph). It will reduce the number of search engines requests to the server and increase the daily amount of pages that can be scanned by search robot.
Check the hidden text
Revise the code for the different page types, in order to identify the possible hidden text errors. In you find such errors, remove the hidden text or make the text visible.
Delete non-informative elements from the script
The code of the main website might contain the text elements that should not be ranked. For example:
- The titles of authorization registration forms with “Forgot your password?” text type;
- The titles of the callback forms with “Order a callback” text type;
- Any other technical elements that are not visible but are available in the main code.
Transform such text elements into the external file.
Transfer additional files to another domain or subdomain
If there are additional scripts, styles and images files, transfer them to another domain or subdomain.
9. Optimize the load speed of your website
In order to identify the elements of your website that need to be speed up, you can use the following tools:
Group all the problematic areas into one file, and ask the programmer to fix it.
10. Add microdata
This paragraph shows the examples of standard markup elements that can apply to almost any project. Before posting these elements, you can verify if they are correct with the Google Structured Data Testing Tool.
Logo markup
<script type=”application/ld+json”>
{
“@context”: “http://schema.org”,
“@type”: “Organization”,
“url”: “http://site.com/”,
“logo”: “http://site.com/logo.jpg”
}
</script>
Navigation layout
<div xmlns:v=”http://rdf.data-vocabulary.org/#”>
<span typeof=”v:Breadcrumb”>
<a href=”http://…” rel=”v:url” property=”v:title”>Name of nesting 1</a>
</span>
<span typeof=”v:Breadcrumb”>
<a href=”http://…” rel=”v:url” property=”v:title”>Name of nesting 2</a>
</span>
<span typeof=”v:Breadcrumb”>
<a href=”http://…” rel=”v:url” property=”v:title”>Name of nesting 3</a>
</span>
<span typeof=”v:Breadcrumb”>
<a href=”http://…” rel=”v:url” property=”v:title”>Name of nesting 4</a>
</span>
</div>
Reviews markup
<div typeof=”v:Review-aggregate” xmlns:v=”http://rdf.data-vocabulary.org/#”>
<span property=”v:itemreviewed”>Product Reviews [Product Name]</span>
<span rel=”v:rating”>
<span typeof=”v:Rating”>
<span property=”v:average”>average value</span>
of
<span property=”v:best”>best value</span>
</span>
</span>
based
<span property=”v:votes”>number of ratings</span> assessments.
<span property=”v:count”>number of customer reviews</span> customer reviews.
</div>
Price range categories layout
Suitable for categories of online stores.
<script type=”application/ld+json”>
{
“@context”: “http://schema.org/”,
“@type”: “Product”,
“name”: “NAME OF CATEGORY”,
“offers”: {
“@type”: “AggregateOffer”,
“priceCurrency”: “USD”,
“lowprice”: “MINIMUM PRICE”,
“highprice”: “MAXIMUM PRICE”,
“offerCount”: “QUANTITY OF GOODS IN SECTION”
}
}
</script>
“Contact details” markup
<div class=”vcard”>
<div class=”adr”>
<p><span class=”locality”>CITY</span>
<span class=”street-address”>STREET</span></p>
</div>
Phones:
<span class=”tel”>+1 (800)469-92-69</span>
<span class=”tel”>+1 (800) 469-92-68</span>
</div>
Links to social networks profiles
<script type=”application/ld+json”>
{ “@context” : “http://schema.org”,
“@type” : “Organization”,
“name” : “BRAND NAME”,
“url” : “http://www.mysite.com/”,
“sameAs” : [ “https://www.facebook.com/page_URI”,
“https://www.instagram.com/page_URI/”]
}
</script>
Social networks layout
Writing an additional code for social networks.
An example of OpenGraphProtocol for social networks
<meta property=”og:url” content=”https://www.ning.com/” />
<meta property=”og:title” content=”Create your own social network with the best community website builder – NING” />
<meta property=”og:description” content=”Ning – is the largest online community building platform in the World ★ Create your own social network in a matter of minutes ⚡️ Take your 14 days trial” />
<meta property=”og:image” content=”https://cdn.ning.com/wp-content/themes/ning/assets/img/ui/white/img_main.jpg” />
<meta property=”og:type” content=”website” />
<meta property=”og:site_name” content=”NING” />
<meta property=”og:locale” content=”en_US” />
An example of Google+ layout
<meta itemprop=”url” content=”https://www.ning.com/”>
<meta itemprop=”name” content=”NING” />
<meta itemprop=”description” content=”Ning – is the largest online community building platform in the World ★ Create your own social network in a matter of minutes ⚡️ Take your 14 days trial”>
<meta itemprop=”image” content=”https://cdn.ning.com/wp-content/themes/ning/assets/img/ui/white/img_main.jpg”>
An example of Twitter markup
<meta name=”twitter:card” content=”summary_large_image”>
<meta name=”twitter:site” content=”@Ning”>
<meta name=”twitter:title” content=”Ning”>
<meta name=”twitter:description” content=”Ning – is the largest online community building platform in the World ★ Create your own social network in a matter of minutes ⚡️ Take your 14 days trial”>
<meta name=”twitter:image” content=”https://cdn.ning.com/wp-content/themes/ning/assets/img/ui/white/img_main.jpg”>
11. Set up interlinking
There are basic and custom interlinking attributes.
Basic interlinking attributes
Basic interlinking attributes include:
- Breadcrumbs creation (if it is not created yet);
- Adding there the links to additional subcategories or generator filter pages;
- Adding the links to the accessories;
- Adding the links to the related products;
- Generation additional links to pagination.
Complex rules
Complex rules are non-standard solutions for improvement of website ranking. More complex schemes are worked out for the specific projects, taking into account the website volume, its structure and linking logic of the competitors.
12. Check your website for pop-ups
To ensure that Google does not lower the website pages value, it is important to get rid of the elements that can negatively affect users interaction with your website. There are several types of pop-ups.
Pop-up
Pop-ups open in new windows. Using pop-ups is considered to be out of the Google Search rules. That is why many browsers, particularly Chrome, automatically block such elements. Users also negatively perceive such pop-ups.
Overlay
Pop-up overlays are visual elements that appear in the same window of the Internet browser. This can be e-mail newsletter subscription form, course registration and similar items. From Google’s point of view, they do not cause such big problems as pop-ups. However, they can cause some difficulties, especially on mobile devices.
Modal window
Modal dialog windows are interactive windows, such as Lightbox windows that show images in detail. This pop-up assumes that the action is performed inside a new window, not in the content below it. Thus, the subscription forms that appear over the content and do not allow you to interact with it, can be considered as modal windows. Their use is not a problem as long as they are not related to spam, advertisement or anything else that makes the users experience worse.
Interstitial ad
Interstitial ads are full-screen ads that cover the whole page at the moment of changing the pages. A classic example of an interstitial ad is Forbes.com format. When visiting the publication for the first site, the user sees the following message:
“Welcome! The sponsor of a day is Brawndo. Brawndo has what the plants crave”.
This kind of interstitial ad scares off many people, since they have to wait before reading an article. Google also has a negative attitude towards such elements.
13. Optimize your website for mobile devices
Main website adaptation
Make a technical assignment to programmers adopt the website for Mobile:
- Create adaptive design;
- Create a mobile version.
Mobile and laptop versions comparison
Compare mobile version to the desktop version, and check if the mobile version corresponds to the desktop version of the website (in case of transferring it to the separate domain or sub domain or directory).
Ensure that the following elements correspond to the desktop elements:
- Meta tags;
- Content;
- Marking;
- Technical pages (pagination, sorting, etc)
- Products characteristics and description.
One of the important MobileFirst Conditions is to have 100% conformity hit.
Linking the mobile application to the website
In order for Google to understand that there is a link between the website and the mobile application, you need to add rel=alternate to the HTML-code of the website. Then, use the sitemap.xml file to identify the linking, and define how to display the search results.
In this documentation you can find the information about the rules of android app://com.example.android/example/gizmos address generation and placing links for indexing applications on Google.
After indexing the website pages, the user will see the buttons in the Google search results with the links to relevant applications that are preinstalled on his smartphone.
AMP web pages set up
The technology of accelerated mobile pages (AMP) is based on open source code. These pages are stored in a special Google cache, which provides faster downloads. Here, you can read in details about the logic of creating AMP pages.
14. Set up redirects
Professional SEO-audit of a website also includes a full check for redirects. Let’s have a look the most common ones.
301 redirect: uppercase to lowercase
Set up 301 redirect from upper case to lower case for all the pages. If for upper case pages there is already set up a canonical tag on lower case pages, then there is no need for redirect.
301 redirect: www to non-www
The website should be accessible by one domain. Therefore, configure the 301 redirect version of a website from www to without www (or vice versa).
301 redirect URLs with or without a trailing slash
The website should be accessible by one URL, preferably with the slash sign in the end. If you find a problem, configure the 301 redirect version of a website without slash into the version with a slash at the end.
301 redirect for pages with and without extension
https://mywebsite.com/index.htm
The web pages can be accessed in .php, .html, .htm, .aspx, .asp extensions. Check the availability of the pages in these extensions.
301 redirect from IP
http://194.35.16.75/
If the website is duplicated by the IP address, configure 301 redirect duplicates to the main website.
301 redirect: duplicate paginations
In case you go to the second page of the website, and then return to the first page, and find out that the URL address is different from the one on the original page, it means that the pagination on the website is not properly set up. This is how it looks like:
https://mywebsite.com/landing/page-1/
In this case, configure the 301 redirect and merge the web pages.
Check for redirect chains
If there is a problem of redirect chains (when one page is redirected to the second one, and vice versa), then configure the correct 301 redirect.
301 redirect for the 404 page with external links
If the website has a reference history, check the website for external links, for example, with Ahrefs service.
Сonfigure 301 redirect to similar pages for all the 404 pages that are redirected from external links.
15. Set up canonical pages
Dynamic pages canonical
If the page is duplicated by adding any get-parameters in the end of the URLs, register the canonical tag on the root page of a website.
An example
The page of the same event is available at the addresses:
https://mysite.com/events/?id=123
To remove tag duplicated on all dynamic pages, specify the rel=”canonical” attribute.
For example, for the web pages:
https://mysite.com/events/?*
where * means any symbol or sign, you need to add a code:
<linkrel=»canonical» href=»http://mysite.com/events/» />
The similar setting should be made for all dynamic web pages of the project.
Duplicate pages canonical
The Internet shops quite often contain the duplicate products, due to the different colors of these products. As a result, almost identical products can appear on the website. In this case, select displaying a single page, and specify the rel=”canonical” attribute for it.
Self-referential canonical
Put rel=”canonical” attribute for all the web pages.
Example: <linkrel=»canonical» href=»http://www.site.com/landing/«/>, where landing is a static landing page, where the canonical is placed.
16. Set up the 404 error
Checking the 404 error response
404 page should get the 200 status code. The best way to check it is to use Google scanner Fetch as Google or Webconfs.com service.
When interchanging directories locations
If after interchanging the directories locations or separate words, the web page opens and gets the 200 status code, then set up 404 server response.
Example.
There is a web page: site.com/animals/elephants/.
Let’s swap the words “animals” and “elephants”. If the page site.com/elephants/animals/ gets the 200 status code, then you need to set up 404 server response on this page. This problem most often occurs when you select two filters in the depository and swap their locations. Therefore the interchange of filters location is checked by default.
When deleting intermediate directories
If after deleting intermediate directories or single words, the page opens and gets the 200 status code, then you better set up 404 server response on this page.
An example:
There is a web page: site.com/animals/elephants/.
Let’s delete the “animals” directory. If the webpage site.com/elephants/ gets the 200 status code, you have to set up 404 server response on this page.
When adding entities
If after adding arbitrary directories or single words, the web page opens and gets the 200 status code, perform the same actions just like in the previous two paragraphs.
Example:
There is a web page: site.com/animals/elephants/.
If you add an arbitrary directory site.com/animals/elephants/elephants, and the page gets the 200 status code, set up 404 server response on this page.
The same rule should apply for checking non-existent language catalogues. I.e., if the website does not display different language versions, check if this version can be automatically generated by the system. In order to do this, add /fr/, /en/, /eu/ after the main domain address. If the page gets the 200 status code, but in fact the different language version does not exist, set up 404 server response on this page.
When adding elements to the primary address
If after adding arbitrary words to the existent directories, the page opens and gets the 200 status code, then, as you already know, you need to set up 404 server response on this page.
Еxample:
There is a web page: site.com/animals/elephants/.
If we add the numbers into the directory and make it site.com/animals/elephants/444, and the page will get the 200 status code, set up 404 server response on this page.
17. Set up language switching elements
If the website has several language versions, set the correct attributes (hreflang) for the web pages.
Example #1. The website has three language versions (English, Spanish and French) that differ from each other and are located in directories (folders). The main website page site.com is in English, but it already has the selection of language versions. Set up the following:
http://site.com/
<link rel=”alternate” hreflang=”en” href=”http://site.com/” />
<link rel=”alternate” hreflang=”es” href=”http://site.com/es/ ” />
<link rel=”alternate” hreflang=”fr” href=”http://site.com/fr/ ” />
http://site.com/es/
<link rel=”alternate” hreflang=”en” href=”http://site.com/ ” />
<link rel=”alternate” hreflang=”fr” href=”http://site.com/fr/ ” />
http://site.com/fr/
<link rel=”alternate” hreflang=”en” href=”http://site.com/ ” />
<link rel=”alternate” hreflang=”es” href=”http://site.com/es/ ” />
Do the similar action for the landing pages, but specify the destination URL.
http://site.com/landing/
<link rel=”alternate” hreflang=”en” href=”http://site.com/landing/ ” />
<link rel=”alternate” hreflang=”es”” href=”http://site.com/es/landing/ ” />
<link rel=”alternate” hreflang=”fr” href=””http://site.com/fr/landing/ ” />
http://site.com/es/landing/
<link rel=”alternate” hreflang=”en” href=”http://site.com/landing/ ” />
<link rel=”alternate” hreflang=”fr” href=”http://site.com/fr/landing/ ” />
http://site.com/fr/landing/
<link rel=”alternate” hreflang=”en” href=”http://site.com/landing/ ” />
<link rel=”alternate” hreflang=”es” href=”http://site.com/es/landing/ ” />
The landing pages names may differ, but it is important to indicate them accordingly. For example, the page “About” can have a name /about/ in English version, and a name /acreca-de/ in Spanish version.
Example #2. If there are three language versions of the website, but there is only one language used for all the countries (e.g. English for USA and Canada), then put cross-linking with the canonical tag.
http://site.com/
<link rel=”alternate” hreflang=”en-US” href=”http://site.com/ ” />
<link rel=”alternate” hreflang=”en-CA” href=”http://site.com/ca/ ” />
<link rel=”canonical” href=”http://site.com/ca/” />
http://site.com/ca/
<link rel=”alternate” hreflang=”en-US”” href=”http://site.com/ ” />
<link rel=”canonical” href=”http://site.com/” />
There is no need to use canonical tag for web pages with different language versions.
If the language versions of the website are hosted on subdomains, but not in directories, specify the subdomains in the website settings (using href=»http://en.site.com/» parameter).
18. Block search indexing
Combination of product filters
To optimize the process on the first stage of website promotion, close down all filter combinations from indexing. On the next stages of the process, analyze the potential traffic to combination pages, and make only useful pages available for indexing.
Technical pages
Close the authorization, logic and other pages that do not contain useful content for search engines.
Blank pages that will be filled in
If the website contains blank pages that will be filled in with content later (e.g. technical pages, categories, subcategories), make them not available for indexing for a while with the help of meta tags.
19. Create a robots.txt file
For Bing, create a separate command block that starts with User-agent: Bing. Exclude all the pages that should not be visible to Bing while ranking.
For Google, create a command block that starts with User-agent: Googlebot (for the main bot) and User-agent: Googlebot-Mobile (for the mobile bot).
Use the “Fetch as Google” feature to see how Google sees your website, and use the Allow: directive to open the styles, scripts, images and font files for indexing.
Then, register Host and Sitemap.
- Host: — link to the main website version;
- Sitemap: — link to the sitemap file.
20. Set up nesting configuration
Now, check all levels of nesting pages that can generate traffic. It is the best to locate pages for frequency-targeted inquiries in close proximity to the main web page or to make it in cross-references style.
Creating a task for setting up an HTML sitemap
If there are any important pages that should be available for a search engine within two clicks (however, they are available within three clicks), create a task for setting up an HTML sitemap. Create a tree-like structure sitemap with the links to pages, important for SEO. The link to the sitemap in html format should be located in the website footer or the bottom menu.
21. Set up pagination
For the correct website indexing and avoiding further problems with duplicate content, your pagination pages require to contain rel=»next» и rel=»prev» attributes, apart from rel=»canonical» attribute. In this way, you indicate to the search engine that the website content is duplicated due to the specifics of the website internal structure.
An example of pagination set up.
There is a website http://www.site.com/landing/. To include the pagination elements into the existent website structure, you have to add the following lines into the html-code of the first page (between <head>…</head> tags):
<link rel=”canonical” href=”http://www.site.com/landing/” />
<link rel=”next” href=”http://www.site.com/landing/page-2/” />
Then, add the following lines into the html-code of the second page (between <head>…</head> tags):
<link rel=”canonical” href=”http://www.site.com/landing/page-2/” />
<link rel=”prev” href=”http://www.site.com/landing/” />
<link rel=”next” href=”http://www.site.com/landing/page-3/” />
On all subsequent pages, except the last one, add the mentioned above code for the second page, changing only the page numbers in rel=«next» and rel=«prev» parameters (the parameters to be changed are indicated by the # symbol).
<link rel=”canonical” href=”http://www.site.com/landing/page-#/” />
<link rel=”prev” href=”http://www.site.com/landing/page-#/” />
<link rel=”next” href=”http://www.site.com/landing/page-#/” />
On the last page, paste the code (between <head>…</head> tags) specifying the corresponding page numbers instead # symbol.
<link rel=”canonical” href=”http://www.site.com/landing/page-#/” />
<link rel=”prev” href=”http://www.site.com/landing/page-#/” />
Note that the first page should contain only rel = »next, and all the pages from the second one to the last one should contan both rel=»next», and rel=»prev». The last page should contain only rel=»prev».
The templates for pagination should be as simple as possible, without additional words like “buy”, “price”, “online store”.
22. Add meta tags
Now, work on template formation of meta tags.
Configuring templates for different table types
If your website contains a large number of pages of different types, apply the templates to generate a title and description. When creating templates, consider the following:
- Templates should uniquely customize meta tags for pages;
- All pages must fall under the pattern (that is, when applying a template, all pages must have readable meta tags);
- The average length of templates should not exceed the standards regulated by search engines;
- It is desirable to specify the vendor codes for the products.
Eliminating missing meta tags errors
Write meta tags for pages that are missing it, according to website crawl data and Search Console Data.
Eliminating short meta tags errors
Write meta tags for pages with short descriptions, according to website crawl data.
Eliminate duplicate meta tags errors
The templates should solve the problem of duplicating meta tags. That means, they must contain all the necessary variables, so that when they are applied, all pages get unique title and description meta tags.
Deleting the keywords attribute
Now, the keywords attribute has no value for website ranking, so you can delete it.
23. Add headers <h1> – <h6>
When conducting a SEO-audit of a website, it is also important to adjust the proper operation of headers.
Formation of correct h1-headings for pages
Analyze the h1 headers for availability and informative content. If they do not exist, create them. If there are h1 headers, but they are not informative, change them.
An example of an incorrect header.
If you opened the page of circulation pumps and saw the heading “Circulation”, then it is incorrect. The search engine may incorrectly interpret the header, because it is unclear what this word relates to.
Forming templates for product filter pages
When you make the product filter pages available for indexation, create unique headers for these pages. Therefore, create the templates for all the product filter pages, to generate automatically unique headings of the first level. If necessary, you can also create templates for h2 headers.
Elimination of non-informative <h1> – <h6> headers
Select different types of pages (main, category, subcategory, filter, product, articles, news, etc.), and check each type with the Chrome plug-in. If there are uninformative headers in the h1 – h6 tags, create new ones using CSS styles.
Arranging the headers in the correct order
The website should have the correct headings structure: first h1, then h2, h3 etc. If these rules are not met, create a task for a programmer to fix it.
24. Create and set up a sitemap.xml file
Sitemap is an obligatory attribute of multi-page, actively promoted web resources. Correctly created site map makes the pages indexing faster and more correct.
A map file is an xml file that lists only the URLs of the site that you want to index, in combination with the metadata associated with each URL address (frequency of changes, its priority within the website, etc.).
Site map should contain links only to important unique pages in a single quantity (that means, the address of a particular page should not be repeated, mentioned several times, etc.) taking into account the selected main mirror.
You can read more about sitemap format in Google Console Help or on Sitemaps.org.
If the site has several subdomains, then you need to create a separate site map for each of them. If there are several language versions on the site, then it is necessary to optimize the site map for each of them.
Checking the sitemap presence
If the site map is not available at the standard sitemap.xml address, the link to it can be found as follows:
- Enter this query in Google: site:mysite.cominurl:sitemap;
- Check the path to the sitemap in robots.txt file;
Check the «Crawl – Sitemaps» tag in GoogleWebmasters.
If none of the mentioned ways helps to find a sitemap, then you have to create it.
A task for creating a multi-level sitemap
With a site volume of more than 50,000 pages, you must create a multi-level site map. The logic is to create a common sitemap that contains links to all other site maps with a maximum of 50,000 pages.
Implementation example
<sitemapindex xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″ xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd”>
<sitemap>
<loc>
https://www.site.com/static/sitemap/www.site.com/-1.xml
</loc>
<lastmod>2017-12-28</lastmod>
</sitemap>
<sitemap>
<loc>
https://www.site.com/static/sitemap/www.site.com/-2.xml
</loc>
<lastmod>2017-12-28</lastmod>
</sitemap>
<sitemap>
<loc>
https://www.site.com/static/sitemap/www.site.com/-3.xml
</loc>
<lastmod>2017-12-28</lastmod>
</sitemap>
<sitemap>
<loc>
https://www.site.com/static/sitemap/www.site.com/-4.xml
</loc>
<lastmod>2017-12-28</lastmod>
</sitemap>
<sitemap>
<loc>
https://www.site.com/static/sitemap/sitemap_page_brand.xml
</loc>
<lastmod>2017-12-28</lastmod>
</sitemap>
<sitemap>
<loc>
https://www.site.com/static/sitemap/sitemap_page_filter.xml
</loc>
<lastmod>2017-12-28</lastmod>
</sitemap>
</sitemapindex>
A task to create image sitemap
Here, you can read about image sitemap.
Prioritizing and Scanning Frequency
It is important to include the following information into the sitemap:
- last modification date of the page (this parameter helps Google understand which pages should be reviewed again);
- page scanning frequency;
- scanning priority compared to other pages.
Check for closing the page from indexing
Upload all the sitemap pages into one file and add them into ScreamingFrog. The program will show the server response codes of these pages, will check if there are attributes prohibiting indexation or canonical to other pages. If there are pages that have one of the above problems, you should replace the pages with the working ones. It is better to create an automatically updated sitemap, which will contain only 200 code pages, available for indexing or self-referral canonical.
25. Analyze the website structure
Analyze the compliance of the assembled structure with the Bing and Google recommendations for generating URLs. Create technical assignments for programmers to fix all errors.
26. Analyze the page index
Go through the catalogs and analyze the index. Identify those pages that fall into the additional index. Create a task to fix the errors.
The basic commands that help in analyzing the index are the following:
- info: (shows if there is a landing page in the index);
- inurl: (shows a separate directory or a group of pages of a single type);
- -inurl: (Excludes individual directories or same page groups from the whole index).
To perform a search only on your site, start from the site operator: name of your website.
An example: site:www.site.com inurl:blog
In Google, it’s best to look at the index for individual directories. In such a way, you can determine those pages that the search engine will not show in the analysis of the entire site.
27. Use Google Search Console
Use Google Search Console to conduct a website errors analysis.
It is recommended to check the following:
Indexing status
Compare the number of useful pages on the site with the number of pages in the index. If there is a significant discrepancy, identify the reasons and fix the problem.
Scanning statistics
If the number of pages scanned per month is less than the total number of useful pages, find out the reason and prepare a task to solve the problem.
Structured data
Eliminate errors in the micro-layout of the website. It is better to check this point after the programmer implemented the website micro-layout.
HTML optimization
Analyze the causes and find the patterns of inconsistencies in meta tags. Create a task to fix this error.
Targeting by language or country
Be sure to target the relevant promotion region for common domains (subdomains, individual folders of the main site) .com, .net, .org, etc.
Mobile usability
If the website is available for indexing, then it’s best to use GoogleWebmaster to check the browsing experience on mobile devices. If the website is close for indexing, check it with PageSpeedInsights or Chrome plagin.
Blocked resources
Ensure that in this tab there are no scripts, styles and other files that are directly responsible for the functionality or visual perception of the website. If you find it, open them for indexing with editing robots.txt file.
Scanning problems
Analyze existing errors, identify patterns and create a task to eliminate these errors. The most common mistakes are the 500 (associated with the server’s inoperability) and the 404 (due to the fact that the non-existent pages still have links from the website code).
Googlebot
It is important for a website to visually look like the users see it. If there are elements, closed for indexing, edit the robots.txt file.
Sitemap files
Add a site map and check if it is working correctly. All pages of the site map should be open for indexing and have the 200th server response code. Test a sitemap and analyze the possible errors. If there are no pages in the website index, upload these pages in a file and find out the patterns of occurring errors. Also, you can monitor the amount of sitemap pages and their number in website index. If the number of indexed pages is less than the total, search how to solve the problem.
URL parameters
Make the dynamic sort pages, filters and the similar ones not available for scanning.
Security issues
Analyze all the pages for security issues and a possibility of hacking.
28. Additional aspects
In addition to the above, you can include a number of additional tasks in the technical audit of your site.
Same IP websites
Analyze what websites are using the same IP as your website. If you find out that the same IP is used by the sites that distribute spam, explicitly selling links, banned topics, affiliate sites or satellite networks, buy IP allotted exclusively for your project.
Making the text unavailable for copying
Add the oncopy=»returnfalse» command to the <body> tag. It will reduce the amount of content copying by 26% and will disable the ability to copy content from the browser when scripts are enabled.
Setting up favicon links
If the website does not have favicon, create a task for programmers to make it. This image should always have .ico format.
29. Optimize the website subdomains
If you find subdomains on the website, fully optimize them as well. For each of subdomains, set up an individual sitemap (sitemap.xml), robot.txt file, and create redirect and canonical rules. The only difference is that you have to look at the pages generation and localize the duplicate content.
30. Learn more about optimizing your website
Here you can find 45 tips on how to create a successful website.
This check list will help you analyze the optimization of the website on your own.