Are Your URLs Accessible to Search Engines?
Organic search engine traffic is one of the leading sources of visitors to most websites, climbing to 53% of all site traffic in 2019. But many don’t realize that search engines can have trouble finding their pages and recommending them to searchers for purely technical reasons. It’s important to check that the URLs of your most important pages are visible to search engines. Using the right keywords, metadata, and tags can ensure that your pages are indexed by search engines.
Search engines send bots to crawl as many web pages as possible and add them to their index. They access URLs and download them in some ways similar to browsers, but with some key differences. This can lead to search engines not fully reading certain pages even if humans can visit them normally, so these pages won’t get indexed. Search engines have a ‘crawl budget’ and won’t crawl too many pages on the same site, so it’s essential to focus their attention on your most important pages and prevent them from wasting crawl budget on extraneous pages.
Avoid confusion with duplicate URLs
Sites may use multiple URLs for the same or similar pages, and search engines can’t always distinguish between them and simply direct users to a single version. Several tools exist to tell search engines what purpose a page serves and which page is the ‘definitive’ version that search engines should index. The <link rel=”canonical”> or canonical link attribute plays a major role in redirecting search engines from non-canonical pages toward the version that is intended to be seen by search engines. Putting the canonical in the <head> of the alternate versions as well as the canonical page itself will point search engines to it.
Sites should check that they use canonicals for all potential versions of their pages, such as mobile versions, HTTP and HTTPS versions, www vs. non-www versions, foreign language versions, and more. They should also ensure that they don’t use multiple canonicals on a page, or they will get ignored. Other ways to direct search engine bots to the canonical page can be used, such as using the right URL in sitemaps and creating 301 redirects from alternate pages, though too many redirects can cut into the crawl budget.
Allow pages to be indexed
Another reason pages don’t appear in search results is because they are set to block search engine bots from finding and indexing them. This usually happens unintentionally, and it’s necessary to review pages so see if they have been given the wrong setting. The robots meta tag in a page head can include a noindex directive to tell search engines to not index them. Additionally, the robots.txt file can list the URLs on the site and tell automated bots which pages to crawl and which they shouldn’t. Both of these should be reviewed to make sure pages are not blocked from indexing unintentionally.
Using all correct metadata
One way many websites fail to get noticed by search engines is through neglect of metadata. Every web page has the opportunity to embed additional information that search engines can analyze, so there’s no reason to not use it. Page titles and descriptions that match search keywords will lead searchers to the right page. Many sites use the same generic page title on many pages; give them unique titles instead. Create a meta description that summarizes the page and uses keywords; check that no pages are missing the description.
Images and videos on pages can include several kinds of metadata. Alt-texts that appear in the image can’t be displayed also provide useful information for image searches, and so can captions. Filenames should also be meaningful rather than strings of letters and numbers. Headings in H1 to H6 tags also show how pages are organized.
Validating accessible URLs for search engines
Running a regular weekly scan of your website can show if you have missing, duplicated, or misused canonicals, directives, and metadata. It can identify how to fix these pressing issues and take action. You can ask Google to recrawl your pages after changes have been made to ensure that the right versions are in their index. Pages that are configured correctly will be more accessible to search engines, bringing more traffic to your content.