You’ve built a site and you want to add a PDF or multiple PDF documents to aid user experience – maybe you want to add your brochures for easy download (and in a format they can’t then easily be edited), or a full version of your paper catalogue to save on sending copies through the post – whatever the reason, you don’t only want users to view them, you want search engines such as Google to be able to index them too! This way, you stand a chance of your PDF pages appearing in the search result pages for matching queries, and as such, an increased click through rate to those pages directly!
PDFs are indexable by search engines but there are some steps you need to take for optimal performance from these pages. By their very nature, PDFs do have limitations when compared to HTML pages, and whilst they can reach the top of the search result pages, all best practice steps should be taken to ensure the best possible chances of doing so!
• Make sure they are clearly linked from other pages on your website, including any sitemaps you have in place. This doesn’t only apply to PDFs but all pages – a clear website structure will help search engines to crawl and index each page, and return them as ranking pages for relevant enquiries when and if made by users
• Make sure you include meta data. It’s surprising how many webmasters include meta content for regular site pages, but don’t think to add it for PDFs. Search engines use meta data (primarily Titles and Descriptions) as part of their algorithms to weight the quality of individual pages, and websites as a whole, so if you don’t add this to your PDFs, you’re missing a trick! Make sure the content you write is original, and describes the PDF/page in question but without being too long – imagine it as your “ad”, and with that in mind, it doesn’t need to be an essay – a couple of descriptive lines (preferably with a USP in the Description) will suffice.
• If your PDFs don’t contain links to the rest of the website, add them in. Search engines will use these links to help them crawl related pages, and they’re good for users as well, as this makes navigation a lot easier. A lot of PDFs suffer from lack of navigation, forcing users to press the Back button on their browser rather than using links to keep browsing – frustrating, and search engines don’t like it either (they’re designed to crawl forwards, not backwards).
• Remember that whilst search engines can index images, they can’t read them – certainly not to the extent they can read plain text. If your PDF is image heavy, try to add more relevant and unique text in, to give a better chance of it being indexed properly and classed as important page of your site. Also, PDFs don’t have the ability to add Alt text for each image, so images used within a PDF format won’t be as optimised as those on a HTML page.
• Make sure the filename of the PDF makes sense, as this will form the URL of the webpage the PDF opens as. Use something simple and descriptive, such as /our-online-catalogue, rather than a mix of letters and numbers that make no sense to anybody except your developers!
• Keep the file size of your PDF documents as small as you can. This will help them to load quicker, and page speed is something that Google particularly cares about, having made it part of its ranking algorithm back in 2010.
• If you have a PDF that contains the same info as a HTML page on the site, you need to consider the duplicate content implications. Whilst it can make sense to offer both HTML and PDF versions of the same content, Google will penalise you for it if you do it too often, so to avoid this, you can either make the content unique, or if this doesn’t fit with your strategy, stop Google from indexing the PDF altogether by using methods such as the robots.txt file to disallow bots from crawling it.