John Mueller commented that Google automatically converts PDFs and similar document types into HTML format for indexing and ranking purposes.
For those who are active in PDF SEO, this won’t be a surprise. Google has converted PDFs into HTML for quite some time, and included a link to the HTML version directly in the search results. So while you may have what you think is an awesome PDF, your users might actually prefer the HTML version and click this link instead.
Do note that for larger files in Google will not convert the entire PDF document into HTML. So there’s still some important content that could be within the PDF that is just simply not indexed because of the PDF size.
And there’s a lot of evidence that while PDF files can rank very well, they tend to rank well for the types of queries where someone is looking for something like a PDF, such as a search for a manual for example.
If you do have a large number of important PDFs indexed and that you want ranking well, it is worth considering whether having that content with in a PDF is the best solution for your users as well. For example, PDFs are hard to open and read on many mobile devices. And sizes of PDFs are often much larger than what the corresponding HTML version of the page would be, which is also a limitation on some slower connections depending on the size of the PDF.
PDFs aren’t the only file type that Google converts to HTML for indexing. Google also does it for .doc documents (such as Word documents), .xls (spreadsheets) and other similar non-HTML content types.
Jennifer Slegg
Latest posts by Jennifer Slegg (see all)
- 2022 Update for Google Quality Rater Guidelines – Big YMYL Updates - August 1, 2022
- Google Quality Rater Guidelines: The Low Quality 2021 Update - October 19, 2021
- Rethinking Affiliate Sites With Google’s Product Review Update - April 23, 2021
- New Google Quality Rater Guidelines, Update Adds Emphasis on Needs Met - October 16, 2020
- Google Updates Experiment Statistics for Quality Raters - October 6, 2020