In this post, I will explain what a document is when referred to by Google.
What is a Document?
According to various patents, a “document” is any machine-readable and machine-storable work product .
More broadly, according to Google’s Gary Illyes, a document can be:
[…] any content that Google Search is able to index at the momentSearch off the record 
Examples of Documents
Among various patents, some examples of documents are listed.
- HTML web pages
- blog posts
- Image file
- Video file
- Google docs
- Web sites
- Combination of files
- News group posting
- Web advertisement
- Yellow pages entry
- Scanned book
- Electronic version of printed text
- One or more files with embedded links to other files
Most Common Document
A common document is a web page.
Documents Indexable by Google
If we rely on Gary Illyes’ definition which is any indexable document, we can find a list of indexable documents here.
- Adobe Portable Document Format (.pdf)
- Adobe PostScript (.ps)
- Google Earth (.kml, .kmz)
- GPS eXchange Format (.gpx)
- Hancom Hanword (.hwp)
- HTML (.htm, .html, other file extensions)
- Microsoft Excel (.xls, .xlsx)
- Microsoft PowerPoint (.ppt, .pptx)
- Microsoft Word (.doc, .docx)
- OpenOffice presentation (.odp)
- OpenOffice spreadsheet (.ods)
- OpenOffice text (.odt)
- Rich Text Format (.rtf)
- Scalable Vector Graphics (.svg)
- TeX/LaTeX (.tex)
- Text (.txt, .text, other file extensions), including source code in common programming languages:
- Basic source code (.bas)
- C/C++ source code (.c, .cc, .cpp, .cxx, .h, .hpp)
- C# source code (.cs)
- Java source code (.java)
- Perl source code (.pl)
- Python source code (.py)
- Wireless Markup Language (.wml, .wap)
- XML (.xml)
Which Patents Mentions Documents?
Most patents related to search engine will mention documents at some point or another.
|Document||Any machine-readable and machine-storable work product|
|Link||Any reference to or from a document|
-  Systems and methods for determining document freshness
-  Search off the Record
-  Changing a rank of a document by applying a rank transition function
-  Updating search engine document index based on calculated age of changed portions in a document
We now have covered what is a document when Google refers to it.
SEO Strategist at Tripadvisor, ex- Seek (Melbourne, Australia). Specialized in technical SEO. Writer in Python, Information Retrieval, SEO and machine learning. Guest author at SearchEngineJournal, SearchEngineLand and OnCrawl.