We live in a world where the answer to any question you could imagine is just a few keystrokes away. Are avocados a vegetable? Huh, they’re actually a fruit. What is the real name of a hashtag? An octothorpe, cool. Both useless and useful information can be found, but how does google search actually work?
Google is used to conduct 8.5 billion searches per day. That’s over 99,000 queries sent every single second, and over 80% of people use Google at least 3 times a day. However, many of us haven’t given how it actually works a second thought.
Well, it’s not all that complicated if you take a birds-eye view of things and don’t dive into the incredibly complicated world of web and software design.
A Google search works in three stages, starting with:
The first step of the process takes place before you’ve even searched for anything.
Google uses a program called Googlebot, also called a spider because it crawls the web (get it?). It has the arduous task of finding out what pages actually exist on the internet. A central registry for all web pages does not exist so Googlebot is constantly looking for new and updated pages to add to its ever-growing list of known pages.
Using an algorithmic process, Googlebot determines which sites to crawl, how often, and how many pages to crawl for each site. Besides adding a list of pages (a sitemap) to Google manually, this process is automatic and Googlebot crawls hundreds of thousands of pages every day.
This is where a majority of the Search Engine Optimization work comes in. Not all of it, but a hefty portion.
Now that Google knows a page exists, it tries to understand and determine what the page is actually about. It will analyze titles, alt attributes, content, images, video, and more when deciding what the purpose of a page is.
Once Google has decided the page is canonical (not a duplicate and can be shown in search results) it clusters pages of similar content together for a “Battle Royal” of sorts. All the pages are compared to each other so the one that is most representative of the group can be selected.
At this point, Google also collects “signals” about the page, including the languages of the page, where the content is local to, the page’s usability, and more.
Indexing Is Not Guaranteed!
Because appearing in a Google search is free, you have to earn it. There are a few things that can prevent your page from being indexed.
Low Content Quality
While they do hide the technical details (so the system can’t be gamed), Google is pretty open about what constitutes quality content:
- Meet Technical Requirements
- Follow Spam Policies
- it’s helpful and reliable
- Use headings when you can and include commonly searched terms in them
Google Can’t Read It
Now that your page is crawled and indexed, users may be served that page when they enter a query. There’s a completely different algorithm that Google uses here to make sure it serves content relevant to the person searching for it.