How Search Engines Work: Crawling, Indexing, And Ranking

Winter quarter is finished! I just turned in my final project (JavaScript) and my brain is oozing onto the floor. Anyway, this quarter I took Intro To JavaScript, Systems Analysis and one I’m calling a catchall, “Joomla and Introduction To SEO”. Interestingly, this last class featured Amazon Web Services (AWS) heavily, and I’ll write about that, too. So, today I’ll explore more of my thoughts on SEO.
  • Crawling and Indexing:
    • Crawling sites is when a robot, for instance Googlebot, explores a webpage. One of the key problems with crawling and crawlers is simply whether they can find your page. There are some ways to push Google in this regard. One valuable way: you can tell Google how you want the page crawled via the Search Console. A related problem: keeping crawlers from indexing pages you don’t want indexed. Things like special landing pages, promo-code pages, pages for A-B testings and so forth. The tool for this is the robots.txt file which directs the bots on how you want the files analyzed.  Web developers also need to consider the crawl budget, which is the average number of pages the Googlebot will before moving onto another site. It is also important to utilize a clean information architecture and sitemaps.
    • Indexes are where search engines store and process the information found while crawling the internet. Just because a site can be crawled does not mean it will be indexed, or indexed the way you want. Robots can be instructed on this within the meta tags.
  • Matching Queries to Content
    • The goal of the search engine: provide the most relevant results to user queries. Their algorithm uses several things to determine that relevance. One of the main ways is backlinks. These links from other sites assert authority on the subject matter. PageRank is part of the Google core algorithm that analyzes link qualities. More natural backlinks are the most valuable. Other elements that Google considers: page clicks, time on the page, the bounce rate (percentage of user visits where they only viewed one page), and pogo-sticking (where the user goes right back to the Search Engine Results Page after visiting the target site. Additionally, for localized content, Google uses these elements to rank results: relevance, distance, and prominence. Things like Google Reviews and Citations are the biggest influencers of prominence.