Everyone was buzzing earlier this week with rumors that Apple was crawling the web. It began popping up in the middle of October with Apple’s IP range with a NetName of APPLE-WWNET as “Mozilla/5.0 (compatible; Fetcher/0.1)” written in Go.
The crawler does request robots.txt and is still actively crawling. But the crawler is only taking straight HTML and not the CSS, JavaScript or image files. While many crawlers don’t download CSS or JavaScript, the fact it was not indexing image files was interesting.
Of course, it had everyone speculating over the possibility that Apple might be getting into the search engine game, despite their current partnerships to provide search for iOS and Siri. With Apple’s following and iOS default search, Apple could definitely make some inroads into creating their own search engine.
Here was the example log file from Jan Moesen.
17.147.18.35 - - [06/Nov/2014:08:10:29 +0100] "GET /robots.txt HTTP/1.1" 301 185 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)" 17.147.18.35 - - [06/Nov/2014:08:10:30 +0100] "GET /robots.txt HTTP/1.1" 200 403 "http://www.catenacycling.com/robots.txt" "Go 1.1 package http" 17.147.18.35 - - [06/Nov/2014:08:10:36 +0100] "GET / HTTP/1.1" 301 185 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)" 17.147.18.35 - - [06/Nov/2014:08:10:40 +0100] "GET /en HTTP/1.1" 200 93350 "http://www.catenacycling.com" "Go 1.1 package http" 17.147.18.35 - - [06/Nov/2014:08:11:03 +0100] "GET /robots.txt HTTP/1.1" 200 403 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)" 17.147.18.35 - - [06/Nov/2014:08:11:05 +0100] "GET /en/ride-the-world/routes HTTP/1.1" 200 6379 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)" 17.147.18.35 - - [06/Nov/2014:08:11:09 +0100] "GET /en/ride-the-world/climbs HTTP/1.1" 200 6841 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)" 17.147.18.35 - - [06/Nov/2014:08:11:14 +0100] "GET /en/calendar/events HTTP/1.1" 200 9333 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
But it would appear that the reality isn’t nearly as sexy as the possibility that Apple is laying down groundwork for a brand new search engine. It is actually used by data scientists using statistical analysis and machine learning to improve Siri’s accuracy and performance.
@janmoesen apparently crawer is to improve siri: Using statistical analysis and machine learning to improve performance nd accuracy of Siri
— . (@NeedAEditButton) November 7, 2014
@janmoesen found a bug report mentioning the IP, checked Reporter's profile and saw he was a data scientist at apple mentioning siri.
— Kevin Deamandel (@DeamandelK) November 7, 2014
So while we may see Apple one day jump into the search engine game, this crawler isn’t a part of it at this time.
Jennifer Slegg
Latest posts by Jennifer Slegg (see all)
- 2022 Update for Google Quality Rater Guidelines – Big YMYL Updates - August 1, 2022
- Google Quality Rater Guidelines: The Low Quality 2021 Update - October 19, 2021
- Rethinking Affiliate Sites With Google’s Product Review Update - April 23, 2021
- New Google Quality Rater Guidelines, Update Adds Emphasis on Needs Met - October 16, 2020
- Google Updates Experiment Statistics for Quality Raters - October 6, 2020
Andrew Shotland says
I disagree with your conclusion. To me “using statistical analysis and machine learning to improve SIRI” does not imply Apple is not building a search engine. SIRI is a search engine.