Make favicon scraping work independent of Embedly
Currently, the favicon scraping is dependent on Embedly correctly identifying the site's favicon and returning info about it. The process is something like this:
- Someone submits a link topic
- The link is passed to Embedly's Extract API
- When the response comes back from Embedly's API, it's used by a couple of different processes to get information about the link. One of these pulls information like the word count out of the data, and another one of them uses the
favicon_url
data to download the favicon for that site.
We should be able to download favicons independently of Embedly, especially since it seems like it's not uncommon for Embedly to fail to identify the favicon for some reason. A generic HTML scraper could probably locate it in the HTML fairly easily (or even just try /favicon.ico since that's generally the standard location).