Google Safe Browsing API

30 November -0001

Google has put together a really cool API that allows developers to query their database of suspected malware and phishing sites. It's the same API that powers the Firefox phishing and malware protection and that Twitter now also uses. The API actually only exposes the MD5 hashes of suspected URLs which is nice. Not only is it consistent in value (MD5 hashes are always 32 hexidecimal characters) but it eliminates the hassle of dealing with all sorts of URLs and make search responsiveness a lot faster. Because you need a Google developer key there might be some protection against malware authors being alerted that their hosts have been reported, but given the proliferation of Google accounts I doubt this will be much of a hurdle. A lot of malware hosting sites point back to the same ip anyways.

Using this new functionality would be a great feature for almost any web application. By scanning URL's in user submissions you can quickly spot malware and phishing links. Removing these links can help prevent your web application from linking to bad sites and potentially propagating the problem.

After reading about this neat new capability I decided to do some investigation into how it could be implemented in PHP. The result is an article I posted. I think there's a lot that could be done with this functionality (Drupal module anyone?) and I'm looking forward to seeing where it goes.