Identifying Malicious Scripts by User Agent
24 March 2011
I noticed an interesting phenomena today while reviewing my web server logs. I track requests to URL's that include patterns indicative of directory traversal attacks. The most common of these is an attempt to include the file "/proc/self/environ" which can be used for a remote code execution attack (as described in When is LFI Really ACE?). Most of these hits are probably automated scripts that are trolling around the internet, looking for servers vulnerable to specific exploits. For instance, they could be searching for Joomla installations with modules enabled that present security issues.
My suspicions were confirmed as soon as I began to look through a database I set up to collect information from my web server logs. I've been collecting data from March 9, 2011 to the present (March 24, 2011) - a total of 16 days, or just over two weeks. In terms of internet traffic this isn't a good sample size, but even with this small snapshot trends begin to emerge. As with all good honeypots, this data collection is really useful for determining attacker trends rather than as an intrusion prevention system.
In the 16 days of data collection I tracked 728 potentially malicious requests. I tracked the URL requested, the date and time, the IP of the potential attacker, and the browser user agent. Of the 728 requests, 32% were made from browser user agents that identified themselves as "libwww-perl". Libwww-perl is the user agent designation that is applied when the access is made using the Perl LWP library (http://search.cpan.org/dist/libwww-perl/). This indicates that a good chunk of the potentially malicious requests are actually Perl scripts. Notably, there aren't even particularly sneaky scripts, as it's quite easy to change the default user agent of LWP from "libwww-perl" to whatever the programmer wishes (Ref: http://search.cpan.org/dist/libwww-perl/lib/LWP/UserAgent.pm).
Given this data it might be useful for site administrators to review their traffic logs and simply deny requests from any client that identifies itself as "libwww-perl". This is certainly not a foolproof method for safety, but it will offer significant defense against the types of automated attacker scripts that search for known vulnerabilities in web servers. Such a defense will certainly not stop a determined attacker, or even a competent malware programmer, it will stop the legions of automated bots trolling the internet. The only potential downside to such a strategy would be a denial of service to legitimate users who are utilizing Perl's LWP library to access resources. It is my suspicion though that such a situation is likely rare.