Creating a Darknet Sensor Database

2 May 2012

A darknet is a portion of the internet for which no traffic should legitimately bound. This traditionally includes unallocated IP address space but can include traffic destined for ports on hosts that do not listen on those ports. The idea behind monitoring darknet traffic is that it is a good way to spot automated scans of your network. Scans usually traverse IP space in an ordered fashion, attempting to connect to ports and services based on addressability rather than availability. A good example of this type of behavior is a NMAP scan of an IP range. Typically this type of activity is performed as reconnaissance prior to an attack (the goal being to find systems vulnerable to a know exploit to derive a hit-list for later attack). With the proliferation of fast connectivity many attackers bundle discovery with exploitation and simply attack IP addresses and ports without checking to see if there is a machine at the IP address or a port listening. This approach speeds up attacks and is typical of worm activity. The advantage for defenders is that by monitoring traffic bound to closed ports or unallocated IP space, security analysts can observe traffic that is known to be hostile, or at least the result of misconfiguration. By observing trends in darknet activity it is possible to discern the types of services being attacked, or at least those that attackers are interested in.

Creating a darknet sensor often involved placing a network tap at a choke point in the network topology and capturing packets destined for unallocated IP space. The logistical challenges in deploying this type of sensor are significant, and security administrators may find it much more convenient, and feasible, to simply deploy machines on unused IP addresses with all ports closed, or deploy sensors on existing machines to monitor for traffic destined to closed ports. By leveraging existing infrastructure or plugging in old hardware, it is possible to gain insight into darknet activity without having to adjust or interfere with the network topology. This approach greatly reduces the possibility of an adverse side effect on the network while providing much of the same data.

Building a sensor using a typical Linux machine is quite easy. Iptables is a kernel modification that performs firewall like activity, determining if packets should be accepted, dropped, or forwarded along based on iptables rules. Iptables offers a couple of handy configuration options that allow packets to be logged as well. By altering iptables rules to log packets that are dropped, we can effectively turn iptables into a darknet sensor.

Although iptables logging allows dropped packets to be logged to syslog services, this behavior is less than optimal when considering doing data analysis. The flat file format of syslog, combined with the inclusion of all sorts of data not related to iptables makes parsing syslog a hassle when trying to derive security intelligence data.

Rsyslog, an evolution of the traditional Unix syslog, allows for a lot more flexibility when it comes to the handling of log messages. In addition to being able to segregate log messages into different log files based on regular expression pattern matching, rsyslog allows log messages to be written to a MySQL database. This functionality allows log messages to be inserted directly into a medium conducive to running reports and aggregating the data collected by darknet sensors. The MySQL server can be local or remote to the rsyslog service collecting log messages, which allows multiple sensors to log data in a central MySQL server.

Unfortunately the database format for rsyslog messages is less than helpful. The actual text of message is stored in a single varchar column called 'Message' which must be parsed using string matching in order to derive useful data. String matching on a varchar field is extremely inefficient and storing darknet data in tables specifically designed to segregate port and ip address information into integer fields that can be indexed, allows for much faster and more effective querying and reporting. Fortunately, recent versions of MySQL implement triggers. Database triggers are designed to carry out stored proceedures when certain criteria are met. For instance, a trigger can be created so that every time a row in inserted into one table, another row is inserted into a separate table. This allows for dynamic reactions to the introduction of new data. By building a trigger on the table rsyslog stores MySQL data in, we can parse the 'Message' varchar field into useful chunks and insert those values into carefully controlled columns in a separate table of our own design.

Implementing Drop Logs

The first step in the process of building a darknet database is manipulating iptables rules to allow for the logging of dropped packets. Typically on Red Hat (CentOS, Fedora, etc.) systems iptables configuration data is stored in /etc/sysconfig/iptables. The following is a typical iptables configuration with logging enabled:

# Firewall configuration written by system-config-firewall
# Manual customization of this file is not recommended.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:ossec - [0:0]
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
#-A INPUT -m state --state NEW -m udp -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT
#-A INPUT -m udp -p udp --dport 17500 -j REJECT -m comment --comment "Disregard Dropbox clients"
#-A INPUT -m udp -p udp --dport 137 -j REJECT -m comment --comment "Ignore DHCP clients"
#-A INPUT -m udp -p udp --dport 138 -j REJECT -m comment --comment "Ignore DHCP clients"
-A INPUT -d 224.0.0.251 -j REJECT -m comment --comment "Ignore multicast packets"
-A INPUT -d 224.0.0.1 -j REJECT -m comment --comment "Ignore multicast packets"
-A INPUT -d 10.0.0.7 -m state --state NEW -j LOG --log-ip-options --log-prefix "iptables " -m comment --comment "Log dropped packets"
-A INPUT -j REJECT --reject-with icmp-host-prohibited

-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT

This configuration allows connections to tcp port 22 and upp port 5353 along with a few additional lines to ignore background network traffic. Notice the line that refers to the destination (-d) address 10.0.0.7, which is assumed to be the IP address of the sensor. This rule logs all packets that aren't allowed with prior rules, and prepends the string "iptables" to the syslog entry. The next line drops all packets that make it to that point in the rule chain. The prepended string is important because we can use it to quickly identify iptables log messages in syslog using regular expression pattern matching.

Implementing Rsyslog

Rsyslog is typically available as a package for installation on most systems. In order to use rsyslog with MySQL you have to install not only rsyslog packages but also rsyslog-mysql packages. Once this is done you must manually create the data structure in your MySQL database. Check your installed version of rsyslog and then download the source package from rsyslog.com that corresponds. For instance, if you are using rsyslog 4.6.3 you can download the source from http://www.rsyslog.com/download/page/files/download/rsyslog/rsyslog-4.6.3.tar.gz. Once downloaded you need to extract the source and examine the MySQL configuration information in plugins/ommysql/createDB.sql. You can create the new database by running that script directly against the MySQL server using:

$ mysql -u root -p < createDB.sql

This will create a new database Syslog with two tables named SystemEvents and SystemEventsProperties. Once installed you can modify your rsyslog configuration to log syslog entries with the above specified "iptables " string to the database. On Red Hat type systems the configuration is found in /etc/rsyslog.conf. In addition to the logging you have to enable MySQL database connectivity by adding a single line in the Modules section of the rsyslog.conf file:

# Add MySQL support
$ModLoad ommysql

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none                /var/log/messages
if $msg contains 'iptables ' then :ommysql:mysql-database-host.tld,Syslog,db-username,password

Once this is done you can restart rsyslog and your iptables entries should show up in your Syslog database.

Creating the Trigger

The database entries in the Syslog database are useful, but they're not very easy to query or friendly to queries when the volume of data grows. In order to make more efficient use of the database, creating a table "firewall" with the following configuration, is advisable:

Create table firewall (
  id int auto_increment not null primary key,
  src_ip int unsigned not null,
  dst_ip int unsigned not null,
  src_port int unsigned not null,
  dst_port int unsigned not null,
  proto ENUM('tcp','udp','icmp'),
  received_at datetime
);

This table stores IP addresses as unsigned integers, which are much easier to query and sort than varchar text columns. The only step remaining is to create a trigger so that every time a row is inserted into the SystemEvents table a new entry is added into the firewall table. The following MySQL statement creates this trigger:

DROP TRIGGER IF EXISTS firewall_ai_trig;
DELIMITER $$

CREATE TRIGGER firewall_ai_trig
  AFTER INSERT ON SystemEvents
  FOR EACH ROW
BEGIN
  DECLARE t_src_ip INT UNSIGNED;
  DECLARE t_dst_ip INT UNSIGNED;
  DECLARE t_src_port INT UNSIGNED;
  DECLARE t_dst_port INT UNSIGNED;
  DECLARE t_proto ENUM('tcp','udp','icmp');

  
  SELECT INET_ATON(SUBSTRING(NEW.Message, INSTR(NEW.Message, 'SRC=')+4, INSTR(New.Message, ' DST=')-INSTR(NEW.Message, 'SRC=')-4)) INTO t_src_ip from DUAL;
  SELECT INET_ATON(SUBSTRING(NEW.Message, INSTR(NEW.Message, 'DST=')+4, INSTR(New.Message, ' LEN=')-INSTR(NEW.Message, 'DST=')-4)) INTO t_dst_ip from DUAL;
  SELECT SUBSTRING(NEW.Message, INSTR(NEW.Message, 'SPT=')+4, INSTR(New.Message, ' DPT=')-INSTR(NEW.Message, 'SPT=')-4) INTO t_src_port from DUAL;
  SELECT SUBSTRING(NEW.Message, INSTR(NEW.Message, 'DPT=')+4, INSTR(New.Message, ' WINDOW=')-INSTR(NEW.Message, 'DPT=')-4) INTO t_dst_port from DUAL;
  SELECT LOWER(SUBSTRING(NEW.Message, INSTR(NEW.Message, 'PROTO=')+6, INSTR(New.Message, ' SPT=')-INSTR(NEW.Message, 'PROTO=')-6)) INTO t_proto from DUAL;

  INSERT INTO firewall set src_ip = t_src_ip, dst_ip = t_dst_ip, src_port = t_src_port, dst_port = t_dst_port, proto = t_proto, received_at = NEW.ReceivedAt;

END$$

Once this is complete your trigger should become operational immediately and you should be able to find that new firewall drop events are logged by iptables to rsyslog which creates new entries in the MySQL database and the MySQL trigger parses the data into a more meaningful format for querying and reporting:

mysql> select id, inet_ntoa(src_ip), inet_ntoa(dst_ip), src_port, dst_port, proto, received_at from firewall;
+----+-------------------+-------------------+----------+----------+-------+---------------------+
| id | inet_ntoa(src_ip) | inet_ntoa(dst_ip) | src_port | dst_port | proto | received_at         |
+----+-------------------+-------------------+----------+----------+-------+---------------------+
|  6 | 28.63.83.104      | 105.123.57.111    |    53469 |       23 | tcp   | 2012-05-02 13:56:55 | 
+----+-------------------+-------------------+----------+----------+-------+---------------------+
1 row in set (0.00 sec)

Conclusion

Parsing darknet data can be a daunting task. Finding a way to capture darknet data in the absence of network taps can also be challenging. Fortunately tools already existing in most Red Hat based Linux distributions to accomplish this task quite easily. By leveraging extensions in iptables it is possible to log dropped packets to syslog. By implementing rsyslog with the MySQL module it is trivial to send these messages to a MySQL database instead of a log file on the filesystem. This can even be used to accomplish remote logging. Finally, using MySQL triggers we can parse new data on the fly as it arrives and reformat it into a much easier structure for aggregation, reporting and common queries.