Installation

You can install logstash either manually or as an APT-GET package. I recommend you to use the manual installation because the automatic one will chroot you in /var/log. If your application is using logs that are somewhere else, then you'll be screwed.

Manual installation (recommended)

Be careful: Logstash version must match the ElasticSearch version for better performances.

Get Logstash from the official website: http://logstash.net/
Install it and unpack it into /opt/

cd /tmp
wget https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz
tar xjvf logstash-1.4.2.tar.gz
rm logstash-1.4.2.tar.gz
mv logstash-1.4.2/ /opt/
cd /opt
ln -s /opt/logstash-1.4.2 /opt/logstash

Create configuration directories

mkdir -p /etc/logstash/conf.d
mkdir /etc/logstash/grok
mkdir /etc/logstash/db
chmod -R 777 /etc/logstash

touch log file

touch /var/log/logstash.log
chmod -R 777 /var/log/logstash.log

Create an init.d script

cd /etc/init.d
vim logstash.sh

Parse the following content:

#!/bin/sh
### BEGIN INIT INFO
# Provides: logstash
# Required-Start: $remote_fs $syslog
# Required-Stop: $remote_fs $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Start daemon at boot time
# Description: Enable service provided by daemon.
### END INIT INFO

. /lib/lsb/init-functions

if [ $(id -u) -ne 0 ]; then
	echo -e " " 
	echo -e "!!!!!!!!!!!!!!!!!!!!" 
	echo -e "!! Security alert !!" 
	echo -e "!!!!!!!!!!!!!!!!!!!!" 
	echo -e "You need to be root or have root privileges to run this script!\n\n"
	echo -e " " 
	exit 1
fi

# Where should Logstash keep track of each file?
export SINCEDB_DIR="/etc/logstash/db"

# Logstash params
name="logstash"
logstash_bin="/opt/logstash/bin/logstash"
logstash_conf="/etc/logstash/conf.d/"
logstash_log="/var/log/logstash.log"
pid_file="/var/run/$name.pid"

start () {
	commandOpts="agent -f $logstash_conf --log ${logstash_log} --verbose"
	log_daemon_msg "Starting $name" "$name"
	if start-stop-daemon --start --quiet --oknodo --pidfile "$pid_file" -b -m --exec $logstash_bin -- $commandOpts; then
		log_end_msg 0
	else
		log_end_msg 1
	fi
}
testConfig () {
	echo "#############################"
	echo " Logstash configuration test"
	echo "#############################"
	command="${logstash_bin} -f $logstash_conf --verbose -t"
	$command
}
stop () {
	log_daemon_msg "Stopping $name" "$name"
	start-stop-daemon --stop --quiet --oknodo --pidfile "$pid_file"
}
status () {
	status_of_proc -p $pid_file "" "$name"
}

case $1 in
	start)
		if status; then exit 0; fi
		start
		;;
	stop)
		stop
		;;
	reload)
		stop
		start
		;;
	restart)
		stop
		start
		;;
	status)
		status && exit 0 || exit $?
		;;
	testConfig)
		testConfig
		;;
	*)
		echo "Usage: $0 {start|stop|restart|reload|status|testConfig}"
		exit 1
		;;
esac
exit 0

Create symlinks

ln -s /etc/init.d/logstash.sh /usr/bin/logstash

Register application as a service (optional)

cd /etc/init.d
update-rc.d logstash.sh defaults

Automatic installation

Source: http://logstash.net/docs/latest/repositories

Add Logstash repository: see Sources#ELK
Install application

apt-get install logstash logstash-contrib

>> Binaries in /opt/logstash

>> Configuration in /etc/logstash/conf.d/

>> Logs in /var/log/logstash/

Create a folder for logstash to keep track of each file

mkdir -p /etc/logstash/db
chmod -R 777 /etc/logstash/

Add a new environment variable in your /etc/profile || /etc/environment

Put:

SINCEDB_DIR=/etc/logstash/db

Apply changes:

source /etc/environment

Configuration

GROK

Grok is used to split a log message into fields.

Grok tools

You can create your own grok patterns and test them with the on-line processor over here: http://grokdebug.herokuapp.com/

Apache2 error

Create configuration file:

vim /etc/logstash/grok/apache2ErrorLog.grok

Put the following content:

HTTPERRORDATE %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}
APACHEERRORLOG \[%{HTTPERRORDATE:timestamp}\] \[%{WORD:severity}\] \[client %{IPORHOST:clientip}\] %{GREEDYDATA:message_remainder}

IpTables

Create configuration file:

vim /etc/logstash/grok/iptables.grok

Put the following content:

NETFILTERMAC %{COMMONMAC:dst_mac}:%{COMMONMAC:src_mac}:%{ETHTYPE:ethtype}
ETHTYPE (?:(?:[A-Fa-f0-9]{2}):(?:[A-Fa-f0-9]{2}))
# IPv6 + v4
IPTABLES %{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME} .* IN=(%{WORD:in_device})? OUT=(%{WORD:out_device})? *(MAC=%{NETFILTERMAC})? \ 
SRC=%{IP:src_ip} DST=%{IP:dst_ip} *(LEN=%{INT:pkt_length})? *(TOS=%{BASE16NUM:pkt_tos})? *(PREC=%{BASE16NUM:pkt_prec})? \ 
*(TTL=%{INT:pkt_ttl})? ID=%{INT:pkt_id} .* *(PROTO=%{WORD:protocol}) SPT=%{INT:src_port} DPT=%{INT:dst_port} \
*(WINDOW=%{INT:pkt_window})? *(RES=%{BASE16NUM:pkt_res})? .* *(URGP=%{INT:pkt_urgp})?
# IPv4 only
IPTABLES_V4 %{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME} .* IN=(%{WORD:in_device})? OUT=(%{WORD:out_device})? *(MAC=%{NETFILTERMAC})? \
SRC=%{IPV4:src_ip} DST=%{IPV4:dst_ip} *(LEN=%{INT:pkt_length})? *(TOS=%{BASE16NUM:pkt_tos})? *(PREC=%{BASE16NUM:pkt_prec})? \
*(TTL=%{INT:pkt_ttl})? ID=%{INT:pkt_id} .* *(PROTO=%{WORD:protocol}) SPT=%{INT:src_port} DPT=%{INT:dst_port} \
*(WINDOW=%{INT:pkt_window})? *(RES=%{BASE16NUM:pkt_res})? .* *(URGP=%{INT:pkt_urgp})?

Fail2ban

Create configuration file:

vim /etc/logstash/grok/fail2ban.grok

Put the following content:

FAIL2BAN %{TIMESTAMP_ISO8601:timestamp} %{JAVACLASS:criteria}: %{LOGLEVEL:level} \[%{WORD:service}\] Ban %{IPV4:clientip}

Log4j

We use some common log4j patterns, it's easy to extract the overall log message:

Log4j pattern

Grok root pattern

%d %5p %t %c - %m%n

^%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{USERNAME:thread} %{JAVACLASS:logger} - *(%{GREEDYDATA:content})

date{dd.MM.yyyy HH:mm:ss:SSS} [%thread] [%-5p] - %30c{0} - %m%n

TIMESTAMP_RTD %{MONTHDAY}.%{MONTHNUM}.%{YEAR} %{HOUR}:%{MINUTE}:%{SECOND}
LOG4J_FR_PATTERN ^%{TIMESTAMP_RTD:timestamp} \[%{USERNAME:thread}] \[%{LOGLEVEL:level}] - .* (%{WORD:class})? - *(%{GREEDYDATA:content})

Super strong expression

To match multiple cases at once:

%d %5p %t %c - %m%n
%d %5p %t %c{1} - %m%n
%d %5p %c - %m%n
%d %5p %c{1} - %m%n

^\s*%{TIMESTAMP_ISO8601:timestamp}\s*%{LOGLEVEL:level} 
     (?:(%{USERNAME:thread} %{JAVACLASS:logger}|%{USERNAME:thread} %{WORD:logger}|%{JAVACLASS:logger}|%{WORD:logger}))
     (?<content>(.|\r|\n)*)

[!] The previous expression has to be 1 single line, with a single space between each block!!

VEHCO specific patterns

Having a generic "content" is not enough!! You need to extract information from it.

Here are some examples:

Logs

2014-11-21 12:00:47,922 TRACE rabbitmq-cxn-2-consumer com.vehco.rtd.smartcard.service.business.AuthClient \ 
   - Replying to OBC auth data DONE. Smart-card --> OBC   |   smartcardId 02951DA314000000
2014-11-21 12:38:26,981 TRACE rabbitmq-cxn-2-consumer com.vehco.rtd.smartcard.service.dao.ampq.JmsTopicListener \
   -  [x] Received message 'startAuthentication' for smart-card: 02667AA314000000, consumer smartcardId: 02667AA314000000
2014-11-21 12:38:27,033 TRACE rabbitmq-cxn-2-consumer com.vehco.rtd.smartcard.service.cardreaderlisthandler.cardreader.ReaderLocker \
   - Terminal: OMNIKEY AG CardMan 3121 02 00 | Smart-card ID: 02667AA314000000 # locked
2014-11-21 12:38:30,920 TRACE rabbitmq-cxn-2-consumer com.vehco.rtd.smartcard.service.cardreaderlisthandler.cardreader.ReaderLocker \
   - Terminal: OMNIKEY AG CardMan 3121 02 00 | Smart-card ID: 02667AA314000000 # unlocked

Grok patterns

LOG_SENTENCE (?:[A-Za-z0-9\s\-><\\/.+*\[\]&%'#]+)*
RTD_TERMINAL_SUFFIX Terminal: %{LOG_SENTENCE:rtd_terminal_id} .* *(Smart-card ID: %{WORD:rtd_smartcard_id}) # %{WORD:rtd_terminal_state}
RTD_AUTH_START_SUFFIX %{LOG_SENTENCE:rtd_action}: %{WORD:rtd_smartcard_id}
RTD_AUTH_DONE_SUFFIX %{LOG_SENTENCE:rtd_action}. *(smartcardId %{WORD:rtd_smartcard_id})?


RTD_TERMINAL ^%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{USERNAME:thread} %{JAVACLASS:logger} - %{RTD_TERMINAL_SUFFIX}
RTD_AUTH_START ^%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{USERNAME:thread} %{JAVACLASS:logger} - %{RTD_AUTH_START_SUFFIX}
RTD_AUTH_DONE ^%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{USERNAME:thread} %{JAVACLASS:logger} - %{RTD_AUTH_DONE_SUFFIX}

Logstash - Multilines

Multiline filter must be before GROK filter!

No space start

A new event must NOT start with a space.

# All lines starting with a space belong to the previous event
multiline {
       pattern => "^\s"
       negate => false
       what => "previous"
}

Java exceptions

This will make all exceptions belong to the previous event.

# All exceptions belong to the previous event
multiline {
       pattern => "(([^\s]+)Exception.+)|(at:.+)"
       negate => false
       what => "previous"
}

Only LOG4J logs

If you only expect Log4j logs then you know that each line that does NOT start with a %{TIMESTAMP} is NOT a new event.

# All lines that does not start with %{TIMESTAMP} or ' ' + %{TIMESTAMP} belong to the previous event
multiline {
       pattern => "(([\s]+)20[0-9]{2}-)|20[0-9]{2}-"
       negate => true
       what => "previous"
}

Logstash - Common services

More informations about GeoIP: http://logstash.net/docs/latest/filters/geoip

Grok failure

If your Grok expression is wrong the line will be tagged as '_grokparsefailure' .

filter {
	# myApplication
	if [type] == "myApp" {
		grok {
                      ...
		}
		# Something wrong occurred !!! :O Do something else instead!
		if "_grokparsefailure" in [tags] {
			grok {
				 patterns_dir => "/etc/logstash/grok"
				 match=>[
					"message","(?<content>(.|\r|\n)*)"
					]
				 }
		}
	}
}

Apache2

Requirements:

Make sure your logs are in "/var/log/apache2" or adjust the paths
Make sure your using the COMBINED logs (default in Apache 2.4+)

Logstash configuration extract:

input {
	file {
	    path => [ "/var/log/apache2/access.log", "/var/log/apache2/ssl_access.log", "/var/log/apache2/other_vhosts_access.log" ]
	    type => "apache-access"
	}
	file {
	    path => "/var/log/apache2/error.log"
	    type => "apache-error"
	}
}

filter {
	# ------------------------ Parse services logs into fields ---------------------------
	# APACHE 2
	if [type] == "apache-access" {
		# To process log data (message's content) using some regex or precompiled GROK pattern
		grok {
			match => [ "message", "%{COMBINEDAPACHELOG}"]
		}
		# To extract log's time according to a date pattern
		date {
			match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z"]
		}
		# Extraction browser information, if available.
		if [agent] != "" {
			useragent {
				source => "agent"
			}
		}
		if [clientip] != "" {
			geoip {
				source => "clientip"        
                                target => "apache_clientip"
                                add_tag => [ "geoip" ]
                        }
                }
	}

	if [type] == "apache-error" {
		grok {
			match => [ "message", "%{APACHEERRORLOG}"]
			# Directory where to find the custom patterns
			patterns_dir => ["/etc/logstash/grok"]
		}
                if [clientip] != "" {
			geoip {
				source => "clientip"        
                                target => "apache_clientip"
                                add_tag => [ "geoip" ]
			}
		}	
	}
}

output { 
   ...
}

IpTables

Requirements:

Make sure you are logging dropped packets into a dedicated file. See Firewall log dropped

Logstash configuration extract:

input {
	file {
	    path => "/var/log/iptables.log"
	    type => "iptables"
	}
}


filter {
	# IPTABLES
	if [type] == "iptables" {
		grok {
			match => ["message", "%{IPTABLES}"]
			patterns_dir => ["/etc/logstash/grok"]
		}
                # Default 'geoip' == src_ip. That means it's easy to display the DROPPED INPUT :)
                if [src_ip] != "" {
			geoip {
			        source => "src_ip"
                                add_tag => [ "geoip" ]
			        target => "src_geoip"
			}
		}
                if [dst_ip] != "" {
			geoip {
			        source => "dst_ip"
                                add_tag => [ "geoip" ]
			        target => "dst_geoip"
			}
		}	
	}
}

output { 
   ...
}

Fail2ban

Logstash configuration extract:

input {
	file {
	    path => "/var/log/fail2ban.log"
	    type => "fail2ban"
	}
}


filter {
	# Fail2ban
	if [type] == "fail2ban" {
		grok {
			match => ["message", "%{FAIL2BAN}"]
			patterns_dir => ["/etc/logstash/grok"]
		}
                if [ban_ip] != "" {
			geoip {
			        source => "ban_ip"
                                add_tag => [ "geoip" ]
			        target => "ban_geoip"
			}
		}	
	}
}

output { 
   ...
}

Syslog

Logstash configuration extract:

input {
	file {
	    path => [ "/var/log/syslog", "/var/log/auth.log", "/var/log/mail.info" ]
	    type => "syslog"
	}
}


filter {
	# SYSLOG
	if [type] == "syslog" {
		grok {
			match => ["message", "%{SYSLOGBASE}"]
		}
	}
}

output { 
   ...
}

Tomcat

... To be done ...

Logstash (Application logs)

Log4J

input {
	file {
	    path => [ "/home/beta3/catalina.base/logs/vehco/*.log" ]
	    type => "myApp"
	}
}

filter {	
	# All lines that does not start with %{TIMESTAMP} or ' ' + %{TIMESTAMP} belong to the previous event
	multiline {
		pattern => "(([\s]+)20[0-9]{2}-)|20[0-9]{2}-"
		negate => true
		what => "previous"
	}
	
	# myApplication
	if [type] == "myApp" {
		grok {
			patterns_dir => ["/etc/logstash/grok"]
			match => [
				"message", "^\s*%{TIMESTAMP_ISO8601:timestamp}\s*%{LOGLEVEL:level} (?:(%{USERNAME:thread} %{JAVACLASS:logger}|%{USERNAME:thread} %{WORD:logger}|%{JAVACLASS:logger}|%{WORD:logger}))(?<content>(.|\r|\n)*)"
				]
			add_tag => "myApp-log"
		}
		# Something wrong occurred !!! :O
		if "_grokparsefailure" in [tags] {
			grok {
				 patterns_dir => "/etc/logstash/grok"
				 match=>[
					"message","(?<content>(.|\r|\n)*)"
					]
				 }
		}
	}
}

output { 
   ...
}

VEHCO specific patterns

Now that you have some specific GROK patterns, you need to update your Logstash configuration.

input {
	file {
	    path => [ "/var/log/vehco/*.log" ]
	    type => "vehco-rtd"
	}
}


filter {
	# VEHCO-RTD
	if [type] == "vehco-rtd" {
		grok {
			patterns_dir => ["/etc/logstash/grok"]
			match => [
                                    "message", "%{RTD_TERMINAL}",
                                    "message", "%{RTD_AUTH_START}",
                                    "message", "%{RTD_AUTH_DONE}"
                                 ]
		}
	}
}

output { 
   ...
}

[!] Note 1:

Grok will normally break on match == it will stop processing after the first pattern that matches.

[!] Note 2:

You can set generic blob expression as INPUT filters.

Manual commands

The following command(s) are just here for my personal reference:

cd /opt/logstash/bin
./logstash -f /etc/logstash/conf.d/ -t --verbose

References

Very good webinar from the ElasticSearch team: http://www.elasticsearch.org/webinars/introduction-to-logstash/?watch=1
Very good blog article: https://home.regit.org/2014/01/a-bit-of-logstash-cooking/
Grok on-line debugger: http://grokdebug.herokuapp.com/

Anonymous

Search

Navigation

Navigation

Categories

Wiki tools

Wiki tools

Logstash

Namespaces

Page actions

Contents

Installation

Manual installation (recommended)

Automatic installation

Configuration

GROK

Grok tools

Apache2 error

IpTables

Fail2ban

Log4j

VEHCO specific patterns

Logstash - Multilines

No space start

Java exceptions

Only LOG4J logs

Logstash - Common services

Grok failure

Apache2

IpTables

Fail2ban

Syslog

Tomcat

Logstash (Application logs)

Log4J

VEHCO specific patterns

Manual commands

References

Anonymous

Search

Navigation

Wiki tools

Page tools

Categories

Logstash

Contents

Installation

Manual installation (recommended)

Automatic installation

Configuration

GROK

Grok tools

Apache2 error

IpTables

Fail2ban

Log4j

VEHCO specific patterns

Logstash - Multilines

No space start

Java exceptions

Only LOG4J logs

Logstash - Common services

Grok failure

Apache2

IpTables

Fail2ban

Syslog

Tomcat

Logstash (Application logs)

Log4J

VEHCO specific patterns

Manual commands

References