Table of Contents

Postfix body_checks script

This script downloads two lists of URL's that has been spotted in spam mails and reported to http://spamvertised.abusebutler.com and http://www.spamcop.net.

The idea is that if this URL is found in the mail it must be spam and thus discarded.

Run it as a cronjob every 5 minutes (or more often if you like).

Beware: Data is supplied by humans and not all URL's are “dangerous” - I have seen legitimate ones like w3.org and symantec.com

main.cf/master.cf configuration

Insert the following into main.cf:

body_checks = pcre:/etc/postfix/advanced_block_urls

You may want to have a second smtpd instance running for outgoing mail to save resources on your gateway. Put the following into master.cf:

192.168.1.1:10050 inet n        -        n        -        -  smtpd
      -o receive_override_options=no_header_body_checks

This will open a second smtp daemon listening on port 10050. The IP 192.168.1.1 should reflect your own IP address. When routing mail from the inside through the gateway just forward the mail to port 10050 instead of 25. That's it.

The script

The following script extracts the domain from the two sources and discards subdomains and/or directories and not completed (when they are too long and shortend as www.longdomain/andsomemore…).

Also I have tried to take care of some nations way of using commercial domains like UK (bbcnews.co.uk) and ZA (www.iol.co.za). If you know of others doing the same, just let me know and I correct the script to get more usefull data out of it.

Please let me know if it doesn't work or you have good ideas.

#!/bin/bash

DOWNLOAD_DIRECTORY="/etc/postfix"
ABUSEBUTLER_URL="http://spamvertised.abusebutler.com/spamvertised.php?rep=last24"
SPAMCOP_URL="http://www.spamcop.net/w3m?action=inprogress&type=www"
OUT_FILE="bad_url_list"
OUT_FILE_IP="bad_url_list_ip"
PRE_POSTFIX_MAP_FILE="pre_block_urls"
OUT_FILE_FULL_PATH=$DOWNLOAD_DIRECTORY/$OUT_FILE
POSTFIX_MAP_FILE="advanced_block_urls"
POSTFIX_MAP_FILE_FULL_PATH=$DOWNLOAD_DIRECTORY/$POSTFIX_MAP_FILE

rm $OUT_FILE_FULL_PATH
rm $POSTFIX_MAP_FILE_FULL_PATH
rm $DOWNLOAD_DIRECTORY/$PRE_POSTFIX_MAP_FILE
rm $DOWNLOAD_DIRECTORY/$OUT_FILE_IP

#Get URL that consist of IP addresses and remove any ports
lynx -dump $ABUSEBUTLER_URL |egrep -v -i "uptime|abusebutler|spamvertised"|grep http |cut -c 14- | egrep "[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}" | cut -d: -f1 | sort -u > $DOWNLOAD_DIRECTORY/$OUT_FILE_IP

#Get normal URL without IP addresses
lynx -dump $ABUSEBUTLER_URL |egrep -v -i "uptime|abusebutler|spamvertised"|grep http |cut -c 14- | fgrep -v ... | egrep -v "[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}" | sort -u > $OUT_FILE_FULL_PATH

#Strip down to single domain
for x in $( cat $OUT_FILE_FULL_PATH ); do
        SUBDOMAIN=""
        TLD=`echo -n $x | awk -F "." '{elems=split($0, a); print $elems}'` >/dev/null 2>&1
        DOMAIN=`echo -n $x | awk -F "." '{elems=split($0, a);elems-- ; print $elems}'`  >/dev/null 2>&1
         if [ $DOMAIN.$TLD = co.uk -o $DOMAIN.$TLD = co.za ]; then
                SUBDOMAIN=`echo -n $z | awk -F "." '{elems=split($0, a);elems-- ;elems-- ; print $elems}'`
                DOMAIN=$DOMAIN.$SUBDOMAIN
         fi
        echo $DOMAIN.$TLD >> $DOWNLOAD_DIRECTORY/$PRE_POSTFIX_MAP_FILE
done

#Make mapfile from Abusebutler
for y in $( cat $DOWNLOAD_DIRECTORY/$PRE_POSTFIX_MAP_FILE ); do
        echo "/$y        REJECT Abusebutler body_check failed" >> $POSTFIX_MAP_FILE_FULL_PATH
done

#Get URL that consist of IP addresses and remove any ports
lynx -dump -nolist $SPAMCOP_URL | sed -n 's/.*http\:\/\/\(.*\).*/\1/p'| sed 's/\/$//' | cut -d/ -f1 | egrep "[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}" | cut -d: -f1 | sort -u >>  $DOWNLOAD_DIRECTORY/$OUT_FILE_IP

#Get normal URL's without IP addresses
lynx -dump -nolist $SPAMCOP_URL | sed -n 's/.*http\:\/\/\(.*\).*/\1/p'| sed 's/\/$//' | cut -d/ -f1 | fgrep -v ... | egrep -v "[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}" | cut -d: -f1 | sort -u > $OUT_FILE_FULL_PATH

#Strip down to single domain
for z in $( cat $OUT_FILE_FULL_PATH ); do
        SUBDOMAIN=""
        TLD=`echo -n $z | awk -F "." '{elems=split($0, a); print $elems}'`  ##>/dev/null 2>&1
        DOMAIN=`echo -n $z | awk -F "." '{elems=split($0, a);elems-- ; print $elems}'` ## >/dev/null 2>&1
         if [ $DOMAIN.$TLD = co.uk -o $DOMAIN.$TLD = co.za  ]; then
                SUBDOMAIN=`echo -n $z | awk -F "." '{elems=split($0, a);elems-- ;elems-- ; print $elems}'`
                DOMAIN=$DOMAIN.$SUBDOMAIN
         fi

        echo $DOMAIN.$TLD >> $DOWNLOAD_DIRECTORY/$PRE_POSTFIX_MAP_FILE


done

#Make mapfile from SpamCop
for b in $( cat $DOWNLOAD_DIRECTORY/$PRE_POSTFIX_MAP_FILE ); do
        b=`echo -n $b |cut -d/ -f1`  >/dev/null 2>&1
        echo "/$b/        REJECT SpamCop body_checks failed" >> $POSTFIX_MAP_FILE_FULL_PATH
done

#Needs to reload Postfix to read the new file
/usr/sbin/postfix reload >/dev/null 2>&1

FIXME Doesn't handle URL's when it is an ip address. Those are put in a seperat file for now and not used.

FIXME You have to reload Postfix everytime the files are updated. This is not feasible. The URL's should be put into a MySQL instead.

FIXME Doesn't handle mistyped URL's consisting of partial ip addresses.