Filtering mails in exim with Python

Fri 10 October 2014

Why you send spam and how you can quit it

Personally I find correctly configuring mail systems as one of most experience needing skill of common sysadmin in webhosting environment. This is place where you have to handle problems of many users using different programs, different network connections (some ISP can do funny things when they see communication at port 25, not only block it. And MUA's still often use 25 as default) and having different experience. And from other side you have to work with different mail providers - some of them work in nice, clear way, some make stupid mistakes, don't follow same standards as you (like rewriting domain with SRS when you tell server to forward all your mails) or can be great pain in the ass when will classify you as spammer (specially big providers, like gmail or hotmail).

But you aren't spammer, so you won't be classified, right?

Well... Sometimes you become even if you don't want. One thing are viruses stealing your users passwords and using your server to spam - lucky, you can fight with it. But sometimes you have to find other solution.

In game where I help as sysadmin we use mailing list to support users. So there is publicly known support address, players can send their rants there and mailing list resend rant to all moderators using different mail providers.

What if spam bot will find your support address? You will send spam to many different accounts on different servers - easy way of getting in trouble. Of course moderators don't want to read about viagra or other new deals, so they will train their server to classify such mails as spam. Great, your moderators are teaching big providers that mails send by you are mostly spam. Problems guaranteed.

You could try to filter incoming mails checking if their are spam, but this never is perfect, and in this situation you specially don't want false-positives.

Lucky this is pretty specific situation with a nice solution. You know your players, you know their mail addresses. Sometimes you will get mail from somebody who didn't yet become a player, but in this specific situation you can always ask him to use form, right?

Python for the rescue

So you basically need to check in exim if sender address exists in db. But exim doesn't give you such interface.

What we can do? Luckily exim allows us to run shell scripts, such acl will do work for us:

deny
  condition = ${run{/path/to/filter.py "$sender_address" "$local_part"}{no}{${if eq {$runrc}{1}{yes}{no}}}}
  log_message = Mail blocked by python antyspam filter
  message = Your mail wasn't recognized by system. To contact us please use form available at https://address.to.form

We have just to add it in the end of acls run after getting recipient (likely acl_check_rcpt), just before all mails will be accepted.

And here is our script:

#!/usr/bin/python

import datetime
import MySQLdb
import sys
import os

MYSQL_USER=''
MYSQL_HOST=''
MYSQL_PASS=''
MYSQL_DB=''

MODERATED_MAILISTS = ['help', 'needrescue']

REQUESTS = ['SELECT COUNT(*) FROM users WHERE mail = %s',
          ]

db = MySQLdb.connect(host=MYSQL_HOST, user=MYSQL_USER,
                     passwd=MYSQL_PASS, db=MYSQL_DB
                     )
cursor = db.cursor()

sender = sys.argv[1].lower()
rcpt = sys.argv[2].lower()

myfile = open("/tmp/blocked_mails.txt", "a")
myfile.write(str(datetime.datetime.now())+" ")

if not sender:
     # we block mails without proper sender
    myfile.write("BLOCKED EMPTY SENDER: from %s to %s\n" % (sender, rcpt))
    print "yes"
    exit(1)

if not rcpt in MODERATED_MAILISTS:
    # we check if mail is being send to moderated mailist
    myfile.write("NOT MODERATED: from %s to %s\n" % (sender, rcpt))
    print "no"
    exit(0)

mailist_members = os.popen("/usr/bin/list_members %s" % rcpt)
mailist_members = mailist_members.read().strip().split('\n')
if sender in mailist_members:
    # sender is memeber of mailist
    myfile.write("MEMBER OF MAILIST: from %s to %s\n" % (sender, rcpt))
    print "no"
    exit(0)

for request in REQUESTS:
    cursor.execute(request, sender)
    if cursor.fetchone()[0]:
        # sender is in datebase
        myfile.write("FOUND IN DB: from %s to %s\n" % (sender, rcpt))
        print "no"
        exit(0)

myfile.write("BLOCKED: from %s to %s\n" % (sender, rcpt))
print "yes"
exit(1)

Where is the catch?

Is this solution ideal? For sure no. First problem is fact, that rejected user will get standard bounce message with our communicate somewhere in the middle, and many users probably won't find it. We could instead redirect messages to other account, and autoreply with more user friendly message, but we will lose ability to inform server that message was rejected.

Second problem, this solution isn't fastest. For every message we will have to start python script, import all libraries, connect to db, run mailman scripts and probably run few db queries. But if you make it on the server that have to handle only few hundreds mails per day, it's definitely good enough.