The fascination with having software automate your mundane tasks never ceases. One such tedious activity lies in email parsing. Email accounts can be a useful tool for enabling people to enact a request or issue a command that may not require the intervention of humans. One such example is the, now all to common, "unsubscribe" request to harmless lists and nefarious spammers. The automated program that handles these requests is called a mailbot. Though it was adopted early, and has been retired in many cases in favor of web-interfaces, mailbots still represent a good technique for performing a number of useful tasks. Nearly anything can be done with enough creativity, good code, and patience.
First lets talk about the environment I have set up:
Linux (Redhat 7.3)
CGI version of PHP 4.2.1
The setup is very standard. I haven't done anything fancy to the system or the programs. Everything you need to know I'll explain here.
Let me give you an understanding of the process we are about to go through. When mail comes in it is received by Sendmail and diverted to the appropriate user's inbox (or offsite email address). Sendmail will check the user account to look for a .forward file. If one is found, Sendmail will deliver the mail to each recipient in the file (usually separated by newlines, but commas should work as well). This is where the mailbot gets called. The mail is "delivered" to the mailbot, which reads it in from stdin, parses the header of the mail, and then determines a course of action.
Sounds simple? It actually is. There are only a few catches that I'll guide you through. So let's get to it.
Step 1: Building the bot
A couple things to note about building a bot.
Your bot is going to get called on EVERY email message that is sent to the account.
Your bot MUST avoid "Returned Mail" loops.
First thing we need to do is set up your file to execute from the system as a script. You do this just as you would a PERL program. Place the shebang on the first line:
Next you need to open your stdin (php 4.3 makes the $stdin variable a global one so you do not need to fopen. However I'm running 4.2.1, so I do).
$stdin = fopen('php://stdin', 'r');
Setup your initial variables. You do not need to create these, but I find that my needs might change for the mailbot so having them available (should I require body parsing, other header information, etc. I don't have to rewrite anything).
Now we get into the simple loop that will feed the header data into our array. Mail headers are in a very simple format:
Where HEADERNAME is things like "Subject" or "Reply-To", and DATA is the information. Now one thing to note here is that the format is separated by a colon and some white space. Unfortunately you cannot use a function like "split" because the DATA may have a colon too. (In fact it always will for certain headers, such as those relating to time). My solution is a simple one, that hasn't failed, but could probably be tightened a bit.
At this point you have contained in simple data-structures the complete email message. You have parsed the headers and placed them into a hash table (associative array), and are ready to do the final parsing to determine your course of action. One little clean up action you might want to do before we search the "Subject" header is to format the "Reply-To" into a usable email address. The "Reply-To" is in the form: <email@example.com>
You need to simply remove the brackets. A simple way to do this is as follows:
You may also want to do additional regular expression checks to insure the "Reply-To" address is a valid email. A simple way to do that (no doubt great improvements could be made), is to do the following:
Now we do our decision-making "if" statement. Basically the technique I use to avoid the message loops (which, by the way, can crash a server in no time) is to require the word "unsubscribe" in the subject header. If, for some reason, the message the mailbot sends is returned, the subject will be changed (it would now say something like "Returned Mail"), and the mailbot will not reply. Thus a loop is avoided.
Finally we should close the pointer to the $stdin file. (You don't need to do this in 4.3)
So now we have the mailbot software ready to go. You may consider having it dump information into a logfile for error checking purposes, since you will not be running this from the command line.
Step 2: Installing the bot
Installing your mailbot is a platform dependent issue, so you may need to review the process your mail delivery software goes through. In our setup, which is pretty standard for most *NIX platforms, we start with a .forward file.
Create a file called .forward (if one does not already exist) in your home directory for the user account you want the mailbot to reside in. You may want to consider making a new user account (i.e. mailbot) as SPAM or other mail will get filtered through your mailbot if it is an account that is used for regular mail purposes. In the .forward file place the line:
You want to include the quotes, but change /path/to/bot/ to be the absolute path (do not use ~/) to your mailbot program.
Next you may need to indicate to the Sendmail Restricted Shell (smrsh) that the mailbot program is a valid program to send mail too. An easy way to do this is to add a link in the smrsh directory to your mailbot program. So:
ln -s /path/to/bot/mailbot.php
This will create the necessary link to allow your mailbot to be usable in the .forward file. An indication of when this is needed is when you receive returned email (when you attempt to mail your mailbot program), that indicates a 5.0.0 error.
At this point, your mailbot is up and running! Things to consider are issues of error reporting, and performance. It is generally a good idea to have many areas where your script will terminate if it runs in to problems, or encounters an email that it does not need to answer. A general rule of thumb is: Do NOT try and answer every email that the bot reads. This could easily lead to loops, especially on invalid return email addresses. Good luck, and enjoy!