Probably the most-used mail client by novice computer users is Outlook Express (called OE on this page and commonly elsewhere), because it is provided for free by Microsoft with all Windows systems. OE is part of the Internet Explorer package, so generally you'll have the same version of OE as you have of Internet Explorer (Microsoft's web browser).
SAProxy, the award-winning open source antispam software, works well with OE. However, there are a few potential needs that run off the beaten track. Since most users of OE aren't computer experts, I've listed here some of the needs you may have, and how to resolve them.
For basic installation of SAProxy with OE, see the manual automatically installed when you install SAProxy—right-click on the SAProxy icon in your system tray (a picture of a piece of paper with a magnifying glass, next to the system time in the bottom right corner of your screen), and choose View Manual. Click on Configuring Mail Clients, then click on Outlook Express. Don't click on the link for Outlook; the setup procedures are a bit different for that program, which is sold by Microsoft as part of its Office software suite.
Note: These instructions were written using OE v6.0 and v5.5. Other versions of OE may have slightly different titles for menu options; if you run into this, please let me know and I'll add them to this tutorial.
Special situations dealt with here:
Using SAProxy with Norton Antivirus
Using SA-Learn to learn from misidentified spam & ham
If you have the Norton Antivirus (NAV) 2002 or later
editions (including Norton Systemworks 2002 or
later), the SAProxy setup is standard; follow the steps written in the manual
(see above to view it). However, earlier versions of NAV set themselves up the
same way that SAProxy does, listening on port 110 and intercepting email
messages before passing them on to OE. To get both NAV and SAProxy to work
together, take the following steps instead
of the configuration steps that SAProxy lists in its manual. For Norton Antivirus 2001 and earlier, do
not follow any of the configuration setup steps listed in the SAProxy
configuration guide (changing the Incoming Mail Server and Account Name
settings); if you've already followed those steps, reverse them and make sure
you can send and receive messages normally using OE and NAV before following
the steps below.
1. In OE, choose the Tools menu, then Accounts. Choose the Mail tab. Your email account will be listed here; if more than one is listed, choose whichever one you want to protect with SAProxy. Click on Properties.
2. Choose the Servers tab. Incoming Mail Server should read "pop3.norton.antivirus". If it doesn't, and instead has the name of your email provider, then you're using the more recent edition of NAV that operates transparently; use the standard configuration instructions for SAProxy.
3. If you're still with me, choose the Advanced tab. The second box of numbers, titled "Incoming mail (POP3):", should read "110". Change that to "109" (no quotation marks). Click OK. Click Close.
4. Open SAProxy's configuration pages: Right-click on the SAProxy icon in your system tray, then choose Configure from the menu. Choose the Host Map tab.
5. Add a line anywhere in the box as follows:
109 = 127.0.0.1:110
6. Click OK.
That's it! Don't do any of the other configuration steps normally used for SAProxy. You do, however, need to follow the steps titled "Filtering in Outlook Express" in SAProxy's setup page for OE, so that the messages flagged as spam go into a separate folder from your Inbox.
Once that's done, click on the "Send/Recv" button to verify that you're getting messages properly (if you can't see such a button at the top of Outlook Express, choose the Tools menu, Send and Receive, Send and Receive All). If you receive a message, check its headers to see that SAProxy is working (right-click on the message, choose Properties, choose the Details tab, then look at the bottom of the text box for a line that starts: "X-Spam-Checker-Version: SpamAssassin" with a version number following). If you don't get a message but do get "No new messages" appearing in the bottom right corner of OE, it indicates that everything is set up correctly (either that, or SAProxy isn't active and connected to OE at all, but at least you'll receive messages!).
No generalized program will catch all of your spam. This is because what’s spam to you may not be the same as for someone else, and because spammers are constantly changing their messages (particularly by using misspelled words to fool systems that search for particular words). SpamAssassin’s method of dealing with this problem is to offer a Bayesian classifier. The basic concept is that SA looks at all the words in the message and checks to see how often those words are used in messages that are known to be either legitimate (“ham”) or spam. For instance, messages that talk about your sister Sylvia are highly likely to be ham; messages with the hundred misspelled variations of Viagra (Vaigra, V-ia-gra, Viagara, etc.) are virtually guaranteed to be spam. The Bayesian classifier looks at all the words in a message and assigns a probability (expressed as a decimal, 0.001-.9999) that the message is spam. The higher the probability, the greater the points assigned to the message.
If you check the option “Automatically learn from past spam
to recognize new spam,” in the Settings tab of SAProxy’s
Configuration, SAProxy will assume that any message with a point value over 12
is spam, and that words in such a message can be considered spam identifiers. A
message with a point value below .1 is considered ham for the same purposes.
(These default limits can be changed by setting the commands bayes_auto_learn_threshold_spam n.nn and bayes_auto_learn_threshold_nonspam
n.nn, where “n.nn” is the
threshold point value, in the Rules tab of SAProxy’s
Configuration.)
However, you’ll get better results if you go to the extra effort of
separating your own mail into spam and ham. Keep every piece of mail in one
folder or the other for a few weeks; the suggestion is to have at least a
thousand messages of each type to give the classifier a good base to work from.
For mail you want to keep, it’s probably best to make a copy of it in your Ham
folder; for spam, just move it to the Spam folder and don’t delete it until you’ve
run SA-learn, which we’ll describe next.
SA-learn is the program which processes your messages. Unfortunately,
SA-learn requires messages to be in the “mbox” format
which is used by Eudora and certain other email programs, but not by Outlook
Express (or Outlook, for that matter). Therefore, you’ll have to convert your
folder. To do that, there is a very nice free program called Dbxconv which will convert your OE folders (which have a “.dbx” file extension, hence the name) into the mbox format. You can get it from http://people.freenet.de/ukrebs/dbxconv.html (and probably
elsewhere). Note that the program comes zipped; you’ll need an unzipping
program like WinZip (shareware) to get a usable program. I recommend you
put Dbxconv in your C:\Program Files\SpamAssassin
POP3 Proxy folder (or wherever you’ve installed SAProxy), for ease of
reference.
Dbxconv is a DOS program, as is
SA-learn, so to use them you have to use old DOS-style command-line statements.
The most tedious part of this is specifying the path name for your mail folder.
In Windows 95, 98, and ME, Outlook Express normally stores its files in a
folder whose path starts with C:\WINDOWS\Application Data\Identities. Then
there will be a 32-character hexadecimal code (intended to ensure the name is
unique), then Microsoft\Outlook Express. So, for instance, my primary identity’s
full path is C:\WINDOWS\Application
Data\Identities\{9F847560-5CD3-11D4-9608-EB0799AD573F}\Microsoft\Outlook
Express. (Whew!) It may make things easier or harder to know that in DOS mode,
it’s generally better to use the 8-character form of all folder and file names,
as the spaces in the middle of names can sometimes cause DOS programs to choke.
Therefore, I refer to this folder as \windows\applic~1\identi~1\{9f847~1\micros~1\outloo~1\
(note the ~1, which might be a ~2, ~3, etc., in your own case, as characters 7
& 8, following the first six alphanumeric characters in the folder name).
In Windows XP, it’s buried in a hidden folder inside you’re my Documents
folder. In many cases, the easiest way to discover the right location will be
to do a file search for “*.dbx” (click Start, then Find,
then Files or Folders, or the equivalent in your version of Windows).
Given the potential for mistyping the folder name, I highly recommend
either having DOSKey enabled (so you can recall a
mistyped line and edit it) or using a batch file to do the conversion and
processing of the folder. (A batch file is a text file, created using the EDIT
command, with a .BAT extension, which has on each line the DOS command you want
to execute.) Almost all the steps are the same for both ham and spam. As an
example, I’ll list my batch file for spam:
set home=c:\progra~1\spamas~1
del spam.dbx
copy \windows\applic~1\identi~1\{9f847~1\micros~1\outloo~1\spam.dbx
dbxconv spam.dbx
sa-learn --spam --showdots --mbox spam.mbx
The first line sets an environment variable that SA-learn will whine about if not present. (Don’t worry about it; just set it to the main SAProxy folder.) Next, I delete any copy of the mail folder that may be present here in the SAProxy folder, so my copy command (coming next) doesn’t give an error. Then I copy from the Outlook Express folder to the current folder. I use Dbxconv to convert from spam.dbx to spam.mbx (the latter file name is generated automatically by Dbxconv). With all that done, I’m finally ready to run SA-learn, telling it to print a dot on the screen each time it processes a message, and letting it know that this mail folder is pure spam. If it were a folder with good messages, I’d give the command:
sa-learn --ham --showdots --mbox ham.mbx
SA-learn will take a while to run, but the results are worth it. Running this on a few thousand messages of both good and bad mail will probably increase your accuracy rate to 95-99%. It’s worthwhile to continue to run this every month or so, since spam continues to evolve.
For additional information on using SAProxy, please see my Rules Customization page, which includes a number of other links at the bottom.
Kelvin Smith
196 Danbury Road
Wilton, CT 06897