Cheap VPS & Xen Server


Residential Proxy Network - Hourly & Monthly Packages

Email Classification (Incl. Spam Classification) With POPFile On Ubuntu Feisty Fawn


This article shows how you can install and use POPFile to classify incoming emails on an Ubuntu Feisty Fawn desktop. It is a POP3 proxy that fetches your mails from your mail server, classifies them and passes them on to your email client. Of course, POPFile must be trained to properly classify emails. From the POPFile web site: “POPFile is an automatic mail classification tool. Once properly set up and trained, it will scan all email as it arrives and classify it based on your training. You can give it a simple job, like separating out junk e-mail, or a complicated one-like filing mail into a dozen folders. Think of it as a personal assistant for your inbox.”

This document comes without warranty of any kind!

 

1 Preliminary Note

In order to use POPFile with your email client, a few settings must be changed in the email client. I’ll demonstrate this on the basis of the Evolution email client, but of course, you can use any other email client (e.g. Thunderbird) as well – the modifications are very similar.

 

2 Installing POPFile

Open the Synaptic Package Manager (System > Administration > Synaptic Package Manager):

1

Type in your password:

2

Click on the Search button:

3

Search for popfile:

4

The search should find a package called popfile. Click on it and select Mark for Installation from the menu:

5

POPFile needs a few other packages. Accept the other packages by clicking on Mark:

6

Then click on Apply:

7

Confirm your selection by clicking on Apply again:

8

The packages are now being downloaded and installed:

9

11

Click on Close afterwards and leave the Synaptic Package Manager:

12

3 Configuring POPFile

POPFile comes with a web interface which runs on port 7070, so open a browser and go to http://127.0.0.1:7070. You’ll first see the History page which doesn’t hold anything interesting for now:

13

Go to the Buckets page. Scroll down to the bottom of that page (to the Maintenance section) and create your email categories (“buckets” in POPFile speech), e.g. personal, work, and spam:

14

15

16

17

Afterwards, you should see your new categories at the top of the Buckets page. It’s a good idea to select different colours for each category so that you can easily differentiate between each category later on in the statistics:

18

That’s all we need to configure. Of course, you can browse the other tabs to become familiar with the POPFile interface. On the Magnets page, you can define rules (e.g. From or To email addresses) that cause matching emails to always be put into a certain category:

19

On the Configuration, Security, and Advanced tabs, you can find further POPFile configuration options. You should change them only if you know what you’re doing:

20

21

22

4 Configuring The Email Client

I’m using Evolution as my email client in this example. The configuration shouldn’t be very different for other email clients (e.g. Thunderbird).

What we have to do now is tell our email client to fetch emails from our POPFile proxy instead of directly from the mail server. Therefore we must modify the settings of the email account for which we want to use POPFile. Go to Edit > Preferences:

23

Select the appropriate email account and click on Edit:

24

Go to the Receiving Email tab. What we have to do now is change the Configuration section. Let’s assume you have pop.example.com in the Server field and someuser@example.com in the Username field. POPFile is running on 127.0.0.1 (localhost) on port 7071, therefore we put 127.0.0.1:7071 into the Server field, and we change the Username field to servername:username, e.g. pop.example.com:someuser@example.com in this example:

25

Afterwards, click on the Send/Receive button. If all goes well, Evolution should now fetch emails from POPFile (which itself fetches the emails from the original POP3 server, pop.example.com in this example):

26

If you use POPFile for the first time, POPFile isn’t trained yet and doesn’t know how to classify emails yet, therefore it puts the string [unclassified] in the subject line of each mail (you can change the way how POPFile marks emails on the Buckets page in the POPFile web interface – so if you don’t want POPFile to modify the original subject, you can make it modify the emails’ headers instead, for example):

27

Now let’s create subfolders for our email categories (personal, work) in the Inbox folder. Right-click on the Inbox folder and select New Folder…:

28

Then create two new folders called Personal and Work in the Inbox folder:

29

30

We don’t need to create a Spam folder in Evolution because Evolution comes with a Junk folder by default. In other email clients, you might have to create a Spam folder as well.

Now let’s create message filters that make emails tagged with [spam] go into the Junk folder, emails tagged with [personal] into the Personal folder, and emails tagged with [work] into the Work folder. Go to Edit > Message Filters:

31

Select Show filters for mail: Incoming and click on Add:

32

Create a rule called Spam like this:

Subject contains [spam] => Set Status Junk

33

Then create a rule called Personal like this:

Subject contains [personal] => Move to Folder Inbox/Personal

34

Then create a similar rule called Work:

35

(Please note: If you configure POPFile to not change the subject line, but the header of an email instead, you must adjust the message filters.)

5 Training POPFile

In order to train POPFile, go to the History page in the POPFile web interface (http://127.0.0.1:7070). You should find all mails there that were processed by POPFile. Each unclassified message has a drop-down menu at the right where you can select the category you want to put that mail in. Select the appropriate category for each mail and click on the Reclassify button. Please make sure that you not only classify spam, but also messages for your other categories so that POPFile gets trained the right way.

36

Afterwards, your previously unclassified mails have been sorted into one of your categories. If you’ve put a mail into the wrong category, you can click on the Undo button:

38

On the Buckets page, you can now find statistics about your emails and categories. If you’ve assigned colours to each category, each category should be shown in its own colour:

39

Training can take some time in the beginning, but after some time it starts to pay off, so it’s worth it.

 

  • POPFile: http://popfile.sourceforge.net
  • Ubuntu: http://www.ubuntu.com

Comments

comments