The technique of Bayesian filtering is described by this article. Our e-mail system integrates SpamAssassin, which uses Bayesian filtering in conjunction with its regular battery of tests. Note: If your e-mail is not configured for spam filtering already, please refer to: Using the Spam Filter first.
By default, SpamAssassin will automatically train the Bayes filter
based on the results of its battery of tests. If you want, you can disable
this behavior by setting bayes_auto_learn=0
in
~/.spamassassin/user_prefs.
Whenever a mail user moves a message to the Spam/ folder, the server automatically learns from the message (as being spam). Moving a message from the Spam/ folder to Inbox will have the reverse effect. The precise action taken by the server when moving messages between folders is described by the following matrix:
DestinationSource | Spam/ | Trash/ | Quarantine/ | Inbox / other |
---|---|---|---|---|
Spam/ | Mark as spam | Mark as spam | ||
Trash/ | (Forbidden) | |||
Quarantine/ | (Forbidden) | (Forbidden) | (Forbidden) | |
Inbox / other | Mark as not spam | Mark as not spam |
Some IMAP client applications use non-standard names for the Spam/ and Trash/ folders. The server tries to deal with this by also recognizing common patterns such as Junk/, Deleted Items/, case variations and common translations such as Courrier Indésirable/ and Éléments supprimés/.
You can also train the Bayes filter manually with the sa-learn command on the server.
To run these commands, make sure you are logged in to the correct mail account, on the active mail server. All of your mail accounts are listed in the Control Panel (under Mail / Mailbox Accounts). If the account name is yourname and your mail server is mail123.csoft.net, you would log into:
$ ssh yourname@mail123.csoft.net
The sa-learn utility is able to read individual messages from
files, or entire folders. Use the --spam
argument to indicate
confirmed spam:
$ sa-learn --spam ~/Mail/Maildir/.Spam
Use the --ham
argument to indicate the input is confirmed non-spam.
Assuming your entire Inbox is free of spam, you can use:
$ sa-learn --ham ~/Mail/Maildir/cur
To display information about your active Bayes database, use the
--dump magic
argument:
$ sa-learn --dump magic
If you are using the mutt mail user agent, you can add the following to your muttrc file so that specific keys can be used to trigger the learning of the selected message, either as spam or as ham (legitimate e-mail).
set wait_key=no # H: Register message as non-spam macro index H "|sa-learn --ham --no-rebuild --single" macro pager H "|sa-learn --ham --no-rebuild --single" # S: Register the message as spam macro index S "|sa-learn --spam --no-rebuild --single" macro pager S "|sa-learn --spam --no-rebuild --single" # R: Rebuild the Bayes database (call last) macro index R "|sa-learn --rebuild" macro pager R "|sa-learn --rebuild"