mirror of
https://git.proxmox.com/git/proxmox-spamassassin
synced 2025-04-28 12:19:37 +00:00
342 lines
13 KiB
Plaintext
342 lines
13 KiB
Plaintext
Welcome to Apache SpamAssassin!
|
|
-------------------------------
|
|
|
|
What is Apache SpamAssassin
|
|
---------------------------
|
|
|
|
Apache SpamAssassin is the #1 Open Source anti-spam platform giving
|
|
system administrators a filter to classify email and block "spam"
|
|
(unsolicited bulk email). It uses a robust scoring framework and plug-ins
|
|
to integrate a wide range of advanced heuristic and statistical analysis
|
|
tests on email headers and body text including text analysis, Bayesian
|
|
filtering, DNS blocklists, and collaborative filtering databases.
|
|
|
|
Apache SpamAssassin is a project of the Apache Software Foundation (ASF).
|
|
|
|
|
|
What Apache SpamAssassin Is Not
|
|
-------------------------------
|
|
|
|
Apache SpamAssassin is not a program to delete spam, route spam and ham to
|
|
separate mailboxes or folders, or send bounces when you receive spam.
|
|
Those are mail routing functions, and Apache SpamAssassin is not a mail
|
|
router. Apache SpamAssassin is a mail filter or classifier. It will examine
|
|
each message presented to it, and assign a score indicating the
|
|
likelihood that the mail is spam. An external program must then
|
|
examine this score and do any routing the user wants done. There are
|
|
many programs that will easily perform these functions after examining
|
|
the score assigned by Apache SpamAssassin.
|
|
|
|
|
|
How Apache SpamAssassin Works
|
|
-----------------------------
|
|
|
|
Apache SpamAssassin uses a wide range of heuristic tests on mail headers and
|
|
body text to identify "spam", also known as unsolicited commercial
|
|
email.
|
|
|
|
Once identified, the mail can then be optionally tagged as spam for
|
|
later filtering using the user's own mail user-agent application.
|
|
|
|
Apache SpamAssassin typically differentiates successfully between spam and
|
|
non-spam in between 95% and 100% of cases, depending on what kind of mail
|
|
you get and your training of its Bayesian filter. Specifically,
|
|
Apache SpamAssassin has been shown to produce around 1.5% false negatives (spam
|
|
that was missed) and around 0.06% false positives (ham incorrectly marked
|
|
as spam). See the rules/STATISTICS*.txt files for more information.
|
|
|
|
Apache SpamAssassin also includes plugins to support reporting spam messages
|
|
automatically or manually to collaborative filtering databases such as
|
|
Pyzor, DCC, and Vipul's Razor.
|
|
|
|
The distribution provides "spamassassin", a command line tool to
|
|
perform filtering, along with the "Mail::SpamAssassin" module set
|
|
which allows Apache SpamAssassin to be used in spam-protection proxy SMTP or
|
|
POP/IMAP server, or a variety of different spam-blocking scenarios.
|
|
|
|
In addition, "spamd", a daemonized version of Apache SpamAssassin which
|
|
runs persistently, is available. Using its counterpart, "spamc",
|
|
a lightweight client written in C, an MTA can process large volumes of
|
|
mail through Apache SpamAssassin without having to fork/exec a perl interpreter
|
|
for each message.
|
|
|
|
|
|
Questions? Need Help?
|
|
---------------------
|
|
|
|
If you have questions about Apache SpamAssassin, please check the Wiki[1] to
|
|
see if someone has already posted an answer to your question. (The
|
|
Wiki doubles as a FAQ.) Failing that, post a message to the
|
|
spamassassin-users mailing list[2]. If you've found a bug (and you're
|
|
sure it's a bug after checking the Wiki), please file a report in our
|
|
Bugzilla[3].
|
|
|
|
[1]: https://wiki.apache.org/spamassassin/
|
|
[2]: https://wiki.apache.org/spamassassin/MailingLists
|
|
[3]: https://issues.apache.org/SpamAssassin/
|
|
|
|
Please also be sure to read the man pages.
|
|
|
|
|
|
Upgrading Apache SpamAssassin
|
|
-----------------------------
|
|
|
|
IMPORTANT: If you are upgrading from a previous major version of Apache
|
|
SpamAssassin, please be sure to read the notes in UPGRADE to find out
|
|
what has changed in a non- backward compatible way.
|
|
|
|
|
|
Installing Apache SpamAssassin
|
|
------------------------------
|
|
|
|
See the INSTALL file.
|
|
|
|
|
|
Customizing Apache SpamAssassin
|
|
-------------------------------
|
|
|
|
These are the configuration files installed by Apache SpamAssassin. The commands
|
|
that can be used therein are listed in the POD documentation for the
|
|
Mail::SpamAssassin::Conf class (run the following command to read it:
|
|
"perldoc Mail::SpamAssassin::Conf"). Note: The following directories are
|
|
the standard defaults that people use. There is an explanation of all the
|
|
default locations that Apache SpamAssassin will look at the end.
|
|
|
|
- /usr/share/spamassassin/*.cf:
|
|
|
|
Distributed configuration files, with all defaults. Do not modify
|
|
these, as they are overwritten when you upgrade.
|
|
|
|
- /var/lib/spamassassin/*/*.cf:
|
|
|
|
Local state directory; updated rulesets, overriding the
|
|
distributed configuration files, downloaded using "sa-update". Do
|
|
not modify these, as they are overwritten when you run
|
|
"sa-update".
|
|
|
|
- /etc/mail/spamassassin/*.cf:
|
|
|
|
Site config files, for system admins to create, modify, and
|
|
add local rules and scores to. Modifications here will be
|
|
appended to the config loaded from the above directory.
|
|
|
|
- /etc/mail/spamassassin/*.pre:
|
|
|
|
Plugin control files, installed from the distribution. These are
|
|
used to control what plugins are loaded. Modifications here will
|
|
be loaded before any configuration loaded from the above
|
|
directories.
|
|
|
|
You want to modify these files if you want to load additional
|
|
plugins, or inhibit loading a plugin that is enabled by default.
|
|
If the files exist in /etc/mail/spamassassin, they will not
|
|
be overwritten during future installs.
|
|
|
|
- /usr/share/spamassassin/user_prefs.template:
|
|
|
|
Distributed default user preferences. Do not modify this, as it is
|
|
overwritten when you upgrade.
|
|
|
|
- /etc/mail/spamassassin/user_prefs.template:
|
|
|
|
Default user preferences, for system admins to create, modify, and
|
|
set defaults for users' preferences files. Takes precedence over
|
|
the above prefs file, if it exists.
|
|
|
|
Do not put system-wide settings in here; put them in a file in the
|
|
"/etc/mail/spamassassin" directory ending in ".cf". This file is
|
|
just a template, which will be copied to a user's home directory
|
|
for them to change.
|
|
|
|
- $USER_HOME/.spamassassin:
|
|
|
|
User state directory. Used to hold spamassassin state, such
|
|
as a per-user automatic welcomelist, and the user's preferences
|
|
file.
|
|
|
|
- $USER_HOME/.spamassassin/user_prefs:
|
|
|
|
User preferences file. If it does not exist, one of the
|
|
default prefs file from above will be copied here for the
|
|
user to edit later, if they wish.
|
|
|
|
Unless you're using spamd, there is no difference in
|
|
interpretation between the rules file and the preferences file, so
|
|
users can add new rules for their own use in the
|
|
"~/.spamassassin/user_prefs" file, if they like. (spamd disables
|
|
this for security and increased speed.)
|
|
|
|
- $USER_HOME/.spamassassin/bayes*
|
|
|
|
Statistics databases used for Bayesian filtering. If they do
|
|
not exist, they will be created by Apache SpamAssassin.
|
|
|
|
Spamd users may wish to create a shared set of bayes databases;
|
|
the "bayes_path" and "bayes_file_mode" configuration settings
|
|
can be used to do this.
|
|
|
|
See "perldoc sa-learn" for more documentation on how
|
|
to train this.
|
|
|
|
File Locations:
|
|
|
|
Apache SpamAssassin will look in a number of areas to find the default
|
|
configuration files that are used. The "__*__" text are variables
|
|
whose value you can see by looking at the first several lines of the
|
|
"spamassassin" or "spamd" scripts.
|
|
|
|
They are set on install time and can be overridden with the Makefile.PL
|
|
command line options DATADIR (for __def_rules_dir__) and CONFDIR (for
|
|
__local_rules_dir__). If none of these options were given, FHS-compliant
|
|
locations based on the PREFIX (which becomes __prefix__) are chosen.
|
|
These are:
|
|
|
|
__prefix__ __def_rules_dir__ __local_rules_dir__
|
|
-------------------------------------------------------------------------
|
|
/usr /usr/share/spamassassin /etc/mail/spamassassin
|
|
/usr/local /usr/local/share/spamassassin /etc/mail/spamassassin
|
|
/opt/$DIR /opt/$DIR/share/spamassassin /etc/opt/mail/spamassassin
|
|
$DIR $DIR/share/spamassassin $DIR/etc/mail/spamassassin
|
|
|
|
The files themselves are then looked for in these paths:
|
|
|
|
- Distributed Configuration Files
|
|
'__def_rules_dir__'
|
|
'__prefix__/share/spamassassin'
|
|
'/usr/local/share/spamassassin'
|
|
'/usr/share/spamassassin'
|
|
|
|
- Site Configuration Files
|
|
'__local_rules_dir__'
|
|
'__prefix__/etc/mail/spamassassin'
|
|
'__prefix__/etc/spamassassin'
|
|
'/usr/local/etc/spamassassin'
|
|
'/usr/pkg/etc/spamassassin'
|
|
'/usr/etc/spamassassin'
|
|
'/etc/mail/spamassassin'
|
|
'/etc/spamassassin'
|
|
|
|
- Default User Preferences File
|
|
'__local_rules_dir__/user_prefs.template'
|
|
'__prefix__/etc/mail/spamassassin/user_prefs.template'
|
|
'__prefix__/share/spamassassin/user_prefs.template'
|
|
'/etc/spamassassin/user_prefs.template'
|
|
'/etc/mail/spamassassin/user_prefs.template'
|
|
'/usr/local/share/spamassassin/user_prefs.template'
|
|
'/usr/share/spamassassin/user_prefs.template'
|
|
|
|
|
|
In addition, the "Distributed Configuration Files" location is overridden
|
|
by a "Local State Directory", used to store an updated copy of the
|
|
ruleset:
|
|
|
|
__prefix__ __local_state_dir__
|
|
-------------------------------------------------------------------------
|
|
/usr /var/lib/spamassassin/__version__
|
|
/usr/local /var/lib/spamassassin/__version__
|
|
/opt/$DIR /var/opt/spamassassin/__version__
|
|
$DIR $DIR/var/spamassassin/__version__
|
|
|
|
This is normally written to by the "sa-update" script. "__version__" is
|
|
replaced by a representation of the version number, so that multiple
|
|
versions of Apache SpamAssassin will not interfere with each other's rulesets.
|
|
|
|
|
|
After installation, try "perldoc Mail::SpamAssassin::Conf" to see what
|
|
can be set. Common first-time tweaks include:
|
|
|
|
- required_score
|
|
|
|
Set this higher to make Apache SpamAssassin less sensitive.
|
|
If you are installing Apache SpamAssassin system-wide, this is
|
|
**strongly** recommended!
|
|
|
|
Statistics on how many false positives to expect at various
|
|
different thresholds are available in the "STATISTICS.txt" file in
|
|
the "rules" directory.
|
|
|
|
- rewrite_header, add_header
|
|
|
|
These options affect the way messages are tagged as spam or
|
|
non-spam. This makes it easy to identify incoming mail.
|
|
|
|
- ok_locales
|
|
|
|
If you expect to receive mail in non-ISO-8859 character sets (ie.
|
|
Chinese, Cyrillic, Japanese, Korean, or Thai) then set this.
|
|
|
|
|
|
Learning
|
|
--------
|
|
|
|
Apache SpamAssassin includes a Bayesian learning filter, so it is worthwhile
|
|
training Apache SpamAssassin with your collection of non-spam and spam,
|
|
if possible. This will make it more accurate for your incoming mail.
|
|
Do this using the "sa-learn" tools, like so:
|
|
|
|
sa-learn --spam ~/Mail/saved-spam-folder
|
|
sa-learn --ham ~/Mail/inbox
|
|
sa-learn --ham ~/Mail/other-nonspam-folder
|
|
|
|
|
|
If these are mail folders in mbox format, use the --mbox switch, for
|
|
Maildirs use a trailing slash, like Maildir/cur/.
|
|
|
|
Use as many mailboxes as you like. Note that Apache SpamAssassin will remember
|
|
what mails it has learnt from, so you can re-run this as often as you like.
|
|
|
|
|
|
Localization
|
|
------------
|
|
|
|
All text displayed to users is taken from the configuration files. This
|
|
means that you can translate messages, test descriptions, and templates
|
|
into other languages.
|
|
|
|
If you do so, we would *really* appreciate it if you could contribute
|
|
these translations, so that they can be added to the
|
|
distribution. Please file a bug in our Bugzilla[4], and attach your
|
|
translations. You will, of course, be credited for this work!
|
|
|
|
[4]: https://issues.apache.org/SpamAssassin/
|
|
|
|
|
|
Disabled code
|
|
-------------
|
|
|
|
There are some tests and code in Apache SpamAssassin that are turned off by
|
|
default: experimental code, slow code, or code that depends on
|
|
non-open-source software or services that are not always free. These
|
|
disabled tests include:
|
|
|
|
- DCC: depends on non-open-source software (disabled in init.pre)
|
|
- MAPS: commercial service (disabled in 50_scores.cf)
|
|
- TextCat: slow (disabled in init.pre)
|
|
- various optional plugins, disabled for speed (disabled in *.pre)
|
|
|
|
To turn on tests disabled in 50_scores.cf, simply assign them a non-zero
|
|
score, e.g. by adding score lines to your ~/.spamassassin/user_prefs file.
|
|
|
|
To turn on tests disabled by commenting out the required plugin in
|
|
init.pre, you need to uncomment the loadplugin line and make sure the
|
|
prerequisites for proper operation of the plugin are present.
|
|
|
|
|
|
Automatic Reputation System
|
|
--------------------------
|
|
|
|
Apache SpamAssassin includes an automatic reputation system. The way it works is
|
|
by tracking for each sender address a rolling average score of messages
|
|
so far seen from there. Then, it combines this long-term average score
|
|
for the sender with the score for the particular message being evaluated,
|
|
after all other rules have been applied.
|
|
|
|
This functionality can be enabled or disabled with the
|
|
"use_txrep" option.
|
|
|
|
For more information, read sql/README.txrep
|
|
|
|
(end of README)
|
|
|
|
// vim:tw=74:
|