mirror of
https://git.proxmox.com/git/proxmox-spamassassin
synced 2025-04-28 21:37:19 +00:00
buildsys: drop upstream tarball and add extracted sources
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
This commit is contained in:
parent
054f24dbbb
commit
37ef577538
Binary file not shown.
403
upstream/CREDITS
Normal file
403
upstream/CREDITS
Normal file
@ -0,0 +1,403 @@
|
||||
Copyright (C) 2019 The Apache Software Foundation
|
||||
|
||||
Project Management Committee (PMC):
|
||||
|
||||
This list contains PMC members in alphabetical order (and their Amazon
|
||||
wishlists). The PMC can be reached at <private /at/ spamassassin.apache.org>
|
||||
(for private correspondence) or <dev /at/ spamassassin.apache.org> (for
|
||||
public contact); we request that the dev list should be used for all
|
||||
non-confidential correspondence.
|
||||
|
||||
- Giovanni Bechis
|
||||
- Karsten Bräckelmann
|
||||
- Alex Broens http://www.msf.org/en/donate
|
||||
- Bill Cole
|
||||
- John Hardin
|
||||
- Dave Jones
|
||||
- Adam Katz
|
||||
- Henrik Krohns
|
||||
- Sidney Markowitz
|
||||
- Mark Martinec
|
||||
- Kevin A. McGrail
|
||||
- Michael Parker http://www.amazon.com/o/registry/10BBAR2M03T6F
|
||||
- Joe Quinn
|
||||
|
||||
Committers:
|
||||
|
||||
This list contains committers in alphabetical order
|
||||
|
||||
- Paul Stead
|
||||
- Merijn van den Kroonenberg
|
||||
- Bryan Vest
|
||||
|
||||
PMC Emeritus & inactive committers:
|
||||
|
||||
This list contains all PMC Emeritus & inactive committers in alphabetical order.
|
||||
|
||||
We wish them well and hope to see them return someday.
|
||||
|
||||
- Matthew Cline
|
||||
- Duncan Findlay - PMC Emeritus
|
||||
- Tony Finch
|
||||
- Steve Freegard
|
||||
- Craig Hughes
|
||||
- Matt Kettler - PMC Emeritus
|
||||
- Justin Mason - Creator & PMC Emeritus
|
||||
- Robert Menschel
|
||||
- Daryl C. W. O'Shea - PMC Emeritus
|
||||
- John Gardiner Myers
|
||||
- Daniel Quinlan - PMC Emeritus
|
||||
- Dale 'Doc' Schneider - PMC Emeritus
|
||||
- Matt Sergeant
|
||||
- Henry Stern
|
||||
- Malte S. Stretz - PMC Emeritus
|
||||
- Warren Togami
|
||||
- Theo Van Dinter - PMC Emeritus
|
||||
- Matt Yackley
|
||||
- Daniel Lemke
|
||||
|
||||
Spamc credits:
|
||||
|
||||
- Author: Craig Hughes
|
||||
- Conversion to a thread-safe shared library by Liam Widdowson
|
||||
- Portions by Brad Jorsch
|
||||
- Windows adaption by Sidney Markowitz
|
||||
- autoconf wrapper (for Unix)/alternative (for Windows): Malte S. Stretz,
|
||||
based on work done by Sidney Markowitz
|
||||
- spamc/qmail-spamc.c by John Peacock
|
||||
|
||||
Spamd:
|
||||
|
||||
- Author: Craig Hughes
|
||||
- Parts by Malte S. Stretz
|
||||
|
||||
|
||||
Algorithms:
|
||||
|
||||
The Bayesian-style probabilistic classifier used by SpamAssassin's BAYES
|
||||
rules is based on an approach outlined by Gary Robinson. Thanks, Gary!
|
||||
|
||||
http://radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html
|
||||
|
||||
|
||||
Major contributions:
|
||||
|
||||
- Michael Bell, <mikebell90(at)yahoo.com>: Bayes documentation.
|
||||
|
||||
- Kelsey Cummings, <kgc(at)sonic.net>: client-ip-address security in spamd;
|
||||
sql-configs-with-setuid switch added to spamd.
|
||||
|
||||
- Dallas L. Engelken <dallase(at)nmgi.com>: ImageInfo plugin.
|
||||
|
||||
- Justin England, <jengland(at)enetis.net>: SQL support.
|
||||
|
||||
- Steve Friedl, <steve(at)unixwiz.net>: UNIX domain socket support in
|
||||
spamd/spamc.
|
||||
|
||||
- Ryan Finnie, <ryan(at)finnie.org>: message encapsulation as
|
||||
message/rfc822 attachment.
|
||||
|
||||
- Matt Kettler, <mkettler_sa(at)comcast.net>: most of the anti-drug rules in
|
||||
rules/20_drugs.cf; bug fix for list of reserved IP addresses; others.
|
||||
|
||||
- Brad "anomie" Jorsch, <anomie(at)users.sourceforge.net>: fix
|
||||
to avoid losing mail from spamc; BSMTP and -e support; tracking of
|
||||
number of spamd processes; several other mods.
|
||||
|
||||
- Kristian Köhntopp, <kris(at)koehntopp.de>: LDAP support.
|
||||
|
||||
- Matthias Leisi, <matthias(at)leisi.net>: Mail::SpamAssassin::Plugin::ASN
|
||||
plugin.
|
||||
|
||||
- Daniel Lemke, <lemke(at)jam-software.com>: many Windows support fixes
|
||||
|
||||
- John Madden, <maddenj+spamassassin at skynet.ie>: spamc -F configuration
|
||||
file support.
|
||||
|
||||
- Sidney Markowitz, <sidney(at)sidney.com>: fix to DNS tests;
|
||||
message-size sanity-checking in spamc; language identification;
|
||||
Win32 build support.
|
||||
|
||||
- Marc Merlin, <marc_soft(at)merlins.org>: RBL ordering/timeouts;
|
||||
time-logging for debug speed testing.
|
||||
|
||||
- Bob Menschel: 'longwords' rules, some documentation.
|
||||
|
||||
- Eugene/Yevgeniy Miretsky, <eugene(at)invision.net>: bug fix for
|
||||
spamc -c; Spamc timeout support; support for REPORT, REPORT_IFSPAM
|
||||
and SYMBOLS methods in spamc.
|
||||
|
||||
- Gertjan van Noord, <vannoord(at)let.rug.nl>: TextCat language classifier
|
||||
|
||||
- Michael Parker, <parkerm(at)pobox.com>: Bayes-in-SQL and AWL-in-SQL;
|
||||
writing Bayes regression tests.
|
||||
|
||||
- John Peacock, <jpeacock(at)rowman.com>: qmail-spamc in contrib;
|
||||
patch to README file
|
||||
|
||||
- Marc Perkel, <marc(at)perkel.com>: 30-or-so rules; about 20 of which are
|
||||
still in codebase.
|
||||
|
||||
- Nico Prenzel, <nico.prenzel(at)pn-systeme.de>: remote learning and
|
||||
spam-reporting support in spamc/spamd.
|
||||
|
||||
- Dustin Sallings, <dustin+spamassassin(at)spy.net>: support for
|
||||
virtual users in spamd.
|
||||
|
||||
- Ed Allen Smith, <easmith(at)beatrice.rutgers.edu>: GA improvements; 6
|
||||
rules.
|
||||
|
||||
- Henry Stern: perceptron score optimizer (replacing the GA).
|
||||
|
||||
- Ivo Truxa: TxRep reputation database plugin.
|
||||
|
||||
- Liam Widdowson, <liam(at)inodes.org>: shared-library use of spamc.
|
||||
|
||||
- Radoslaw Zielinski, <radek(at)pld-linux.org>:
|
||||
Mail::SpamAssassin::Spamd::Apache2, a mod_perl2 module implementing spamd,
|
||||
contributed as a Google Summer of Code project.
|
||||
|
||||
|
||||
Translators:
|
||||
|
||||
- Michel Bouissou, <michel(at)bouissou.net>: French translation.
|
||||
|
||||
- Olivier Nicole, <on(at)cs.ait.ac.th>: prior version of French translation.
|
||||
|
||||
- Jesse Houwing, <j.houwing(at)student.utwente.nl>: Dutch translation.
|
||||
|
||||
- Alexander Litvinov, <lan(at)ac-sw.com>: Russian translation.
|
||||
|
||||
- Peter Mann, <peter.mann(at)tuke.sk>: Slovak translation.
|
||||
|
||||
- Klaus Heinz, <klaus.heinz(at)onlinehome.de>: German translation.
|
||||
|
||||
|
||||
Patch submitters:
|
||||
|
||||
- Michael Brown, <michaelb(at)opentext.com>: support to build libspamc.so
|
||||
|
||||
- Nick "Nix" Alcock, <nix(at)esperi.org.uk>: DCC fix
|
||||
|
||||
- Bob Apthorpe, <apthorpe+sa(at)cynistar.net>: tools/sa-stats.pl
|
||||
|
||||
- Alan Barrett, <apb(at)cequrux.com>: base64 decoding code
|
||||
|
||||
- Rod Begbie, <rod(at)begbie.com> <rOD-spamassassin(at)arsecandle.org>: DCC
|
||||
bugfixes.
|
||||
|
||||
- Robert Bihlmeyer, <robbe(at)orcus.priv.at>: CHARSET_FARAWAY_HEADERS test
|
||||
|
||||
- Richard Birkett, <richard(at)birkett.com>: patch to a build script.
|
||||
|
||||
- Cassandra Brockett, <cass(at)ophiuchi.net>: brought SQL checks and
|
||||
documentation into line
|
||||
|
||||
- Adrian Bunk, <bunk(at)fs.tum.de>: URI_IS_POUND
|
||||
|
||||
- Ken Causey, <ken(at)kencausey.com>: patch to remove reserved IPs;
|
||||
improve performance of network lookups.
|
||||
|
||||
- Jean Charles Delepine, <delepine(at)u-picardie.fr>: report_charset
|
||||
support.
|
||||
|
||||
- Anirvan Chatterjee, <anirvan(at)chatterjee.net>:
|
||||
<anirvan(at)chatterjee.net>; suggestions for TO_HAS_SPACES rule
|
||||
and TO_ADDRESS_EQ_REAL;
|
||||
|
||||
- Andrey Chernov, <ache(at)nagual.pp.ru>: miscellaneous bug fixes.
|
||||
|
||||
- Christopher Davis, <ckd-spamassassin(at)ckdhr.com>: patch to
|
||||
contrib/spamcheck.py.
|
||||
|
||||
- Vince Delvecchio, <vince.delvecchio(at)analog.com>: negative look-behinds
|
||||
PORN_4
|
||||
|
||||
- Chris Eykamp, <chris(at)eykamp.com>: 4 weight loss rules.
|
||||
|
||||
- David B Funk <dbfunk(at)engineering.uiowa.edu>: Fix for newline collapse
|
||||
causing excessive whitespace rules to not match
|
||||
|
||||
- Alan Ford, <alan(at)whirlnet.co.uk>: Getting a list of symbols of tests hit
|
||||
with spamd; patch for spamd response headers.
|
||||
|
||||
- Scott Griffith, <skod(at)ises-llc.com>: 12 tests; 7 still in codebase.
|
||||
|
||||
- Ask Bjoern Hansen, <ask(at)develooper.com> <ask(at)apache.org>:
|
||||
REPORT_IFSPAM in spamd.
|
||||
|
||||
- Sean Harding, <sharding(at)dogcow.org>: patch for
|
||||
X-Spam-Checker-Version.
|
||||
|
||||
- Klaus Heinz, <klaus.heinz(at)onlinehome.de>: changes to rules;
|
||||
packaging fixes for UNIX package; German translation.
|
||||
|
||||
- Ed Henderson, <ed.henderson(at)certainty.net>: fix for vpopmail support in
|
||||
spamd.
|
||||
|
||||
- David Hull, <hull(at)paracel.com> <hull(at)davidhull.org>:
|
||||
rewrite_subject and report_header; rules
|
||||
|
||||
- Morbus Iff, <morbus(at)disobey.com>: don't create prefs patch.
|
||||
|
||||
- Steve Keay, <steve-spamassassin-bugzilla(at)keay.com>: spamd -A network
|
||||
ranges support.
|
||||
|
||||
- Vivek Khera, <khera(at)kcilink.com>: contributed to Razor2 patch.
|
||||
|
||||
- Alexander Kourakos, <awk(at)bnt.com>: bug fixes.
|
||||
|
||||
- Juergen Kreileder, <kreilede(at)issan.informatik.uni-dortmund.de>:
|
||||
misc fixes; Bayes ignore Gnus annotation
|
||||
|
||||
- Henrik Krohns, <hege@hege.li>, 'uridnssub' keyword for URIDNSBL plugin.
|
||||
|
||||
- David M. Koppelman, <koppel(at)ece.lsu.edu>: bayes_score report
|
||||
fix; bayes expiry time bugfix.
|
||||
|
||||
- Gregor Lawatscheck, <gpel(at)mpex.net>: rule suggestions
|
||||
|
||||
- Michael Lemke, <lemkemch(at)t-online.de>: Win32Locker fix
|
||||
|
||||
- John Levine, <johnl(at)iecc.com>: added --syslog=stderr support.
|
||||
|
||||
- Tom Lipkis, <tal(at)pss.com>: SunOS 4.1.4 support.
|
||||
|
||||
- Dave Lugo, <dlugo(at)etherboy.com>: documentation regarding use of
|
||||
DCC in INSTALL file.
|
||||
|
||||
- Mark Martinec, <Mark.Martinec(at)ijs.si>: contributed to Razor2 patch
|
||||
|
||||
- Morgan Massena, <mmassena(at)ipowerplant.com>: patch to SQL support to
|
||||
allow installer to specify more table details.
|
||||
|
||||
- Kevin McGrail, <kmcgrail(at)pccc.com>: portability fix for
|
||||
Red Hat 5 support.
|
||||
|
||||
- Rob McMillin, <rlm(at)pricegrabber.com> <rfm(at)scareduck.com>:
|
||||
rule fixes.
|
||||
|
||||
- Michael Moncur, <mgm(at)starlingtech.com>: many rules; SQL fix.
|
||||
|
||||
- John Morrissey, <jwm(at)horde.net>: fixed "check_rbl() doesn't check $#ips
|
||||
properly".
|
||||
|
||||
- Dirk Mueller, <mueller(at)kde.org>: spamc low-memory bugfix.
|
||||
|
||||
- Nate Mueller, <nate(at)cs.wisc.edu>: SSL support in spamd/spamc;
|
||||
|
||||
- Rob Nagler, <nagler(at)bivio.com>: speed-up bug fix.
|
||||
|
||||
- Nathan Neulinger, <nneul(at)umr.edu>: security patch; code
|
||||
cleanup; dccifd support.
|
||||
|
||||
- John Newman, <jnewman(at)scms.waikato.ac.nz>: UW .mbx file-format support.
|
||||
|
||||
- Mike Nolan, <nolan(at)naic.edu>: SunOS build directions
|
||||
|
||||
- Martin Östlund, <mo(at)microsaft.nu>: Slackware 9.0 rc-script for spamd.
|
||||
|
||||
- Tomasz Ostrowski, <tometzky(at)batory.org.pl>: perl 5.005 support.
|
||||
|
||||
- Henning P. Schmiedehausen, <hps(at)intermeta.de> <henning(at)apache.org>:
|
||||
adding ? to shell globs.
|
||||
|
||||
- Francesco Potortì, <pot(at)gnu.org>: documentation improvements
|
||||
|
||||
- Alan Premselaar, <alien(at)12inch.com>: rule suggestions.
|
||||
|
||||
- Martin Radford, <martin-sabz(at)zamenhof.demon.co.uk>: rules and
|
||||
rule descriptions.
|
||||
|
||||
- Daniel Rall, <dlr(at)finemaltcoding.com>: documentation fix.
|
||||
|
||||
- Brad Rathbun, <brad(at)computechnv.com>: tools/sa-stats.pl.
|
||||
|
||||
- Xavier Renaut, <node3667(at)users.sourceforge.net>: contrib/spamproxyd
|
||||
bugfix.
|
||||
|
||||
- Bobby Rose, <brose(at)med.wayne.edu>: Pyzor support; dcc_path.
|
||||
|
||||
- Klaus Johannes Rusch, <KlausRusch(at)atmedia.net>: fix for
|
||||
find_all_addrs_in_line().
|
||||
|
||||
- Bart Schaefer, <schaefer(at)zanshin.com>: bug fix for DCC; Razor
|
||||
support; rules.
|
||||
|
||||
- Dianne Skoll, <dfs(at)roaringpenguin.com>: one line change to
|
||||
__OUTLOOK_MUA.
|
||||
|
||||
- Al Smith, <al.smith(at)aeschi.ch.eu.org>: fix to SSL spamd bug.
|
||||
|
||||
- Sander Steffann, <sander(at)steffann.nl>: patch to contrib/spamcheck.py.
|
||||
|
||||
- Michael Stenner, <mstenner(at)phy.duke.edu>: ident authentication support
|
||||
in spamd.
|
||||
|
||||
- Brett A. Thomas, <bthomas(at)vindicia.com>: improved sa-learn
|
||||
command-line API.
|
||||
|
||||
- Reini Urban, <rurban(at)x-ray.at>: fix to Makefile for cygwin
|
||||
|
||||
- Tobias von Koch, <tvk(at)weltcharts.de>: DCC support fixes; rules.
|
||||
|
||||
- Vince Vielhaber, <vev(at)michvhf.com>: spamc -c bugfix patch.
|
||||
|
||||
- Charlie Watts, <cewatts(at)frontier.net>: patch to deal with perl
|
||||
bug on BSD platforms; DNS timeouts.
|
||||
|
||||
- Andrew Wilson, <andrew(at)rivendale.net>: support for MIME::Entity
|
||||
contributed.
|
||||
|
||||
- Jeremy Zawodny, <jeremy(at)zawodny.com>: patch to SQL support.
|
||||
|
||||
- The Little Rubber Ninja Homepage <http://www.unsuave.com/ninja/>, owned
|
||||
by Mike Quinn: source of the original SpamAssassin logo.
|
||||
|
||||
If your name is not here, and you've submitted a patch that was included,
|
||||
it's just an oversight. Please mail me at <jm /at/ jmason.org> and I'll add
|
||||
you to the list.
|
||||
|
||||
ASF Sponsorship:
|
||||
|
||||
SpamAssassin is an Apache Software Foundation project. The Apache Software
|
||||
Foundation could not exist without the continued generous support from the
|
||||
community.
|
||||
|
||||
Please take a moment to view the complete list of sponsors by visiting:
|
||||
|
||||
http://www.apache.org/foundation/thanks.html
|
||||
|
||||
If you are interested in sponsoring the ASF, please read the sponsorship page
|
||||
by visiting:
|
||||
|
||||
http://www.apache.org/foundation/sponsorship.html for more information.
|
||||
|
||||
Resources:
|
||||
|
||||
Thanks to our previous mirrors: Peregrine Computer Consultants Corporation
|
||||
(previously Peregrine Hardware, Inc.) and Kevin A. McGrail, Jeremy Zawodny,
|
||||
Mark Reynolds, RedIRIS, Hagen Herrschaft, and PlanetMirror.
|
||||
|
||||
Thanks to Mark Reynolds of Reynolds Technology (http://www.reynolds.net.au/)
|
||||
for the registration of spamassassin.org.
|
||||
|
||||
Thanks to Kelsey Cummings and Sonic.net (http://www.sonic.net/) for
|
||||
significant contributions with network services, and on the back-end; our
|
||||
score-optimization systems would be significantly weedier without their help.
|
||||
|
||||
http://www.pccc.com/
|
||||
http://www.zawodny.com/
|
||||
http://www.reynolds.net.au/
|
||||
http://www.rediris.es/
|
||||
http://www.hrxnet.de/
|
||||
http://www.planetmirror.com/
|
||||
http://sourceforge.net/
|
||||
http://www.sonic.net/
|
||||
|
||||
Finally:
|
||||
|
||||
Thanks to James Thompson at cPanel Inc who designed our new logo in 2014 and
|
||||
to Christian Rauh, winner of the SpamAssassin logo contest, who created,
|
||||
designed, and illustrated our first Apache SpamAssassin logo.
|
||||
|
3514
upstream/Changes
Normal file
3514
upstream/Changes
Normal file
File diff suppressed because it is too large
Load Diff
481
upstream/INSTALL
Normal file
481
upstream/INSTALL
Normal file
@ -0,0 +1,481 @@
|
||||
Upgrading SpamAssassin?
|
||||
-----------------------
|
||||
|
||||
Please be sure to read the UPGRADE file for important changes that
|
||||
have been made since previous versions. In particular, 3.3.0 no
|
||||
longer includes a default ruleset.
|
||||
|
||||
|
||||
Installing or Upgrading SpamAssassin
|
||||
------------------------------------
|
||||
|
||||
Using CPAN via CPAN.pm:
|
||||
|
||||
perl -MCPAN -e shell [as root]
|
||||
o conf prerequisites_policy ask
|
||||
install Mail::SpamAssassin
|
||||
quit
|
||||
|
||||
Using Linux:
|
||||
|
||||
Debian unstable: apt-get install spamassassin
|
||||
Gentoo: emerge mail-filter/spamassassin
|
||||
Fedora: yum install spamassassin
|
||||
|
||||
Alternatively download the tarfile, zipfile, and/or build your own RPM
|
||||
from https://spamassassin.apache.org/. Building from tar/zip file is
|
||||
usually as simple as:
|
||||
|
||||
[unzip/untar the archive]
|
||||
cd Mail-SpamAssassin-*
|
||||
perl Makefile.PL
|
||||
[option: add -DSPAMC_SSL to $CFLAGS to build an SSL-enabled spamc]
|
||||
make
|
||||
make install [as root]
|
||||
|
||||
After installing SpamAssassin, you need to download and install the
|
||||
SpamAssassin ruleset using "sa-update". See the "Installing Rules"
|
||||
section below.
|
||||
|
||||
Please make sure to read this whole document before installing, especially
|
||||
the prerequisite information further down.
|
||||
|
||||
To install as non-root, see the directions below.
|
||||
|
||||
If you are running AFS, you may also need to specify INSTALLSITELIB and
|
||||
SITELIBEXP.
|
||||
|
||||
Note that you can upgrade SpamAssassin using these instructions, as long
|
||||
as you take care to read the caveats in the file UPGRADE. Upgrading
|
||||
will not delete your learnt Bayes data or local rule modifications.
|
||||
|
||||
If you're using SunOS 4.1.x, see
|
||||
http://wiki.spamassassin.org/w/BuildingOnSunOS4 for build tips.
|
||||
|
||||
|
||||
Installing SpamAssassin for Personal Use (Not System-Wide)
|
||||
----------------------------------------------------------
|
||||
|
||||
These steps assume the following, so substitute as necessary:
|
||||
- Your UNIX login is "user"
|
||||
- Your home directory is /home/user
|
||||
- The location of the procmail executable is /usr/bin/procmail
|
||||
|
||||
Many more details of this process are at
|
||||
http://wiki.apache.org/spamassassin/SingleUserUnixInstall
|
||||
|
||||
1. Uncompress and extract the SpamAssassin archive, using "unzip" or
|
||||
"tar xvfz", in a temporary directory.
|
||||
|
||||
2. change directory into it:
|
||||
|
||||
cd Mail-SpamAssassin-*
|
||||
|
||||
3. Make SpamAssassin as normal, but using your home directory as the
|
||||
target:
|
||||
|
||||
perl Makefile.PL PREFIX=$HOME
|
||||
make
|
||||
make install
|
||||
|
||||
Please see the file PACKAGING, sections "Changing paths in the Makefile"
|
||||
and "Setting further options on the command line" for more information
|
||||
on available command line variables.
|
||||
|
||||
4. Install the SpamAssassin ruleset using "sa-update":
|
||||
|
||||
$HOME/bin/sa-update
|
||||
|
||||
See the "Installing Rules" section below if you do not wish to download
|
||||
the rules directly from the internet.
|
||||
|
||||
NOTE: Because LWP does not support IPv6, sa-update as of 3.4.0 will use
|
||||
the binaries curl, wget or fetch to download rule updates with LWP used
|
||||
as a fallback if none of the binaries exist.
|
||||
|
||||
5. If you already use procmail, skip to step 7. If not, ensure procmail
|
||||
is installed using "which procmail" or install it from www.procmail.org.
|
||||
|
||||
6. Create a .forward file in your home directory containing the below
|
||||
lines:
|
||||
|
||||
"|IFS=' ' && exec /usr/bin/procmail -f- || exit 75 #user"
|
||||
|
||||
7. Edit or create a .procmailrc file in your home directory containing the
|
||||
below lines. If you already have a .procmailrc file, add the lines to
|
||||
the top of your .procmailrc file:
|
||||
|
||||
:0fw: spamassassin.lock
|
||||
| /home/user/bin/spamassassin
|
||||
|
||||
The above line filters all incoming mail through SpamAssassin and tags
|
||||
probable spam with a unique header. If you would prefer to have spam
|
||||
blocked and saved to a file called "caughtspam" in your home directory,
|
||||
instead of passed through and tagged, append this directly below the above
|
||||
lines:
|
||||
|
||||
:0:
|
||||
* ^X-Spam-Status: Yes
|
||||
caughtspam
|
||||
|
||||
Also, see the file procmailrc.example and
|
||||
http://wiki.apache.org/spamassassin/UsedViaProcmail
|
||||
|
||||
8. Now, you should be ready to send some test emails and ensure everything
|
||||
works as expected. First, send yourself a test email that doesn't
|
||||
contain anything suspicious. You should receive it normally, but there
|
||||
will be a header containing "X-Spam-Status: No". If you are only
|
||||
tagging your spam, send yourself a copy of the GTUBE test string to
|
||||
check to be sure it is marked as spam. GTUBE is located in the
|
||||
sample-spam.txt message distributed with SpamAssassin and also at:
|
||||
|
||||
https://spamassassin.apache.org/gtube/
|
||||
|
||||
If your test emails don't get through to you, immediately rename your
|
||||
.forward file until you figure out cause of the the problem, so you
|
||||
don't lose incoming email.
|
||||
|
||||
Note: one possible cause for this is the use of smrsh on the MTA system;
|
||||
see http://wiki.spamassassin.org/w/ProcmailVsSmrsh for details.
|
||||
|
||||
9. You can now customize SpamAssassin. See README for more information.
|
||||
|
||||
|
||||
Installing Rules
|
||||
----------------
|
||||
|
||||
Rules are normally installed by running a sa-update command.
|
||||
The version of sa-update program should match the version of SpamAssassin
|
||||
modules, so invoking sa-update should be performed only after installing
|
||||
or upgrading SpamAssassin code, not before.
|
||||
|
||||
Installing rules from network is done with a single command:
|
||||
|
||||
sa-update
|
||||
|
||||
This is normally run as root.
|
||||
|
||||
If you wish to install rules from downloaded files, rather than "live" from
|
||||
the latest online ruleset, here is how to do it.
|
||||
|
||||
Obtain all the following files:
|
||||
|
||||
Mail-SpamAssassin-rules-xxx.tgz
|
||||
Mail-SpamAssassin-rules-xxx.tgz.asc
|
||||
Mail-SpamAssassin-rules-xxx.tgz.md5
|
||||
Mail-SpamAssassin-rules-xxx.tgz.sha1
|
||||
(where xxx may look something like '3.3.0-rc1.r893295')
|
||||
|
||||
Save them all to the current directory.
|
||||
Obtain a rules-signing public key:
|
||||
|
||||
curl -O https://spamassassin.apache.org/updates/GPG.KEY
|
||||
|
||||
Import the signing key to the SpamAssassin gpg keyring, so that the rules
|
||||
files can be verified safely:
|
||||
|
||||
sa-update --import GPG.KEY
|
||||
|
||||
Install rules from a compressed tar archive:
|
||||
|
||||
sa-update --install Mail-SpamAssassin-rules-xxx.tgz
|
||||
|
||||
Note that the ".tgz.asc", ".tgz.md5" and ".tgz.sha1" files all need to
|
||||
be in the same directory, otherwise sa-update will fail.
|
||||
|
||||
|
||||
If the intended rules destination directory differs from a default location
|
||||
as assumed by sa-update and SpamAssassin, such as when running a content
|
||||
filter within a Unix jail or on an unusual installation, please supply the
|
||||
rules destination directory to sa-update through its option --updatedir,
|
||||
such as:
|
||||
|
||||
sa-update --updatedir /var/jail/var/db/spamassassin/3.003000
|
||||
|
||||
|
||||
CPAN
|
||||
----
|
||||
|
||||
Most of the modules listed below are available via the Comprehensive Perl
|
||||
Archive Network (CPAN, see http://www.cpan.org/ for more information).
|
||||
While each module is different, most can be installed via a few simple
|
||||
commands such as:
|
||||
|
||||
$ perl -MCPAN -e shell
|
||||
cpan> o conf prerequisites_policy ask
|
||||
cpan> install Module::Name
|
||||
cpan> quit
|
||||
|
||||
If there are problems or questions regarding the installation any of the
|
||||
modules, please see the CPAN and relevant module's documentation for more
|
||||
information. We can't provide documentation or installation support for
|
||||
third party modules.
|
||||
|
||||
Additional information about the CPAN module is also available via
|
||||
"perldoc CPAN".
|
||||
|
||||
Most Linux distributions also offer the CPAN modules in their own native
|
||||
formats (RPMs, Debian packages, etc.), so you should be able to find these
|
||||
through those mechanisms, too, if you prefer.
|
||||
|
||||
|
||||
Required Perl Interpreter
|
||||
-------------------------
|
||||
|
||||
Perl 5.8.1 or a later version is required.
|
||||
Preferred versions are 5.8.8, or 5.10.1 or later.
|
||||
|
||||
Most of the functions might still work with Perl 5.6.1 or 5.6.2,
|
||||
but 5.6.* is no longer a supported version.
|
||||
|
||||
|
||||
Required Perl Modules
|
||||
---------------------
|
||||
|
||||
In addition to the modules associated with Perl, some additional modules
|
||||
need to be installed or upgraded depending on the version of Perl that you
|
||||
are running.
|
||||
|
||||
You can get an immediate report on which of these modules you may need (or
|
||||
want) to upgrade, by running "perl build/check_dependencies" from the
|
||||
SpamAssassin build directory.
|
||||
|
||||
The list of required modules that do not ship with Perl and must be
|
||||
installed:
|
||||
|
||||
- Digest::SHA1 (from CPAN),
|
||||
or the newer Digest::SHA which is a perl base module since Perl 5.10.0
|
||||
|
||||
The Digest::SHA1 module is used as a cryptographic hash for some
|
||||
tests and the Bayes subsystem if Digest::SHA module is not available.
|
||||
|
||||
An external perl module razor-agents-2.84 as used by a Razor2 plugin
|
||||
seems to be the only remaining component depending on Digest::SHA1
|
||||
(note that a packager may ship a patched version of razor-agents which
|
||||
can use Digest::SHA instead)
|
||||
|
||||
Debian: apt-get install libdigest-sha1-perl
|
||||
Gentoo: emerge dev-perl/Digest-SHA1
|
||||
Fedora: yum install perl-Digest-SHA1
|
||||
|
||||
- HTML::Parser >= 3.43 (from CPAN)
|
||||
|
||||
HTML is used for an ever-increasing amount of email so this dependency
|
||||
is unavoidable. Run "perldoc -q html" for additional information.
|
||||
|
||||
Debian: apt-get install libhtml-parser-perl
|
||||
Gentoo: emerge dev-perl/HTML-Parser
|
||||
Fedora: yum install perl-HTML-Parser
|
||||
|
||||
- Net::DNS (from CPAN)
|
||||
|
||||
Used for all DNS-based tests (SBL, XBL, SpamCop, DSBL, etc.),
|
||||
perform MX checks, used when manually reporting spam to SpamCop,
|
||||
and used by sa-update to gather version information.
|
||||
|
||||
You need to make sure the Net::DNS version is sufficiently up-to-date:
|
||||
|
||||
- version 0.34 or higher on Unix systems
|
||||
- version 0.46 or higher on Windows systems
|
||||
|
||||
Debian/Ubuntu: apt-get install libnet-dns-perl
|
||||
Fedora: yum install perl-Net-DNS
|
||||
|
||||
- NetAddr::IP (from CPAN)
|
||||
|
||||
Used to parse IP addresses and IP address ranges for
|
||||
"trusted_networks".
|
||||
|
||||
Debian/Ubuntu: apt-get install libnetaddr-ip-perl
|
||||
Fedora: yum install perl-NetAddr-IP
|
||||
|
||||
- Time::HiRes (from CPAN)
|
||||
|
||||
Used by asynchronous DNS lookups to operate timeouts with subsecond
|
||||
precision and to report processing times accurately.
|
||||
|
||||
- LWP (aka libwww-perl) (from CPAN)
|
||||
|
||||
This set of modules will include both the LWP::UserAgent and
|
||||
HTTP::Date modules, used by sa-update to retrieve update archives.
|
||||
|
||||
Fedora: yum install perl-libwww-perl
|
||||
|
||||
- HTTP::Date (from CPAN)
|
||||
|
||||
Used by sa-update to deal with certain Date requests.
|
||||
|
||||
- IO::Zlib (from CPAN)
|
||||
|
||||
Used by sa-update to uncompress update archives.
|
||||
Version 1.04 or later is required.
|
||||
|
||||
Fedora: yum install perl-IO-Zlib
|
||||
|
||||
- Archive::Tar (from CPAN)
|
||||
|
||||
Used by sa-update to expand update archives.
|
||||
Version 1.23 or later is required.
|
||||
|
||||
Fedora: yum install perl-Archive-Tar
|
||||
|
||||
|
||||
Optional Modules
|
||||
----------------
|
||||
|
||||
In addition, the following modules will be used for some checks, if
|
||||
available and the version is high enough. If they are not available or if
|
||||
their version is too low, SpamAssassin will still work, just not as
|
||||
effectively because some of the spam-detection tests will have to be
|
||||
skipped.
|
||||
|
||||
Note: SpamAssassin will not warn you if these are installed, but the
|
||||
version is too low for them to be used.
|
||||
|
||||
- MIME::Base64
|
||||
|
||||
This module is highly recommended to increase the speed with which
|
||||
Base64 encoded messages/mail parts are decoded.
|
||||
|
||||
|
||||
- DB_File (from CPAN, included in many distributions)
|
||||
|
||||
Used to store data on-disk, for the Bayes-style logic, TxRep, and
|
||||
auto-whitelist. *Much* more efficient than the other standard Perl
|
||||
database packages. Strongly recommended.
|
||||
|
||||
There seems to be a bug in libdb 4.1.25, which is
|
||||
distributed by default on some versions of Linux. See
|
||||
http://wiki.apache.org/spamassassin/DbFileSleepBug for details.
|
||||
|
||||
|
||||
- Net::SMTP (from CPAN)
|
||||
|
||||
Used when manually reporting spam to SpamCop.
|
||||
|
||||
|
||||
- Mail::SPF (from CPAN)
|
||||
|
||||
Used to check DNS Sender Policy Framework (SPF) records to fight email
|
||||
address forgery and make it easier to identify spams. This module
|
||||
makes Mail::SPF::Query obsolete.
|
||||
|
||||
Net::DNS version 0.58 or higher is required.
|
||||
|
||||
Note that NetAddr::IP (required by Mail::SPF) versions up to and
|
||||
including version 4.006 include a bug that will slow down the entire
|
||||
perl interpreter. NetAddr::IP version 4.007 or later fixes this.
|
||||
|
||||
|
||||
- Geo::IP (from CPAN)
|
||||
|
||||
Used by the RelayCountry plugin (not enabled by default) to determine
|
||||
the domain country codes of each relay in the path of an email.
|
||||
|
||||
IP::Country::Fast is used as alternative if Geo::IP is not installed.
|
||||
This is not recommended as it's obsolete.
|
||||
|
||||
|
||||
- Net::Ident (from CPAN)
|
||||
|
||||
If you plan to use the --auth-ident option to spamd, you will need
|
||||
to install this module.
|
||||
|
||||
|
||||
- IO::Socket::INET6 (from CPAN)
|
||||
|
||||
This is required if the first nameserver listed in your IP
|
||||
configuration or /etc/resolv.conf file is available only via an IPv6
|
||||
address.
|
||||
|
||||
Fedora: yum install perl-IO-Socket-INET6
|
||||
|
||||
|
||||
- IO::Socket::SSL (from CPAN)
|
||||
|
||||
If you wish to use SSL encryption to communicate between spamc and
|
||||
spamd (the --ssl option to spamd), you need to install this
|
||||
module. (You will need the OpenSSL libraries and use the
|
||||
ENABLE_SSL="yes" argument to Makefile.PL to build and run an SSL
|
||||
compatible spamc.)
|
||||
|
||||
Fedora: yum install perl-IO-Socket-SSL
|
||||
|
||||
|
||||
- Compress::Zlib (from CPAN)
|
||||
|
||||
If you wish to use the optional zlib compression for communication
|
||||
between spamc and spamd (the -z option to spamc), useful for
|
||||
long-distance use of spamc over the internet, you need to install
|
||||
this module.
|
||||
|
||||
Fedora: yum install perl-Compress-Zlib
|
||||
|
||||
|
||||
- Mail::DKIM (from CPAN)
|
||||
|
||||
If this module is installed, and you enable the DKIM plugin,
|
||||
SpamAssassin will perform DKIM lookups when a DKIM-Signature header is
|
||||
present in the message headers. Current versions of Mail::DKIM (0.20
|
||||
or later) also perform Domain Key lookups on DomainKey-Signature headers,
|
||||
without requiring the Mail::DomainKeys module, which is now obsolete.
|
||||
Version 0.37 or later is preferred, the absolute minimal version is 0.31.
|
||||
|
||||
Note that the Mail::DKIM module in turn requires the Digest::SHA module
|
||||
and OpenSSL libraries.
|
||||
|
||||
|
||||
- DBI *and* DBD driver/modules for your database (from CPAN)
|
||||
|
||||
If you intend to use SpamAssassin with an SQL database backend for
|
||||
user configuration data, Bayes storage, or other storage, you will need
|
||||
to have these installed; both the basic DBI module and the driver for
|
||||
your database.
|
||||
|
||||
|
||||
- Encode::Detect (from CPAN)
|
||||
|
||||
If you plan to use the normalize_charset config setting to detect
|
||||
charsets and convert them into Unicode, you will need to install
|
||||
this module.
|
||||
|
||||
|
||||
- Apache::Test (from CPAN)
|
||||
|
||||
If you plan to run the Apache2 version of spamd in the
|
||||
"spamd-apache2" directory, you will need to install this
|
||||
module.
|
||||
|
||||
|
||||
- Apache 2 and mod_perl
|
||||
|
||||
If you plan to run the Apache2 version of spamd in the "spamd-apache2"
|
||||
directory, you will need to ensure these are installed.
|
||||
|
||||
Ubuntu: sudo apt-get install apache2 libapache2-mod-perl2
|
||||
|
||||
|
||||
- Razor2
|
||||
|
||||
If you plan to use Vipul's Razor, note that versions up to and
|
||||
including version 2.82 include a bug that will slow down the entire
|
||||
perl interpreter. Version 2.83 or later fixes this.
|
||||
|
||||
If you do not plan to use this plugin, be sure to comment out
|
||||
its loadplugin line in "/etc/mail/spamassassin/v310.pre".
|
||||
|
||||
|
||||
What Next?
|
||||
----------
|
||||
|
||||
Take a look at the USAGE document for more information on how to use
|
||||
SpamAssassin.
|
||||
|
||||
The SpamAssassin Wiki <http://wiki.spamassassin.org/> contains
|
||||
information on custom plugins, extensions, and other optional modules
|
||||
included with SpamAssassin.
|
||||
|
||||
|
||||
(end of INSTALL)
|
||||
|
||||
// vim:tw=74:
|
35
upstream/INSTALL.VMS
Normal file
35
upstream/INSTALL.VMS
Normal file
@ -0,0 +1,35 @@
|
||||
|
||||
Notes on building SpamAssassin on VMS
|
||||
-------------------------------------
|
||||
|
||||
- Firstly, do not do a default build; this will build a set of C components
|
||||
which currently use Autoconf to determine their dependencies, to build
|
||||
"spamc".
|
||||
|
||||
Instead, run
|
||||
|
||||
perl Makefile.PL
|
||||
make spamassassin
|
||||
|
||||
If you want "spamc" I suggest you provide a replacement header and
|
||||
binaries.mk we can use, then we can special-case VMS in Makefile.PL to use
|
||||
those instead of running autoconf.
|
||||
|
||||
- The generated Makefile.PL no longer relies on UNIX input/output
|
||||
redirection; /bin/sh "for" loops; @foo "silent" directives; backslashed
|
||||
continuations; or [ -f foo ] || somecommand conditionals. At least, not in
|
||||
our code any more ;)
|
||||
|
||||
- There are still some UNIX paths inside the generated Makefile:
|
||||
"rules/local.cf" for example. Suggestions on how to work around this
|
||||
neatly inside the Makefile syntax while gaining VMS compatibility would be
|
||||
welcome!
|
||||
|
||||
- "make test" will probably rely heavily on UNIX-style input and output
|
||||
redirection, so will probably still fail.
|
||||
|
||||
- bug 1099 in the SA Bugzilla is being used to track progress.
|
||||
|
||||
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=1099
|
||||
|
||||
|
202
upstream/LICENSE
Normal file
202
upstream/LICENSE
Normal file
@ -0,0 +1,202 @@
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
585
upstream/MANIFEST
Normal file
585
upstream/MANIFEST
Normal file
@ -0,0 +1,585 @@
|
||||
CREDITS
|
||||
Changes
|
||||
INSTALL
|
||||
INSTALL.VMS
|
||||
LICENSE
|
||||
MANIFEST
|
||||
MANIFEST.SKIP
|
||||
Makefile.PL
|
||||
NOTICE
|
||||
PACKAGING
|
||||
README
|
||||
TRADEMARK
|
||||
UPGRADE
|
||||
USAGE
|
||||
build/check_dependencies
|
||||
build/convert_pods_to_doc
|
||||
build/get_version
|
||||
build/mkrules
|
||||
build/parse-rules-for-masses
|
||||
build/preprocessor
|
||||
build/sha256sum.pl
|
||||
build/sha512sum.pl
|
||||
ldap/README
|
||||
ldap/README.testing
|
||||
ldap/sa_test.ldif
|
||||
lib/Mail/SpamAssassin.pm
|
||||
lib/Mail/SpamAssassin/AICache.pm
|
||||
lib/Mail/SpamAssassin/ArchiveIterator.pm
|
||||
lib/Mail/SpamAssassin/AsyncLoop.pm
|
||||
lib/Mail/SpamAssassin/AutoWhitelist.pm
|
||||
lib/Mail/SpamAssassin/Bayes.pm
|
||||
lib/Mail/SpamAssassin/Bayes/CombineChi.pm
|
||||
lib/Mail/SpamAssassin/Bayes/CombineNaiveBayes.pm
|
||||
lib/Mail/SpamAssassin/BayesStore.pm
|
||||
lib/Mail/SpamAssassin/BayesStore/BDB.pm
|
||||
lib/Mail/SpamAssassin/BayesStore/DBM.pm
|
||||
lib/Mail/SpamAssassin/BayesStore/MySQL.pm
|
||||
lib/Mail/SpamAssassin/BayesStore/PgSQL.pm
|
||||
lib/Mail/SpamAssassin/BayesStore/Redis.pm
|
||||
lib/Mail/SpamAssassin/Util/TinyRedis.pm
|
||||
lib/Mail/SpamAssassin/BayesStore/SDBM.pm
|
||||
lib/Mail/SpamAssassin/BayesStore/SQL.pm
|
||||
lib/Mail/SpamAssassin/Client.pm
|
||||
lib/Mail/SpamAssassin/Conf.pm
|
||||
lib/Mail/SpamAssassin/Conf/LDAP.pm
|
||||
lib/Mail/SpamAssassin/Conf/Parser.pm
|
||||
lib/Mail/SpamAssassin/Conf/SQL.pm
|
||||
lib/Mail/SpamAssassin/Constants.pm
|
||||
lib/Mail/SpamAssassin/DBBasedAddrList.pm
|
||||
lib/Mail/SpamAssassin/Dns.pm
|
||||
lib/Mail/SpamAssassin/DnsResolver.pm
|
||||
lib/Mail/SpamAssassin/HTML.pm
|
||||
lib/Mail/SpamAssassin/Locales.pm
|
||||
lib/Mail/SpamAssassin/Locker.pm
|
||||
lib/Mail/SpamAssassin/Locker/Flock.pm
|
||||
lib/Mail/SpamAssassin/Locker/UnixNFSSafe.pm
|
||||
lib/Mail/SpamAssassin/Locker/Win32.pm
|
||||
lib/Mail/SpamAssassin/Logger.pm
|
||||
lib/Mail/SpamAssassin/Logger/File.pm
|
||||
lib/Mail/SpamAssassin/Logger/Stderr.pm
|
||||
lib/Mail/SpamAssassin/Logger/Syslog.pm
|
||||
lib/Mail/SpamAssassin/MailingList.pm
|
||||
lib/Mail/SpamAssassin/Message.pm
|
||||
lib/Mail/SpamAssassin/Message/Metadata.pm
|
||||
lib/Mail/SpamAssassin/Message/Metadata/Received.pm
|
||||
lib/Mail/SpamAssassin/Message/Node.pm
|
||||
lib/Mail/SpamAssassin/NetSet.pm
|
||||
lib/Mail/SpamAssassin/PerMsgLearner.pm
|
||||
lib/Mail/SpamAssassin/PerMsgStatus.pm
|
||||
lib/Mail/SpamAssassin/PersistentAddrList.pm
|
||||
lib/Mail/SpamAssassin/Plugin.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AskDNS.pm
|
||||
lib/Mail/SpamAssassin/Plugin/ASN.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AWL.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AccessDB.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AntiVirus.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AutoLearnThreshold.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Bayes.pm
|
||||
lib/Mail/SpamAssassin/Plugin/BodyEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/BodyRuleBaseExtractor.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Check.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DCC.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DKIM.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DNSEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/FreeMail.pm
|
||||
lib/Mail/SpamAssassin/Plugin/FromNameSpoof.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HTMLEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HTTPSMismatch.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Hashcash.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HashBL.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HeaderEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/ImageInfo.pm
|
||||
lib/Mail/SpamAssassin/Plugin/MIMEEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/MIMEHeader.pm
|
||||
lib/Mail/SpamAssassin/Plugin/OLEVBMacro.pm
|
||||
lib/Mail/SpamAssassin/Plugin/OneLineBodyRuleType.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Phishing.pm
|
||||
lib/Mail/SpamAssassin/Plugin/PhishTag.pm
|
||||
lib/Mail/SpamAssassin/Plugin/PDFInfo.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Pyzor.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Razor2.pm
|
||||
lib/Mail/SpamAssassin/Plugin/RelayCountry.pm
|
||||
lib/Mail/SpamAssassin/Plugin/RelayEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/ResourceLimits.pm
|
||||
lib/Mail/SpamAssassin/Plugin/ReplaceTags.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Reuse.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Rule2XSBody.pm
|
||||
lib/Mail/SpamAssassin/Plugin/SPF.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Shortcircuit.pm
|
||||
lib/Mail/SpamAssassin/Plugin/SpamCop.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Test.pm
|
||||
lib/Mail/SpamAssassin/Plugin/TextCat.pm
|
||||
lib/Mail/SpamAssassin/Plugin/TxRep.pm
|
||||
lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm
|
||||
lib/Mail/SpamAssassin/Plugin/URIDetail.pm
|
||||
lib/Mail/SpamAssassin/Plugin/URIEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/VBounce.pm
|
||||
lib/Mail/SpamAssassin/Plugin/WLBLEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/WhiteListSubject.pm
|
||||
lib/Mail/SpamAssassin/PluginHandler.pm
|
||||
lib/Mail/SpamAssassin/Plugin/URILocalBL.pm
|
||||
lib/Mail/SpamAssassin/RegistryBoundaries.pm
|
||||
lib/Mail/SpamAssassin/Reporter.pm
|
||||
lib/Mail/SpamAssassin/SQLBasedAddrList.pm
|
||||
lib/Mail/SpamAssassin/SpamdForkScaling.pm
|
||||
lib/Mail/SpamAssassin/SubProcBackChannel.pm
|
||||
lib/Mail/SpamAssassin/Timeout.pm
|
||||
lib/Mail/SpamAssassin/Util.pm
|
||||
lib/Mail/SpamAssassin/Util/DependencyInfo.pm
|
||||
lib/Mail/SpamAssassin/Util/Progress.pm
|
||||
lib/Mail/SpamAssassin/Util/ScopedTimer.pm
|
||||
lib/Mail/SpamAssassin/Util/TieOneStringHash.pm
|
||||
lib/spamassassin-run.pod
|
||||
procmailrc.example
|
||||
rules.README
|
||||
rules/active.list
|
||||
rules/init.pre
|
||||
rules/languages
|
||||
rules/local.cf
|
||||
rules/regression_tests.cf
|
||||
rules/sa-update-pubkey.txt
|
||||
rules/user_prefs.template
|
||||
rules/v310.pre
|
||||
rules/v312.pre
|
||||
rules/v320.pre
|
||||
rules/v330.pre
|
||||
rules/v340.pre
|
||||
rules/v341.pre
|
||||
rules/v342.pre
|
||||
rules/v343.pre
|
||||
rules/20_aux_tlds.cf
|
||||
sa-awl.raw
|
||||
sa-check_spamd.raw
|
||||
sa-compile.raw
|
||||
sa-learn.raw
|
||||
sa-update.raw
|
||||
sample-nonspam.txt
|
||||
sample-spam.txt
|
||||
spamassassin.raw
|
||||
spamc/Makefile.in
|
||||
spamc/Makefile.win
|
||||
spamc/README.qmail
|
||||
spamc/README.win
|
||||
spamc/acconfig.h
|
||||
spamc/config.h.in
|
||||
spamc/config.h.win
|
||||
spamc/configure
|
||||
spamc/configure.in
|
||||
spamc/configure.pl
|
||||
spamc/getopt.c
|
||||
spamc/getopt.h
|
||||
spamc/libspamc.c
|
||||
spamc/libspamc.h
|
||||
spamc/qmail-spamc.c
|
||||
spamc/spamc.c
|
||||
spamc/spamc.h.in
|
||||
spamc/spamc.h.win
|
||||
spamc/spamc.pod
|
||||
spamc/utils.c
|
||||
spamc/utils.h
|
||||
spamc/version.h.in
|
||||
spamc/version.h.pl
|
||||
spamd-apache2/MANIFEST
|
||||
spamd-apache2/MANIFEST.SKIP
|
||||
spamd-apache2/META.yml
|
||||
spamd-apache2/Makefile.PL
|
||||
spamd-apache2/README.apache
|
||||
spamd-apache2/bin/Bench-spamd.pl
|
||||
spamd-apache2/bin/Spamd
|
||||
spamd-apache2/bin/apache-spamd.pl
|
||||
spamd-apache2/lib/Mail/SpamAssassin/Spamd.pm
|
||||
spamd-apache2/lib/Mail/SpamAssassin/Spamd/Apache2.pm
|
||||
spamd-apache2/lib/Mail/SpamAssassin/Spamd/Apache2/AclIP.pm
|
||||
spamd-apache2/lib/Mail/SpamAssassin/Spamd/Apache2/AclRFC1413.pm
|
||||
spamd-apache2/lib/Mail/SpamAssassin/Spamd/Apache2/Config.pm
|
||||
spamd-apache2/lib/Mail/SpamAssassin/Spamd/Config.pm
|
||||
spamd-apache2/t/30run.t
|
||||
spamd-apache2/t/TEST.PL
|
||||
spamd-apache2/t/certs/Makefile
|
||||
spamd-apache2/t/certs/server.crt
|
||||
spamd-apache2/t/certs/server.csr
|
||||
spamd-apache2/t/certs/server.key
|
||||
spamd-apache2/t/conf/extra.last.conf.in
|
||||
spamd/OSXStartup/README
|
||||
spamd/OSXStartup/SAspamd
|
||||
spamd/OSXStartup/StartupParameters.plist
|
||||
spamd/PROTOCOL
|
||||
spamd/README
|
||||
spamd/README.SuSE
|
||||
spamd/README.vpopmail
|
||||
spamd/netbsd-rc-script.sh
|
||||
spamd/redhat-rc-script.sh
|
||||
spamd/slackware-rc-script.sh
|
||||
spamd/solaris-rc-script.sh
|
||||
spamd/spamd.raw
|
||||
spamd/suse-ancient-rc-script.sh
|
||||
sql/README
|
||||
sql/README.awl
|
||||
sql/README.bayes
|
||||
sql/README.txrep
|
||||
sql/awl_mysql.sql
|
||||
sql/awl_pg.sql
|
||||
sql/bayes_mysql.sql
|
||||
sql/bayes_pg.sql
|
||||
sql/userpref_mysql.sql
|
||||
sql/userpref_pg.sql
|
||||
sql/txrep_mysql.sql
|
||||
sql/txrep_pg.sql
|
||||
sql/txrep_sqlite.sql
|
||||
t/README
|
||||
t/SATest.pl
|
||||
t/SATest.pm
|
||||
t/all_modules.t
|
||||
t/autolearn.t
|
||||
t/autolearn_force.t
|
||||
t/autolearn_force_fail.t
|
||||
t/basic_lint.t
|
||||
t/basic_lint_without_sandbox.t
|
||||
t/basic_meta.t
|
||||
t/basic_meta2.t
|
||||
t/basic_obj_api.t
|
||||
t/bayesbdb.t
|
||||
t/bayesdbm.t
|
||||
t/bayesdbm_flock.t
|
||||
t/bayessdbm.t
|
||||
t/bayessdbm_seen_delete.t
|
||||
t/bayessql.t
|
||||
t/blacklist_autolearn.t
|
||||
t/body_mod.t
|
||||
t/body_str.t
|
||||
t/check_implemented.t
|
||||
t/cidrs.t
|
||||
t/config.dist
|
||||
t/config_errs.t
|
||||
t/config_text.t
|
||||
t/config_tree_recurse.t
|
||||
t/cpp_comments_in_spamc.t
|
||||
t/cross_user_config_leak.t
|
||||
t/olevbmacro.t
|
||||
t/data/01_test_rules.cf
|
||||
t/data/01_test_rules.pre
|
||||
t/data/Dumpheaders.pm
|
||||
t/data/dkim/test-adsp-11.msg
|
||||
t/data/dkim/test-adsp-12.msg
|
||||
t/data/dkim/test-adsp-13.msg
|
||||
t/data/dkim/test-adsp-14.msg
|
||||
t/data/dkim/test-adsp-15.msg
|
||||
t/data/dkim/test-adsp-16.msg
|
||||
t/data/dkim/test-adsp-17.msg
|
||||
t/data/dkim/test-adsp-18.msg
|
||||
t/data/dkim/test-adsp-19.msg
|
||||
t/data/dkim/test-adsp-20.msg
|
||||
t/data/dkim/test-adsp-21.msg
|
||||
t/data/dkim/test-adsp-22.msg
|
||||
t/data/dkim/test-fail-01.msg
|
||||
t/data/dkim/test-fail-02.msg
|
||||
t/data/dkim/test-fail-03.msg
|
||||
t/data/dkim/test-fail-04.msg
|
||||
t/data/dkim/test-fail-05.msg
|
||||
t/data/dkim/test-fail-06.msg
|
||||
t/data/dkim/test-fail-07.msg
|
||||
t/data/dkim/test-fail-08.msg
|
||||
t/data/dkim/test-fail-09.msg
|
||||
t/data/dkim/test-pass-01.msg
|
||||
t/data/dkim/test-pass-02.msg
|
||||
t/data/dkim/test-pass-03.msg
|
||||
t/data/dkim/test-pass-04.msg
|
||||
t/data/dkim/test-pass-05.msg
|
||||
t/data/dkim/test-pass-06.msg
|
||||
t/data/dkim/test-pass-07.msg
|
||||
t/data/dkim/test-pass-08.msg
|
||||
t/data/dkim/test-pass-09.msg
|
||||
t/data/dkim/test-pass-10.msg
|
||||
t/data/dkim/test-pass-11.msg
|
||||
t/data/dkim/test-pass-12.msg
|
||||
t/data/dkim/test-pass-13.msg
|
||||
t/data/dkim/test-pass-14.msg
|
||||
t/data/dkim/test-pass-15.msg
|
||||
t/data/dkim/test-pass-16.msg
|
||||
t/data/dkim/test-pass-17.msg
|
||||
t/data/dkim/test-pass-18.msg
|
||||
t/data/dkim/test-pass-19.msg
|
||||
t/data/etc/hello.txt
|
||||
t/data/etc/testhost.cert
|
||||
t/data/etc/testhost.key
|
||||
t/data/mime-subject.txt
|
||||
t/data/nice/001
|
||||
t/data/nice/002
|
||||
t/data/nice/003
|
||||
t/data/nice/004
|
||||
t/data/nice/005
|
||||
t/data/nice/006
|
||||
t/data/nice/007
|
||||
t/data/nice/008
|
||||
t/data/nice/009
|
||||
t/data/nice/010
|
||||
t/data/nice/011
|
||||
t/data/nice/012
|
||||
t/data/nice/013
|
||||
t/data/nice/014
|
||||
t/data/nice/015
|
||||
t/data/nice/016
|
||||
t/data/nice/base64.txt
|
||||
t/data/nice/crlf-endings
|
||||
t/data/nice/dkim/AddedVtag_07
|
||||
t/data/nice/dkim/BasicTest_01
|
||||
t/data/nice/dkim/MIMEnowsp_05
|
||||
t/data/nice/dkim/MIMEsimple_04
|
||||
t/data/nice/dkim/MultipleAuthRes_10
|
||||
t/data/nice/dkim/MultipleReceived_08
|
||||
t/data/nice/dkim/MultipleSig_06
|
||||
t/data/nice/dkim/NonExistingHeader_09
|
||||
t/data/nice/dkim/Nowsp_03
|
||||
t/data/nice/dkim/Simple_02
|
||||
t/data/nice/mailman_message.txt
|
||||
t/data/nice/mime1
|
||||
t/data/nice/mime2
|
||||
t/data/nice/mime3
|
||||
t/data/nice/mime4
|
||||
t/data/nice/mime5
|
||||
t/data/nice/mime6
|
||||
t/data/nice/mime7
|
||||
t/data/nice/mime8
|
||||
t/data/nice/mime9
|
||||
t/data/nice/no_body
|
||||
t/data/nice/not_gtube.eml
|
||||
t/data/nice/orig_ip_hdr.eml
|
||||
t/data/nice/spf1
|
||||
t/data/nice/spf2
|
||||
t/data/nice/spf3
|
||||
t/data/nice/spf3-received-spf
|
||||
t/data/reporterplugin.pm
|
||||
t/data/spam/001
|
||||
t/data/spam/002
|
||||
t/data/spam/003
|
||||
t/data/spam/004
|
||||
t/data/spam/005
|
||||
t/data/spam/006
|
||||
t/data/spam/007
|
||||
t/data/spam/008
|
||||
t/data/spam/009
|
||||
t/data/spam/010
|
||||
t/data/spam/011
|
||||
t/data/spam/012
|
||||
t/data/spam/013
|
||||
t/data/spam/014
|
||||
t/data/spam/015
|
||||
t/data/spam/016
|
||||
t/data/spam/017
|
||||
t/data/spam/018
|
||||
t/data/spam/badctype1
|
||||
t/data/spam/badctype2
|
||||
t/data/spam/badmime.txt
|
||||
t/data/spam/badmime2.txt
|
||||
t/data/spam/badmime3.txt
|
||||
t/data/spam/base64.txt
|
||||
t/data/spam/bsmtp
|
||||
t/data/spam/bsmtpnull
|
||||
t/data/spam/dnsbl.eml
|
||||
t/data/spam/gtube.eml
|
||||
t/data/spam/gtubedcc.eml
|
||||
t/data/spam/olevbmacro/encrypted.eml
|
||||
t/data/spam/olevbmacro/goodcsv.eml
|
||||
t/data/spam/olevbmacro/macro.eml
|
||||
t/data/spam/olevbmacro/malicemacro.eml
|
||||
t/data/spam/olevbmacro/nomacro.eml
|
||||
t/data/spam/olevbmacro/renamedmacro.eml
|
||||
t/data/spam/olevbmacro/zippwmacro.eml
|
||||
t/data/spam/razor2
|
||||
t/data/spam/relayUS.eml
|
||||
t/data/spam/spf1
|
||||
t/data/spam/spf2
|
||||
t/data/spam/spf3
|
||||
t/data/spamc_blank.cf
|
||||
t/data/taintcheckplugin.pm
|
||||
t/data/testplugin.pm
|
||||
t/data/testplugin2.pm
|
||||
t/data/validuserplugin.pm
|
||||
t/data/whitelists/action.eff.org
|
||||
t/data/whitelists/amazon_co_uk_ship
|
||||
t/data/whitelists/amazon_com_ship
|
||||
t/data/whitelists/cert.org
|
||||
t/data/whitelists/debian_bts_reassign
|
||||
t/data/whitelists/ibm_enews_de
|
||||
t/data/whitelists/infoworld
|
||||
t/data/whitelists/linuxplanet
|
||||
t/data/whitelists/lp.org
|
||||
t/data/whitelists/media_unspun
|
||||
t/data/whitelists/mlist_mailman_message
|
||||
t/data/whitelists/mlist_yahoo_groups_message
|
||||
t/data/whitelists/mypoints
|
||||
t/data/whitelists/neat_net_tricks
|
||||
t/data/whitelists/netcenter-direct_de
|
||||
t/data/whitelists/netsol_renewal
|
||||
t/data/whitelists/networkworld
|
||||
t/data/whitelists/oracle_net_techblast
|
||||
t/data/whitelists/orbitz.com
|
||||
t/data/whitelists/paypal.com
|
||||
t/data/whitelists/register.com_password
|
||||
t/data/whitelists/ryanairmail.com
|
||||
t/data/whitelists/sf.net
|
||||
t/data/whitelists/winxpnews.com
|
||||
t/data/whitelists/yahoo-inc.com
|
||||
t/data/phishing/openphish-feed.txt
|
||||
t/data/phishing/phishtank-feed.csv
|
||||
t/data/spam/phishing_openphish.eml
|
||||
t/data/spam/phishing_phishtank.eml
|
||||
t/phishing.t
|
||||
t/date.t
|
||||
t/db_awl_path.t
|
||||
t/db_awl_perms.t
|
||||
t/db_based_whitelist.t
|
||||
t/db_based_whitelist_ips.t
|
||||
t/dcc.t
|
||||
t/debug.t
|
||||
t/desc_wrap.t
|
||||
t/dkim.t
|
||||
t/dnsbl.t
|
||||
t/dnsbl_sc_meta.t
|
||||
t/duplicates.t
|
||||
t/freemail.t
|
||||
t/get_all_headers.t
|
||||
t/get_headers.t
|
||||
t/gtube.t
|
||||
t/hashcash.t
|
||||
t/html_colors.t
|
||||
t/html_obfu.t
|
||||
t/html_utf8.t
|
||||
t/idn_dots.t
|
||||
t/if_can.t
|
||||
t/ifversion.t
|
||||
t/ip_addrs.t
|
||||
t/lang_lint.t
|
||||
t/lang_pl_tests.t
|
||||
t/line_endings.t
|
||||
t/lint_nocreate_prefs.t
|
||||
t/memory_cycles.t
|
||||
t/metadata.t
|
||||
t/mimeheader.t
|
||||
t/mimeparse.t
|
||||
t/missing_hb_separator.t
|
||||
t/nonspam.t
|
||||
t/originating_ip_hdr.t
|
||||
t/plugin.t
|
||||
t/plugin_file.t
|
||||
t/plugin_priorities.t
|
||||
t/prefs_include.t
|
||||
t/priorities.t
|
||||
t/razor2.t
|
||||
t/rcvd_parser.t
|
||||
t/re_base_extraction.t
|
||||
t/recips.t
|
||||
t/recreate.t
|
||||
t/recursion.t
|
||||
t/regexp_valid.t
|
||||
t/relaycountry_fast.t
|
||||
t/relaycountry_geoip.t
|
||||
t/relaycountry_geoip2.t
|
||||
t/relative_scores.t
|
||||
t/report_safe.t
|
||||
t/reportheader.t
|
||||
t/reportheader_8bit.t
|
||||
t/reuse.t
|
||||
t/root_spamd.t
|
||||
t/root_spamd_tell.t
|
||||
t/root_spamd_tell_paranoid.t
|
||||
t/root_spamd_tell_x.t
|
||||
t/root_spamd_tell_x_paranoid.t
|
||||
t/root_spamd_u.t
|
||||
t/root_spamd_u_dcc.t
|
||||
t/root_spamd_virtual.t
|
||||
t/root_spamd_x.t
|
||||
t/root_spamd_x_paranoid.t
|
||||
t/root_spamd_x_u.t
|
||||
t/rule_multiple.t
|
||||
t/rule_names.t
|
||||
t/rule_types.t
|
||||
t/sa_awl.t
|
||||
t/sa_check_spamd.t
|
||||
t/sa_compile.t
|
||||
t/sha1.t
|
||||
t/shortcircuit.t
|
||||
t/spam.t
|
||||
t/spamc.t
|
||||
t/spamc_B.t
|
||||
t/spamc_E.t
|
||||
t/spamc_c.t
|
||||
t/spamc_c_stdout_closed.t
|
||||
t/spamc_cf.t
|
||||
t/spamc_headers.t
|
||||
t/spamc_l.t
|
||||
t/spamc_optC.t
|
||||
t/spamc_optL.t
|
||||
t/spamc_x_E_R.t
|
||||
t/spamc_x_e.t
|
||||
t/spamc_y.t
|
||||
t/spamc_z.t
|
||||
t/spamd.t
|
||||
t/spamd_allow_user_rules.t
|
||||
t/spamd_client.t
|
||||
t/spamd_hup.t
|
||||
t/spamd_kill_restart.t
|
||||
t/spamd_kill_restart_rr.t
|
||||
t/spamd_ldap.t
|
||||
t/spamd_maxchildren.t
|
||||
t/spamd_maxsize.t
|
||||
t/spamd_parallel.t
|
||||
t/spamd_plugin.t
|
||||
t/spamd_port.t
|
||||
t/spamd_prefork_stress.t
|
||||
t/spamd_prefork_stress_2.t
|
||||
t/spamd_prefork_stress_3.t
|
||||
t/spamd_prefork_stress_4.t
|
||||
t/spamd_protocol_10.t
|
||||
t/spamd_report.t
|
||||
t/spamd_report_ifspam.t
|
||||
t/spamd_sql_prefs.t
|
||||
t/spamd_ssl.t
|
||||
t/spamd_ssl_accept_fail.t
|
||||
t/spamd_stop.t
|
||||
t/spamd_symbols.t
|
||||
t/spamd_syslog.t
|
||||
t/spamd_unix.t
|
||||
t/spamd_unix_and_tcp.t
|
||||
t/spamd_user_rules_leak.t
|
||||
t/spamd_utf8.t
|
||||
t/spamd_whitelist_leak.t
|
||||
t/spf.t
|
||||
t/sql_based_whitelist.t
|
||||
t/stop_always_matching_regexps.t
|
||||
t/strip2.t
|
||||
t/strip_no_subject.t
|
||||
t/stripmarkup.t
|
||||
t/tainted_msg.t
|
||||
t/test_dir
|
||||
t/text_bad_ctype.t
|
||||
t/timeout.t
|
||||
t/trust_path.t
|
||||
t/uri.t
|
||||
t/uri_html.t
|
||||
t/uri_text.t
|
||||
t/uribl.t
|
||||
t/urilocalbl_geoip.t
|
||||
t/utf8.t
|
||||
t/util_wrap.t
|
||||
t/whitelist_addrs.t
|
||||
t/whitelist_from.t
|
||||
t/whitelist_subject.t
|
||||
t/whitelist_to.t
|
||||
t/zz_cleanup.t
|
||||
t/spamc_bug6176.t
|
||||
t/data/spam/dnsbl_domsonly.eml
|
||||
t/uribl_domains_only.t
|
||||
t/data/spam/dnsbl_ipsonly.eml
|
||||
t/uribl_all_types.t
|
||||
t/uribl_ips_only.t
|
||||
t/uri_list.t
|
||||
t/dnsbl_subtests.t
|
||||
powered_by/128-powered-by-spamassassin.png
|
||||
powered_by/256-powered-by-spamassassin.png
|
||||
powered_by/512-powered-by-spamassassin.png
|
||||
powered_by/LOGO_USAGE.TXT
|
||||
powered_by/powered_by_spamassassin.psd
|
||||
META.yml Module YAML meta-data (added by MakeMaker)
|
||||
META.json Module JSON meta-data (added by MakeMaker)
|
124
upstream/MANIFEST.SKIP
Normal file
124
upstream/MANIFEST.SKIP
Normal file
@ -0,0 +1,124 @@
|
||||
\.a$
|
||||
\.bak$
|
||||
\.dll$
|
||||
\.exe$
|
||||
\.exists$
|
||||
\.log$
|
||||
\.o$
|
||||
\.obj$
|
||||
\.old$
|
||||
\.pid$
|
||||
\.so$
|
||||
\.svn/
|
||||
\.gitignore$
|
||||
\.swp$
|
||||
\.tmp$
|
||||
\.tst$
|
||||
\bautom4te\.cache/
|
||||
\bcore(\.\d+)?$
|
||||
^Makefile$
|
||||
^spamc/Makefile$
|
||||
\bold\.cf$
|
||||
\btmon\.out$
|
||||
\b[Oo][Ll][Dd]$
|
||||
\b[Oo][Uu][Tt]$
|
||||
^blib/
|
||||
^blibdirs$
|
||||
^build/3\.\d\.\d_change_summary$
|
||||
^build/cf_to_html$
|
||||
^build/buildbot_ready$
|
||||
^build/find_meta_missing_dependencies$
|
||||
^build/README$
|
||||
^build/replace_license_blocks$
|
||||
^build/tag_.*$
|
||||
^build/update_.*$
|
||||
^build/pga.*$
|
||||
^build/old/.*$
|
||||
^debian/
|
||||
^doc/
|
||||
^lib/Mail/SpamAssassin/Plugin/NetCache\.pm$
|
||||
^lib/Mail/SpamAssassin/Util/MemoryDump\.pm$
|
||||
^lm/
|
||||
^made-doc-stamp$
|
||||
^Mail-SpamAssassin-.*$
|
||||
^pm_to_blib$
|
||||
^pod2html?-?.*$
|
||||
^presentation$
|
||||
^rules/.*\.cf$
|
||||
^rules/.*\.pm$
|
||||
^rules/STATISTICS-set\d.txt$
|
||||
^sa-filter$
|
||||
^sa-learn$
|
||||
^sa_with_mail_audit$
|
||||
^spamassassin$
|
||||
^spamc/config\.(h|status|log)$
|
||||
^spamc/qmail-spamc$
|
||||
^spamc/spamc$
|
||||
^spamc/spamc\.h$
|
||||
^spamc/spamc\.conf$
|
||||
^spamc/sslspamc$
|
||||
^spamc/version\.h$
|
||||
^spamd/spamd$
|
||||
^stuff/
|
||||
^t/bayessql\.cf$
|
||||
^t/config$
|
||||
^t/data/nice/cjk/
|
||||
^t/data/whitelists/
|
||||
^t/do_net$
|
||||
^t/log/
|
||||
^t/rule_tests\.t$
|
||||
^t/sql_based_whitelist\.cf$
|
||||
^tasks/
|
||||
^testmails.*$
|
||||
^todo$
|
||||
^wordfreqs/
|
||||
~$
|
||||
^masses/
|
||||
^contrib/
|
||||
^tools/
|
||||
^pod2ref
|
||||
^sa-update$
|
||||
^sa-compile$
|
||||
^build/describe-to-po-file$
|
||||
^rules/70_sandbox\.cf$
|
||||
^build/automc/
|
||||
^rulesrc/.*$
|
||||
^rules/active\.list$
|
||||
^build/mkupdates/
|
||||
^build/buildbot/
|
||||
^hack/
|
||||
^patches/
|
||||
(^|/)\.
|
||||
~$
|
||||
/logs/
|
||||
/t/logs/
|
||||
\.conf$
|
||||
\.pl$
|
||||
\bCVS\b
|
||||
/t/TEST$
|
||||
^lib/Mail/SpamAssassin/Plugin/P595Body\.pm
|
||||
^lib/Mail/SpamAssassin/Plugin/RabinKarpBody\.pm
|
||||
^rules/svn_only\.pre$
|
||||
^masses/tenpass/randomise$
|
||||
^masses/rule-dev/seek-phrases-in-log$
|
||||
^masses/rule-dev/seek-phrases-in-corpus$
|
||||
^build/announcements/.*\.txt$
|
||||
^t/mass_check\.t$
|
||||
^build/backup
|
||||
^build/hudson
|
||||
^build/jenkins
|
||||
^rules/.*\.pm$
|
||||
^artifacts/
|
||||
^t\.rules/
|
||||
^t/make_install\.t$
|
||||
^backend/
|
||||
^sa-awl$
|
||||
^sa-check_spamd$
|
||||
^xt/
|
||||
^build/rebuild_xt$
|
||||
^build/repackage_latest_update_rules$
|
||||
^MYMETA.(json|yml)$
|
||||
^trunk-only.*
|
||||
^t/mkrules\.t$
|
||||
^t/mkrules_else\.t$
|
||||
^t/spamc_H\.t$
|
61
upstream/META.json
Normal file
61
upstream/META.json
Normal file
@ -0,0 +1,61 @@
|
||||
{
|
||||
"abstract" : "Apache SpamAssassin is an extensible email filter which is used to identify spam",
|
||||
"author" : [
|
||||
"The Apache SpamAssassin Project <dev@spamassassin.apache.org>"
|
||||
],
|
||||
"dynamic_config" : 1,
|
||||
"generated_by" : "ExtUtils::MakeMaker version 6.68, CPAN::Meta::Converter version 2.120921",
|
||||
"license" : [
|
||||
"apache_2_0"
|
||||
],
|
||||
"meta-spec" : {
|
||||
"url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec",
|
||||
"version" : "2"
|
||||
},
|
||||
"name" : "Mail-SpamAssassin",
|
||||
"no_index" : {
|
||||
"directory" : [
|
||||
"t",
|
||||
"inc"
|
||||
]
|
||||
},
|
||||
"prereqs" : {
|
||||
"build" : {
|
||||
"requires" : {
|
||||
"ExtUtils::MakeMaker" : "0"
|
||||
}
|
||||
},
|
||||
"runtime" : {
|
||||
"requires" : {
|
||||
"Archive::Tar" : "1.23",
|
||||
"Digest::SHA1" : "0",
|
||||
"Errno" : "0",
|
||||
"File::Copy" : "2.02",
|
||||
"File::Spec" : "0.8",
|
||||
"HTML::Parser" : "3.43",
|
||||
"IO::Zlib" : "1.04",
|
||||
"Mail::DKIM" : "0.31",
|
||||
"Net::DNS" : "0.34",
|
||||
"NetAddr::IP" : "4.01",
|
||||
"Pod::Usage" : "1.1",
|
||||
"Sys::Hostname" : "0",
|
||||
"Test::More" : "0",
|
||||
"Time::HiRes" : "0",
|
||||
"Time::Local" : "0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"release_status" : "stable",
|
||||
"resources" : {
|
||||
"homepage" : "https://spamassassin.apache.org/",
|
||||
"license" : [
|
||||
"http://www.apache.org/licenses/LICENSE-2.0.html"
|
||||
],
|
||||
"repository" : {
|
||||
"type" : "svn",
|
||||
"url" : "http://svn.apache.org/repos/asf/spamassassin/"
|
||||
},
|
||||
"x_MailingList" : "http://wiki.apache.org/spamassassin/MailingLists"
|
||||
},
|
||||
"version" : "3.004004"
|
||||
}
|
39
upstream/META.yml
Normal file
39
upstream/META.yml
Normal file
@ -0,0 +1,39 @@
|
||||
---
|
||||
abstract: 'Apache SpamAssassin is an extensible email filter which is used to identify spam'
|
||||
author:
|
||||
- 'The Apache SpamAssassin Project <dev@spamassassin.apache.org>'
|
||||
build_requires:
|
||||
ExtUtils::MakeMaker: 0
|
||||
dynamic_config: 1
|
||||
generated_by: 'ExtUtils::MakeMaker version 6.68, CPAN::Meta::Converter version 2.120921'
|
||||
license: apache
|
||||
meta-spec:
|
||||
url: http://module-build.sourceforge.net/META-spec-v1.4.html
|
||||
version: 1.4
|
||||
name: Mail-SpamAssassin
|
||||
no_index:
|
||||
directory:
|
||||
- t
|
||||
- inc
|
||||
requires:
|
||||
Archive::Tar: 1.23
|
||||
Digest::SHA1: 0
|
||||
Errno: 0
|
||||
File::Copy: 2.02
|
||||
File::Spec: 0.8
|
||||
HTML::Parser: 3.43
|
||||
IO::Zlib: 1.04
|
||||
Mail::DKIM: 0.31
|
||||
Net::DNS: 0.34
|
||||
NetAddr::IP: 4.01
|
||||
Pod::Usage: 1.1
|
||||
Sys::Hostname: 0
|
||||
Test::More: 0
|
||||
Time::HiRes: 0
|
||||
Time::Local: 0
|
||||
resources:
|
||||
homepage: https://spamassassin.apache.org/
|
||||
license: http://www.apache.org/licenses/LICENSE-2.0.html
|
||||
repository: http://svn.apache.org/repos/asf/spamassassin/
|
||||
x_MailingList: http://wiki.apache.org/spamassassin/MailingLists
|
||||
version: 3.004004
|
1170
upstream/Makefile.PL
Normal file
1170
upstream/Makefile.PL
Normal file
File diff suppressed because it is too large
Load Diff
35
upstream/NOTICE
Normal file
35
upstream/NOTICE
Normal file
@ -0,0 +1,35 @@
|
||||
This product includes software developed by the Apache Software
|
||||
Foundation (http://www.apache.org/).
|
||||
|
||||
SpamAssassin is a trademark of the Apache Software Foundation.
|
||||
|
||||
This distribution includes cryptographic software. The country in
|
||||
which you currently reside may have restrictions on the import,
|
||||
possession, use, and/or re-export to another country, of
|
||||
encryption software. BEFORE using any encryption software, please
|
||||
check your country's laws, regulations and policies concerning the
|
||||
import, possession, or use, and re-export of encryption software, to
|
||||
see if this is permitted. See <http://www.wassenaar.org/> for more
|
||||
information.
|
||||
|
||||
The U.S. Government Department of Commerce, Bureau of Industry and
|
||||
Security (BIS), has classified this software as Export Commodity
|
||||
Control Number (ECCN) 5D002.C.1, which includes information security
|
||||
software using or performing cryptographic functions with asymmetric
|
||||
algorithms. The form and manner of this Apache Software Foundation
|
||||
distribution makes it eligible for export under the License Exception
|
||||
ENC Technology Software Unrestricted (TSU) exception (see the BIS
|
||||
Export Administration Regulations, Section 740.13) for both object
|
||||
code and source code.
|
||||
|
||||
The following provides more details on the included cryptographic
|
||||
software:
|
||||
|
||||
The OpenSSL Project - http://www.openssl.org/source/
|
||||
|
||||
spamc and libspamc use OpenSSL to perform SSL encryption.
|
||||
|
||||
Steffen Ullrich - http://search.cpan.org/%7esullr/
|
||||
|
||||
spamd uses IO::Socket::SSL to perform SSL encryption.
|
||||
|
315
upstream/PACKAGING
Normal file
315
upstream/PACKAGING
Normal file
@ -0,0 +1,315 @@
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
The Makefile which is used to build SpamAssassin is created by calling
|
||||
perl Makefile.PL
|
||||
|
||||
This is the standard Perl way of building packages. It involves the
|
||||
Perl module ExtUtils::MakeMaker which creates a Makefile.
|
||||
|
||||
ExtUtils::MakeMaker recognizes several variables which can be set at
|
||||
the command line to give the user the possibility to influence the
|
||||
contents of the generated Makefile. All macros written to the Makefile
|
||||
can be changed on the command line like this:
|
||||
perl Makefile.PL FOO="bar"
|
||||
This would give the (exemplary) macro 'FOO' the value 'bar'.
|
||||
|
||||
Now has the internal structure of ExtUtils::MakeMaker and that of the
|
||||
generated Makefiles changed over the years. For a description of the
|
||||
features your version supports, please read
|
||||
perldoc ExtUtils::MakeMaker
|
||||
|
||||
One important thing to know when you're building packages is that Perl
|
||||
uses three different "repositories" for installed modules and their
|
||||
corresponding files: 'perl', 'site' and 'vendor' (the latter was
|
||||
introduced with Perl 5.6.0). These have the following meanings:
|
||||
|
||||
perl: This should be used only by essential modules shipped with
|
||||
Perl or modules required by one of these. And maybe for some
|
||||
other important modules chosen by some obscure selection
|
||||
process. Only one thing is sure about this set of directories:
|
||||
SpamAssassin doesn't belong there.
|
||||
|
||||
site: This is the default. The libs (.pm files) of the modules are
|
||||
installed into the site_perl subdir in the Perl lib dir.
|
||||
Everything installed via the CPAN shell or directly from
|
||||
sources should go there.
|
||||
|
||||
vendor: This repository was officially introduced some time after
|
||||
Perl 5.005_03 (maybe with 5.6.0). It's intended to be the
|
||||
target for all modules installed from distribution specific
|
||||
packages; that means RPMs, debs, ebuilds, etc. The rationale
|
||||
behind this is that this prevents modules installed by the
|
||||
user from being overwritten by packaged ones.
|
||||
|
||||
The wanted repository can be chosen by setting the variable INSTALLDIRS.
|
||||
So according to the description above should packages probably use
|
||||
perl Makefile.PL INSTALLDIRS=vendor
|
||||
That's definitely the correct way to go for Debian, according to their
|
||||
Perl Policy [DEBPERL]. But I've heard that the vendor stuff is either
|
||||
broken or not set on many other systems, especially Red Hat ones. Google
|
||||
might help to find out more on this topic.
|
||||
|
||||
The following resources might help understanding this stuff:
|
||||
[MANEUMM616], [MM00779], [P5P94113].
|
||||
|
||||
|
||||
Changing paths in the Makefile
|
||||
------------------------------
|
||||
|
||||
Internally the Makefile defined quite some paths for the different settings
|
||||
of INSTALLDIRS. One can change them directly but to be independent of the
|
||||
version of ExtUtils::MakeMaker the following variables should be used:
|
||||
|
||||
PREFIX:
|
||||
Sets the prefix below which SpamAssassin is installed. Please note the
|
||||
exceptions for SYSCONFDIR.
|
||||
|
||||
Default is the prefix Perl was built with (call
|
||||
perl -V:prefix
|
||||
to see the value). Normally something like /usr or /usr/local.
|
||||
|
||||
Samples:
|
||||
This will install the spamassassin apps in /foo/bin, the libs in
|
||||
/foo/lib/perl5, the shared stuff in /foo/share/spamassassin and make
|
||||
SpamAssassin look for config files in /foo/etc/mail/spamassassin:
|
||||
perl Makefile.PL PREFIX=/foo
|
||||
|
||||
LIB:
|
||||
This will change the directory where the SpamAssassin libraries (.pm files)
|
||||
are installed. The module's architecture-independent files will be put into
|
||||
the given directory, the architecture-dependent files into a subdirectory
|
||||
with the name of the current architecture.
|
||||
|
||||
The default is something like PREFIX/lib/perl5/site_perl/PERL_VERSION (for
|
||||
INSTALLDIRS=site).
|
||||
|
||||
Samples:
|
||||
Under i686-Linux, put the architecture-independent files below ~/.libs
|
||||
and the architecture-dependent ones below ~/.libs/i686-linux:
|
||||
perl Makefile.PL LIB=~/.libs
|
||||
|
||||
DATADIR (DEFRULESDIR):
|
||||
SpamAssassin's real logic lies in its shipped rule definitions and the
|
||||
corresponding scores. The files with these settings have to be saved
|
||||
somewhere, normally below PREFIX/share/spamassassin. The full path to
|
||||
that directory can be changed with this variable (DEFRULESDIR is a
|
||||
synonym).
|
||||
|
||||
ATTENTION: All files within this directory are removed when SpamAssassin
|
||||
is installed!
|
||||
|
||||
Samples:
|
||||
Install everything into the default locations but put the rules in
|
||||
/tmp/sa-rules (for whatever reason):
|
||||
perl Makefile.PL DATADIR=/tmp/sa-rules
|
||||
|
||||
|
||||
SYSCONFDIR:
|
||||
Sets the base dir for the config files. See also CONFDIR.
|
||||
|
||||
The default depends on the PREFIX and is compliant to the FHS:
|
||||
- if PREFIX is either /usr or /usr/local:
|
||||
/etc
|
||||
- if PREFIX starts with /opt:
|
||||
/etc/opt
|
||||
- else:
|
||||
PREFIX/etc
|
||||
|
||||
Samples:
|
||||
This will (on Windows) install below 'C:\Program Files\SpamAssassin' but
|
||||
look for the config files in 'C:\Program Files\Shared Files\SpamAssassin':
|
||||
perl Makefile.PL PREFIX="C:/Program Files/SpamAssassin"
|
||||
SYSCONFDIR="C:/Program Files/Shared Files/SpamAssassin"
|
||||
|
||||
To put the apps and libs below ~/.sa-bin but the config below ~/.sa-etc
|
||||
try the following:
|
||||
perl Makefile.PL PREFIX=$HOME/.sa-bin SYSCONFDIR=$HOME/.sa-etc
|
||||
|
||||
And the following installs SpamAssassin in /usr/local and forces the
|
||||
config files to be below /usr/local, too:
|
||||
perl Makefile.PL PREFIX=/usr/local SYSCONFDIR=/usr/local/etc
|
||||
|
||||
CONFDIR (LOCALRULESDIR):
|
||||
SpamAssassin looks for its config files in SYSCONFDIR/mail/spamassassin.
|
||||
(There is also a sample local.cf created if such a file doesn't exist yet.)
|
||||
Some people didn't like this path for various reasons so the full path to
|
||||
the config files can be changed here (this more or less makes SYSCONFDIR
|
||||
obsolete). A synonym for this variable is LOCALRULESDIR.
|
||||
|
||||
Samples:
|
||||
If you'd like to have the config files directly in /etc/spamassassin
|
||||
try this:
|
||||
perl Makefile.PL CONFDIR=/etc/spamassassin
|
||||
|
||||
LOCALSTATEDIR:
|
||||
"sa-update" will download rule updates into LOCALSTATEDIR/spamassassin.
|
||||
|
||||
The default depends on the PREFIX and is compliant to the FHS:
|
||||
- if PREFIX is either /usr or /usr/local:
|
||||
/var/lib
|
||||
- if PREFIX starts with /opt:
|
||||
/var/opt
|
||||
- else:
|
||||
PREFIX/var
|
||||
|
||||
Samples:
|
||||
If you'd like to have the downloaded rules files in /var/spamassassin
|
||||
try this:
|
||||
perl Makefile.PL LOCALSTATEDIR=/var
|
||||
|
||||
|
||||
Installing to a directory different from the final destination
|
||||
--------------------------------------------------------------
|
||||
|
||||
When you're building packages, it's often needed to install the stuff to
|
||||
some temporary directory and then build the package from there. The problem
|
||||
with this approach is that the build system of SpamAssassin needs to write
|
||||
some final paths to the libs and the applications.
|
||||
|
||||
Previous versions offered some complicated variables to achieve this. Those
|
||||
hacks weren't compatible to current versions of ExtUtils::MakeMaker. But
|
||||
ExtUtils::MakeMaker 6.06 introduced a feature which is well known from the GNU
|
||||
build tools [GNUMAKECMD]: The variable DESTDIR.
|
||||
|
||||
The value of DESTDIR is simply prepended to all other paths on make install.
|
||||
So if you wanted to create a SpamAssassin package for a system which will
|
||||
have it installed in /usr but you want to create that package from some temp
|
||||
dir, you would do something like this:
|
||||
perl Makefile.PL Makefile.PL PREFIX=/usr DESTDIR=/tmp/sa-build
|
||||
make
|
||||
make disttest
|
||||
make install
|
||||
cd /tmp/sa-build
|
||||
build_some_package
|
||||
|
||||
|
||||
Setting further options on the command line
|
||||
-------------------------------------------
|
||||
|
||||
Besides the directories, the build process of SpamAssassin supports several
|
||||
other settings to set or enable some features. For some of these settings
|
||||
the user is asked before the Makefile is created. To avoid these questions
|
||||
(and accept the defaults, whatever they are) it is possible to redirect
|
||||
STDIN from the null device like this:
|
||||
perl Makefile.PL < /dev/null
|
||||
Or, under Windows:
|
||||
perl Makefile.PL < nul
|
||||
|
||||
The following variables are supported:
|
||||
|
||||
ENABLE_SSL:
|
||||
Can be set to either "yes" or "no" (default). Makes it possible to use SSL
|
||||
encryption on the (TCP) connection between spamc and spamd.
|
||||
|
||||
Sample:
|
||||
Build spamc with SSL, use defaults for all other questions:
|
||||
perl Makefile.PL ENABLE_SSL=yes < /dev/null
|
||||
|
||||
CONTACT_ADDRESS:
|
||||
Each reported spam contains an address under which the confused user/client
|
||||
can request more information about the tagging of his mail. That address can
|
||||
be set here. The default is to query the buildung user, falling back to the
|
||||
string "the administrator of that system".
|
||||
|
||||
Sample:
|
||||
The user can find some information on the page http://example.com/tag/:
|
||||
perl Makefile.PL CONTACT_ADDRESS="http://example.com/tag/"
|
||||
|
||||
RUN_NET_TESTS:
|
||||
Vipul's Razor and Net::DNS are optional modules. If one of those modules is
|
||||
found to be installed, some special tests can be performed when 'make test'
|
||||
is run. The builder is asked if he wants to do so. Default is "no" (because
|
||||
those tests can fail if there are problems with the network connection or
|
||||
the servers).
|
||||
|
||||
Sample:
|
||||
Run only the Razor tests:
|
||||
perl Makefile.PL RUN_NET_TESTS=yes < /dev/null
|
||||
make test TEST_FILES="t/razor*.t"
|
||||
|
||||
|
||||
Twisting Perl details
|
||||
---------------------
|
||||
|
||||
The build process of SpamAssassin has to know several details of the Perl
|
||||
calling it later. This is used to work around some Perl bugs and make it
|
||||
all actually work :o) The following additional variables are supported to
|
||||
modify these settings:
|
||||
|
||||
PERL_BIN:
|
||||
The path to the perl application which will be used to call the scripts
|
||||
(like spamassassin and spamd). It makes sense to set this if you build
|
||||
SpamAssassin on some weird build host which happen to have Perl in
|
||||
/some/weird/location which is definitely not the location on the end
|
||||
user's box. The default is the value of the macro FULLPERL which should
|
||||
be the path to the perl processing Makefile.PL.
|
||||
|
||||
Sample:
|
||||
Building with some weird perl:
|
||||
/local/buildsys/perl-5.6.1/bin/perl Makefile.PL PERL_BIN=/usr/bin/perl
|
||||
|
||||
|
||||
Obsolete Variables
|
||||
------------------
|
||||
|
||||
The following list shows variables recognized by the old build system and
|
||||
their new counterparts (no, the ones in the end aren't in the wrong order,
|
||||
it actually was that complicated):
|
||||
|
||||
old: PREFIX=/bar/foo INST_PREFIX=/foo
|
||||
new: PREFIX=/foo DESTDIR=/bar
|
||||
|
||||
old: INST_SITELIB=/foo
|
||||
new: LIB=/foo
|
||||
|
||||
old: SYSCONFDIR=/bar/foo INST_SYSCONFDIR=/foo
|
||||
new: SYSCONFDIR=/foo DESTDIR=/bar
|
||||
|
||||
old: LOCAL_RULES_DIR=/foo PKG_LOCAL_RULES_DIR=/bar/foo
|
||||
new: LOCALRULESDIR=/foo DESTDIR=/bar
|
||||
|
||||
old: DEF_RULES_DIR=/foo PKG_DEF_RULES_DIR=/bar/foo
|
||||
new: DEFRULESDIR=/foo DESTDIR=/bar
|
||||
|
||||
Using one of the following variables will make the Makefile generation
|
||||
process die:
|
||||
INST_PREFIX
|
||||
INST_SITELIB
|
||||
INST_SYSCONFDIR
|
||||
LOCAL_RULES_DIR
|
||||
DEF_RULES_DIR
|
||||
|
||||
If you think you need to use one of those nevertheless, you can set the
|
||||
variable IGNORE_CRUFT to "yes".
|
||||
|
||||
|
||||
Resources
|
||||
---------
|
||||
|
||||
[BUGZILLA] SpamAssassin bug database:
|
||||
<http://issues.apache.org/SpamAssassin/>
|
||||
|
||||
[DEBPERL] Debian Perl Policy, Chapter 3: Packaged Modules:
|
||||
<http://www.debian.org/doc/packaging-manuals/perl-policy/ch-module_packages.html>
|
||||
|
||||
[GNUMAKECMD] GNU make manual: Make Conventions: Variables for Specifying
|
||||
Commands
|
||||
<http://www.gnu.org/manual/make-3.79.1/html_chapter/make_14.html#SEC119>
|
||||
|
||||
[MANEUMM616] The man page for ExtUtils::MakeMaker 6.16:
|
||||
<http://search.cpan.org/author/MSCHWERN/ExtUtils-MakeMaker-6.16/lib/ExtUtils/MakeMaker.pm#Default_Makefile_Behaviour>
|
||||
|
||||
[MM00779] makemaker-at-perl-dot-org: Michael G Schwern: "Re: MakeMaker
|
||||
problems with relocation" (PREFIX was broken):
|
||||
<http://www.mail-archive.com/makemaker@perl.org/msg00779.html>
|
||||
|
||||
[P5P94113] perl5-porters: Michael G Schwern: "Re: OS X's vendorlib default
|
||||
seems wrong" (description of different repositoreis):
|
||||
<http://archive.develooper.com/perl5-porters@perl.org/msg94113.html>
|
||||
|
||||
[RHBUG78053] Red Hat bug 78053: "incompatible changes in behavior of
|
||||
MakeMaker; affects rpm build process" (introduction of DESTDIR):
|
||||
<https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=78053>
|
||||
|
341
upstream/README
Normal file
341
upstream/README
Normal file
@ -0,0 +1,341 @@
|
||||
Welcome to Apache SpamAssassin!
|
||||
-------------------------------
|
||||
|
||||
What is Apache SpamAssassin
|
||||
---------------------------
|
||||
|
||||
Apache SpamAssassin is the #1 Open Source anti-spam platform giving
|
||||
system administrators a filter to classify email and block "spam"
|
||||
(unsolicited bulk email). It uses a robust scoring framework and plug-ins
|
||||
to integrate a wide range of advanced heuristic and statistical analysis
|
||||
tests on email headers and body text including text analysis, Bayesian
|
||||
filtering, DNS blocklists, and collaborative filtering databases.
|
||||
|
||||
Apache SpamAssassin is a project of the Apache Software Foundation (ASF).
|
||||
|
||||
|
||||
What Apache SpamAssassin Is Not
|
||||
-------------------------------
|
||||
|
||||
Apache SpamAssassin is not a program to delete spam, route spam and ham to
|
||||
separate mailboxes or folders, or send bounces when you receive spam.
|
||||
Those are mail routing functions, and Apache SpamAssassin is not a mail
|
||||
router. Apache SpamAssassin is a mail filter or classifier. It will examine
|
||||
each message presented to it, and assign a score indicating the
|
||||
likelihood that the mail is spam. An external program must then
|
||||
examine this score and do any routing the user wants done. There are
|
||||
many programs that will easily perform these functions after examining
|
||||
the score assigned by Apache SpamAssassin.
|
||||
|
||||
|
||||
How Apache SpamAssassin Works
|
||||
-----------------------------
|
||||
|
||||
Apache SpamAssassin uses a wide range of heuristic tests on mail headers and
|
||||
body text to identify "spam", also known as unsolicited commercial
|
||||
email.
|
||||
|
||||
Once identified, the mail can then be optionally tagged as spam for
|
||||
later filtering using the user's own mail user-agent application.
|
||||
|
||||
Apache SpamAssassin typically differentiates successfully between spam and
|
||||
non-spam in between 95% and 100% of cases, depending on what kind of mail
|
||||
you get and your training of its Bayesian filter. Specifically,
|
||||
Apache SpamAssassin has been shown to produce around 1.5% false negatives (spam
|
||||
that was missed) and around 0.06% false positives (ham incorrectly marked
|
||||
as spam). See the rules/STATISTICS*.txt files for more information.
|
||||
|
||||
Apache SpamAssassin also includes plugins to support reporting spam messages
|
||||
automatically or manually to collaborative filtering databases such as
|
||||
Pyzor, DCC, and Vipul's Razor.
|
||||
|
||||
The distribution provides "spamassassin", a command line tool to
|
||||
perform filtering, along with the "Mail::SpamAssassin" module set
|
||||
which allows Apache SpamAssassin to be used in spam-protection proxy SMTP or
|
||||
POP/IMAP server, or a variety of different spam-blocking scenarios.
|
||||
|
||||
In addition, "spamd", a daemonized version of Apache SpamAssassin which
|
||||
runs persistently, is available. Using its counterpart, "spamc",
|
||||
a lightweight client written in C, an MTA can process large volumes of
|
||||
mail through Apache SpamAssassin without having to fork/exec a perl interpreter
|
||||
for each message.
|
||||
|
||||
|
||||
Questions? Need Help?
|
||||
---------------------
|
||||
|
||||
If you have questions about Apache SpamAssassin, please check the Wiki[1] to
|
||||
see if someone has already posted an answer to your question. (The
|
||||
Wiki doubles as a FAQ.) Failing that, post a message to the
|
||||
spamassassin-users mailing list[2]. If you've found a bug (and you're
|
||||
sure it's a bug after checking the Wiki), please file a report in our
|
||||
Bugzilla[3].
|
||||
|
||||
[1]: http://wiki.apache.org/spamassassin/
|
||||
[2]: http://wiki.apache.org/spamassassin/MailingLists
|
||||
[3]: http://issues.apache.org/SpamAssassin/
|
||||
|
||||
Please also be sure to read the man pages.
|
||||
|
||||
|
||||
Upgrading Apache SpamAssassin
|
||||
-----------------------------
|
||||
|
||||
IMPORTANT: If you are upgrading from a previous major version of Apache
|
||||
SpamAssassin, please be sure to read the notes in UPGRADE to find out
|
||||
what has changed in a non- backward compatible way.
|
||||
|
||||
|
||||
Installing Apache SpamAssassin
|
||||
------------------------------
|
||||
|
||||
See the INSTALL file.
|
||||
|
||||
|
||||
Customizing Apache SpamAssassin
|
||||
-------------------------------
|
||||
|
||||
These are the configuration files installed by Apache SpamAssassin. The commands
|
||||
that can be used therein are listed in the POD documentation for the
|
||||
Mail::SpamAssassin::Conf class (run the following command to read it:
|
||||
"perldoc Mail::SpamAssassin::Conf"). Note: The following directories are
|
||||
the standard defaults that people use. There is an explanation of all the
|
||||
default locations that Apache SpamAssassin will look at the end.
|
||||
|
||||
- /usr/share/spamassassin/*.cf:
|
||||
|
||||
Distributed configuration files, with all defaults. Do not modify
|
||||
these, as they are overwritten when you upgrade.
|
||||
|
||||
- /var/lib/spamassassin/*/*.cf:
|
||||
|
||||
Local state directory; updated rulesets, overriding the
|
||||
distributed configuration files, downloaded using "sa-update". Do
|
||||
not modify these, as they are overwritten when you run
|
||||
"sa-update".
|
||||
|
||||
- /etc/mail/spamassassin/*.cf:
|
||||
|
||||
Site config files, for system admins to create, modify, and
|
||||
add local rules and scores to. Modifications here will be
|
||||
appended to the config loaded from the above directory.
|
||||
|
||||
- /etc/mail/spamassassin/*.pre:
|
||||
|
||||
Plugin control files, installed from the distribution. These are
|
||||
used to control what plugins are loaded. Modifications here will
|
||||
be loaded before any configuration loaded from the above
|
||||
directories.
|
||||
|
||||
You want to modify these files if you want to load additional
|
||||
plugins, or inhibit loading a plugin that is enabled by default.
|
||||
If the files exist in /etc/mail/spamassassin, they will not
|
||||
be overwritten during future installs.
|
||||
|
||||
- /usr/share/spamassassin/user_prefs.template:
|
||||
|
||||
Distributed default user preferences. Do not modify this, as it is
|
||||
overwritten when you upgrade.
|
||||
|
||||
- /etc/mail/spamassassin/user_prefs.template:
|
||||
|
||||
Default user preferences, for system admins to create, modify, and
|
||||
set defaults for users' preferences files. Takes precedence over
|
||||
the above prefs file, if it exists.
|
||||
|
||||
Do not put system-wide settings in here; put them in a file in the
|
||||
"/etc/mail/spamassassin" directory ending in ".cf". This file is
|
||||
just a template, which will be copied to a user's home directory
|
||||
for them to change.
|
||||
|
||||
- $USER_HOME/.spamassassin:
|
||||
|
||||
User state directory. Used to hold spamassassin state, such
|
||||
as a per-user automatic whitelist, and the user's preferences
|
||||
file.
|
||||
|
||||
- $USER_HOME/.spamassassin/user_prefs:
|
||||
|
||||
User preferences file. If it does not exist, one of the
|
||||
default prefs file from above will be copied here for the
|
||||
user to edit later, if they wish.
|
||||
|
||||
Unless you're using spamd, there is no difference in
|
||||
interpretation between the rules file and the preferences file, so
|
||||
users can add new rules for their own use in the
|
||||
"~/.spamassassin/user_prefs" file, if they like. (spamd disables
|
||||
this for security and increased speed.)
|
||||
|
||||
- $USER_HOME/.spamassassin/bayes*
|
||||
|
||||
Statistics databases used for Bayesian filtering. If they do
|
||||
not exist, they will be created by Apache SpamAssassin.
|
||||
|
||||
Spamd users may wish to create a shared set of bayes databases;
|
||||
the "bayes_path" and "bayes_file_mode" configuration settings
|
||||
can be used to do this.
|
||||
|
||||
See "perldoc sa-learn" for more documentation on how
|
||||
to train this.
|
||||
|
||||
File Locations:
|
||||
|
||||
Apache SpamAssassin will look in a number of areas to find the default
|
||||
configuration files that are used. The "__*__" text are variables
|
||||
whose value you can see by looking at the first several lines of the
|
||||
"spamassassin" or "spamd" scripts.
|
||||
|
||||
They are set on install time and can be overridden with the Makefile.PL
|
||||
command line options DATADIR (for __def_rules_dir__) and CONFDIR (for
|
||||
__local_rules_dir__). If none of these options were given, FHS-compliant
|
||||
locations based on the PREFIX (which becomes __prefix__) are chosen.
|
||||
These are:
|
||||
|
||||
__prefix__ __def_rules_dir__ __local_rules_dir__
|
||||
-------------------------------------------------------------------------
|
||||
/usr /usr/share/spamassassin /etc/mail/spamassassin
|
||||
/usr/local /usr/local/share/spamassassin /etc/mail/spamassassin
|
||||
/opt/$DIR /opt/$DIR/share/spamassassin /etc/opt/mail/spamassassin
|
||||
$DIR $DIR/share/spamassassin $DIR/etc/mail/spamassassin
|
||||
|
||||
The files themselves are then looked for in these paths:
|
||||
|
||||
- Distributed Configuration Files
|
||||
'__def_rules_dir__'
|
||||
'__prefix__/share/spamassassin'
|
||||
'/usr/local/share/spamassassin'
|
||||
'/usr/share/spamassassin'
|
||||
|
||||
- Site Configuration Files
|
||||
'__local_rules_dir__'
|
||||
'__prefix__/etc/mail/spamassassin'
|
||||
'__prefix__/etc/spamassassin'
|
||||
'/usr/local/etc/spamassassin'
|
||||
'/usr/pkg/etc/spamassassin'
|
||||
'/usr/etc/spamassassin'
|
||||
'/etc/mail/spamassassin'
|
||||
'/etc/spamassassin'
|
||||
|
||||
- Default User Preferences File
|
||||
'__local_rules_dir__/user_prefs.template'
|
||||
'__prefix__/etc/mail/spamassassin/user_prefs.template'
|
||||
'__prefix__/share/spamassassin/user_prefs.template'
|
||||
'/etc/spamassassin/user_prefs.template'
|
||||
'/etc/mail/spamassassin/user_prefs.template'
|
||||
'/usr/local/share/spamassassin/user_prefs.template'
|
||||
'/usr/share/spamassassin/user_prefs.template'
|
||||
|
||||
|
||||
In addition, the "Distributed Configuration Files" location is overridden
|
||||
by a "Local State Directory", used to store an updated copy of the
|
||||
ruleset:
|
||||
|
||||
__prefix__ __local_state_dir__
|
||||
-------------------------------------------------------------------------
|
||||
/usr /var/lib/spamassassin/__version__
|
||||
/usr/local /var/lib/spamassassin/__version__
|
||||
/opt/$DIR /var/opt/spamassassin/__version__
|
||||
$DIR $DIR/var/spamassassin/__version__
|
||||
|
||||
This is normally written to by the "sa-update" script. "__version__" is
|
||||
replaced by a representation of the version number, so that multiple
|
||||
versions of Apache SpamAssassin will not interfere with each other's rulesets.
|
||||
|
||||
|
||||
After installation, try "perldoc Mail::SpamAssassin::Conf" to see what
|
||||
can be set. Common first-time tweaks include:
|
||||
|
||||
- required_score
|
||||
|
||||
Set this higher to make Apache SpamAssassin less sensitive.
|
||||
If you are installing Apache SpamAssassin system-wide, this is
|
||||
**strongly** recommended!
|
||||
|
||||
Statistics on how many false positives to expect at various
|
||||
different thresholds are available in the "STATISTICS.txt" file in
|
||||
the "rules" directory.
|
||||
|
||||
- rewrite_header, add_header
|
||||
|
||||
These options affect the way messages are tagged as spam or
|
||||
non-spam. This makes it easy to identify incoming mail.
|
||||
|
||||
- ok_locales
|
||||
|
||||
If you expect to receive mail in non-ISO-8859 character sets (ie.
|
||||
Chinese, Cyrillic, Japanese, Korean, or Thai) then set this.
|
||||
|
||||
|
||||
Learning
|
||||
--------
|
||||
|
||||
Apache SpamAssassin includes a Bayesian learning filter, so it is worthwhile
|
||||
training Apache SpamAssassin with your collection of non-spam and spam,
|
||||
if possible. This will make it more accurate for your incoming mail.
|
||||
Do this using the "sa-learn" tools, like so:
|
||||
|
||||
sa-learn --spam ~/Mail/saved-spam-folder
|
||||
sa-learn --ham ~/Mail/inbox
|
||||
sa-learn --ham ~/Mail/other-nonspam-folder
|
||||
|
||||
|
||||
If these are mail folders in mbox format, use the --mbox switch, for
|
||||
Maildirs use a trailing slash, like Maildir/cur/.
|
||||
|
||||
Use as many mailboxes as you like. Note that Apache SpamAssassin will remember
|
||||
what mails it has learnt from, so you can re-run this as often as you like.
|
||||
|
||||
|
||||
Localization
|
||||
------------
|
||||
|
||||
All text displayed to users is taken from the configuration files. This
|
||||
means that you can translate messages, test descriptions, and templates
|
||||
into other languages.
|
||||
|
||||
If you do so, we would *really* appreciate it if you could contribute
|
||||
these translations, so that they can be added to the
|
||||
distribution. Please file a bug in our Bugzilla[4], and attach your
|
||||
translations. You will, of course, be credited for this work!
|
||||
|
||||
[4]: http://issues.apache.org/SpamAssassin/
|
||||
|
||||
|
||||
Disabled code
|
||||
-------------
|
||||
|
||||
There are some tests and code in Apache SpamAssassin that are turned off by
|
||||
default: experimental code, slow code, or code that depends on
|
||||
non-open-source software or services that are not always free. These
|
||||
disabled tests include:
|
||||
|
||||
- DCC: depends on non-open-source software (disabled in init.pre)
|
||||
- MAPS: commercial service (disabled in 50_scores.cf)
|
||||
- TextCat: slow (disabled in init.pre)
|
||||
- various optional plugins, disabled for speed (disabled in *.pre)
|
||||
|
||||
To turn on tests disabled in 50_scores.cf, simply assign them a non-zero
|
||||
score, e.g. by adding score lines to your ~/.spamassassin/user_prefs file.
|
||||
|
||||
To turn on tests disabled by commenting out the required plugin in
|
||||
init.pre, you need to uncomment the loadplugin line and make sure the
|
||||
prerequisites for proper operation of the plugin are present.
|
||||
|
||||
|
||||
Automatic Reputation System
|
||||
--------------------------
|
||||
|
||||
Apache SpamAssassin includes an automatic reputation system. The way it works is
|
||||
by tracking for each sender address a rolling average score of messages
|
||||
so far seen from there. Then, it combines this long-term average score
|
||||
for the sender with the score for the particular message being evaluated,
|
||||
after all other rules have been applied.
|
||||
|
||||
This functionality can be enabled or disabled with the
|
||||
"use_txrep" option.
|
||||
|
||||
For more information, read sql/README.txrep
|
||||
|
||||
(end of README)
|
||||
|
||||
// vim:tw=74:
|
67
upstream/TRADEMARK
Normal file
67
upstream/TRADEMARK
Normal file
@ -0,0 +1,67 @@
|
||||
Guidelines for Use of the SpamAssassin Trademark
|
||||
|
||||
The goal of these guidelines is to minimize the likelihood that email
|
||||
users and other relevant readers will be confused as to the relationship
|
||||
between the licensees of the SpamAssassin(tm) technology, and the project
|
||||
itself. The Apache Software Foundation has acquired the trademark and
|
||||
will be taking the appropriate steps to protect the valuable goodwill that
|
||||
has developed under the SpamAssassin mark.
|
||||
|
||||
What do these guidelines cover?
|
||||
|
||||
These guidelines explain how you can use the trademark SpamAssassin to
|
||||
refer to the SpamAssassin software as required for reasonable and
|
||||
customary use in describing the origin of the software and reproducing
|
||||
the content of the NOTICE file. All such use is at your own risk.
|
||||
Without advance written permission from the Apache Software Foundation,
|
||||
you may not use the SpamAssassin trademark, logos, or artwork for any
|
||||
other purpose. Under no circumstances may you use SpamAssassin in or as
|
||||
part of a product, service, company, domain, or other name, or in any
|
||||
slogan, tag line, promotional campaign title, advertising hook, or meta
|
||||
tag. If you have any concerns about whether your plans might conflict
|
||||
with these guidelines, please contact the Apache Software Foundation at
|
||||
<human-response@apache.org>.
|
||||
|
||||
What are the rules?
|
||||
|
||||
Proper use of the SpamAssassin trademark is simple: (1) stylize the mark
|
||||
properly, (2) use it in a grammatically appropriate way, (3) use the (tm)
|
||||
marking and attribute ownership in the fine print, and (4) avoid any
|
||||
misleading usage. The following paragraphs provide additional detail on
|
||||
these four steps.
|
||||
|
||||
(1) Stylization of the SpamAssassin mark.
|
||||
|
||||
SpamAssassin should always be written in mixed case with the initial S and
|
||||
the first A in assassin in capital letters, and the other letters in
|
||||
lowercase type. There is no space between "Spam" and "Assassin".
|
||||
|
||||
(2) Grammatically Appropriate Use of the SpamAssassin mark.
|
||||
|
||||
A trademark is used correctly when the word "brand" would comfortably fit
|
||||
between the mark and the following word. "SpamAssassin [brand] software"
|
||||
and "SpamAssassin [brand] technology" are correct uses; "SpamAssassin
|
||||
[brand] will intercept" is not. A trademark is an adjective that should
|
||||
be followed by an appropriate generic term. It may be cumbersome to
|
||||
always make technically correct use of the SpamAssassin trademark, but you
|
||||
should do so the first time you refer to the mark and as often as possible
|
||||
thereafter.
|
||||
|
||||
Because a trademark is an adjective, it should never be used in a
|
||||
possessive form ("SpamAssassin's") or made plural ("SpamAssassins") or
|
||||
used as a verb. You also should not create new forms of the trademark
|
||||
("SpamAssassinate"). Consistency in repetition helps make a mark more
|
||||
memorable, and will promote the success of the SpamAssassin project.
|
||||
|
||||
(3) Markings and Fine Print.
|
||||
|
||||
The appropriate marking to use with SpamAssassin is the (tm) symbol
|
||||
(™). At the bottom of the page in which SpamAssassin is referenced,
|
||||
you should add the statement "SpamAssassin is a trademark of the Apache
|
||||
Software Foundation".
|
||||
|
||||
(4) Avoiding Misleading Use.
|
||||
|
||||
The SpamAssassin mark must never be used to imply that the Apache Software
|
||||
Foundation or the SpamAssassin project wrote, tested, endorses, or
|
||||
approves any particular third party product.
|
589
upstream/UPGRADE
Normal file
589
upstream/UPGRADE
Normal file
@ -0,0 +1,589 @@
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.4
|
||||
----------------------------------------------
|
||||
|
||||
- FromNameSpoof: fns_extrachars parameter default value has been increased to 50
|
||||
|
||||
- nosubject and maxhits tflags now work correctly with sa-compile
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.3
|
||||
----------------------------------------------
|
||||
|
||||
- New subjprefix keyword added, this can be used to add a prefix to
|
||||
email Subject if the original email matches a particular rule
|
||||
|
||||
- New Util::is_fqdn_valid() function to validate hostname (DNS name) format
|
||||
(Bug 7736). To check if a name contains valid TLD, it's still needed to
|
||||
additionally use RegistryBoundaries::is_domain_valid()
|
||||
|
||||
- New OLEVBMacro plugin to detect OLE Macro inside documents attached to emails,
|
||||
this plugin requires Archive::Zip and IO::String Perl modules to work.
|
||||
|
||||
- Due to the dangerous nature of sa-update --allowplugins option, it
|
||||
now prints a warning that --reallyallowplugins is required to use it.
|
||||
This is to make sure all the legacy installations and wiki guides etc
|
||||
still using it needlessly get fixed.
|
||||
|
||||
- TxRep and Awl plugins has been modified to be compatible
|
||||
with latest Postgresql versions.
|
||||
You should upgrade your sql database running the following command:
|
||||
MySQL:
|
||||
"ALTER TABLE `txrep` CHANGE `count` `msgcount` INT(11) NOT NULL DEFAULT '0';"
|
||||
"ALTER TABLE `awl` CHANGE `count` `msgcount` INT(11) NOT NULL DEFAULT '0';"
|
||||
"ALTER TABLE `awl` ADD last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP;"
|
||||
PostgreSQL:
|
||||
"ALTER TABLE txrep RENAME COLUMN count TO msgcount;"
|
||||
"ALTER TABLE awl RENAME COLUMN count TO msgcount;"
|
||||
"ALTER TABLE awl ADD last_hit timestamp NOT NULL default CURRENT_TIMESTAMP;"
|
||||
|
||||
- body_part_scan_size 50000, rawbody_part_scan_size 500000 defaults added (Bug 6582)
|
||||
These enable safer and faster scanning of large emails.
|
||||
|
||||
- ALL pseudo-header now returns decoded headers, so it's usage is consistent
|
||||
with single header matching (:raw returns undecoded and folded like before).
|
||||
|
||||
- RegistryBoundaries did not load 20_aux_tlds.cf properly in older versions.
|
||||
Old hardcoded list is now removed and RB will print "no tlds defined, need
|
||||
to run sa-update" unless it can find list from config files.
|
||||
|
||||
- Deprecated functions: Parser::is_delimited_regexp_valid(),
|
||||
Parser::is_regexp_valid(), Util::regexp_remove_delimiters(),
|
||||
Util::make_qr(). These all are combined into new Util::compile_regexp().
|
||||
|
||||
- DNSEval: add check_rbl_headers to check specific headers in rbl
|
||||
|
||||
- DNSEval: add check_rbl_ns_from to check against an rbl for dns servers
|
||||
|
||||
- HashBL: Add check_hashbl_bodyre, check_hashbl_emails, check_hashbl_uris,
|
||||
hashbl_ignore
|
||||
|
||||
- ASN: Support IPv6 with asn_lookup_ipv6 (Bug 7211)
|
||||
|
||||
- sa-update: New option --httputil to force used download utility
|
||||
|
||||
- Add rules_matching() expression to meta rules
|
||||
|
||||
- Add tflags domains_only/ips_only to DNSEval.pm functions
|
||||
|
||||
- RelayCountry: Added new metadata: X-Spam-Countries-External (_RELAYCOUNTRYEXT_),
|
||||
X-Spam-Countries-Auth (_RELAYCOUNTRYAUTH_), X-Spam-Countries-All (_RELAYCOUNTRYALL_)
|
||||
|
||||
- New tflag "nosubject" for 'body' rules, to stop matching the Subject
|
||||
header which is part of the body text.
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.2
|
||||
----------------------------------------------
|
||||
|
||||
- We now support SHA-512 and SHA-256 signatures for our rules updates.
|
||||
|
||||
- We may stop producing SHA-1 signatures in the near future so upgrading
|
||||
to 3.4.2 is important. sa-update no longer uses these signatures.
|
||||
|
||||
See https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7614
|
||||
|
||||
- freemail_import_whitelist_auth, freemail_import_def_whitelist_auth added (Bug 6451)
|
||||
|
||||
|
||||
New plugins
|
||||
-----------
|
||||
|
||||
There are four new plugins added with this release:
|
||||
|
||||
Mail::SpamAssassin::Plugin::HashBL
|
||||
|
||||
The HashBL plugin is the interface to The Email Blocklist (EBL).
|
||||
The EBL is intended to filter spam that is sent from IP addresses
|
||||
and domains that cannot be blocked without causing significant
|
||||
numbers of false positives.
|
||||
|
||||
Mail::SpamAssassin::Plugin::ResourceLimits
|
||||
|
||||
This plugin leverages BSD::Resource to assure your spamd child processes
|
||||
do not exceed specified CPU or memory limit. If this happens, the child
|
||||
process will die. See the BSD::Resource for more details.
|
||||
|
||||
Mail::SpamAssassin::Plugin::FromNameSpoof
|
||||
|
||||
This plugin allows for detection of the From:name field being used to mislead
|
||||
recipients into thinking an email is from another address. The man page
|
||||
includes examples and we expect to put test rules for this plugin into
|
||||
rulesrc soon!
|
||||
|
||||
Mail::SpamAssassin::Plugin::Phishing
|
||||
|
||||
This plugin finds uris used in phishing campaigns detected by
|
||||
OpenPhish (https://openphish.com) or PhishTank (https://phishtank.com) feeds.
|
||||
|
||||
These plugins are disabled by default. To enable, uncomment
|
||||
the loadplugin configuration options in file v342.pre, or add it to
|
||||
some local .pre file such as local.pre .
|
||||
|
||||
Notable changes
|
||||
---------------
|
||||
|
||||
For security reasons SSLv3 support has been removed from spamc(1).
|
||||
|
||||
GeoIP2 support has been added to RelayCountry and URILocalBL plugins due
|
||||
to GeoIP legacy api deprecations.
|
||||
|
||||
New configuration options
|
||||
-------------------------
|
||||
|
||||
A new template tag _DKIMSELECTOR_ that maps to the DKIM selector (the 's' tag)
|
||||
from valid signatures has been added.
|
||||
|
||||
A 'uri_block_cont' option to URILocalBL plugin to score uris per continent has been added.
|
||||
Possible continent codes are:
|
||||
af, as, eu, na, oc, sa for Africa, Asia, Europe, North America,
|
||||
Oceania and South America.
|
||||
|
||||
The 'country_db_type' and 'country_db_path' options has been added to be able
|
||||
to choose in RelayCountry plugin between GeoIP legacy
|
||||
(discontinued from 04/01/2018), GeoIP2, IP::Country::Fast
|
||||
and IP::Country::DB_File.
|
||||
GeoIP legacy is still the default option but it will be deprecated
|
||||
in future releases.
|
||||
|
||||
A config option 'uri_country_db_path' has been added to be able to choose
|
||||
in URILocalBL plugin between GeoIP legacy and new GeoIP2 api.
|
||||
|
||||
A config option 'resource_limit_cpu' (default: 0 or no limit) has been added
|
||||
to configure how many cpu cycles are allowed on a child process before it dies.
|
||||
|
||||
A config option 'resource_limit_mem' (default: 0 or no limit) has been added
|
||||
to configure the maximum number of bytes of memory allowed both for
|
||||
(virtual) address space bytes and resident set size.
|
||||
|
||||
A new config option 'report_wrap_width' (default: 70) has been added
|
||||
to set the wrap width for description lines in the X-Spam-Report header.
|
||||
|
||||
Notable Internal changes
|
||||
------------------------
|
||||
|
||||
SpamAssassin can cope with new Net::DNS module versions.
|
||||
|
||||
The "bytes" pragma has been remove from both core modules and plugins for
|
||||
better utf-8 compatibility, there has been also some other utf-8 related fixes.
|
||||
|
||||
The spamc(1) client can now be build against OpenSSL 1.1.0.
|
||||
|
||||
The test framework has been switched to Test::More module.
|
||||
|
||||
Other updates
|
||||
-------------
|
||||
|
||||
A list of top-level domains in registrar boundaries was updated.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.1
|
||||
----------------------------------------------
|
||||
|
||||
- The TxRep plugin is now included and disabled by default for new installs.
|
||||
To replace an existing AWL configuration with TxRep, follow the steps below:
|
||||
- Disable AWL
|
||||
- Enable TxRep
|
||||
- Set txrep_factor equal to your previous AWL factor
|
||||
- Set use_txrep to 1
|
||||
|
||||
For more detailed information and more configuration options, consult the
|
||||
documentation in Mail::SpamAssassin::Plugin::TxRep.
|
||||
|
||||
- The $VALID_TLDS_RE global in registrar boundaries is deprecated but kept for
|
||||
third-party plugin compatibility. It will become increasingly out of date
|
||||
and may be removed in a future release.
|
||||
|
||||
See lib/Mail/SpamAssassin/Plugin/FreeMail.pm for an example of the new way
|
||||
to obtain a valid list of TLDs, i.e.
|
||||
|
||||
$self->{main}->{registryboundaries}->{valid_tlds_re}
|
||||
|
||||
- Mail::SpamAssassin::Util::RegistrarBoundaries is being replaced by
|
||||
Mail::SpamAssassin::RegistryBoundaries so that new TLDs can be updated via
|
||||
20_aux_tlds.cf delivered via sa-update.
|
||||
|
||||
***3rd Party Plugin Authors, Please Note***
|
||||
|
||||
The following functions will be removed in the next release after 3.4.1
|
||||
excepting any emergency break/fix releases immediately after 3.4.1:
|
||||
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::is_domain_valid
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::trim_domain
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::split_domain
|
||||
Mail::SpamAssassin::Util::uri_to_domain
|
||||
|
||||
And the following variables will also be removed:
|
||||
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::US_STATES
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::THREE_LEVEL_DOMAINS
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::TWO_LEVEL_DOMAINS
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::VALID_TLDS_RE
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::VALID_TLDS
|
||||
|
||||
|
||||
This change should only affect 3rd party plugin authors who will need to
|
||||
update their code to utilize Mail::SpamAssassin::RegistryBoundaries
|
||||
instead of the functions and variables in
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries and the function
|
||||
Mail::SpamAssassin::Util::uri_to_domain which are deprecated and will be
|
||||
removed.
|
||||
|
||||
For example, the $VALID_TLDS_RE global in registrar boundaries is
|
||||
deprecated but kept for third-party plugin compatibility. It will become
|
||||
increasingly out of date and may be removed in a future release.
|
||||
|
||||
See lib/Mail/SpamAssassin/Plugin/FreeMail.pm for an example of the new way
|
||||
to obtain a valid list of TLDs, i.e.
|
||||
|
||||
$self->{main}->{registryboundaries}->{valid_tlds_re}
|
||||
|
||||
- It is now recommended that users uncomment "normalize_charset 1" in
|
||||
local.cf. It will break rules that depend on messages being in non-UTF8
|
||||
encodings, but going forward this will enable more robust unicode rules and
|
||||
will eventually become the default.
|
||||
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.0
|
||||
----------------------------------------------
|
||||
|
||||
- When Bayes classification is in use and messages are 'learned' as spam
|
||||
or ham and stored in a database, the Bayes plugin generates internal
|
||||
message IDs of learned messages and stores them in a 'seen' database to
|
||||
avoid re-learning duplicates and accidental un-learning messages that
|
||||
were not previously learned. With changes in bug 5185, the calculation
|
||||
of message IDs in a bayes 'seen' database has changed, so new code can
|
||||
no longer associate new messages with those learned before the change.
|
||||
|
||||
- Note that this change does not affect recognition of old tokens and the
|
||||
classification algorithm, only duplicate detection and unlearning of old
|
||||
messages is affected.
|
||||
|
||||
- Because of this change, if you use Bayes and you are upgrading from a
|
||||
version prior to 3.4.0, you may consider wiping your Bayes database
|
||||
and starting fresh.
|
||||
|
||||
- There is a new optional dependency on Net::Patricia to speed up lookups
|
||||
on internal_networks, trusted_networks or msa_networks when these lists
|
||||
contain a larger number of entries. Consider installing this module to
|
||||
speed up classification.
|
||||
|
||||
- The minimal required version of NetAddr::IP was bumped to 4.010
|
||||
|
||||
- In addition to existing backends, the 3.4.0 introduces support for keeping
|
||||
a Bayes database on a Redis server, either running locally, or accessed
|
||||
over network. Similar to SQL backends, the database may be concurrently
|
||||
used by several hosts running SpamAssassin.
|
||||
|
||||
- For more detail on these and other changes, please see the Announcement
|
||||
file at:
|
||||
http://svn.apache.org/repos/asf/spamassassin/branches/3.4/build/announcements/3.4.0.txt
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.3.0
|
||||
-----------------------------------------------
|
||||
|
||||
- Rules are no longer included with SpamAssassin "out of the box". You will
|
||||
need to immediately run "sa-update", or download the additional rules .tgz
|
||||
package and run "sa-update --install" with it, to get a ruleset.
|
||||
|
||||
- The BETA label has been taken off of the SpamAssassin SQL support. Please
|
||||
be aware that occasional changes may still be made to this area of the
|
||||
code. You should be sure to read this upgrade document each time you
|
||||
install a new version to determine if any SQL updates need to be made to
|
||||
your local installation.
|
||||
|
||||
- The DKIM plugin is now enabled by default for new installs, if the perl
|
||||
module Mail::DKIM is installed. However, installation of SpamAssassin
|
||||
will not overwrite existing .pre configuration files, so to use DKIM when
|
||||
upgrading from a previous release that did not use DKIM, a directive:
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::DKIM
|
||||
|
||||
will need to be uncommented in file "v312.pre", or added to some
|
||||
other .pre file, such as local.pre.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.2.0
|
||||
-----------------------------------------------
|
||||
|
||||
- The "127/8" network, including 127.0.0.1, is now always implicitly part of
|
||||
"trusted_networks" and "internal_networks". It's impossible to remove these
|
||||
from the trusted/internal sets, since if you cannot trust the host where
|
||||
SpamAssassin is running, you cannot trust SpamAssassin itself. If you
|
||||
previously had "trusted_networks" and "internal_networks" lines listing those
|
||||
hosts, you should now remove them, otherwise a minor (non-lint-error) warning
|
||||
will be output.
|
||||
|
||||
- For ISPs -- version 3.2.0 now includes a new way to specify Mail Submission
|
||||
Agents, relay hosts which accept mail from your own users and authenticates
|
||||
them appropriately. See the Mail::SpamAssassin::Conf manual page for the
|
||||
"msa_networks" setting.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.1.0
|
||||
-----------------------------------------------
|
||||
|
||||
- A significant amount of core functionality has been moved into
|
||||
plugins. These include, AWL (auto-whitelist), DCC, Pyzor, Razor2,
|
||||
SpamCop reporting and TextCat. For information on configuring these
|
||||
plugins please refer to their individual documentation:
|
||||
perldoc Mail::SpamAssassin::Plugin::* (ie AWL, DCC, etc)
|
||||
|
||||
- There are now multiple files read to enable plugins in the
|
||||
/etc/mail/spamassassin directory; previously only one, "init.pre" was
|
||||
read. Now both "init.pre", "v310.pre", and any other files ending
|
||||
in ".pre" will be read. As future releases are made, new plugins
|
||||
will be added to new files named according to the release they're
|
||||
added in.
|
||||
|
||||
- Due to license restrictions the DCC and Razor2 plugins are disabled
|
||||
by default. We encourage you to read the appropriate license
|
||||
yourself and decide if you are able to re-enable the plugins for
|
||||
your site.
|
||||
|
||||
- The use_auto_whitelist config option has been moved to a user config
|
||||
option, this allows individual users to turn on or off whitelisting
|
||||
regardless of what is set in the system config. If you would like to
|
||||
disable AWL (auto-whitelist) on a site-wide basis, then you can comment
|
||||
out the plugin in "v310.pre".
|
||||
|
||||
- The bayes_auto_learn_threshold_* config options for bayes have moved
|
||||
to a plugin. In general the default should work just fine however
|
||||
if you are interested in changing these values you should see:
|
||||
perldoc Mail::SpamAssassin::Plugin::AutoLearnThreshold
|
||||
|
||||
- The AWL support for NDBM_File databases has been dropped, due to a
|
||||
bug in that package which renders it useless (bug 4353). It appears
|
||||
that SDBM_File, the package which will be used instead, is fully
|
||||
compatible with NDBM however, so this should be unnoticeable.
|
||||
|
||||
- The prefork algorithm for spamd has been changed. In this version spamd
|
||||
will attempt to keep a small number of "hot" child processes as busy as
|
||||
possible, and keep any others as idle as possible, using something
|
||||
similar to the Apache httpd server scaling algorithm. This reduces
|
||||
memory usage and swapping. You can use the --round-robin switch for
|
||||
spamd to disable this scaling algorithm, and the behaviour seen in the
|
||||
3.0.x versions will be used instead, where all processes receive an
|
||||
equal load and no scaling takes place.
|
||||
|
||||
- As of 3.1.0, in addition to the generic BayesSQL support (via
|
||||
Mail::SpamAssassin::BayesStore::SQL) usable by multiple database
|
||||
drivers there is now specific support for MySQL 4.1+ and
|
||||
PostgreSQL. This support is based on non-standard features present
|
||||
in both database servers that allow for various performance boosts.
|
||||
|
||||
If you were using the previous BayesSQL support with MySQL, and
|
||||
already have MySQL 4.1+ installed you can begin using the new module
|
||||
immediately by replacing the bayes_store_module line in your
|
||||
configuration with: Mail::SpamAssassin::BayesStore::MySQL
|
||||
|
||||
We do however recommend that you switch your MySQL tables over to
|
||||
InnoDB for better data integrity and multi user support. You can
|
||||
most often do this via a simple ALTER TABLE command, refer to the
|
||||
MySQL documentation for more information.
|
||||
|
||||
If you were previously using PostgreSQL for your bayes database then
|
||||
we STRONGLY recommend switching to the PostgreSQL specific backend:
|
||||
Mail::SpamAssassin::BayesStore::PgSQL
|
||||
To switch to this backend you should first run sa-learn --backup to
|
||||
make a backup of your existing data and then drop and recreate the
|
||||
database following the instructions in sql/README.bayes. Then you
|
||||
can restore the database with sa-learn --restore. If you have
|
||||
multiple users then you will have to run --backup and --restore for
|
||||
each user to fully restore the database.
|
||||
|
||||
- See http://wiki.apache.org/spamassassin/UpgradeTo310 for a
|
||||
supplementary list of upgrade notes. It will be updated with any
|
||||
upgrade notes not captured in this document.
|
||||
|
||||
- Finally, this document is likely not complete. Other configuration
|
||||
options/arguments may have changed from older versions, etc. It would
|
||||
be good to double-check any custom configuration options to make sure
|
||||
they're still valid. This could be as simple as running "spamassassin
|
||||
--lint", or more complex, as required by the environment.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.0.x
|
||||
----------------------------------------------
|
||||
|
||||
- The SpamAssassin 2.6x release series was the last set of releases to
|
||||
officially support perl versions earlier than perl 5.6.1. If you are
|
||||
using an earlier version of perl, you will need to upgrade before you
|
||||
can use the 3.0.0 version of SpamAssassin. You will also want to make
|
||||
sure that you have the appropriate versions of required and optional
|
||||
modules as they may have changed from old versions. The INSTALL
|
||||
document has the modules and version requirements listed.
|
||||
|
||||
- See http://wiki.apache.org/spamassassin/UpgradeTo300 for a
|
||||
supplementary list of upgrade notes. It will be updated with any
|
||||
upgrade notes not captured in this document.
|
||||
|
||||
- SpamAssassin 3.0.0 has a significantly different API (Application Program
|
||||
Interface) from the 2.x series of code. This means that if you use
|
||||
SpamAssassin through a third-party utility (milter, etc,) you need to make
|
||||
sure you have an updated version which supports 3.0.0. See
|
||||
http://wiki.apache.org/spamassassin/UpgradeTo300 for information about
|
||||
third-party software.
|
||||
|
||||
- The --auto-whitelist, --whitelist and -a options for "spamd" and
|
||||
"spamassassin" to turn on the auto-whitelist have been removed and
|
||||
replaced by the "use_auto_whitelist" configuration option which is
|
||||
also now turned on by default.
|
||||
|
||||
- The --virtual-config switch for spamd had to be dropped, due to licensing
|
||||
issues. It is replaced by the --virtual-config-dir switch.
|
||||
|
||||
- The "rewrite_subject" and "subject_tag" configuration options were
|
||||
deprecated and are now removed. Instead, using "rewrite_header Subject
|
||||
[your desired setting]". e.g.
|
||||
|
||||
rewrite_subject 1
|
||||
subject_tag ****SPAM(_SCORE_)****
|
||||
|
||||
becomes
|
||||
|
||||
rewrite_header Subject ****SPAM(_SCORE_)****
|
||||
|
||||
- The "sa-learn --rebuild" command has been deprecated; please use
|
||||
"sa-learn --sync" instead. The --rebuild option will remain temporarily
|
||||
for backward compatibility.
|
||||
|
||||
- The Bayesian storage modules have been completely re-written and now
|
||||
include Berkeley DB (DBM) storage as well as SQL based storage (see
|
||||
sql/README.bayes for more information). In addition, a new format
|
||||
has been introduced for the bayes database that stores tokens in fixed
|
||||
length hashes (Bayes v3). All DBM databases should be automatically
|
||||
converted to this new format the first time they are opened for write.
|
||||
You can manually perform the upgrade by running "sa-learn --sync"
|
||||
from the command line.
|
||||
|
||||
Due to the database format change, you will want to do something like
|
||||
this when upgrading:
|
||||
|
||||
- stop running spamassassin/spamd (ie: you don't want it to be running
|
||||
during the upgrade)
|
||||
- run "sa-learn --rebuild", this will sync your journal. if you skip
|
||||
this step, any data from the journal will be lost when the DB is
|
||||
upgraded.
|
||||
- upgrade SA to 3.0.0
|
||||
- run "sa-learn --sync", which will cause the db format to be upgraded.
|
||||
if you want to see what is going on, you can add the "-D" option.
|
||||
- test the new database by running some sample mails through
|
||||
SpamAssassin, and/or at least running "sa-learn --dump" to make sure
|
||||
the data looks valid.
|
||||
- start running spamassassin/spamd again
|
||||
|
||||
If, instead of uprading your Bayes database, you want to wipe it and
|
||||
start fresh, you can run "sa-learn --clear" to safely remove your
|
||||
Bayes database files. If the --clear command issues an error then
|
||||
you can simply delete the Bayes database files ("bayes_*") while SA
|
||||
is not running; SpamAssassin will recreate them in the current
|
||||
format when it runs.
|
||||
|
||||
- "spamd" now has a default max-children setting of 5; no more than 5
|
||||
child scanner processes will be run in parallel. Previously, there was
|
||||
no default limit unless you specified the "-m" switch when starting
|
||||
spamd.
|
||||
|
||||
- If you are using a UNIX machine with all database files on local disks,
|
||||
and no sharing of those databases across NFS filesystems, you can use a
|
||||
more efficient, but non-NFS-safe, locking mechanism. Do this by adding
|
||||
the line "lock_method flock" to the /etc/mail/spamassassin/local.cf
|
||||
file. This is strongly recommended if you're not using NFS, as it is
|
||||
much faster than the NFS-safe locker.
|
||||
|
||||
- Please note that the use of the following commandline parameters for
|
||||
spamassassin and spamd have been deprecated and may be removed in
|
||||
upcoming versions of SpamAssassin. Please discontinue usage of these
|
||||
options:
|
||||
|
||||
in the 2.6x series: --add-from, --pipe, -F, --stop-at-threshold,
|
||||
-S, -P (spamassassin only)
|
||||
in the 3.0.x series: --auto-whitelist, -a, --whitelist-factory, -M,
|
||||
--warning-from, -w, --log-to-mbox, -l
|
||||
|
||||
- user_scores_sql_table is no longer supported. If you need to use a table
|
||||
name, other than the default, create a custom query using the
|
||||
user_scores_sql_custom_query config option.
|
||||
|
||||
- SpamAssassin runs in "taint mode" by default for improved security.
|
||||
Certain third-party modules may be incompatible with taint mode.
|
||||
|
||||
- 2.6x deprecated the use of the "check_bayes_db" script, and it
|
||||
has been removed in 3.0.0. Please see the sa-learn man/pod
|
||||
documentation for more info.
|
||||
|
||||
- Finally, this document is likely not complete. Other configuration
|
||||
options/arguments may have changed from older versions, etc. It would
|
||||
be good to double-check any custom configuration options to make sure
|
||||
they're still valid. This could be as simple as running "spamassassin
|
||||
--lint", or more complex, as required by the environment.
|
||||
|
||||
An example: "require_version <version>" hasn't changed itself, but the
|
||||
internal version representation is now "x.yyyzzz" instead of "x.yz"
|
||||
which could cause issues if "require_version 3.00" is expected to work
|
||||
(it won't, it needs to be "require_version 3.000000").
|
||||
|
||||
|
||||
Note for Users Upgrading from SpamAssassin 2.5x
|
||||
-----------------------------------------------
|
||||
|
||||
- Due to major reliability shortcomings in the database support libraries
|
||||
other than DB_File, we now require that the DB_File module be installed
|
||||
to use SpamAssassin's Bayes rules.
|
||||
|
||||
SpamAssassin will still work without DB_File installed, but the Bayes
|
||||
support will be disabled.
|
||||
|
||||
If you install DB_File and wish to import old Bayes database data, each
|
||||
user with a Bayes db should run "sa-learn --import" to copy old entries
|
||||
from the other formats into a new DB_File file.
|
||||
|
||||
Due to the database library change, and the change to the database
|
||||
format itself, you will want to do something like this when upgrading:
|
||||
|
||||
- stop running spamassassin/spamd (ie: you don't want it to be running
|
||||
during the upgrade)
|
||||
- run "sa-learn --rebuild", this will sync your journal. if you skip
|
||||
this step, any data from the journal will be lost when the DB is
|
||||
upgraded.
|
||||
- install DB_File module if necessary
|
||||
- upgrade SA to 3.0.0
|
||||
- if you were using another database module previously, run "sa-learn
|
||||
--import" to migrate the data into new DB_File files
|
||||
- run "sa-learn --sync", which will cause the db format to be upgraded.
|
||||
if you want to see what is going on, you can add the "-D" option.
|
||||
- test the new database by running some sample mails through
|
||||
SpamAssassin, and/or at least running "sa-learn --dump" to make sure
|
||||
the data looks valid.
|
||||
- start running spamassassin/spamd again
|
||||
|
||||
Obviously the steps will be different depending on your environment, but
|
||||
you get the idea. :)
|
||||
|
||||
|
||||
Note For Users Upgrading From SpamAssassin 2.3x or 2.4x
|
||||
-------------------------------------------------------
|
||||
|
||||
- SpamAssassin no longer includes code to handle local mail delivery, as
|
||||
it was not reliable enough, compared to procmail. So now, if you relied
|
||||
on spamassassin to write the mail into your mail folder, you'll have to
|
||||
change your setup to use procmail as detailed below. If you used
|
||||
spamassassin to filter your mail and then something else wrote it into a
|
||||
folder for you, then you should be fine.
|
||||
|
||||
- Support for versions of the optional Mail::Audit module is no longer
|
||||
included.
|
||||
|
||||
- The default mode of tagging (which used to be ***SPAM*** in the subject
|
||||
line) no longer takes place. Instead the message is rewritten. If an
|
||||
incoming message is tagged as spam, instead of modifying the original
|
||||
message, SpamAssassin will create a new report message and attach the
|
||||
original message as a message/rfc822 MIME part (ensuring the original
|
||||
message is completely preserved and easier to recover). If you do not
|
||||
want to modify the body of incoming spam, use the "report_safe" option.
|
||||
The "report_header" and "defang_mime" options have been removed as a
|
||||
result.
|
||||
|
||||
(end of UPGRADE)
|
||||
|
||||
//vim:tw=74:
|
250
upstream/USAGE
Normal file
250
upstream/USAGE
Normal file
@ -0,0 +1,250 @@
|
||||
|
||||
Important Note For Users Upgrading From Earlier Versions
|
||||
--------------------------------------------------------
|
||||
|
||||
SpamAssassin no longer includes code to handle local mail delivery, as it
|
||||
was not reliable enough, compared to procmail. So now, if you relied on
|
||||
spamassassin to write the mail into your mail folder, you'll have to
|
||||
change your setup to use procmail as detailed below.
|
||||
|
||||
If you used spamassassin to filter your mail and then something else wrote
|
||||
it into a folder for you, then you should be fine.
|
||||
|
||||
Steps to take for every installation:
|
||||
|
||||
- Install Mail::SpamAssassin on your mail server, as per the INSTALL
|
||||
document.
|
||||
|
||||
- Test it:
|
||||
|
||||
spamassassin -t < sample-nonspam.txt > nonspam.out
|
||||
spamassassin -t < sample-spam.txt > spam.out
|
||||
|
||||
Verify (using a text viewer, ie. "less" or "notepad") that nonspam.out
|
||||
has not been tagged as spam, and that spam.out has. The files should
|
||||
contain the full text and headers of the messages, the "spam.out"
|
||||
message should contain the header "X-Spam-Flag: YES" and be annotated
|
||||
with a report from SpamAssassin, and there should be no errors when you
|
||||
run the commands.
|
||||
|
||||
Even though sample-nonspam.txt is not spam, nonspam.out will
|
||||
contain a SpamAssassin report anyway. This is a side-effect of
|
||||
the "-t" (test) switch. However, there should be less than 5
|
||||
points accumulated; when the "-t" switch is not in use, the report
|
||||
text would not be added. For more verbose (debugging) output, add
|
||||
the "-D" switch.
|
||||
|
||||
If the commands do not work, DO NOT PROCEED TO THE NEXT STEP, as you
|
||||
will lose mail!
|
||||
|
||||
|
||||
|
||||
If you use KMail:
|
||||
|
||||
- http://kmail.kde.org/tools.html mentions:
|
||||
|
||||
The filter setup is the work of five minutes (if that!) if you have a
|
||||
working spamassassin set up.
|
||||
|
||||
The filter in question is "<any header><matches regexp> ."
|
||||
|
||||
The action is "<pipe through> spamassassin"
|
||||
|
||||
Then, in the advanced options, uncheck the "If this filter matches,
|
||||
stop processing here" box. If you keep this filter at the top, it will
|
||||
analyze any incoming mail, decide whether it's spam or not, and flag
|
||||
it accordingly.
|
||||
|
||||
[Then add] a second filter behind it, which searches for the added
|
||||
spam-flags and diverts them into a specific spam folder. [...]
|
||||
|
||||
|
||||
|
||||
If you use procmail, or haven't decided on any of the above examples:
|
||||
|
||||
- Make a backup of your .procmailrc (if you already have one).
|
||||
|
||||
cp ~/.procmailrc ~/.procmailrc.bak
|
||||
|
||||
- add the line from procmailrc.example to ~/.procmailrc, at the top of
|
||||
the file before any existing recipes.
|
||||
|
||||
That'll process all mail through SA, and refile spam messages to
|
||||
a folder called "caughtspam" in your home directory.
|
||||
|
||||
- Send yourself a mail message, and ensure it gets to you. If it does
|
||||
not, copy your old backed-up .procmailrc file back into place and ask
|
||||
your sysadmin for help! Here's commands to do that:
|
||||
|
||||
cp ~/.procmailrc.bak ~/.procmailrc
|
||||
echo "Help!" | mail root
|
||||
|
||||
|
||||
|
||||
If you want to use SpamAssassin site-wide:
|
||||
|
||||
- take a look at the notes on the Wiki website, currently at
|
||||
<http://wiki.apache.org/spamassassin/UsingSiteWide>. You will probably
|
||||
want to use 'spamd' (see below). You may want to investigate the
|
||||
new Apache mod_perl module, in the 'spamd-apache2' directory, too.
|
||||
|
||||
- *PLEASE* let your users know you've installed it, and how to turn it
|
||||
off! This is our #1 tech support query, and the users are usually
|
||||
pretty frustrated once it reaches that stage.
|
||||
|
||||
- *PLEASE* consider setting it up as "off by default" for most accounts,
|
||||
and let users opt-in to using it. Quite a few folks prefer not to
|
||||
have their mail filtered, presumably because they don't use their
|
||||
email address publicly and do not get much spam.
|
||||
|
||||
- Note that procmail users adding spamc to /etc/procmailrc should
|
||||
add the line 'DROPPRIVS=yes' at the top of the file.
|
||||
|
||||
|
||||
The Auto-Whitelist
|
||||
------------------
|
||||
|
||||
The auto-whitelist is enabled using the 'use_auto_whitelist' option.
|
||||
(See http://wiki.apache.org/spamassassin/AutoWhitelist for details on
|
||||
how it works, if you're curious.)
|
||||
|
||||
|
||||
Other Installation Notes
|
||||
------------------------
|
||||
|
||||
|
||||
- Hashcash is a useful system; it requires that senders exercise a
|
||||
CPU-intensive task before they can send mail to you, so we give that
|
||||
some bonus points. However, it requires that you list what addresses
|
||||
you expect to receive mail for, by adding 'hashcash_accept' lines to
|
||||
your ~/.spamassassin/user_prefs or /etc/mail/spamassassin/local.cf
|
||||
files. See the Mail::SpamAssassin::Plugin::Hashcash manual page for
|
||||
details on how to specify these.
|
||||
|
||||
|
||||
- SpamAssassin now uses a temporary file in /tmp (or $TMPDIR, if that's
|
||||
set in the environment) for Pyzor and DCC checks. Make sure that this
|
||||
directory is either (a) not writable by other users, or (b) not shared
|
||||
over NFS, for security.
|
||||
|
||||
|
||||
- You can create your own system-wide rules files in
|
||||
/etc/mail/spamassassin; their filenames should end in ".cf". Multiple
|
||||
files will be read, and SpamAssassin will not overwrite these files
|
||||
when installing a new version.
|
||||
|
||||
|
||||
- You should not modify the files in /usr/share/spamassassin; these
|
||||
will be overwritten when you upgrade. Any changes you make in
|
||||
files in the /etc/mail/spamassassin directory, however, will
|
||||
override these files.
|
||||
|
||||
|
||||
- Rules can be turned off by setting their scores to 0 in a
|
||||
configuration or user-preference file.
|
||||
|
||||
|
||||
- Speakers of Chinese, Japanese, Korean or Arabic may find it useful to
|
||||
turn off the rules listed at the end of the "user_prefs.template"
|
||||
file; we've found out that these rules are still triggering on
|
||||
non-spam CJK mails.
|
||||
|
||||
|
||||
- If you have an unusual network configuration, you should probably
|
||||
set 'trusted_networks'. This allows SpamAssassin to determine where
|
||||
your internal network ends and the internet begins, and allows DNS
|
||||
checks to be more accurate. If your mail host is NATed, you will
|
||||
almost certainly need to set 'trusted_networks' to get correct
|
||||
results.
|
||||
|
||||
|
||||
- A very handy new feature is SPF support, which allows you to check
|
||||
that the message sender is permitted by their domain to send from the
|
||||
IP address used. This has the potential to greatly cut down on mail
|
||||
forgery. (see http://spf.pobox.com/ for more details.)
|
||||
|
||||
|
||||
- MDaemon users should add this line to their "local.cf" file:
|
||||
|
||||
report_safe_copy_headers X-MDRcpt-To X-MDRemoteIP X-MDaemon-Deliver-To
|
||||
|
||||
Otherwise, MDaemon's internal delivery will fail when SpamAssassin
|
||||
rewrites a message as spam.
|
||||
|
||||
|
||||
- The distribution includes 'spamd', a daemonized version of
|
||||
SpamAssassin which runs persistently. Using its counterpart,
|
||||
'spamc', a lightweight client written in C, an MTA can process
|
||||
large volumes of mail through SpamAssassin without having to
|
||||
fork/exec a perl interpreter for each message. Take a look in the
|
||||
'spamd' and 'spamc' directories for more details.
|
||||
|
||||
|
||||
- The distribution also includes 'spamd-apache2', a mod_perl module
|
||||
allowing the Apache HTTP server to be used as a platform for a
|
||||
daemonized SpamAssassin, in an upwardly-compatible fashion from
|
||||
'spamd'. If you don't require some of the spamd features it does not
|
||||
implement (such as switching UIDs to read per-user configuration from
|
||||
user home directories), this may be much faster than spamd. Take a
|
||||
look at the 'spamd-apache2' directory for details.
|
||||
|
||||
|
||||
- spamc can now be built as a shared library for use with milters or
|
||||
to link into other existing programs; simply run "make libspamc.so"
|
||||
to build this.
|
||||
|
||||
|
||||
- If you get spammed, it is helpful to everyone else if you re-run
|
||||
spamassassin with the "-r" option to report the message in question as
|
||||
"verified spam". This will add it to Vipul's Razor, DCC and Pyzor,
|
||||
assuming you've set these up appropriately.
|
||||
|
||||
spamassassin -r < spam-message
|
||||
|
||||
If you use mutt as your mail reader, this macro will bind the X key to
|
||||
report a spam message.
|
||||
|
||||
macro index X "| spamassassin -r"
|
||||
|
||||
This is, of course, optional -- but you'll get lots of good-netizen
|
||||
karma. ;)
|
||||
|
||||
|
||||
- Quite often, if you've been on the internet for a while, you'll have
|
||||
accumulated a few old email accounts that nowadays get nothing but
|
||||
spam. You can set these up as spam traps using SpamAssassin; see the
|
||||
''SPAM TRAPPING'' section of the spamassassin manual page for details.
|
||||
|
||||
If you don't want to go to the bother of setting up a system yourself
|
||||
to do this, take a look here [1] for a simple forwarding-based
|
||||
alternative.
|
||||
|
||||
[1]: http://wiki.apache.org/spamassassin/SpamTrapping
|
||||
|
||||
|
||||
- Scores and other user preferences can now be loaded from, and Bayes
|
||||
and automatic reputation data can be stored in, an SQL database; see
|
||||
the 'sql' subdirectory for more details.
|
||||
|
||||
If you are setting up a large 'spamd' system-wide installation, with
|
||||
Bayes and/or automatic reputation, we strongly recommend using SQL as
|
||||
storage. It has proven more reliable than the default DB_File storage
|
||||
backend at several large sites. For Bayes, you should always use the
|
||||
new Redis backend (see Mail::SpamAssassin::BayesStore::Redis).
|
||||
|
||||
|
||||
- If you are running SpamAssassin under a disk quota, or are setting up
|
||||
'spamd' with users with disk quotas, be warned that the DB_File
|
||||
database module used by SpamAssassin for Bayes, TxRep, and AWL storage
|
||||
seems to be unreliable in the face of quotas (bug 3796). In this
|
||||
situation, we recommend using SQL storage for those databases, instead
|
||||
of DB_File.
|
||||
|
||||
|
||||
- Lots more ways to integrate SpamAssassin can be read at
|
||||
http://wiki.SpamAssassin.org/ .
|
||||
|
||||
|
||||
(end of USAGE)
|
||||
|
||||
// vim:tw=74:
|
6
upstream/build/check_dependencies
Executable file
6
upstream/build/check_dependencies
Executable file
@ -0,0 +1,6 @@
|
||||
#!/usr/bin/perl
|
||||
|
||||
use strict;
|
||||
use lib 'lib';
|
||||
use Mail::SpamAssassin::Util::DependencyInfo;
|
||||
exit Mail::SpamAssassin::Util::DependencyInfo::long_diagnostics();
|
17
upstream/build/convert_pods_to_doc
Executable file
17
upstream/build/convert_pods_to_doc
Executable file
@ -0,0 +1,17 @@
|
||||
#!/usr/bin/perl
|
||||
|
||||
use Pod::Html;
|
||||
use Pod::Text ();
|
||||
|
||||
foreach (@ARGV) {
|
||||
$in = $_;
|
||||
s,^(lib|spamd|spamc)/|\.(pod|pm)$,,g;
|
||||
tr,/,_,;
|
||||
|
||||
# convert to HTML: doc/foo.html
|
||||
pod2html ("--infile=$in", "--outfile=doc/$_.html");
|
||||
|
||||
# and to text: doc/foo.txt
|
||||
my $parser = Pod::Text->new ();
|
||||
$parser->parse_from_file ($in, "doc/$_.txt");
|
||||
}
|
15
upstream/build/get_version
Executable file
15
upstream/build/get_version
Executable file
@ -0,0 +1,15 @@
|
||||
#!/usr/bin/perl
|
||||
|
||||
open (IN, "< lib/Mail/SpamAssassin.pm")
|
||||
or die "cannot open lib/Mail/SpamAssassin.pm\n";
|
||||
while (<IN>) {
|
||||
($VERSION) = /^\s*\$VERSION\s*=\s*\"(\S+)\"\;/ unless ( $VERSION );
|
||||
($EXTRA) = /^\s*\@EXTRA_VERSION\s*=\s*q\w\((\S+?)\)/ unless ( $EXTRA );
|
||||
($DEVEL) = /^\s*\$IS_DEVEL_BUILD\s*=\s*(1);/ unless ( $DEVEL );
|
||||
}
|
||||
close IN;
|
||||
|
||||
print $VERSION;
|
||||
print "-$EXTRA" if ( $EXTRA );
|
||||
print "-svn" if ( $DEVEL );
|
||||
print "\n";
|
1276
upstream/build/mkrules
Executable file
1276
upstream/build/mkrules
Executable file
File diff suppressed because it is too large
Load Diff
219
upstream/build/parse-rules-for-masses
Executable file
219
upstream/build/parse-rules-for-masses
Executable file
@ -0,0 +1,219 @@
|
||||
#!/usr/bin/perl -w
|
||||
#
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
use strict;
|
||||
|
||||
sub usage {
|
||||
die "
|
||||
parse-rules-for-masses: parse the SpamAssassin rules files for mass-checks,
|
||||
evolving, and frequency analysis
|
||||
|
||||
usage: ./parse-rules-for-masses [-d rulesdir] [-o outputfile] [-s scoreset] [-x]
|
||||
|
||||
rulesdir defaults to ../rules
|
||||
outputfile defaults to ./tmp/rules.pl
|
||||
scoreset default to 0
|
||||
-x do not include test rules files (ie 70_*)
|
||||
";
|
||||
}
|
||||
|
||||
use Getopt::Long;
|
||||
use Data::Dumper;
|
||||
|
||||
our (@rulesdirs, $outputfile, $scoreset, $skip_test_rules);
|
||||
GetOptions (
|
||||
"d=s" => \@rulesdirs,
|
||||
"o=s" => \$outputfile,
|
||||
"s=i" => \$scoreset,
|
||||
"x" => \$skip_test_rules,
|
||||
"help|h|?" => sub { usage(); } );
|
||||
|
||||
if ($#rulesdirs < 0) {
|
||||
@rulesdirs = ("../rules");
|
||||
}
|
||||
|
||||
if (!defined $outputfile) {
|
||||
$outputfile = "./tmp/rules.pl";
|
||||
mkdir ("tmp", 0755);
|
||||
}
|
||||
|
||||
$scoreset = 0 if ( !defined $scoreset );
|
||||
|
||||
my $rules = { };
|
||||
$rules->{_scoreset} = $scoreset;
|
||||
readrules(@rulesdirs);
|
||||
|
||||
my $scores = { };
|
||||
foreach my $key (keys %{$rules}) {
|
||||
next if $key eq '_scoreset';
|
||||
$scores->{$key} = $rules->{$key}->{score};
|
||||
}
|
||||
|
||||
writerules($outputfile);
|
||||
exit;
|
||||
|
||||
sub readrules {
|
||||
foreach my $indir (@_) {
|
||||
my @files = <$indir/*.cf>;
|
||||
|
||||
my $file;
|
||||
my $scores_mutable = 1;
|
||||
my %rulesfound = ();
|
||||
my %langs = ();
|
||||
foreach $file (sort @files) {
|
||||
$scores_mutable = 1; # always start off mutable in each file
|
||||
if ($skip_test_rules) {
|
||||
next if ($file =~ /70_/);
|
||||
}
|
||||
open (IN, "<$file");
|
||||
while (<IN>)
|
||||
{
|
||||
# these appear in comments, so deal with them before comment stripping
|
||||
# takes place
|
||||
if (/<\/gen:mutable>/i) {
|
||||
$scores_mutable = 0;
|
||||
}
|
||||
elsif (/<gen:mutable>/i) {
|
||||
$scores_mutable = 1;
|
||||
}
|
||||
|
||||
s/#.*$//g; s/^\s+//; s/\s+$//; next if /^$/;
|
||||
|
||||
# TODO: this could be overwriting stuff
|
||||
my $lang = '';
|
||||
if (s/^lang\s+(\S+)\s+//) {
|
||||
$lang = $1;
|
||||
}
|
||||
|
||||
if (/^(header|rawbody|body|full|uri|askdns|meta|mimeheader|reuse)\s+(\S+)\s+(.*)$/) {
|
||||
my $type = $1;
|
||||
my $name = $2;
|
||||
my $val = $3;
|
||||
|
||||
if (exists $rules->{$name}->{type} && $type eq 'reuse') {
|
||||
# "reuse" should be skipped if we already have a rule
|
||||
next;
|
||||
}
|
||||
|
||||
$rules->{$name} ||= { };
|
||||
$rules->{$name}->{type} = $type;
|
||||
$rules->{$name}->{lang} = $lang;
|
||||
$rules->{$name}->{issubrule} = ($name =~ /^__/) ? '1' : '0';
|
||||
$rules->{$name}->{tflags} = '';
|
||||
$rules->{$name}->{eval} = ($val =~ /\beval:(\w+)/) ? $1 : '0';
|
||||
if ($type eq "meta") {
|
||||
my @depends = grep { !/^\d+$/ } ($val =~ m/(\w+)/g);
|
||||
push(@{ $rules->{$name}->{depends} }, @depends);
|
||||
}
|
||||
$rules->{$name}->{code} = $val;
|
||||
|
||||
} elsif (/^describe\s+(\S+)\s+(.+)$/) {
|
||||
$rules->{$1} ||= { };
|
||||
if ($lang) {
|
||||
$rules->{$1}->{describe} ||= $2;
|
||||
}
|
||||
else {
|
||||
$rules->{$1}->{describe} = $2;
|
||||
}
|
||||
} elsif (/^tflags\s+(\S+)\s+(.+)$/) {
|
||||
$rules->{$1} ||= { };
|
||||
$rules->{$1}->{tflags} = $2;
|
||||
|
||||
} elsif (/^score\s+(\S+)\s+(.+)$/) {
|
||||
my($name,$score) = ($1,$2);
|
||||
$rules->{$name} ||= { };
|
||||
if ( $score =~ /\s/ ) { # there are multiple scores
|
||||
($score) = (split(/\s+/,$score))[$scoreset];
|
||||
}
|
||||
$rules->{$name}->{score} = $score;
|
||||
$rules->{$name}->{mutable} = $scores_mutable;
|
||||
|
||||
}
|
||||
}
|
||||
close IN;
|
||||
}
|
||||
}
|
||||
|
||||
foreach my $rule (keys %{$rules}) {
|
||||
next if ($rule eq '_scoreset');
|
||||
if (!defined $rules->{$rule}->{type}) {
|
||||
delete $rules->{$rule}; # no rule definition -> no rule
|
||||
next;
|
||||
}
|
||||
|
||||
my $tflags = $rules->{$rule}->{tflags};
|
||||
if (!defined $rules->{$rule}->{score}) {
|
||||
my $def = 1.0;
|
||||
if ($rule =~ /^T_/) {
|
||||
$def = 0.01;
|
||||
}
|
||||
|
||||
if ($tflags =~ /\bnice\b/) {
|
||||
$rules->{$rule}->{score} = -$def;
|
||||
} else {
|
||||
$rules->{$rule}->{score} = $def;
|
||||
}
|
||||
$rules->{$rule}->{no_score_found} = 1;
|
||||
}
|
||||
|
||||
# ignore net rules in set 0 or set 2
|
||||
if ($tflags =~ /\bnet\b/ && ($scoreset & 1) == 0) {
|
||||
$rules->{$rule}->{mutable} = 0;
|
||||
$rules->{$rule}->{score} = 0;
|
||||
}
|
||||
|
||||
# ignore bayes rules in set 0 or set 2
|
||||
if ($tflags =~ /\blearn\b/ && ($scoreset & 2) == 0) {
|
||||
$rules->{$rule}->{mutable} = 0;
|
||||
$rules->{$rule}->{score} = 0;
|
||||
}
|
||||
|
||||
# if a rule didn't have a score specified, assume it's
|
||||
# mutable
|
||||
if (!defined $rules->{$rule}->{mutable}) {
|
||||
$rules->{$rule}->{mutable} = 1;
|
||||
}
|
||||
|
||||
# although T_ test rules are clamped to 0.01. this works well
|
||||
# for release mass-checks, at least
|
||||
if ($rule =~ /^T_/) {
|
||||
$rules->{$rule}->{mutable} = 0;
|
||||
} elsif ($rule eq 'AWL') { # ignore entirely
|
||||
$rules->{$rule}->{mutable} = 0;
|
||||
$rules->{$rule}->{score} = 0;
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
sub writerules {
|
||||
my $outfile = shift;
|
||||
# quick hack to create the tmp directory
|
||||
system ("mkdir -p $outfile 2>/dev/null ; rmdir $outfile 2>/dev/null");
|
||||
|
||||
open (OUT, ">$outfile") or die "cannot write to $outfile";
|
||||
print OUT "# dumped at ".`date`."\n";
|
||||
|
||||
$Data::Dumper::Purity = 1;
|
||||
print OUT Data::Dumper->Dump ([$rules, $scores], ['*rules', '*scores']);
|
||||
|
||||
print OUT "1;";
|
||||
close OUT;
|
||||
}
|
||||
|
228
upstream/build/preprocessor
Executable file
228
upstream/build/preprocessor
Executable file
@ -0,0 +1,228 @@
|
||||
#!/usr/bin/perl
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
#
|
||||
# This script isn't supposed to be run by hand, it's used by `make` as a pre-
|
||||
# processor. It currently accepts these options on the command line:
|
||||
#
|
||||
# -M<module> Enables <module>
|
||||
# -D<variable>=<value> Defines the <variable> to be <value>; if the value
|
||||
# doesn't contain an equal sign, it is interpreted
|
||||
# as a file and all of its lines containing equal
|
||||
# signs are taken as <variable>=<value> pairs
|
||||
#
|
||||
# and some more to help with non-UNIX platforms, where command-line input
|
||||
# and output redirection are not always available:
|
||||
#
|
||||
# -i<file> Read from input file <file>
|
||||
# -o<file> Write to output file <file>
|
||||
#
|
||||
# -I<dir> Read from input directory <dir>
|
||||
# -O<dir> Write to output directory <dir>
|
||||
# <filename> ... Process named files from -I<dir> to -O<dir>
|
||||
#
|
||||
# -m<perm> Use chmod permissions <perm> for files
|
||||
#
|
||||
# Those modules are currently implemented:
|
||||
# conditional Comments out every line containing the string
|
||||
# REMOVEFORINST
|
||||
# vars Replaces variables: upper case strings surrounded
|
||||
# by double at-signs, eg. @@VERSION@@. The values are
|
||||
# taken from the environment and can be overwritten with
|
||||
# the -D switch. Empty/undefined variables are removed.
|
||||
# sharpbang Does some sharpbang (#!) replacement. Uses PERL_BIN and
|
||||
# PERL_WARN.
|
||||
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
|
||||
my %modules = ();
|
||||
my %defines = ();
|
||||
|
||||
my @infiles = ();
|
||||
my $infile;
|
||||
my $outfile;
|
||||
my $indir;
|
||||
my $outdir;
|
||||
my $mode;
|
||||
|
||||
|
||||
# Each environment variable counts as an own defined var for us.
|
||||
foreach (keys %ENV) {
|
||||
$defines{$_} = $ENV{$_};
|
||||
}
|
||||
|
||||
foreach (@ARGV) {
|
||||
if (/^-M([a-z]+)$/) { $modules{$1} = 1; }
|
||||
elsif (/^-D([A-Z0-9_]+)=(.*)$/) { $defines{$1} = $2; }
|
||||
elsif (/^-D([^=]+)$/) { read_defs($1); }
|
||||
elsif (/^-i(.+)$/) { $infile = $1; }
|
||||
elsif (/^-o(.+)$/) { $outfile = $1; }
|
||||
elsif (/^-I(.+)$/) { $indir = $1; }
|
||||
elsif (/^-m(.*)$/) { $mode = '0'.$1; }
|
||||
elsif (/^-O(.+)$/) { $outdir = $1; }
|
||||
elsif (/^(.+)$/) { push (@infiles, $1); }
|
||||
}
|
||||
|
||||
# On Windows, we get -m without an arg. avoid problems with that
|
||||
# by just ignoring that switch.
|
||||
$mode = undef unless $mode;
|
||||
|
||||
if (defined ($indir) && defined ($outdir) && scalar @infiles > 0) {
|
||||
require File::Spec;
|
||||
my $fname;
|
||||
while ($fname = shift @infiles) {
|
||||
my $in = File::Spec->catfile ($indir, $fname);
|
||||
my $out = File::Spec->catfile ($outdir, $fname);
|
||||
do_file ($in, $out);
|
||||
}
|
||||
}
|
||||
elsif (defined ($infile) && defined($outfile)) {
|
||||
do_file ($infile, $outfile);
|
||||
}
|
||||
else {
|
||||
# just do STDIN/STDOUT . Not recommended for portability as
|
||||
# it requires "<" and ">" for Makefile to do its work.
|
||||
#
|
||||
do_stdin();
|
||||
}
|
||||
|
||||
|
||||
sub read_defs {
|
||||
my ($in) = @_;
|
||||
open (DEFS, "<$in") or die "Cannot open $in: $!";
|
||||
foreach (<DEFS>) {
|
||||
$_ =~ s/^\s+|\s+$//g;
|
||||
next if /^#/;
|
||||
next unless /=/;
|
||||
my ($var, $val) = split(/\s*=\s*/, $_, 2);
|
||||
$var =~ tr/A-Z_//cd;
|
||||
$defines{$var} = $val;
|
||||
}
|
||||
close (DEFS);
|
||||
}
|
||||
|
||||
|
||||
sub do_file {
|
||||
my ($in, $out) = @_;
|
||||
|
||||
open (FOOIN, "<$in") or die "Cannot open $in: $!";
|
||||
open (FOOOUT, ">$out") or die "Cannot open $out: $!";
|
||||
|
||||
do_it();
|
||||
|
||||
close (FOOIN);
|
||||
close (FOOOUT);
|
||||
|
||||
if (defined $mode) {
|
||||
chmod (oct $mode, $out) or die "Cannot chmod $mode $out: $!";
|
||||
}
|
||||
}
|
||||
|
||||
sub do_stdin {
|
||||
open (FOOIN, "<&STDIN") or die "Cannot dup stdin: $!";
|
||||
open (FOOOUT, ">&STDOUT") or die "Cannot dup stdout: $!";
|
||||
|
||||
do_it();
|
||||
|
||||
close (FOOIN);
|
||||
close (FOOOUT);
|
||||
}
|
||||
|
||||
|
||||
sub do_it {
|
||||
# The perlpath can be overwritten via -DPERL_BIN=<perlpath>
|
||||
my $perl;
|
||||
if($defines{'PERL_BIN'} && ($defines{PERL_BIN} ne 'this')) {
|
||||
$perl = $defines{'PERL_BIN'};
|
||||
unless(-x $perl) {
|
||||
warn("No such PERL_BIN: $perl");
|
||||
}
|
||||
}
|
||||
else {
|
||||
# use eval so the module is not loaded unless needed; it's slow
|
||||
eval 'use Config; $perl = $Config{"perlpath"};';
|
||||
}
|
||||
|
||||
# Warnings are enabled per default
|
||||
my $perl_warn = ' -w';
|
||||
# The warnings can be overwritten via -DPERL_WARN=<yes|no>
|
||||
if ($defines{'PERL_WARN'} and $defines{'PERL_WARN'} eq 'no') {
|
||||
$perl_warn = '';
|
||||
}
|
||||
|
||||
# Taint mode is enabled per default except on 5.005
|
||||
my $perl_taint = ' -T';
|
||||
# The taint mode can be disabled with -DPERL_TAINT=<yes|no>
|
||||
if ($defines{'PERL_TAINT'} and $defines{'PERL_TAINT'} eq 'no') {
|
||||
$perl_taint = '';
|
||||
}
|
||||
|
||||
$defines{PERL_MAJOR_VER} = sub {
|
||||
$] =~ /^(\d\.\d\d\d)/ or die "bad perl ver $]";
|
||||
return $1;
|
||||
};
|
||||
|
||||
$defines{PLUGIN_POD} = sub {
|
||||
|
||||
# Grab active plugin list
|
||||
my @plugin_pod = ();
|
||||
foreach my $pre (<rules/*.pre>) {
|
||||
if (open(INIT, $pre)) {
|
||||
while (<INIT>) {
|
||||
if (/^loadplugin\s+(.*?)\s*$/) { push(@plugin_pod, " $1\n"); }
|
||||
}
|
||||
close(INIT);
|
||||
}
|
||||
}
|
||||
return join('', sort @plugin_pod);
|
||||
};
|
||||
|
||||
my $towrite = '';
|
||||
while (<FOOIN>) {
|
||||
$_ = pack("C0A*", $_); # turn off UTF8-ness
|
||||
|
||||
# Conditional compiling
|
||||
if ($modules{'conditional'}) {
|
||||
# DELETE lines carrying the REMOVE_ON_BUILD or (deprecated) REMOVEFORINST tag
|
||||
if(/\bREMOVE(?:FORINST|_ON_BUILD)\b/) {
|
||||
next;
|
||||
}
|
||||
}
|
||||
|
||||
$towrite .= $_;
|
||||
}
|
||||
|
||||
# Sharpbang (#!) replacement (see also ExtUtils::MY->fixin)
|
||||
if ($modules{'sharpbang'}) {
|
||||
$towrite =~ s/^#![^\n]*perl[^\n]*\n/#!${perl}${perl_taint}${perl_warn}\n/;
|
||||
}
|
||||
|
||||
# Variable replacement (do in one invocation)
|
||||
if ($modules{'vars'}) {
|
||||
# Replace all @@VARS@@
|
||||
while ($towrite =~ /\@\@([A-Z][A-Z0-9_]*)\@\@/) {
|
||||
my $what = $1;
|
||||
my $d = $defines{$what} || '';
|
||||
if (ref($d) =~ /^CODE/) { $d = $d->(); }
|
||||
$towrite =~ s/\@\@${what}\@\@/$d/g;
|
||||
}
|
||||
}
|
||||
|
||||
print FOOOUT $towrite;
|
||||
}
|
11
upstream/build/sha256sum.pl
Executable file
11
upstream/build/sha256sum.pl
Executable file
@ -0,0 +1,11 @@
|
||||
#!/usr/bin/perl
|
||||
|
||||
BEGIN {
|
||||
require Digest::SHA; import Digest::SHA qw(sha256_hex sha512_hex);
|
||||
}
|
||||
|
||||
$/=undef;
|
||||
|
||||
while(<>) {
|
||||
print sha256_hex($_)," $ARGV\n";
|
||||
}
|
11
upstream/build/sha512sum.pl
Executable file
11
upstream/build/sha512sum.pl
Executable file
@ -0,0 +1,11 @@
|
||||
#!/usr/bin/perl
|
||||
|
||||
BEGIN {
|
||||
require Digest::SHA; import Digest::SHA qw(sha256_hex sha512_hex);
|
||||
}
|
||||
|
||||
$/=undef;
|
||||
|
||||
while(<>) {
|
||||
print sha512_hex($_)," $ARGV\n";
|
||||
}
|
116
upstream/ldap/README
Normal file
116
upstream/ldap/README
Normal file
@ -0,0 +1,116 @@
|
||||
|
||||
Using SpamAssassin With An LDAP Server
|
||||
--------------------------------------
|
||||
|
||||
SpamAssassin can now load users' score files from an LDAP server. The concept
|
||||
here is to have a web application (PHP/perl/ASP/etc.) that will allow users to
|
||||
be able to update their local preferences on how SpamAssassin will filter their
|
||||
e-mail. The most common use for a system like this would be for users to be
|
||||
able to update the white list of addresses (whitelist_from) without the need
|
||||
for them to update their $HOME/.spamassassin/user_prefs file. It is also quite
|
||||
common for users listed in /etc/passwd to not have a home directory,
|
||||
therefore, the only way to have their own local settings would be through a
|
||||
database or LDAP server.
|
||||
|
||||
SpamAssassin will check the global configuration file (ie. any file matching
|
||||
/etc/mail/spamassassin/*.cf) for the following settings:
|
||||
|
||||
user_scores_dsn ldap://host:port/dc=basedn,dc=de?attr?scope?uid=__USERNAME__
|
||||
user_scores_ldap_username bind dn
|
||||
user_scores_ldap_password password
|
||||
|
||||
The first option, user_scores_dsn, describes the data source name that will be
|
||||
used to create the connection to your LDAP server. You have to write the DSN as
|
||||
an LDAP URL, the components being the host and port to connect to, the base DN
|
||||
for the search, the scope of the search (base, one or sub), the single
|
||||
attribute being the multivalued attribute used to hold the configuration data
|
||||
(space separated pairs of key and value, just as in a file) and finally the
|
||||
filter being the expression used to filter out the wanted username. Note that
|
||||
the filter expression uses the literal text __USERNAME__ as a placeholder for
|
||||
the username (SpamAssassin will use a s///g statement to replace it with the
|
||||
actual username).
|
||||
|
||||
Examples:
|
||||
|
||||
ldap://localhost:389/dc=koehntopp,dc=de?spamassassin?sub?uid=__USERNAME__
|
||||
ldap://localhost:389/o=stooges?spamassassin?sub?uid=__USERNAME__
|
||||
|
||||
|
||||
If the user_scores_dsn option does not exist, SpamAssassin will not attempt
|
||||
to use an LDAP server for retrieving users' preferences. Note that this will
|
||||
NOT look for test rules, only local scores, whitelist_from(s), and
|
||||
required_score.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
In order for SpamAssassin to work with your LDAP database, you must have
|
||||
the perl Net::LDAP module installed. You'll also need the URI module.
|
||||
|
||||
In order for spamd to use the LDAP driver, you will have to start spamd
|
||||
with the additional parameters '--ldap-config -x'.
|
||||
|
||||
Each user that wants to utilise the SpamAssassin LDAP driver must add
|
||||
the 'spamassassin' attribute in their object (either manually or via the
|
||||
web interface of your making/choice) like this (see the file sa_test.ldif
|
||||
in this directory for a full database example):
|
||||
|
||||
spamassassin: add_header all Foo LDAP read
|
||||
|
||||
Database Schema
|
||||
---------------
|
||||
|
||||
You can use any schema extension to your user entries with SpamAssassin,
|
||||
as long as the attribute is multivalued and correctly named in your LDAP url.
|
||||
We are currently using a <customername>spamassassin field that is part of
|
||||
our inetOrgPerson subclass.
|
||||
|
||||
Here's an example for openldap's /etc/openldap/schema/inetorgperson.schema :
|
||||
|
||||
# SpamAssassin
|
||||
# see http://SpamAssassin.org/ .
|
||||
attributetype ( 2.16.840.1.113730.3.1.217
|
||||
NAME 'spamassassin'
|
||||
DESC 'SpamAssassin user preferences settings'
|
||||
EQUALITY caseExactMatch
|
||||
SYNTAX 1.3.6.1.4.1.1466.115.121.1.15 )
|
||||
|
||||
(don't forget to add "$ spamassassin" to the objectclass MAY clause.)
|
||||
|
||||
|
||||
Testing SpamAssassin/LDAP
|
||||
-------------------------
|
||||
|
||||
To test your LDAP setup, and debug any possible problems, you should start
|
||||
spamd with the -D option, which will keep spamd in the foreground, and will
|
||||
output debug message to the terminal. You should then test spamd with a
|
||||
message by calling spamc. You can use the sample-spam.txt file with the
|
||||
following command:
|
||||
|
||||
cat sample-spam.txt | spamc
|
||||
|
||||
Watch the debug output from spamd and look for the following debug line:
|
||||
|
||||
retrieving LDAP prefs for <username>: <value>
|
||||
|
||||
If you do not see the above text, then the LDAP query was not successful, and
|
||||
you should see any error messages reported. <username> should be the user
|
||||
that was passed to spamd and is usually the user executing spamc.
|
||||
|
||||
If you need to set up LDAP, a good guide is here:
|
||||
http://yolinux.com/TUTORIALS/LinuxTutorialLDAP.html
|
||||
|
||||
To test LDAP support using the SpamAssassin test suite, you need to
|
||||
perform a little bit of manual configuration first. See the file
|
||||
"ldap/README.testing" for details.
|
||||
|
||||
|
||||
******
|
||||
NB: This should be considered BETA, and the interface or overall
|
||||
operation of LDAP support may change at any time with future releases of SA.
|
||||
******
|
||||
|
||||
Please send any comments to <kris at koehntopp.de> and file bugs via
|
||||
<http://issues.apache.org/SpamAssassin/>.
|
||||
|
||||
Kristian Köhntopp
|
106
upstream/ldap/README.testing
Normal file
106
upstream/ldap/README.testing
Normal file
@ -0,0 +1,106 @@
|
||||
|
||||
How to enable the SpamAssassin LDAP self-test
|
||||
---------------------------------------------
|
||||
|
||||
- install openldap server, using apt-get etc. On Debian (unstable),
|
||||
that's done as follows:
|
||||
|
||||
sudo apt-get install slapd ldap-utils
|
||||
|
||||
Then enter an admin password.
|
||||
|
||||
|
||||
- Patch the 'inetorgperson.schema' file, found in the following locations:
|
||||
|
||||
Fedora Core 1: /etc/openldap/schema/inetorgperson.schema
|
||||
Debian: /etc/ldap/schema/inetorgperson.schema
|
||||
|
||||
as follows:
|
||||
|
||||
--- /etc/openldap/schema/inetorgperson.schema.default 2003-10-23 07:26:01.000000000 -0700
|
||||
+++ /etc/openldap/schema/inetorgperson.schema 2004-02-05 22:07:01.000000000 -0800
|
||||
@@ -121,6 +121,13 @@
|
||||
DESC 'RFC2798: personal identity information, a PKCS #12 PFX'
|
||||
SYNTAX 1.3.6.1.4.1.1466.115.121.1.5 )
|
||||
|
||||
+# spamassassin
|
||||
+# see http://SpamAssassin.org/ .
|
||||
+attributetype ( 2.16.840.1.113730.3.1.220
|
||||
+ NAME 'spamassassin'
|
||||
+ DESC 'SpamAssassin user preferences settings'
|
||||
+ EQUALITY caseExactMatch
|
||||
+ SYNTAX 1.3.6.1.4.1.1466.115.121.1.15 )
|
||||
|
||||
# inetOrgPerson
|
||||
# The inetOrgPerson represents people who are associated with an
|
||||
@@ -138,5 +145,5 @@
|
||||
labeledURI $ mail $ manager $ mobile $ o $ pager $
|
||||
photo $ roomNumber $ secretary $ uid $ userCertificate $
|
||||
x500uniqueIdentifier $ preferredLanguage $
|
||||
- userSMIMECertificate $ userPKCS12 )
|
||||
+ userSMIMECertificate $ userPKCS12 $ spamassassin )
|
||||
)
|
||||
|
||||
|
||||
- set up according to
|
||||
http://yolinux.com/TUTORIALS/LinuxTutorialLDAP.html#TUTORIAL . To go into
|
||||
some detail, here's what you need to do...
|
||||
|
||||
|
||||
- Edit the slapd.conf file, and add the "o=stooges" suffix, so that the LDIF
|
||||
file we'll be using will be valid.
|
||||
|
||||
On Debian, this means changing the existing /etc/ldap/slapd.conf file
|
||||
according to these substitutions:
|
||||
|
||||
s/"dc=jmason,dc=org"/"o=stooges"/g
|
||||
s/"cn=admin,dc=jmason,dc=org"/"cn=StoogeAdmin,o=stooges"/g
|
||||
|
||||
Also, add these two lines to set the "root" password so the LDIF file
|
||||
can be loaded (ONLY FOR TESTING! DO NOT DO THIS ON A LIVE SERVER!):
|
||||
|
||||
rootdn "cn=StoogeAdmin,o=stooges"
|
||||
rootpw secret1
|
||||
|
||||
On Fedora Core 1, you can use the file from
|
||||
http://yolinux.com/TUTORIALS/OpenLDAP2.0-stooges-slapd.conf-sample.txt
|
||||
pretty much as-is.
|
||||
|
||||
|
||||
- Next, create the storage directories:
|
||||
|
||||
sudo mkdir /var/lib/ldap/stooges /var/lib/ldap/fraternity
|
||||
sudo chown ldap.ldap /var/lib/ldap/stooges [Fedora Core only]
|
||||
sudo chown ldap.ldap /var/lib/ldap/fraternity [Fedora Core only]
|
||||
|
||||
|
||||
- And start the server:
|
||||
|
||||
sudo /etc/rc.d/init.d/ldap start [Fedora Core only]
|
||||
sudo /etc/init.d/slapd start [Debian only]
|
||||
|
||||
|
||||
- Now load the LDIF data for our testing: (this LDIF file adds a "spamassassin"
|
||||
line to the "curley" user.)
|
||||
|
||||
ldapadd -f ldap/sa_test.ldif -xv -D "cn=StoogeAdmin,o=stooges" \
|
||||
-h 127.0.0.1 -w secret1
|
||||
|
||||
|
||||
- install Net::LDAP using CPAN:
|
||||
|
||||
perl -MCPAN -e shell
|
||||
install Convert::ASN1
|
||||
install Net::LDAP
|
||||
quit
|
||||
|
||||
|
||||
- create the test flag file:
|
||||
|
||||
touch t/do_ldap
|
||||
|
||||
|
||||
- now "make test" will test against the LDAP server as well. You can
|
||||
also run "cd t; ./spamd_ldap.t" to run just that test directly.
|
||||
|
||||
|
99
upstream/ldap/sa_test.ldif
Normal file
99
upstream/ldap/sa_test.ldif
Normal file
@ -0,0 +1,99 @@
|
||||
dn: o=stooges
|
||||
objectClass: top
|
||||
objectClass: organization
|
||||
o: stooges
|
||||
description: The Three Stooges
|
||||
|
||||
dn: cn=StoogeAdmin,o=stooges
|
||||
objectClass: organizationalRole
|
||||
cn: StoogeAdmin
|
||||
description: LDAP Directory Administrator
|
||||
|
||||
dn: ou=MemberGroupA,o=stooges
|
||||
ou: MemberGroupA
|
||||
objectClass: top
|
||||
objectClass: organizationalUnit
|
||||
description: Members of MemberGroupA
|
||||
|
||||
dn: ou=MemberGroupB,o=stooges
|
||||
ou: MemberGroupB
|
||||
objectClass: top
|
||||
objectClass: organizationalUnit
|
||||
description: Members of MemberGroupB
|
||||
|
||||
dn: cn=Larry Anderson,ou=MemberGroupA,o=stooges
|
||||
ou: MemberGroupA
|
||||
o: stooges
|
||||
cn: Larry Anderson
|
||||
objectClass: top
|
||||
objectClass: person
|
||||
objectClass: organizationalPerson
|
||||
objectClass: inetOrgPerson
|
||||
mail: LAnderson@isp.com
|
||||
givenname: Larry
|
||||
sn: Anderson
|
||||
uid: larry
|
||||
homePostalAddress: 15 Cherry Ln.$Plano TX 78888
|
||||
postalAddress: 15 Fitzhugh Ave.
|
||||
l: Dallas
|
||||
st: TX
|
||||
postalcode: 76888
|
||||
telephoneNumber: (800)555-1212
|
||||
homePhone: 800-555-1313
|
||||
facsimileTelephoneNumber: 800-555-1414
|
||||
userPassword: larrysecret
|
||||
title: Account Executive
|
||||
|
||||
dn: cn=Moe Anderson,ou=MemberGroupA,o=stooges
|
||||
ou: MemberGroupA
|
||||
o: stooges
|
||||
cn: Moe Anderson
|
||||
objectClass: top
|
||||
objectClass: person
|
||||
objectClass: organizationalPerson
|
||||
objectClass: inetOrgPerson
|
||||
mail: MAnderson@isp.com
|
||||
givenname: Moe
|
||||
sn: Anderson
|
||||
uid: moe
|
||||
initials: Bob
|
||||
homePostalAddress: 16 Cherry Ln.$Plano TX 78888
|
||||
postalAddress: 15 Fitzhugh Ave.
|
||||
l: Dallas
|
||||
st: TX
|
||||
postalcode: 76888
|
||||
pager: 800-555-1319
|
||||
homePhone: 800-555-1313
|
||||
telephoneNumber: (800)555-1213
|
||||
mobile: 800-555-1318
|
||||
title: Manager of Product Development
|
||||
facsimileTelephoneNumber: 800-555-3318
|
||||
manager: cn=Larry Anderson,ou=MemberGroupA,o=stooges
|
||||
userPassword: moesecret
|
||||
|
||||
dn: cn=Curley Anderson,ou=MemberGroupB,o=stooges
|
||||
ou: MemberGroupB
|
||||
o: stooges
|
||||
cn: Curley Anderson
|
||||
objectClass: top
|
||||
objectClass: person
|
||||
objectClass: organizationalPerson
|
||||
objectClass: inetOrgPerson
|
||||
mail: CAnderson@isp.com
|
||||
givenname: Curley
|
||||
sn: Anderson
|
||||
uid: curley
|
||||
initials: Joe
|
||||
homePostalAddress: 14 Cherry Ln.$Plano TX 78888
|
||||
postalAddress: 15 Fitzhugh Ave.
|
||||
spamassassin: add_header all Foo LDAP read
|
||||
l: Dallas
|
||||
st: TX
|
||||
postalcode: 76888
|
||||
pager: 800-555-1319
|
||||
homePhone: 800-555-1313
|
||||
telephoneNumber: (800)555-1214
|
||||
mobile: 800-555-1318
|
||||
title: Developemnt Engineer
|
||||
facsimileTelephoneNumber: 800-555-3318
|
||||
userPassword: curleysecret
|
2320
upstream/lib/Mail/SpamAssassin.pm
Normal file
2320
upstream/lib/Mail/SpamAssassin.pm
Normal file
File diff suppressed because it is too large
Load Diff
214
upstream/lib/Mail/SpamAssassin/AICache.pm
Normal file
214
upstream/lib/Mail/SpamAssassin/AICache.pm
Normal file
@ -0,0 +1,214 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::AICache - provide access to cached information for
|
||||
ArchiveIterator
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This module allows ArchiveIterator to use cached atime information instead of
|
||||
having to read every message separately.
|
||||
|
||||
=head1 PUBLIC METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::AICache;
|
||||
|
||||
use File::Spec;
|
||||
use File::Path;
|
||||
use File::Basename;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
use Errno qw(EBADF);
|
||||
|
||||
=item new()
|
||||
|
||||
Generates a new cache object.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
|
||||
my $self = shift;
|
||||
if (!defined $self) { $self = {}; }
|
||||
|
||||
$self->{cache} = {};
|
||||
$self->{dirty} = 0;
|
||||
$self->{prefix} ||= '/';
|
||||
|
||||
my $use_cache = 1;
|
||||
|
||||
# be sure to use rel2abs() here, since otherwise relative paths
|
||||
# are broken by the prefix stuff
|
||||
if ($self->{type} eq 'dir') {
|
||||
$self->{cache_file} = File::Spec->catdir(
|
||||
$self->{prefix},
|
||||
File::Spec->rel2abs($self->{path}),
|
||||
'.spamassassin_cache');
|
||||
|
||||
my @stat = stat($self->{cache_file});
|
||||
@stat or dbg("AIcache: no access to %s: %s", $self->{cache_file}, $!);
|
||||
$self->{cache_mtime} = $stat[9] || 0;
|
||||
}
|
||||
else {
|
||||
my @split = File::Spec->splitpath($self->{path});
|
||||
$self->{cache_file} = File::Spec->catdir(
|
||||
$self->{prefix},
|
||||
File::Spec->rel2abs($split[1]),
|
||||
join('_', '.spamassassin_cache', $self->{type}, $split[2]));
|
||||
|
||||
my @stat = stat($self->{cache_file});
|
||||
@stat or dbg("AIcache: no access to %s: %s", $self->{cache_file}, $!);
|
||||
$self->{cache_mtime} = $stat[9] || 0;
|
||||
|
||||
# for mbox and mbx, verify whether mtime on cache file is >= mtime of
|
||||
# messages file. if it is, use it, otherwise don't.
|
||||
@stat = stat($self->{path});
|
||||
@stat or dbg("AIcache: no access to %s: %s", $self->{path}, $!);
|
||||
if ($stat[9] > $self->{cache_mtime}) {
|
||||
$use_cache = 0;
|
||||
}
|
||||
}
|
||||
$self->{cache_file} = File::Spec->canonpath($self->{cache_file});
|
||||
|
||||
# go ahead and read in the cache information
|
||||
local *CACHE;
|
||||
if (!$use_cache) {
|
||||
# not in use
|
||||
} elsif (!open(CACHE, $self->{cache_file})) {
|
||||
dbg("AIcache: cannot open AI cache file (%s): %s", $self->{cache_file},$!);
|
||||
} else {
|
||||
for ($!=0; defined($_=<CACHE>); $!=0) {
|
||||
my($k,$v) = split(/\t/, $_);
|
||||
next unless (defined $k && defined $v);
|
||||
$self->{cache}->{$k} = $v;
|
||||
}
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("AIcache: error reading from AI cache file: $!")
|
||||
: warn "error reading from AI cache file: $!";
|
||||
close CACHE
|
||||
or die "error closing AI cache file (".$self->{cache_file}."): $!";
|
||||
}
|
||||
|
||||
bless($self,$class);
|
||||
$self;
|
||||
}
|
||||
|
||||
sub count {
|
||||
my ($self) = @_;
|
||||
return keys %{$self->{cache}};
|
||||
}
|
||||
|
||||
sub check {
|
||||
my ($self, $name) = @_;
|
||||
|
||||
return $self->{cache} unless $name;
|
||||
|
||||
# for dir collections: just use the info on a file, if an entry
|
||||
# exists for that file. it's very unlikely that a file will be
|
||||
# changed to contain a different Date header, and it's slow to check.
|
||||
# return if ($self->{type} eq 'dir' && (stat($name))[9] > $self->{cache_mtime});
|
||||
|
||||
$name = $self->canon($name);
|
||||
return $self->{cache}->{$name};
|
||||
}
|
||||
|
||||
sub update {
|
||||
my ($self, $name, $date) = @_;
|
||||
|
||||
return unless $name;
|
||||
$name = $self->canon($name);
|
||||
|
||||
# if information is different than cached version, set dirty and update
|
||||
if (!exists $self->{cache}->{$name} || $self->{cache}->{$name} != $date) {
|
||||
$self->{cache}->{$name} = $date;
|
||||
$self->{dirty} = 1;
|
||||
}
|
||||
}
|
||||
|
||||
sub finish {
|
||||
my ($self) = @_;
|
||||
|
||||
return unless $self->{dirty};
|
||||
|
||||
# Cache is dirty, so write out new file
|
||||
|
||||
# create enclosing dir tree, if required
|
||||
eval {
|
||||
mkpath(dirname($self->{cache_file}));
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn "cannot mkpath for AI cache file ($self->{cache_file}): $eval_stat\n";
|
||||
};
|
||||
|
||||
my $towrite = '';
|
||||
while(my($k,$v) = each %{$self->{cache}}) {
|
||||
$towrite .= "$k\t$v\n";
|
||||
}
|
||||
|
||||
{
|
||||
# ignore signals while we're writing this file
|
||||
local $SIG{'INT'} = 'IGNORE';
|
||||
local $SIG{'TERM'} = 'IGNORE';
|
||||
|
||||
if (!open(CACHE, ">".$self->{cache_file}))
|
||||
{
|
||||
warn "creating AI cache file failed (".$self->{cache_file}."): $!";
|
||||
# TODO: should we delete it/clean it up?
|
||||
}
|
||||
else {
|
||||
print CACHE $towrite
|
||||
or warn "error writing to AI cache file: $!";
|
||||
close CACHE
|
||||
or warn "error closing AI cache file (".$self->{cache_file}."): $!";
|
||||
}
|
||||
}
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
sub canon {
|
||||
my ($self, $name) = @_;
|
||||
|
||||
if ($self->{type} eq 'dir') {
|
||||
# strip off dirs, just look at filename
|
||||
$name = (File::Spec->splitpath($name))[2];
|
||||
}
|
||||
else {
|
||||
# we may get in a "/path/mbox.offset", so trim to just offset as necessary
|
||||
$name =~ s/^.+\.(\d+)$/$1/;
|
||||
}
|
||||
return $name;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
1;
|
||||
__END__
|
1224
upstream/lib/Mail/SpamAssassin/ArchiveIterator.pm
Normal file
1224
upstream/lib/Mail/SpamAssassin/ArchiveIterator.pm
Normal file
File diff suppressed because it is too large
Load Diff
683
upstream/lib/Mail/SpamAssassin/AsyncLoop.pm
Normal file
683
upstream/lib/Mail/SpamAssassin/AsyncLoop.pm
Normal file
@ -0,0 +1,683 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::AsyncLoop - scanner asynchronous event loop
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
An asynchronous event loop used for long-running operations, performed "in the
|
||||
background" during the Mail::SpamAssassin::check() scan operation, such as DNS
|
||||
blocklist lookups.
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::AsyncLoop;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Time::HiRes qw(time);
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
# obtain timer resolution if possible
|
||||
our $timer_resolution;
|
||||
BEGIN {
|
||||
eval {
|
||||
$timer_resolution = Time::HiRes->can('clock_getres')
|
||||
? Time::HiRes::clock_getres(Time::HiRes::CLOCK_REALTIME())
|
||||
: 0.001; # wild guess, assume resolution is better than 1s
|
||||
1;
|
||||
} or do {
|
||||
$timer_resolution = 1; # Perl's builtin timer ticks at one second
|
||||
};
|
||||
}
|
||||
|
||||
#############################################################################
|
||||
|
||||
sub new {
|
||||
# called from PerMsgStatus, a new AsyncLoop object is created
|
||||
# for each new message processing
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
|
||||
my ($main) = @_;
|
||||
my $self = {
|
||||
main => $main,
|
||||
queries_started => 0,
|
||||
queries_completed => 0,
|
||||
total_queries_started => 0,
|
||||
total_queries_completed => 0,
|
||||
pending_lookups => { },
|
||||
timing_by_query => { },
|
||||
all_lookups => { }, # keyed by "rr_type/domain"
|
||||
};
|
||||
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
# Given a domain name, produces a listref of successively stripped down
|
||||
# parent domains, e.g. a domain '2.10.Example.COM' would produce a list:
|
||||
# '2.10.example.com', '10.example.com', 'example.com', 'com', ''
|
||||
#
|
||||
sub domain_to_search_list {
|
||||
my ($domain) = @_;
|
||||
$domain =~ s/^\.+//; $domain =~ s/\.+\z//; # strip leading and trailing dots
|
||||
my @search_keys;
|
||||
if ($domain =~ /\[/) { # don't split address literals
|
||||
@search_keys = ( $domain, '' ); # presumably an address literal
|
||||
} else {
|
||||
local $1;
|
||||
$domain = lc $domain;
|
||||
for (;;) {
|
||||
push(@search_keys, $domain);
|
||||
last if $domain eq '';
|
||||
# strip one level
|
||||
$domain = ($domain =~ /^ (?: [^.]* ) \. (.*) \z/xs) ? $1 : '';
|
||||
}
|
||||
if (@search_keys > 20) { # enforce some sanity limit
|
||||
@search_keys = @search_keys[$#search_keys-19 .. $#search_keys];
|
||||
}
|
||||
}
|
||||
return \@search_keys;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $ent = $async->start_lookup($ent, $master_deadline)
|
||||
|
||||
Register the start of a long-running asynchronous lookup operation.
|
||||
C<$ent> is a hash reference containing the following items:
|
||||
|
||||
=over 4
|
||||
|
||||
=item key (required)
|
||||
|
||||
A key string, unique to this lookup. This is what is reported in
|
||||
debug messages, used as the key for C<get_lookup()>, etc.
|
||||
|
||||
=item id (required)
|
||||
|
||||
An ID string, also unique to this lookup. Typically, this is the DNS packet ID
|
||||
as returned by DnsResolver's C<bgsend> method. Sadly, the Net::DNS
|
||||
architecture forces us to keep a separate ID string for this task instead of
|
||||
reusing C<key> -- if you are not using DNS lookups through DnsResolver, it
|
||||
should be OK to just reuse C<key>.
|
||||
|
||||
=item type (required)
|
||||
|
||||
A string, typically one word, used to describe the type of lookup in log
|
||||
messages, such as C<DNSBL>, C<MX>, C<TXT>.
|
||||
|
||||
=item zone (optional)
|
||||
|
||||
A zone specification (typically a DNS zone name - e.g. host, domain, or RBL)
|
||||
which may be used as a key to look up per-zone settings. No semantics on this
|
||||
parameter is imposed by this module. Currently used to fetch by-zone timeouts.
|
||||
|
||||
=item timeout_initial (optional)
|
||||
|
||||
An initial value of elapsed time for which we are willing to wait for a
|
||||
response (time in seconds, floating point value is allowed). When elapsed
|
||||
time since a query started exceeds the timeout value and there are no other
|
||||
queries to wait for, the query is aborted. The actual timeout value ranges
|
||||
from timeout_initial and gradually approaches timeout_min (see next parameter)
|
||||
as the number of already completed queries approaches the number of all
|
||||
queries started.
|
||||
|
||||
If a caller does not explicitly provide this parameter or its value is
|
||||
undefined, a default initial timeout value is settable by a configuration
|
||||
variable rbl_timeout.
|
||||
|
||||
If a value of the timeout_initial parameter is below timeout_min, the initial
|
||||
timeout is set to timeout_min.
|
||||
|
||||
=item timeout_min (optional)
|
||||
|
||||
A lower bound (in seconds) to which the actual timeout approaches as the
|
||||
number of queries completed approaches the number of all queries started.
|
||||
Defaults to 0.2 * timeout_initial.
|
||||
|
||||
=back
|
||||
|
||||
C<$ent> is returned by this method, with its contents augmented by additional
|
||||
information.
|
||||
|
||||
=cut
|
||||
|
||||
sub start_lookup {
|
||||
my ($self, $ent, $master_deadline) = @_;
|
||||
|
||||
my $id = $ent->{id};
|
||||
my $key = $ent->{key};
|
||||
defined $id && $id ne '' or die "oops, no id";
|
||||
$key or die "oops, no key";
|
||||
$ent->{type} or die "oops, no type";
|
||||
|
||||
my $now = time;
|
||||
$ent->{start_time} = $now if !defined $ent->{start_time};
|
||||
|
||||
# are there any applicable per-zone settings?
|
||||
my $zone = $ent->{zone};
|
||||
my $settings; # a ref to a by-zone or to global settings
|
||||
my $conf_by_zone = $self->{main}->{conf}->{by_zone};
|
||||
if (defined $zone && $conf_by_zone) {
|
||||
# dbg("async: searching for by_zone settings for $zone");
|
||||
$zone =~ s/^\.//; $zone =~ s/\.\z//; # strip leading and trailing dot
|
||||
for (;;) { # 2.10.example.com, 10.example.com, example.com, com, ''
|
||||
if (exists $conf_by_zone->{$zone}) {
|
||||
$settings = $conf_by_zone->{$zone};
|
||||
last;
|
||||
} elsif ($zone eq '') {
|
||||
last;
|
||||
} else { # strip one level, careful with address literals
|
||||
$zone = ($zone =~ /^( (?: [^.] | \[ (?: \\. | [^\]\\] )* \] )* )
|
||||
\. (.*) \z/xs) ? $2 : '';
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
dbg("async: applying by_zone settings for %s", $zone) if $settings;
|
||||
|
||||
my $t_init = $ent->{timeout_initial}; # application-specified has precedence
|
||||
$t_init = $settings->{rbl_timeout} if $settings && !defined $t_init;
|
||||
$t_init = $self->{main}->{conf}->{rbl_timeout} if !defined $t_init;
|
||||
$t_init = 0 if !defined $t_init; # last-resort default, just in case
|
||||
|
||||
my $t_end = $ent->{timeout_min}; # application-specified has precedence
|
||||
$t_end = $settings->{rbl_timeout_min} if $settings && !defined $t_end;
|
||||
$t_end = $self->{main}->{conf}->{rbl_timeout_min} if !defined $t_end; # added for bug 7070
|
||||
$t_end = 0.2 * $t_init if !defined $t_end;
|
||||
$t_end = 0 if $t_end < 0; # just in case
|
||||
$t_init = $t_end if $t_init < $t_end;
|
||||
|
||||
my $clipped_by_master_deadline = 0;
|
||||
if (defined $master_deadline) {
|
||||
my $time_avail = $master_deadline - time;
|
||||
$time_avail = 0.5 if $time_avail < 0.5; # give some slack
|
||||
if ($t_init > $time_avail) {
|
||||
$t_init = $time_avail; $clipped_by_master_deadline = 1;
|
||||
$t_end = $time_avail if $t_end > $time_avail;
|
||||
}
|
||||
}
|
||||
$ent->{timeout_initial} = $t_init;
|
||||
$ent->{timeout_min} = $t_end;
|
||||
|
||||
$ent->{display_id} = # identifies entry in debug logging and similar
|
||||
join(", ", grep { defined }
|
||||
map { ref $ent->{$_} ? @{$ent->{$_}} : $ent->{$_} }
|
||||
qw(sets rules rulename type key) );
|
||||
|
||||
$self->{pending_lookups}->{$key} = $ent;
|
||||
|
||||
$self->{queries_started}++;
|
||||
$self->{total_queries_started}++;
|
||||
dbg("async: starting: %s (timeout %.1fs, min %.1fs)%s",
|
||||
$ent->{display_id}, $ent->{timeout_initial}, $ent->{timeout_min},
|
||||
!$clipped_by_master_deadline ? '' : ', capped by time limit');
|
||||
|
||||
$ent;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $ent = $async->bgsend_and_start_lookup($domain, $type, $class, $ent, $cb, %options)
|
||||
|
||||
A common idiom: calls C<bgsend>, followed by a call to C<start_lookup>,
|
||||
returning the argument $ent object as modified by C<start_lookup> and
|
||||
filled-in with a query ID.
|
||||
|
||||
=cut
|
||||
|
||||
sub bgsend_and_start_lookup {
|
||||
my($self, $domain, $type, $class, $ent, $cb, %options) = @_;
|
||||
$ent = {} if !$ent;
|
||||
$domain =~ s/\.+\z//s; # strip trailing dots, these sometimes still sneak in
|
||||
$ent->{id} = undef;
|
||||
$ent->{query_type} = $type;
|
||||
$ent->{query_domain} = $domain;
|
||||
$ent->{type} = $type if !exists $ent->{type};
|
||||
$cb = $ent->{completed_callback} if !$cb; # compatibility with SA < 3.4
|
||||
|
||||
my $key = $ent->{key} || '';
|
||||
|
||||
my $dnskey = uc($type) . '/' . lc($domain);
|
||||
my $dns_query_info = $self->{all_lookups}{$dnskey};
|
||||
|
||||
if ($dns_query_info) { # DNS query already underway or completed
|
||||
my $id = $ent->{id} = $dns_query_info->{id}; # re-use existing query
|
||||
return if !defined $id; # presumably blocked, or other fatal failure
|
||||
my $id_tail = $id; $id_tail =~ s{^\d+/IN/}{};
|
||||
lc($id_tail) eq lc($dnskey)
|
||||
or info("async: unmatched id %s, key=%s", $id, $dnskey);
|
||||
|
||||
my $pkt = $dns_query_info->{pkt};
|
||||
if (!$pkt) { # DNS query underway, still waiting for results
|
||||
# just add our query to the existing one
|
||||
push(@{$dns_query_info->{applicants}}, [$ent,$cb]);
|
||||
dbg("async: query %s already underway, adding no.%d %s",
|
||||
$id, scalar @{$dns_query_info->{applicants}},
|
||||
$ent->{rulename} || $key);
|
||||
|
||||
} else { # DNS query already completed, re-use results
|
||||
# answer already known, just do the callback and be done with it
|
||||
if (!$cb) {
|
||||
dbg("async: query %s already done, re-using for %s", $id, $key);
|
||||
} else {
|
||||
dbg("async: query %s already done, re-using for %s, callback",
|
||||
$id, $key);
|
||||
eval {
|
||||
$cb->($ent, $pkt); 1;
|
||||
} or do {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (1) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("query %s completed, callback %s failed: %s\n",
|
||||
$id, $key, $@);
|
||||
};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
else { # no existing query, open a new DNS query
|
||||
$dns_query_info = $self->{all_lookups}{$dnskey} = {}; # new query needed
|
||||
my($id, $blocked);
|
||||
my $dns_query_blockages = $self->{main}->{conf}->{dns_query_blocked};
|
||||
if ($dns_query_blockages) {
|
||||
my $search_list = domain_to_search_list($domain);
|
||||
foreach my $parent_domain (@$search_list) {
|
||||
$blocked = $dns_query_blockages->{$parent_domain};
|
||||
last if defined $blocked; # stop at first defined, can be true or false
|
||||
}
|
||||
}
|
||||
if ($blocked) {
|
||||
dbg("async: blocked by dns_query_restriction: %s", $dnskey);
|
||||
} else {
|
||||
dbg("async: launching %s for %s", $dnskey, $key);
|
||||
$id = $self->{main}->{resolver}->bgsend($domain, $type, $class, sub {
|
||||
my($pkt, $pkt_id, $timestamp) = @_;
|
||||
# this callback sub is called from DnsResolver::poll_responses()
|
||||
# dbg("async: in a bgsend_and_start_lookup callback, id %s", $pkt_id);
|
||||
if ($pkt_id ne $id) {
|
||||
warn "async: mismatched dns id: got $pkt_id, expected $id\n";
|
||||
return;
|
||||
}
|
||||
$self->set_response_packet($pkt_id, $pkt, $ent->{key}, $timestamp);
|
||||
$dns_query_info->{pkt} = $pkt;
|
||||
my $cb_count = 0;
|
||||
foreach my $tuple (@{$dns_query_info->{applicants}}) {
|
||||
my($appl_ent, $appl_cb) = @$tuple;
|
||||
if ($appl_cb) {
|
||||
dbg("async: calling callback on key %s%s", $key,
|
||||
!defined $appl_ent->{rulename} ? ''
|
||||
: ", rule ".$appl_ent->{rulename});
|
||||
$cb_count++;
|
||||
eval {
|
||||
$appl_cb->($appl_ent, $pkt); 1;
|
||||
} or do {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (2) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("query %s completed, callback %s failed: %s\n",
|
||||
$id, $appl_ent->{key}, $@);
|
||||
};
|
||||
}
|
||||
}
|
||||
delete $dns_query_info->{applicants};
|
||||
dbg("async: query $id completed, no callbacks run") if !$cb_count;
|
||||
});
|
||||
}
|
||||
return if !defined $id;
|
||||
$dns_query_info->{id} = $ent->{id} = $id;
|
||||
push(@{$dns_query_info->{applicants}}, [$ent,$cb]);
|
||||
$self->start_lookup($ent, $options{master_deadline});
|
||||
}
|
||||
return $ent;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $ent = $async->get_lookup($key)
|
||||
|
||||
Retrieve the pending-lookup object for the given key C<$key>.
|
||||
|
||||
If the lookup is complete, this will return C<undef>.
|
||||
|
||||
Note that a lookup is still considered "pending" until C<complete_lookups()> is
|
||||
called, even if it has been reported as complete via C<set_response_packet()>.
|
||||
|
||||
=cut
|
||||
|
||||
sub get_lookup {
|
||||
my ($self, $key) = @_;
|
||||
return $self->{pending_lookups}->{$key};
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $async->log_lookups_timing()
|
||||
|
||||
Log sorted timing for all completed lookups.
|
||||
|
||||
=cut
|
||||
|
||||
sub log_lookups_timing {
|
||||
my ($self) = @_;
|
||||
my $timings = $self->{timing_by_query};
|
||||
for my $key (sort { $timings->{$a} <=> $timings->{$b} } keys %$timings) {
|
||||
dbg("async: timing: %.3f %s", $timings->{$key}, $key);
|
||||
}
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $alldone = $async->complete_lookups()
|
||||
|
||||
Perform a poll of the pending lookups, to see if any are completed.
|
||||
Callbacks on completed queries will be called from poll_responses().
|
||||
|
||||
If there are no lookups remaining, or if too much time has elapsed since
|
||||
any results were returned, C<1> is returned, otherwise C<0>.
|
||||
|
||||
=cut
|
||||
|
||||
sub complete_lookups {
|
||||
my ($self, $timeout, $allow_aborting_of_expired) = @_;
|
||||
my $alldone = 0;
|
||||
my $anydone = 0;
|
||||
my $allexpired = 1;
|
||||
my %typecount;
|
||||
|
||||
my $pending = $self->{pending_lookups};
|
||||
$self->{queries_started} = 0;
|
||||
$self->{queries_completed} = 0;
|
||||
|
||||
my $now = time;
|
||||
|
||||
if (defined $timeout && $timeout > 0 &&
|
||||
%$pending && $self->{total_queries_started} > 0)
|
||||
{
|
||||
# shrink a 'select' timeout if a caller specified unnecessarily long
|
||||
# value beyond the latest deadline of any outstanding request;
|
||||
# can save needless wait time (up to 1 second in harvest_dnsbl_queries)
|
||||
my $r = $self->{total_queries_completed} / $self->{total_queries_started};
|
||||
my $r2 = $r * $r; # 0..1
|
||||
my $max_deadline;
|
||||
while (my($key,$ent) = each %$pending) {
|
||||
my $t_init = $ent->{timeout_initial};
|
||||
my $dt = $t_init - ($t_init - $ent->{timeout_min}) * $r2;
|
||||
my $deadline = $ent->{start_time} + $dt;
|
||||
$max_deadline = $deadline if !defined $max_deadline ||
|
||||
$deadline > $max_deadline;
|
||||
}
|
||||
if (defined $max_deadline) {
|
||||
# adjust to timer resolution, only deals with 1s and with fine resolution
|
||||
$max_deadline = 1 + int $max_deadline
|
||||
if $timer_resolution == 1 && $max_deadline > int $max_deadline;
|
||||
my $sufficient_timeout = $max_deadline - $now;
|
||||
$sufficient_timeout = 0 if $sufficient_timeout < 0;
|
||||
if ($timeout > $sufficient_timeout) {
|
||||
dbg("async: reducing select timeout from %.1f to %.1f s",
|
||||
$timeout, $sufficient_timeout);
|
||||
$timeout = $sufficient_timeout;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# trap this loop in an eval { } block, as Net::DNS could throw
|
||||
# die()s our way; in particular, process_dnsbl_results() has
|
||||
# thrown die()s before (bug 3794).
|
||||
eval {
|
||||
|
||||
if (%$pending) { # any outstanding requests still?
|
||||
$self->{last_poll_responses_time} = $now;
|
||||
my $nfound = $self->{main}->{resolver}->poll_responses($timeout);
|
||||
dbg("async: select found %s responses ready (t.o.=%.1f)",
|
||||
!$nfound ? 'no' : $nfound, $timeout);
|
||||
}
|
||||
$now = time; # capture new timestamp, after possible sleep in 'select'
|
||||
|
||||
# A callback routine may generate another DNS query, which may insert
|
||||
# an entry into the %$pending hash thus invalidating the each() context.
|
||||
# So, make sure that callbacks are not called while the each() context
|
||||
# is open. [Bug 6937]
|
||||
#
|
||||
while (my($key,$ent) = each %$pending) {
|
||||
my $id = $ent->{id};
|
||||
if (exists $self->{finished}->{$id}) {
|
||||
delete $self->{finished}->{$id};
|
||||
$anydone = 1;
|
||||
$ent->{finish_time} = $now if !defined $ent->{finish_time};
|
||||
my $elapsed = $ent->{finish_time} - $ent->{start_time};
|
||||
dbg("async: completed in %.3f s: %s", $elapsed, $ent->{display_id});
|
||||
$self->{timing_by_query}->{". $key"} += $elapsed;
|
||||
$self->{queries_completed}++;
|
||||
$self->{total_queries_completed}++;
|
||||
delete $pending->{$key};
|
||||
}
|
||||
}
|
||||
|
||||
if (%$pending) { # still any requests outstanding? are they expired?
|
||||
my $r =
|
||||
!$allow_aborting_of_expired || !$self->{total_queries_started} ? 1.0
|
||||
: $self->{total_queries_completed} / $self->{total_queries_started};
|
||||
my $r2 = $r * $r; # 0..1
|
||||
while (my($key,$ent) = each %$pending) {
|
||||
$typecount{$ent->{type}}++;
|
||||
my $t_init = $ent->{timeout_initial};
|
||||
my $dt = $t_init - ($t_init - $ent->{timeout_min}) * $r2;
|
||||
# adjust to timer resolution, only deals with 1s and fine resolution
|
||||
$dt = 1 + int $dt if $timer_resolution == 1 && $dt > int $dt;
|
||||
$allexpired = 0 if $now <= $ent->{start_time} + $dt;
|
||||
}
|
||||
dbg("async: queries completed: %d, started: %d",
|
||||
$self->{queries_completed}, $self->{queries_started});
|
||||
}
|
||||
|
||||
# ensure we don't get stuck if a request gets lost in the ether.
|
||||
if (! %$pending) {
|
||||
$alldone = 1;
|
||||
}
|
||||
elsif ($allexpired && $allow_aborting_of_expired) {
|
||||
# avoid looping forever if we haven't got all results.
|
||||
dbg("async: escaping: lost or timed out requests or responses");
|
||||
$self->abort_remaining_lookups();
|
||||
$alldone = 1;
|
||||
}
|
||||
else {
|
||||
dbg("async: queries active: %s%s at %s",
|
||||
join (' ', map { "$_=$typecount{$_}" } sort keys %typecount),
|
||||
$allexpired ? ', all expired' : '', scalar(localtime(time)));
|
||||
$alldone = 0;
|
||||
}
|
||||
1;
|
||||
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
# resignal if alarm went off
|
||||
die "async: (3) $eval_stat\n" if $eval_stat =~ /__alarm__ignore__\(.*\)/s;
|
||||
dbg("async: caught complete_lookups death, aborting: %s", $eval_stat);
|
||||
$alldone = 1; # abort remaining
|
||||
};
|
||||
|
||||
return wantarray ? ($alldone,$anydone) : $alldone;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $async->abort_remaining_lookups()
|
||||
|
||||
Abort any remaining lookups.
|
||||
|
||||
=cut
|
||||
|
||||
sub abort_remaining_lookups {
|
||||
my ($self) = @_;
|
||||
|
||||
my $pending = $self->{pending_lookups};
|
||||
my $foundcnt = 0;
|
||||
my $now = time;
|
||||
|
||||
while (my($key,$ent) = each %$pending) {
|
||||
dbg("async: aborting after %.3f s, %s: %s",
|
||||
$now - $ent->{start_time},
|
||||
(defined $ent->{timeout_initial} &&
|
||||
$now > $ent->{start_time} + $ent->{timeout_initial}
|
||||
? 'past original deadline' : 'deadline shrunk'),
|
||||
$ent->{display_id} );
|
||||
$foundcnt++;
|
||||
$self->{timing_by_query}->{"X $key"} = $now - $ent->{start_time};
|
||||
$ent->{finish_time} = $now if !defined $ent->{finish_time};
|
||||
delete $pending->{$key};
|
||||
}
|
||||
|
||||
# call any remaining callbacks, indicating the query has been aborted
|
||||
#
|
||||
my $all_lookups_ref = $self->{all_lookups};
|
||||
foreach my $dnskey (keys %$all_lookups_ref) {
|
||||
my $dns_query_info = $all_lookups_ref->{$dnskey};
|
||||
my $cb_count = 0;
|
||||
foreach my $tuple (@{$dns_query_info->{applicants}}) {
|
||||
my($ent, $cb) = @$tuple;
|
||||
if ($cb) {
|
||||
dbg("async: calling callback/abort on key %s%s", $dnskey,
|
||||
!defined $ent->{rulename} ? '' : ", rule ".$ent->{rulename});
|
||||
$cb_count++;
|
||||
eval {
|
||||
$cb->($ent, undef); 1;
|
||||
} or do {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (2) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("query %s aborted, callback %s failed: %s\n",
|
||||
$dnskey, $ent->{key}, $@);
|
||||
};
|
||||
}
|
||||
dbg("async: query $dnskey aborted, no callbacks run") if !$cb_count;
|
||||
}
|
||||
delete $dns_query_info->{applicants};
|
||||
}
|
||||
|
||||
dbg("async: aborted %d remaining lookups", $foundcnt) if $foundcnt > 0;
|
||||
delete $self->{last_poll_responses_time};
|
||||
$self->{main}->{resolver}->bgabort();
|
||||
1;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $async->set_response_packet($id, $pkt, $key, $timestamp)
|
||||
|
||||
Register a "response packet" for a given query. C<$id> is the ID for the
|
||||
query, and must match the C<id> supplied in C<start_lookup()>. C<$pkt> is the
|
||||
packet object for the response. A parameter C<$key> identifies an entry in a
|
||||
hash %{$self->{pending_lookups}} where the object which spawned this query can
|
||||
be found, and through which further information about the query is accessible.
|
||||
|
||||
C<$pkt> may be undef, indicating that no response packet is available, but a
|
||||
query has completed (e.g. was aborted or dismissed) and is no longer "pending".
|
||||
|
||||
The DNS resolver's response packet C<$pkt> will be made available to a callback
|
||||
subroutine through its argument as well as in C<$ent-<gt>{response_packet}>.
|
||||
|
||||
=cut
|
||||
|
||||
sub set_response_packet {
|
||||
my ($self, $id, $pkt, $key, $timestamp) = @_;
|
||||
$self->{finished}->{$id} = 1; # only key existence matters, any value
|
||||
$timestamp = time if !defined $timestamp;
|
||||
my $pending = $self->{pending_lookups};
|
||||
if (!defined $key) { # backward compatibility with 3.2.3 and older plugins
|
||||
# a third-party plugin did not provide $key in a call, search for it:
|
||||
if ($id eq $pending->{$id}->{id}) { # I feel lucky, key==id ?
|
||||
$key = $id;
|
||||
} else { # then again, maybe not, be more systematic
|
||||
for my $tkey (keys %$pending) {
|
||||
if ($id eq $pending->{$tkey}->{id}) { $key = $tkey; last }
|
||||
}
|
||||
}
|
||||
dbg("async: got response on id $id, search found key $key");
|
||||
}
|
||||
if (!defined $key) {
|
||||
info("async: no key, response packet not remembered, id $id");
|
||||
} else {
|
||||
my $ent = $pending->{$key};
|
||||
my $ent_id = $ent->{id};
|
||||
if (!defined $ent_id) {
|
||||
# should not happen, troubleshooting
|
||||
info("async: ignoring response, id %s, ent_id is undef: %s",
|
||||
$id, join(', ', %$ent));
|
||||
} elsif ($id ne $ent_id) {
|
||||
info("async: ignoring response, mismatched id $id, expected $ent_id");
|
||||
} else {
|
||||
$ent->{finish_time} = $timestamp;
|
||||
$ent->{response_packet} = $pkt;
|
||||
}
|
||||
}
|
||||
1;
|
||||
}
|
||||
|
||||
=item $async->report_id_complete($id,$key,$key,$timestamp)
|
||||
|
||||
Legacy. Equivalent to $self->set_response_packet($id,undef,$key,$timestamp),
|
||||
i.e. providing undef as a response packet. Register that a query has
|
||||
completed and is no longer "pending". C<$id> is the ID for the query,
|
||||
and must match the C<id> supplied in C<start_lookup()>.
|
||||
|
||||
One or the other of C<set_response_packet()> or C<report_id_complete()>
|
||||
should be called, but not both.
|
||||
|
||||
=cut
|
||||
|
||||
sub report_id_complete {
|
||||
my ($self, $id, $key, $timestamp) = @_;
|
||||
$self->set_response_packet($id, undef, $key, $timestamp);
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $time = $async->last_poll_responses_time()
|
||||
|
||||
Get the time of the last call to C<poll_responses()> (which is called
|
||||
from C<complete_lookups()>. If C<poll_responses()> was never called or
|
||||
C<abort_remaining_lookups()> has been called C<last_poll_responses_time()>
|
||||
will return undef.
|
||||
|
||||
=cut
|
||||
|
||||
sub last_poll_responses_time {
|
||||
my ($self) = @_;
|
||||
return $self->{last_poll_responses_time};
|
||||
}
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
354
upstream/lib/Mail/SpamAssassin/AutoWhitelist.pm
Normal file
354
upstream/lib/Mail/SpamAssassin/AutoWhitelist.pm
Normal file
@ -0,0 +1,354 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::AutoWhitelist - auto-whitelist handler for SpamAssassin
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
(see Mail::SpamAssassin)
|
||||
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin is a module to identify spam using text analysis and
|
||||
several internet-based realtime blacklists.
|
||||
|
||||
This class is used internally by SpamAssassin to manage the automatic
|
||||
whitelisting functionality. Please refer to the C<Mail::SpamAssassin>
|
||||
documentation for public interfaces.
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::AutoWhitelist;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use NetAddr::IP 4.000;
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my ($main, $msg) = @_;
|
||||
|
||||
my $conf = $main->{conf};
|
||||
my $self = {
|
||||
main => $main,
|
||||
factor => $conf->{auto_whitelist_factor},
|
||||
ipv4_mask_len => $conf->{auto_whitelist_ipv4_mask_len},
|
||||
ipv6_mask_len => $conf->{auto_whitelist_ipv6_mask_len},
|
||||
};
|
||||
|
||||
my $factory;
|
||||
if ($main->{pers_addr_list_factory}) {
|
||||
$factory = $main->{pers_addr_list_factory};
|
||||
}
|
||||
else {
|
||||
my $type = $conf->{auto_whitelist_factory};
|
||||
if ($type =~ /^([_A-Za-z0-9:]+)$/) {
|
||||
$type = untaint_var($type);
|
||||
eval '
|
||||
require '.$type.';
|
||||
$factory = '.$type.'->new();
|
||||
1;
|
||||
'
|
||||
or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn "auto-whitelist: $eval_stat\n";
|
||||
undef $factory;
|
||||
};
|
||||
$main->set_persistent_address_list_factory($factory) if $factory;
|
||||
}
|
||||
else {
|
||||
warn "auto-whitelist: illegal auto_whitelist_factory setting\n";
|
||||
}
|
||||
}
|
||||
|
||||
if (!defined $factory) {
|
||||
$self->{checker} = undef;
|
||||
} else {
|
||||
$self->{checker} = $factory->new_checker($self->{main});
|
||||
}
|
||||
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $meanscore = awl->check_address($addr, $originating_ip, $signedby);
|
||||
|
||||
This method will return the mean score of all messages associated with the
|
||||
given address, or undef if the address hasn't been seen before.
|
||||
|
||||
If B<$originating_ip> is supplied, it will be used in the lookup.
|
||||
|
||||
=cut
|
||||
|
||||
sub check_address {
|
||||
my ($self, $addr, $origip, $signedby) = @_;
|
||||
|
||||
if (!defined $self->{checker}) {
|
||||
return; # no factory defined; we can't check
|
||||
}
|
||||
|
||||
$self->{entry} = undef;
|
||||
|
||||
my $fulladdr = $self->pack_addr ($addr, $origip);
|
||||
my $entry = $self->{checker}->get_addr_entry ($fulladdr, $signedby);
|
||||
$self->{entry} = $entry;
|
||||
|
||||
if (!$entry->{msgcount}) {
|
||||
# no entry found
|
||||
if (defined $origip) {
|
||||
# try upgrading a default entry (probably from "add-addr-to-foo")
|
||||
my $noipaddr = $self->pack_addr ($addr, undef);
|
||||
my $noipent = $self->{checker}->get_addr_entry ($noipaddr, undef);
|
||||
|
||||
if (defined $noipent->{msgcount} && $noipent->{msgcount} > 0) {
|
||||
dbg("auto-whitelist: found entry w/o IP address for $addr: replacing with $origip");
|
||||
$self->{checker}->remove_entry($noipent);
|
||||
# Now assign proper entry the count and totscore values of the
|
||||
# no-IP entry instead of assigning the whole value to avoid
|
||||
# wiping out any information added to the previous entry.
|
||||
$entry->{msgcount} = $noipent->{msgcount};
|
||||
$entry->{totscore} = $noipent->{totscore};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if ($entry->{msgcount} < 0 ||
|
||||
$entry->{msgcount} != $entry->{msgcount} || # test for NaN
|
||||
$entry->{totscore} != $entry->{totscore})
|
||||
{
|
||||
warn "auto-whitelist: resetting bad data for ($addr, $origip), ".
|
||||
"count: $entry->{msgcount}, totscore: $entry->{totscore}\n";
|
||||
$entry->{msgcount} = $entry->{totscore} = 0;
|
||||
}
|
||||
|
||||
return !$entry->{msgcount} ? undef : $entry->{totscore} / $entry->{msgcount};
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item awl->count();
|
||||
|
||||
This method will return the count of messages used in determining the
|
||||
whitelist correction.
|
||||
|
||||
=cut
|
||||
|
||||
sub count {
|
||||
my $self = shift;
|
||||
return $self->{entry}->{msgcount};
|
||||
}
|
||||
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item awl->add_score($score);
|
||||
|
||||
This method will add half the score to the current entry. Half the
|
||||
score is used, so that repeated use of the same From and IP address
|
||||
combination will gradually reduce the score.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_score {
|
||||
my ($self,$score) = @_;
|
||||
|
||||
if (!defined $self->{checker}) {
|
||||
return; # no factory defined; we can't check
|
||||
}
|
||||
if ($score != $score) {
|
||||
warn "auto-whitelist: attempt to add a $score to AWL entry ignored\n";
|
||||
return; # don't try to add a NaN
|
||||
}
|
||||
|
||||
$self->{entry}->{msgcount} ||= 0;
|
||||
$self->{checker}->add_score($self->{entry}, $score);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item awl->add_known_good_address($addr);
|
||||
|
||||
This method will add a score of -100 to the given address -- effectively
|
||||
"bootstrapping" the address as being one that should be whitelisted.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_known_good_address {
|
||||
my ($self, $addr, $signedby) = @_;
|
||||
|
||||
return $self->modify_address($addr, -100, $signedby);
|
||||
}
|
||||
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item awl->add_known_bad_address($addr);
|
||||
|
||||
This method will add a score of 100 to the given address -- effectively
|
||||
"bootstrapping" the address as being one that should be blacklisted.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_known_bad_address {
|
||||
my ($self, $addr, $signedby) = @_;
|
||||
|
||||
return $self->modify_address($addr, 100, $signedby);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub remove_address {
|
||||
my ($self, $addr, $signedby) = @_;
|
||||
|
||||
return $self->modify_address($addr, undef, $signedby);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub modify_address {
|
||||
my ($self, $addr, $score, $signedby) = @_;
|
||||
|
||||
if (!defined $self->{checker}) {
|
||||
return; # no factory defined; we can't check
|
||||
}
|
||||
|
||||
my $fulladdr = $self->pack_addr ($addr, undef);
|
||||
my $entry = $self->{checker}->get_addr_entry ($fulladdr, $signedby);
|
||||
|
||||
# remove any old entries (will remove per-ip entries as well)
|
||||
# always call this regardless, as the current entry may have 0
|
||||
# scores, but the per-ip one may have more
|
||||
$self->{checker}->remove_entry($entry);
|
||||
|
||||
# remove address only, no new score to add
|
||||
if (!defined $score) { return 1; }
|
||||
if ($score != $score) { return 1; } # don't try to add a NaN
|
||||
|
||||
# else add score. get a new entry first
|
||||
$entry = $self->{checker}->get_addr_entry ($fulladdr, $signedby);
|
||||
$self->{checker}->add_score($entry, $score);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub finish {
|
||||
my $self = shift;
|
||||
|
||||
return if !defined $self->{checker};
|
||||
$self->{checker}->finish();
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub ip_to_awl_key {
|
||||
my ($self, $origip) = @_;
|
||||
|
||||
my $result;
|
||||
local $1;
|
||||
if (!defined $origip) {
|
||||
# could not find an IP address to use
|
||||
} elsif ($origip =~ /^ (\d{1,3} \. \d{1,3}) \. \d{1,3} \. \d{1,3} $/xs) {
|
||||
my $mask_len = $self->{ipv4_mask_len};
|
||||
$mask_len = 16 if !defined $mask_len;
|
||||
# handle the default and easy cases manually
|
||||
if ($mask_len == 32) {
|
||||
$result = $origip;
|
||||
} elsif ($mask_len == 16) {
|
||||
$result = $1;
|
||||
} else {
|
||||
my $origip_obj = NetAddr::IP->new($origip . '/' . $mask_len);
|
||||
if (!defined $origip_obj) { # invalid IPv4 address
|
||||
dbg("auto-whitelist: bad IPv4 address $origip");
|
||||
} else {
|
||||
$result = $origip_obj->network->addr;
|
||||
$result =~s/(\.0){1,3}\z//; # truncate zero tail
|
||||
}
|
||||
}
|
||||
} elsif ($origip =~ /:/ && # triage
|
||||
$origip =~
|
||||
/^ [0-9a-f]{0,4} (?: : [0-9a-f]{0,4} | \. [0-9]{1,3} ){2,9} $/xsi) {
|
||||
# looks like an IPv6 address
|
||||
my $mask_len = $self->{ipv6_mask_len};
|
||||
$mask_len = 48 if !defined $mask_len;
|
||||
my $origip_obj = NetAddr::IP->new6($origip . '/' . $mask_len);
|
||||
if (!defined $origip_obj) { # invalid IPv6 address
|
||||
dbg("auto-whitelist: bad IPv6 address $origip");
|
||||
} elsif (NetAddr::IP->can('full6')) { # since NetAddr::IP 4.010
|
||||
$result = $origip_obj->network->full6; # string in a canonical form
|
||||
$result =~ s/(:0000){1,7}\z/::/; # compress zero tail
|
||||
}
|
||||
} else {
|
||||
dbg("auto-whitelist: bad IP address $origip");
|
||||
}
|
||||
if (defined $result && length($result) > 39) { # just in case, keep under
|
||||
$result = substr($result,0,39); # the awl.ip field size
|
||||
}
|
||||
if (defined $result) {
|
||||
dbg("auto-whitelist: IP masking %s -> %s", $origip,$result);
|
||||
}
|
||||
return $result;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub pack_addr {
|
||||
my ($self, $addr, $origip) = @_;
|
||||
|
||||
$addr = lc $addr;
|
||||
$addr =~ s/[\000\;\'\"\!\|]/_/gs; # paranoia
|
||||
|
||||
if (defined $origip) {
|
||||
$origip = $self->ip_to_awl_key($origip);
|
||||
}
|
||||
if (!defined $origip) {
|
||||
# could not find an IP address to use, could be localhost mail
|
||||
# or from the user running "add-addr-to-*".
|
||||
$origip = 'none';
|
||||
}
|
||||
return $addr . "|ip=" . $origip;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
165
upstream/lib/Mail/SpamAssassin/Bayes.pm
Normal file
165
upstream/lib/Mail/SpamAssassin/Bayes.pm
Normal file
@ -0,0 +1,165 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Bayes - support for learning classifiers
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This is the general class used to train a learning classifier with new samples
|
||||
of spam and ham mail, and classify based on prior training.
|
||||
|
||||
Prior to version 3.3.0, the default Bayes implementation was here; if you're
|
||||
looking for information on that, it has moved to
|
||||
C<Mail::SpamAssassin::Plugin::Bayes>.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Bayes;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::PerMsgStatus;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
|
||||
my ($main) = @_;
|
||||
my $self = {
|
||||
'main' => $main,
|
||||
'conf' => $main->{conf},
|
||||
'use_ignores' => 1,
|
||||
};
|
||||
bless ($self, $class);
|
||||
|
||||
$self->{main}->call_plugins("learner_new");
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub finish {
|
||||
my $self = shift;
|
||||
# we don't need to do the plugin; Mail::SpamAssassin::finish() does
|
||||
# that for us
|
||||
%{$self} = ();
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
# force the Bayes dbs to be closed, if they haven't already been; called
|
||||
# at the end of scan operation, or when switching between user IDs,
|
||||
# or when C<Mail::SpamAssassin::finish_learner()> is called.
|
||||
#
|
||||
sub force_close {
|
||||
my $self = shift;
|
||||
my $quiet = shift;
|
||||
$self->{main}->call_plugins("learner_close", { quiet => $quiet });
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub ignore_message {
|
||||
my ($self,$PMS) = @_;
|
||||
|
||||
return 0 unless $self->{use_ignores};
|
||||
|
||||
my $ig_from = $self->{main}->call_plugins ("check_wb_list",
|
||||
{ permsgstatus => $PMS, type => 'from', list => 'bayes_ignore_from' });
|
||||
my $ig_to = $self->{main}->call_plugins ("check_wb_list",
|
||||
{ permsgstatus => $PMS, type => 'to', list => 'bayes_ignore_to' });
|
||||
|
||||
my $ignore = $ig_from || $ig_to;
|
||||
dbg("bayes: not using bayes, bayes_ignore_from or _to rule") if $ignore;
|
||||
return $ignore;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub learn {
|
||||
my ($self, $isspam, $msg, $id) = @_;
|
||||
return unless $self->{conf}->{use_learner};
|
||||
return unless defined $msg;
|
||||
|
||||
if( $self->{use_ignores} ) # Remove test when PerMsgStatus available.
|
||||
{
|
||||
# DMK, koppel@ece.lsu.edu: Hoping that the ultimate fix to bug 2263 will
|
||||
# make it unnecessary to construct a PerMsgStatus here.
|
||||
my $PMS = new Mail::SpamAssassin::PerMsgStatus $self->{main}, $msg;
|
||||
my $ignore = $self->ignore_message($PMS);
|
||||
$PMS->finish();
|
||||
return 0 if $ignore;
|
||||
}
|
||||
|
||||
return $self->{main}->call_plugins("learn_message", { isspam => $isspam, msg => $msg, id => $id });
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub forget {
|
||||
my ($self, $msg, $id) = @_;
|
||||
return unless $self->{conf}->{use_learner};
|
||||
return unless defined $msg;
|
||||
return $self->{main}->call_plugins("forget_message", { msg => $msg, id => $id });
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub sync {
|
||||
my ($self, $sync, $expire, $opts) = @_;
|
||||
return 0 unless $self->{conf}->{use_learner};
|
||||
|
||||
if ($sync) {
|
||||
$self->{main}->call_plugins("learner_sync", $opts );
|
||||
}
|
||||
if ($expire) {
|
||||
$self->{main}->call_plugins("learner_expire_old_training", $opts );
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub is_scan_available {
|
||||
my $self = shift;
|
||||
return 0 unless $self->{conf}->{use_learner};
|
||||
return $self->{main}->call_plugins("learner_is_scan_available");
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub dump_bayes_db {
|
||||
my($self, $magic, $toks, $regex) = @_;
|
||||
return 0 unless $self->{conf}->{use_learner};
|
||||
return $self->{main}->call_plugins("learner_dump_database", {
|
||||
magic => $magic, toks => $toks, regex => $regex });
|
||||
}
|
||||
|
||||
1;
|
124
upstream/lib/Mail/SpamAssassin/Bayes/CombineChi.pm
Normal file
124
upstream/lib/Mail/SpamAssassin/Bayes/CombineChi.pm
Normal file
@ -0,0 +1,124 @@
|
||||
# Chi-square probability combining and related constants.
|
||||
#
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
use strict; # make Test::Perl::Critic happy
|
||||
|
||||
# this package is a no-op; the real impl code is in another pkg.
|
||||
package Mail::SpamAssassin::Bayes::CombineChi; 1;
|
||||
|
||||
# Force into another package, so our symbols will appear in that namespace with
|
||||
# no indirection, for speed. Other combiners must do the same, since Bayes.pm
|
||||
# uses this namespace directly. This means only one combiner can be loaded at
|
||||
# any time.
|
||||
package Mail::SpamAssassin::Bayes::Combine;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use POSIX qw(frexp);
|
||||
use constant LN2 => log(2);
|
||||
|
||||
# Value for 'x' in Gary Robinson's f(w) equation.
|
||||
# "Let x = the number used when n [hits] is 0."
|
||||
our $FW_X_CONSTANT = 0.538;
|
||||
|
||||
# Value for 's' in the f(w) equation. "We can see s as the "strength" (hence
|
||||
# the use of "s") of an original assumed expectation ... relative to how
|
||||
# strongly we want to consider our actual collected data." Low 's' means
|
||||
# trust collected data more strongly.
|
||||
our $FW_S_CONSTANT = 0.030;
|
||||
|
||||
# (s . x) for the f(w) equation.
|
||||
our $FW_S_DOT_X = ($FW_X_CONSTANT * $FW_S_CONSTANT);
|
||||
|
||||
# Should we ignore tokens with probs very close to the middle ground (.5)?
|
||||
# tokens need to be outside the [ .5-MPS, .5+MPS ] range to be used.
|
||||
our $MIN_PROB_STRENGTH = 0.346;
|
||||
|
||||
###########################################################################
|
||||
|
||||
# Chi-Squared method. Produces mostly boolean $result,
|
||||
# but with a grey area.
|
||||
sub combine {
|
||||
my ($ns, $nn, $sortedref) = @_;
|
||||
|
||||
# @$sortedref contains an array of the probabilities
|
||||
my $wc = scalar @$sortedref;
|
||||
return unless $wc;
|
||||
|
||||
my ($H, $S);
|
||||
my ($Hexp, $Sexp);
|
||||
$Hexp = $Sexp = 0;
|
||||
|
||||
# see bug 3118
|
||||
my $totmsgs = ($ns + $nn);
|
||||
if ($totmsgs == 0) { return; }
|
||||
$S = ($ns / $totmsgs);
|
||||
$H = ($nn / $totmsgs);
|
||||
|
||||
foreach my $prob (@$sortedref) {
|
||||
$S *= 1.0 - $prob;
|
||||
$H *= $prob;
|
||||
if ($S < 1e-200) {
|
||||
my $e;
|
||||
($S, $e) = frexp($S);
|
||||
$Sexp += $e;
|
||||
}
|
||||
if ($H < 1e-200) {
|
||||
my $e;
|
||||
($H, $e) = frexp($H);
|
||||
$Hexp += $e;
|
||||
}
|
||||
}
|
||||
|
||||
$S = log($S) + $Sexp * LN2;
|
||||
$H = log($H) + $Hexp * LN2;
|
||||
|
||||
# note: previous versions used (2 * $wc) as second arg ($v), but the chi2q()
|
||||
# fn then just used ($v/2) internally! changed to simply supply $wc as
|
||||
# ($halfv) directly instead to avoid redundant doubling and halving. The
|
||||
# side-effect is that chi2q() uses a different API now, but it's only used
|
||||
# here anyway.
|
||||
|
||||
$S = 1.0 - chi2q(-2.0 * $S, $wc);
|
||||
$H = 1.0 - chi2q(-2.0 * $H, $wc);
|
||||
return (($S - $H) + 1.0) / 2.0;
|
||||
}
|
||||
|
||||
# Chi-squared function (API changed; see comment above)
|
||||
sub chi2q {
|
||||
my ($x2, $halfv) = @_;
|
||||
|
||||
my $m = $x2 / 2.0;
|
||||
my ($sum, $term);
|
||||
$sum = $term = exp(0 - $m);
|
||||
|
||||
# replace 'for my $i (1 .. (($v/2)-1))' idiom, which creates a temp
|
||||
# array, with a plain C-style for loop
|
||||
my $i;
|
||||
for ($i = 1; $i < $halfv; $i++) {
|
||||
$term *= $m / $i;
|
||||
$sum += $term;
|
||||
}
|
||||
return $sum < 1.0 ? $sum : 1.0;
|
||||
}
|
||||
|
||||
1;
|
77
upstream/lib/Mail/SpamAssassin/Bayes/CombineNaiveBayes.pm
Normal file
77
upstream/lib/Mail/SpamAssassin/Bayes/CombineNaiveBayes.pm
Normal file
@ -0,0 +1,77 @@
|
||||
# Naive-Bayesian-style probability combining and related constants.
|
||||
#
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
use strict; # make Test::Perl::Critic happy
|
||||
|
||||
# this package is a no-op; the real impl code is in another pkg.
|
||||
package Mail::SpamAssassin::Bayes::CombineNaiveBayes; 1;
|
||||
|
||||
# Force into another package, so our symbols will appear in that namespace with
|
||||
# no indirection, for speed. Other combiners must do the same, since Bayes.pm
|
||||
# uses this namespace directly. This means only one combiner can be loaded at
|
||||
# any time.
|
||||
package Mail::SpamAssassin::Bayes::Combine;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
###########################################################################
|
||||
|
||||
# Value for 'x' in Gary Robinson's f(w) equation.
|
||||
# "Let x = the number used when n [hits] is 0."
|
||||
our $FW_X_CONSTANT = 0.600;
|
||||
|
||||
# Value for 's' in the f(w) equation. "We can see s as the "strength" (hence
|
||||
# the use of "s") of an original assumed expectation ... relative to how
|
||||
# strongly we want to consider our actual collected data." Low 's' means
|
||||
# trust collected data more strongly.
|
||||
our $FW_S_CONSTANT = 0.160;
|
||||
|
||||
# (s . x) for the f(w) equation.
|
||||
our $FW_S_DOT_X = ($FW_X_CONSTANT * $FW_S_CONSTANT);
|
||||
|
||||
# Should we ignore tokens with probs very close to the middle ground (.5)?
|
||||
# tokens need to be outside the [ .5-MPS, .5+MPS ] range to be used.
|
||||
our $MIN_PROB_STRENGTH = 0.430;
|
||||
|
||||
###########################################################################
|
||||
|
||||
# Combine probabilities using Gary Robinson's naive-Bayesian-style
|
||||
# combiner
|
||||
sub combine {
|
||||
my ($ns, $nn, $sortedref) = @_;
|
||||
|
||||
my $wc = scalar @$sortedref;
|
||||
return unless $wc;
|
||||
|
||||
my $P = 1;
|
||||
my $Q = 1;
|
||||
|
||||
foreach my $pw (@$sortedref) {
|
||||
$P *= (1-$pw);
|
||||
$Q *= $pw;
|
||||
}
|
||||
$P = 1 - ($P ** (1 / $wc));
|
||||
$Q = 1 - ($Q ** (1 / $wc));
|
||||
return (1 + ($P - $Q) / ($P + $Q)) / 2.0;
|
||||
}
|
||||
|
||||
1;
|
921
upstream/lib/Mail/SpamAssassin/BayesStore.pm
Normal file
921
upstream/lib/Mail/SpamAssassin/BayesStore.pm
Normal file
@ -0,0 +1,921 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::BayesStore - Storage Module for default Bayes classifier
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This is the public API for the Bayesian store methods. Any implementation of
|
||||
the storage module for the default Bayes classifier must implement these methods.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::BayesStore;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
# TODO: if we ever get tuits, it'd be good to make these POD
|
||||
# method docs more perlish... hardly a biggie.
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=item new
|
||||
|
||||
public class (Mail::SpamAssassin::BayesStore) new (Mail::SpamAssassin::Plugin::Bayes $bayes)
|
||||
|
||||
Description:
|
||||
This method creates a new instance of the Mail::SpamAssassin::BayesStore
|
||||
object. You must pass in an instance of the Mail::SpamAssassin::Plugin::Bayes
|
||||
object, which is stashed for use throughout the module.
|
||||
|
||||
=cut
|
||||
|
||||
sub new {
|
||||
my ($class, $bayes) = @_;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
|
||||
my $self = {
|
||||
'bayes' => $bayes,
|
||||
'supported_db_version' => 0,
|
||||
'db_version' => undef,
|
||||
};
|
||||
|
||||
bless ($self, $class);
|
||||
|
||||
$self;
|
||||
}
|
||||
|
||||
=item DB_VERSION
|
||||
|
||||
public instance (Integer) DB_VERSION ()
|
||||
|
||||
Description:
|
||||
This method returns the currently supported database version for the
|
||||
implementation.
|
||||
|
||||
=cut
|
||||
|
||||
sub DB_VERSION {
|
||||
my ($self) = @_;
|
||||
return $self->{supported_db_version};
|
||||
}
|
||||
|
||||
=item read_db_configs
|
||||
|
||||
public instance () read_db_configs ()
|
||||
|
||||
Description:
|
||||
This method reads any needed config variables from the configuration object
|
||||
and then calls the Mail::SpamAssassin::Plugin::Bayes read_db_configs method.
|
||||
|
||||
=cut
|
||||
|
||||
sub read_db_configs {
|
||||
my ($self) = @_;
|
||||
|
||||
# TODO: at some stage, this may be useful to read config items which
|
||||
# control database bloat, like
|
||||
#
|
||||
# - use of hapaxes
|
||||
# - use of case-sensitivity
|
||||
# - more midrange-hapax-avoidance tactics when parsing headers (future)
|
||||
#
|
||||
# for now, we just set these settings statically.
|
||||
my $conf = $self->{bayes}->{main}->{conf};
|
||||
|
||||
# Minimum desired database size? Expiry will not shrink the
|
||||
# database below this number of entries. 100k entries is roughly
|
||||
# equivalent to a 5Mb database file.
|
||||
$self->{expiry_max_db_size} = $conf->{bayes_expiry_max_db_size};
|
||||
$self->{expiry_pct} = $conf->{bayes_expiry_pct};
|
||||
$self->{expiry_period} = $conf->{bayes_expiry_period};
|
||||
$self->{expiry_max_exponent} = $conf->{bayes_expiry_max_exponent};
|
||||
|
||||
$self->{bayes}->read_db_configs();
|
||||
}
|
||||
|
||||
=item prefork_init
|
||||
|
||||
public instance (Boolean) prefork_init ()
|
||||
|
||||
Description:
|
||||
This optional method is called in the parent process shortly before
|
||||
forking off child processes.
|
||||
|
||||
=cut
|
||||
|
||||
# sub prefork_init {
|
||||
# my ($self) = @_;
|
||||
# }
|
||||
|
||||
=item spamd_child_init
|
||||
|
||||
public instance (Boolean) spamd_child_init ()
|
||||
|
||||
Description:
|
||||
This optional method is called in a child process shortly after being spawned.
|
||||
|
||||
=cut
|
||||
|
||||
# sub spamd_child_init {
|
||||
# my ($self) = @_;
|
||||
# }
|
||||
|
||||
=item tie_db_readonly
|
||||
|
||||
public instance (Boolean) tie_db_readonly ()
|
||||
|
||||
Description:
|
||||
This method opens up the database in readonly mode.
|
||||
|
||||
=cut
|
||||
|
||||
sub tie_db_readonly {
|
||||
my ($self) = @_;
|
||||
die "bayes: tie_db_readonly: not implemented\n";
|
||||
}
|
||||
|
||||
=item tie_db_writable
|
||||
|
||||
public instance (Boolean) tie_db_writable ()
|
||||
|
||||
Description:
|
||||
This method opens up the database in writable mode.
|
||||
|
||||
Any callers of this methods should ensure that they call untie_db()
|
||||
afterwards.
|
||||
|
||||
=cut
|
||||
|
||||
sub tie_db_writable {
|
||||
my ($self) = @_;
|
||||
die "bayes: tie_db_writable: not implemented\n";
|
||||
}
|
||||
|
||||
=item untie_db
|
||||
|
||||
public instance () untie_db ()
|
||||
|
||||
Description:
|
||||
This method unties the database.
|
||||
|
||||
=cut
|
||||
|
||||
sub untie_db {
|
||||
my $self = shift;
|
||||
die "bayes: untie_db: not implemented\n";
|
||||
}
|
||||
|
||||
=item calculate_expire_delta
|
||||
|
||||
public instance (%) calculate_expire_delta (Integer $newest_atime,
|
||||
Integer $start,
|
||||
Integer $max_expire_mult)
|
||||
|
||||
Description:
|
||||
This method performs a calculation on the data to determine the optimum
|
||||
atime for token expiration.
|
||||
|
||||
=cut
|
||||
|
||||
sub calculate_expire_delta {
|
||||
my ($self, $newest_atime, $start, $max_expire_mult) = @_;
|
||||
die "bayes: calculate_expire_delta: not implemented\n";
|
||||
}
|
||||
|
||||
=item token_expiration
|
||||
|
||||
public instance (Integer, Integer,
|
||||
Integer, Integer) token_expiration(\% $opts,
|
||||
Integer $newest_atime,
|
||||
Integer $newdelta)
|
||||
|
||||
Description:
|
||||
This method performs the database specific expiration of tokens based on
|
||||
the passed in C<$newest_atime> and C<$newdelta>.
|
||||
|
||||
=cut
|
||||
|
||||
sub token_expiration {
|
||||
my ($self, $opts, $newest_atime, $newdelta) = @_;
|
||||
die "bayes: token_expiration: not implemented\n";
|
||||
}
|
||||
|
||||
=item expire_old_tokens
|
||||
|
||||
public instance (Boolean) expire_old_tokens (\% hashref)
|
||||
|
||||
Description:
|
||||
This method expires old tokens from the database.
|
||||
|
||||
=cut
|
||||
|
||||
sub expire_old_tokens {
|
||||
my ($self, $opts) = @_;
|
||||
my $ret;
|
||||
|
||||
my $eval_stat;
|
||||
eval {
|
||||
local $SIG{'__DIE__'}; # do not run user die() traps in here
|
||||
if ($self->tie_db_writable()) {
|
||||
$ret = $self->expire_old_tokens_trapped ($opts);
|
||||
}
|
||||
1;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
};
|
||||
|
||||
if (!$self->{bayes}->{main}->{learn_caller_will_untie}) {
|
||||
$self->untie_db();
|
||||
}
|
||||
|
||||
if (defined $eval_stat) { # if we died, untie the dbs.
|
||||
warn "bayes: expire_old_tokens: $eval_stat\n";
|
||||
return 0;
|
||||
}
|
||||
$ret;
|
||||
}
|
||||
|
||||
=item expire_old_tokens_trapped
|
||||
|
||||
public instance (Boolean) expire_old_tokens_trapped (\% $opts)
|
||||
|
||||
Description:
|
||||
This methods does the actual token expiration.
|
||||
|
||||
XXX More docs here about the methodology and what not
|
||||
|
||||
=cut
|
||||
|
||||
sub expire_old_tokens_trapped {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
# Flag that we're doing work
|
||||
$self->set_running_expire_tok();
|
||||
|
||||
# We don't need to do an expire, so why were we called? Oh well.
|
||||
if (!$self->expiry_due()) {
|
||||
$self->remove_running_expire_tok();
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $started = time();
|
||||
my @vars = $self->get_storage_variables();
|
||||
|
||||
if ( $vars[10] > time ) {
|
||||
dbg("bayes: expiry found newest atime in the future, resetting to current time");
|
||||
$vars[10] = time;
|
||||
}
|
||||
|
||||
# How many tokens do we want to keep?
|
||||
my $goal_reduction = int($self->{expiry_max_db_size} * $self->{expiry_pct});
|
||||
dbg("bayes: expiry check keep size, ".$self->{expiry_pct}." * max: $goal_reduction");
|
||||
# Make sure we keep at least 100000 tokens in the DB
|
||||
if ( $goal_reduction < 100000 ) {
|
||||
$goal_reduction = 100000;
|
||||
dbg("bayes: expiry keep size too small, resetting to 100,000 tokens");
|
||||
}
|
||||
# Now turn goal_reduction into how many to expire.
|
||||
$goal_reduction = $vars[3] - $goal_reduction;
|
||||
dbg("bayes: token count: ".$vars[3].", final goal reduction size: $goal_reduction");
|
||||
|
||||
if ( $goal_reduction < 1000 ) { # too few tokens to expire, abort.
|
||||
dbg("bayes: reduction goal of $goal_reduction is under 1,000 tokens, skipping expire");
|
||||
$self->set_last_expire(time());
|
||||
$self->remove_running_expire_tok(); # this won't be cleaned up, so do it now.
|
||||
return 1; # we want to indicate things ran as expected
|
||||
}
|
||||
|
||||
# Estimate new atime delta based on the last atime delta
|
||||
my $newdelta = 0;
|
||||
if ( $vars[9] > 0 ) {
|
||||
# newdelta = olddelta * old / goal;
|
||||
# this may seem backwards, but since we're talking delta here,
|
||||
# not actual atime, we want smaller atimes to expire more tokens,
|
||||
# and visa versa.
|
||||
#
|
||||
$newdelta = int($vars[8] * $vars[9] / $goal_reduction);
|
||||
}
|
||||
|
||||
# Calculate size difference between last expiration token removal
|
||||
# count and the current goal removal count.
|
||||
my $ratio = ($vars[9] == 0 || $vars[9] > $goal_reduction) ? $vars[9]/$goal_reduction : $goal_reduction/$vars[9];
|
||||
|
||||
dbg("bayes: first pass? current: ".time().", Last: ".$vars[4].", atime: ".$vars[8].", count: ".$vars[9].", newdelta: $newdelta, ratio: $ratio, period: ".$self->{expiry_period});
|
||||
|
||||
## ESTIMATION PHASE
|
||||
#
|
||||
# Do this for the first expire or "odd" looking results cause a first pass to determine atime:
|
||||
#
|
||||
# - last expire was more than 30 days ago
|
||||
# assume mail flow stays roughly the same month to month, recompute if it's > 1 month
|
||||
# - last atime delta was under expiry period
|
||||
# if we're expiring often max_db_size should go up, but let's recompute just to check
|
||||
# - last reduction count was < 1000 tokens
|
||||
# ditto
|
||||
# - new estimated atime delta is under expiry period
|
||||
# ditto
|
||||
# - difference of last reduction to current goal reduction is > 50%
|
||||
# if the two values are out of balance, estimating atime is going to be funky, recompute
|
||||
#
|
||||
if ( (time() - $vars[4] > 86400*30) || ($vars[8] < $self->{expiry_period}) || ($vars[9] < 1000)
|
||||
|| ($newdelta < $self->{expiry_period}) || ($ratio > 1.5) ) {
|
||||
dbg("bayes: can't use estimation method for expiry, unexpected result, calculating optimal atime delta (first pass)");
|
||||
|
||||
my $start = $self->{expiry_period}; # exponential search starting at ...? 1/2 day, 1, 2, 4, 8, 16, ...
|
||||
my $max_expire_mult = 2**$self->{expiry_max_exponent}; # $max_expire_mult * $start = max expire time (256 days), power of 2.
|
||||
|
||||
dbg("bayes: expiry max exponent: ".$self->{expiry_max_exponent});
|
||||
|
||||
my %delta = $self->calculate_expire_delta($vars[10], $start, $max_expire_mult);
|
||||
|
||||
return 0 unless (%delta);
|
||||
|
||||
# This will skip the for loop if debugging isn't enabled ...
|
||||
if (would_log('dbg', 'bayes')) {
|
||||
dbg("bayes: atime\ttoken reduction");
|
||||
dbg("bayes: ========\t===============");
|
||||
for(my $i = 1; $i<=$max_expire_mult; $i <<= 1) {
|
||||
dbg("bayes: ".$start*$i."\t".(exists $delta{$i} ? $delta{$i} : 0));
|
||||
}
|
||||
}
|
||||
|
||||
# Now figure out which max_expire_mult value gives the closest results to goal_reduction, without
|
||||
# going over ... Go from the largest delta backwards so the reduction size increases
|
||||
# (tokens that expire at 4 also expire at 3, 2, and 1, so 1 will always be the largest expiry...)
|
||||
#
|
||||
for( ; $max_expire_mult > 0; $max_expire_mult>>=1 ) {
|
||||
next unless exists $delta{$max_expire_mult};
|
||||
if ($delta{$max_expire_mult} > $goal_reduction) {
|
||||
$max_expire_mult<<=1; # the max expire is actually the next power of 2 out
|
||||
last;
|
||||
}
|
||||
}
|
||||
|
||||
# if max_expire_mult gets to 0, either we can't expire anything, or 1 is <= $goal_reduction
|
||||
$max_expire_mult ||= 1;
|
||||
|
||||
# $max_expire_mult is now equal to the value we should use ...
|
||||
# Check to see if the atime value we found is really good.
|
||||
# It's not good if:
|
||||
# - $max_expire_mult would not expire any tokens. This means that the majority of
|
||||
# tokens are old or new, and more activity is required before an expiry can occur.
|
||||
# - reduction count < 1000, not enough tokens to be worth doing an expire.
|
||||
#
|
||||
if ( !exists $delta{$max_expire_mult} || $delta{$max_expire_mult} < 1000 ) {
|
||||
dbg("bayes: couldn't find a good delta atime, need more token difference, skipping expire");
|
||||
$self->set_last_expire(time());
|
||||
$self->remove_running_expire_tok(); # this won't be cleaned up, so do it now.
|
||||
return 1; # we want to indicate things ran as expected
|
||||
}
|
||||
|
||||
$newdelta = $start * $max_expire_mult;
|
||||
dbg("bayes: first pass decided on $newdelta for atime delta");
|
||||
}
|
||||
else { # use the estimation method
|
||||
dbg("bayes: can do estimation method for expiry, skipping first pass");
|
||||
}
|
||||
|
||||
my ($kept, $deleted, $num_hapaxes, $num_lowfreq) = $self->token_expiration($opts, $newdelta, @vars);
|
||||
|
||||
my $done = time();
|
||||
|
||||
my $msg = "expired old bayes database entries in ".($done - $started)." seconds";
|
||||
my $msg2 = "$kept entries kept, $deleted deleted";
|
||||
|
||||
if ($opts->{verbose}) {
|
||||
my $hapax_pc = ($num_hapaxes * 100) / $kept;
|
||||
my $lowfreq_pc = ($num_lowfreq * 100) / $kept;
|
||||
print "$msg\n$msg2\n" or die "Error writing: $!";
|
||||
printf "token frequency: 1-occurrence tokens: %3.2f%%\n", $hapax_pc
|
||||
or die "Error writing: $!";
|
||||
printf "token frequency: less than 8 occurrences: %3.2f%%\n", $lowfreq_pc
|
||||
or die "Error writing: $!";
|
||||
}
|
||||
else {
|
||||
dbg("bayes: $msg: $msg2");
|
||||
}
|
||||
|
||||
$self->remove_running_expire_tok();
|
||||
return 1;
|
||||
}
|
||||
|
||||
=item sync_due
|
||||
|
||||
public instance (Boolean) sync_due ()
|
||||
|
||||
Description:
|
||||
This methods determines if a sync is due.
|
||||
|
||||
=cut
|
||||
|
||||
sub sync_due {
|
||||
my ($self) = @_;
|
||||
die "bayes: sync_due: not implemented\n";
|
||||
}
|
||||
|
||||
=item expiry_due
|
||||
|
||||
public instance (Boolean) expiry_due ()
|
||||
|
||||
Description:
|
||||
This methods determines if an expire is due.
|
||||
|
||||
=cut
|
||||
|
||||
sub expiry_due {
|
||||
my ($self) = @_;
|
||||
|
||||
$self->read_db_configs(); # make sure this has happened here
|
||||
|
||||
# If force expire was called, do the expire no matter what.
|
||||
return 1 if ($self->{bayes}->{main}->{learn_force_expire});
|
||||
|
||||
# if config says not to auto expire then no need to continue
|
||||
return 0 if ($self->{bayes}->{main}->{conf}->{bayes_auto_expire} == 0);
|
||||
|
||||
# is the database too small for expiry? (Do *not* use "scalar keys",
|
||||
# as this will iterate through the entire db counting them!)
|
||||
my @vars = $self->get_storage_variables();
|
||||
my $ntoks = $vars[3];
|
||||
|
||||
my $last_expire = time() - $vars[4];
|
||||
if (!$self->{bayes}->{main}->{ignore_safety_expire_timeout}) {
|
||||
# if we're not ignoring the safety timeout, don't run an expire more
|
||||
# than once every 12 hours.
|
||||
return 0 if ($last_expire < 43200);
|
||||
}
|
||||
else {
|
||||
# if we are ignoring the safety timeout (e.g.: mass-check), still
|
||||
# limit the expiry to only one every 5 minutes.
|
||||
return 0 if ($last_expire < 300);
|
||||
}
|
||||
|
||||
dbg("bayes: DB expiry: tokens in DB: $ntoks, Expiry max size: ".$self->{expiry_max_db_size}.", Oldest atime: ".$vars[5].", Newest atime: ".$vars[10].", Last expire: ".$vars[4].", Current time: ".time());
|
||||
|
||||
my $conf = $self->{bayes}->{main}->{conf};
|
||||
if ($ntoks <= 100000 || # keep at least 100k tokens
|
||||
$self->{expiry_max_db_size} > $ntoks || # not enough tokens to cause an expire
|
||||
$vars[10]-$vars[5] < 43200 || # delta between oldest and newest < 12h
|
||||
$self->{db_version} < $self->DB_VERSION # ignore old db formats
|
||||
) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
=item seen_get
|
||||
|
||||
public instance (Char) seen_get (String $msgid)
|
||||
|
||||
Description:
|
||||
This method retrieves the stored value, if any, for C<$msgid>. The return
|
||||
value is the stored string ('s' for spam and 'h' for ham) or undef if
|
||||
C<$msgid> is not found.
|
||||
|
||||
=cut
|
||||
|
||||
sub seen_get {
|
||||
my ($self, $msgid) = @_;
|
||||
die "bayes: seen_get: not implemented\n";
|
||||
}
|
||||
|
||||
=item seen_put
|
||||
|
||||
public instance (Boolean) seen_put (String $msgid, Char $flag)
|
||||
|
||||
Description:
|
||||
This method records C<$msgid> as the type given by C<$flag>. C<$flag> is
|
||||
one of two values 's' for spam and 'h' for ham.
|
||||
|
||||
=cut
|
||||
|
||||
sub seen_put {
|
||||
my ($self, $msgid, $flag) = @_;
|
||||
die "bayes: seen_put: not implemented\n";
|
||||
}
|
||||
|
||||
=item seen_delete
|
||||
|
||||
public instance (Boolean) seen_delete (String $msgid)
|
||||
|
||||
Description:
|
||||
This method removes C<$msgid> from storage.
|
||||
|
||||
=cut
|
||||
|
||||
sub seen_delete {
|
||||
my ($self, $msgid) = @_;
|
||||
die "bayes: seen_delete: not implemented\n";
|
||||
}
|
||||
|
||||
=item get_storage_variables
|
||||
|
||||
public instance (@) get_storage_variables ()
|
||||
|
||||
Description:
|
||||
This method retrieves the various administrative variables used by
|
||||
the Bayes storage implementation.
|
||||
|
||||
The values returned in the array are in the following order:
|
||||
|
||||
0: scan count base
|
||||
|
||||
1: number of spam
|
||||
|
||||
2: number of ham
|
||||
|
||||
3: number of tokens in db
|
||||
|
||||
4: last expire atime
|
||||
|
||||
5: oldest token in db atime
|
||||
|
||||
6: db version value
|
||||
|
||||
7: last journal sync
|
||||
|
||||
8: last atime delta
|
||||
|
||||
9: last expire reduction count
|
||||
|
||||
10: newest token in db atime
|
||||
|
||||
=cut
|
||||
|
||||
sub get_storage_variables {
|
||||
my ($self) = @_;
|
||||
die "bayes: get_storage_variables: not implemented\n";
|
||||
}
|
||||
|
||||
=item dump_db_toks
|
||||
|
||||
public instance () dump_db_toks (String $template, String $regex, @ @vars)
|
||||
|
||||
Description:
|
||||
This method loops over all tokens, computing the probability for the token
|
||||
and then printing it out according to the passed in template.
|
||||
|
||||
=cut
|
||||
|
||||
sub dump_db_toks {
|
||||
my ($self, $template, $regex, @vars) = @_;
|
||||
die "bayes: dump_db_toks: not implemented\n";
|
||||
}
|
||||
|
||||
=item set_last_expire
|
||||
|
||||
public instance (Boolean) _set_last_expire (Integer $time)
|
||||
|
||||
Description:
|
||||
This method sets the last expire time.
|
||||
|
||||
=cut
|
||||
|
||||
sub set_last_expire {
|
||||
my ($self, $time) = @_;
|
||||
die "bayes: set_last_expire: not implemented\n";
|
||||
}
|
||||
|
||||
=item get_running_expire_tok
|
||||
|
||||
public instance (Time) get_running_expire_tok ()
|
||||
|
||||
Description:
|
||||
This method determines if an expire is currently running and returns the time
|
||||
the expire started.
|
||||
|
||||
=cut
|
||||
|
||||
sub get_running_expire_tok {
|
||||
my ($self) = @_;
|
||||
die "bayes: get_running_expire_tok: not implemented\n";
|
||||
}
|
||||
|
||||
=item set_running_expire_tok
|
||||
|
||||
public instance (Time) set_running_expire_tok ()
|
||||
|
||||
Description:
|
||||
This method sets the running expire time to the current time.
|
||||
|
||||
=cut
|
||||
|
||||
sub set_running_expire_tok {
|
||||
my ($self) = @_;
|
||||
die "bayes: set_running_expire_tok: not implemented\n";
|
||||
}
|
||||
|
||||
=item remove_running_expire_tok
|
||||
|
||||
public instance (Boolean) remove_running_expire_tok ()
|
||||
|
||||
Description:
|
||||
This method removes a currently set running expire time.
|
||||
|
||||
=cut
|
||||
|
||||
sub remove_running_expire_tok {
|
||||
my ($self) = @_;
|
||||
die "bayes: remove_running_expire_tok: not implemented\n";
|
||||
}
|
||||
|
||||
=item tok_get
|
||||
|
||||
public instance (Integer, Integer, Time) tok_get (String $token)
|
||||
|
||||
Description:
|
||||
This method retrieves the specified token (C<$token>) from storage and returns
|
||||
it's spam count, ham count and last access time.
|
||||
|
||||
=cut
|
||||
|
||||
sub tok_get {
|
||||
my ($self, $token) = @_;
|
||||
die "bayes: tok_get: not implemented\n";
|
||||
}
|
||||
|
||||
=item tok_get_all
|
||||
|
||||
public instance (\@) tok_get_all (@ @tokens)
|
||||
|
||||
Description:
|
||||
This method retrieves the specified tokens (C<@tokens>) from storage and
|
||||
returns an array ref of arrays spam count, ham count and last access time.
|
||||
|
||||
=cut
|
||||
|
||||
sub tok_get_all {
|
||||
my ($self, $tokens) = @_;
|
||||
die "bayes: tok_get_all: not implemented\n";
|
||||
}
|
||||
|
||||
=item tok_count_change
|
||||
|
||||
public instance (Boolean) tok_count_change (Integer $spam_count,
|
||||
Integer $ham_count,
|
||||
String $token,
|
||||
Time $atime)
|
||||
|
||||
Description:
|
||||
This method takes a C<$spam_count> and C<$ham_count> and adds it to
|
||||
C<$token> along with updating C<$token>s atime with C<$atime>.
|
||||
|
||||
=cut
|
||||
|
||||
sub tok_count_change {
|
||||
my ($self, $spam_count, $ham_count, $token, $atime) = @_;
|
||||
die "bayes: tok_count_change: not implemented\n";
|
||||
}
|
||||
|
||||
=item multi_tok_count_change
|
||||
|
||||
public instance (Boolean) multi_tok_count_change (Integer $spam_count,
|
||||
Integer $ham_count,
|
||||
\% $tokens,
|
||||
String $atime)
|
||||
|
||||
Description:
|
||||
This method takes a C<$spam_count> and C<$ham_count> and adds it to all
|
||||
of the tokens in the C<$tokens> hash ref along with updating each tokens
|
||||
atime with C<$atime>.
|
||||
|
||||
=cut
|
||||
|
||||
sub multi_tok_count_change {
|
||||
my ($self, $spam_count, $ham_count, $tokens, $atime) = @_;
|
||||
die "bayes: multi_tok_count_change: not implemented\n";
|
||||
}
|
||||
|
||||
=item nspam_nham_get
|
||||
|
||||
public instance (Integer, Integer) nspam_nham_get ()
|
||||
|
||||
Description:
|
||||
This method retrieves the total number of spam and the total number of ham
|
||||
currently under storage.
|
||||
|
||||
=cut
|
||||
|
||||
sub nspam_nham_get {
|
||||
my ($self) = @_;
|
||||
die "bayes: nspam_nham_get: not implemented\n";
|
||||
}
|
||||
|
||||
=item nspam_nham_change
|
||||
|
||||
public instance (Boolean) nspam_nham_change (Integer $num_spam,
|
||||
Integer $num_ham)
|
||||
|
||||
Description:
|
||||
This method updates the number of spam and the number of ham in the database.
|
||||
|
||||
=cut
|
||||
|
||||
sub nspam_nham_change {
|
||||
my ($self, $num_spam, $num_ham) = @_;
|
||||
die "bayes: nspam_nham_change: not implemented\n";
|
||||
}
|
||||
|
||||
=item tok_touch
|
||||
|
||||
public instance (Boolean) tok_touch (String $token,
|
||||
Time $atime)
|
||||
|
||||
Description:
|
||||
This method updates the given tokens (C<$token>) access time.
|
||||
|
||||
=cut
|
||||
|
||||
sub tok_touch {
|
||||
my ($self, $token, $atime) = @_;
|
||||
die "bayes: tok_touch: not implemented\n";
|
||||
}
|
||||
|
||||
=item tok_touch_all
|
||||
|
||||
public instance (Boolean) tok_touch_all (\@ $tokens,
|
||||
Time $atime)
|
||||
|
||||
Description:
|
||||
This method does a mass update of the given list of tokens C<$tokens>, if the existing token
|
||||
atime is < C<$atime>.
|
||||
|
||||
=cut
|
||||
|
||||
sub tok_touch_all {
|
||||
my ($self, $tokens, $atime) = @_;
|
||||
die "bayes: tok_touch_all: not implemented\n";
|
||||
}
|
||||
|
||||
=item cleanup
|
||||
|
||||
public instance (Boolean) cleanup ()
|
||||
|
||||
Description:
|
||||
This method performs any cleanup necessary before moving onto the next
|
||||
operation.
|
||||
|
||||
=cut
|
||||
|
||||
sub cleanup {
|
||||
my ($self) = @_;
|
||||
die "bayes: cleanup: not implemented\n";
|
||||
}
|
||||
|
||||
=item get_magic_re
|
||||
|
||||
public instance get_magic_re (String)
|
||||
|
||||
Description:
|
||||
This method returns a regexp which indicates a magic token.
|
||||
|
||||
=cut
|
||||
|
||||
sub get_magic_re {
|
||||
my ($self) = @_;
|
||||
die "bayes: get_magic_re: not implemented\n";
|
||||
}
|
||||
|
||||
=item sync
|
||||
|
||||
public instance (Boolean) sync (\% $opts)
|
||||
|
||||
Description:
|
||||
This method performs a sync of the database.
|
||||
|
||||
=cut
|
||||
|
||||
sub sync {
|
||||
my ($self, $opts) = @_;
|
||||
die "bayes: sync: not implemented\n";
|
||||
}
|
||||
|
||||
=item perform_upgrade
|
||||
|
||||
public instance (Boolean) perform_upgrade (\% $opts)
|
||||
|
||||
Description:
|
||||
This method is a utility method that performs any necessary upgrades
|
||||
between versions. It should know how to handle previous versions and
|
||||
what needs to happen to upgrade them.
|
||||
|
||||
A true return value indicates success.
|
||||
|
||||
=cut
|
||||
|
||||
sub perform_upgrade {
|
||||
my ($self, $opts) = @_;
|
||||
die "bayes: perform_upgrade: not implemented\n";
|
||||
}
|
||||
|
||||
=item clear_database
|
||||
|
||||
public instance (Boolean) clear_database ()
|
||||
|
||||
Description:
|
||||
This method deletes all records for a particular user.
|
||||
|
||||
Callers should be aware that any errors returned by this method
|
||||
could causes the database to be inconsistent for the given user.
|
||||
|
||||
=cut
|
||||
|
||||
sub clear_database {
|
||||
my ($self) = @_;
|
||||
die "bayes: clear_database: not implemented\n";
|
||||
}
|
||||
|
||||
=item backup_database
|
||||
|
||||
public instance (Boolean) backup_database ()
|
||||
|
||||
Description:
|
||||
This method will dump the users database in a machine readable format.
|
||||
|
||||
=cut
|
||||
|
||||
sub backup_database {
|
||||
my ($self) = @_;
|
||||
die "bayes: backup_database: not implemented\n";
|
||||
}
|
||||
|
||||
=item restore_database
|
||||
|
||||
public instance (Boolean) restore_database (String $filename, Boolean $showdots)
|
||||
|
||||
Description:
|
||||
This method restores a database from the given filename, C<$filename>.
|
||||
|
||||
Callers should be aware that any errors returned by this method
|
||||
could causes the database to be inconsistent for the given user.
|
||||
|
||||
=cut
|
||||
|
||||
sub restore_database {
|
||||
my ($self, $filename, $showdots) = @_;
|
||||
die "bayes: restore_database: not implemented\n";
|
||||
}
|
||||
|
||||
=item db_readable
|
||||
|
||||
public instance (Boolean) db_readable ()
|
||||
|
||||
Description:
|
||||
This method returns whether or not the Bayes DB is available in a
|
||||
readable state.
|
||||
|
||||
=cut
|
||||
|
||||
sub db_readable {
|
||||
my ($self) = @_;
|
||||
die "bayes: db_readable: not implemented\n";
|
||||
}
|
||||
|
||||
=item db_writable
|
||||
|
||||
public instance (Boolean) db_writable ()
|
||||
|
||||
Description:
|
||||
This method returns whether or not the Bayes DB is available in a
|
||||
writable state.
|
||||
|
||||
=cut
|
||||
|
||||
sub db_writable {
|
||||
my ($self) = @_;
|
||||
die "bayes: db_writable: not implemented\n";
|
||||
}
|
||||
|
||||
|
||||
sub sa_die { Mail::SpamAssassin::sa_die(@_); }
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
1572
upstream/lib/Mail/SpamAssassin/BayesStore/BDB.pm
Normal file
1572
upstream/lib/Mail/SpamAssassin/BayesStore/BDB.pm
Normal file
File diff suppressed because it is too large
Load Diff
1974
upstream/lib/Mail/SpamAssassin/BayesStore/DBM.pm
Normal file
1974
upstream/lib/Mail/SpamAssassin/BayesStore/DBM.pm
Normal file
File diff suppressed because it is too large
Load Diff
1075
upstream/lib/Mail/SpamAssassin/BayesStore/MySQL.pm
Normal file
1075
upstream/lib/Mail/SpamAssassin/BayesStore/MySQL.pm
Normal file
File diff suppressed because it is too large
Load Diff
1104
upstream/lib/Mail/SpamAssassin/BayesStore/PgSQL.pm
Normal file
1104
upstream/lib/Mail/SpamAssassin/BayesStore/PgSQL.pm
Normal file
File diff suppressed because it is too large
Load Diff
1420
upstream/lib/Mail/SpamAssassin/BayesStore/Redis.pm
Normal file
1420
upstream/lib/Mail/SpamAssassin/BayesStore/Redis.pm
Normal file
File diff suppressed because it is too large
Load Diff
75
upstream/lib/Mail/SpamAssassin/BayesStore/SDBM.pm
Normal file
75
upstream/lib/Mail/SpamAssassin/BayesStore/SDBM.pm
Normal file
@ -0,0 +1,75 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::BayesStore::SDBM;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Fcntl;
|
||||
|
||||
use Mail::SpamAssassin::BayesStore::DBM;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
# our @DBNAMES; # <---- unused!
|
||||
our @ISA = qw( Mail::SpamAssassin::BayesStore::DBM );
|
||||
|
||||
sub HAS_DBM_MODULE {
|
||||
my ($self) = @_;
|
||||
if (exists($self->{has_dbm_module})) {
|
||||
return $self->{has_dbm_module};
|
||||
}
|
||||
$self->{has_dbm_module} = eval { require SDBM_File; };
|
||||
}
|
||||
|
||||
sub DBM_MODULE {
|
||||
return "SDBM_File";
|
||||
}
|
||||
|
||||
# Possible file extensions used by the kinds of database files SDBM_File
|
||||
# might create. We need these so we can create a new file and rename
|
||||
# it into place.
|
||||
sub DB_EXTENSIONS {
|
||||
return ('.pag', '.dir');
|
||||
}
|
||||
|
||||
sub _unlink_file {
|
||||
my ($self, $filename) = @_;
|
||||
|
||||
for my $ext ($self->DB_EXTENSIONS) {
|
||||
unlink $filename . $ext;
|
||||
}
|
||||
}
|
||||
|
||||
sub _rename_file {
|
||||
my ($self, $sourcefilename, $targetfilename) = @_;
|
||||
|
||||
for my $ext ($self->DB_EXTENSIONS) {
|
||||
return 0 unless (rename($sourcefilename . $ext, $targetfilename . $ext));
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
# this is called directly from sa-learn(1).
|
||||
sub perform_upgrade {
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
||||
1;
|
2361
upstream/lib/Mail/SpamAssassin/BayesStore/SQL.pm
Normal file
2361
upstream/lib/Mail/SpamAssassin/BayesStore/SQL.pm
Normal file
File diff suppressed because it is too large
Load Diff
606
upstream/lib/Mail/SpamAssassin/Client.pm
Normal file
606
upstream/lib/Mail/SpamAssassin/Client.pm
Normal file
@ -0,0 +1,606 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Client - Client for spamd Protocol
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
my $client = Mail::SpamAssassin::Client->new({
|
||||
port => 783,
|
||||
host => 'localhost',
|
||||
username => 'someuser'});
|
||||
or
|
||||
|
||||
my $client = Mail::SpamAssassin::Client->new({
|
||||
socketpath => '/path/to/socket',
|
||||
username => 'someuser'});
|
||||
|
||||
Optionally takes timeout, which is applied to IO::Socket for the
|
||||
initial connection. If not supplied, it defaults to 30 seconds.
|
||||
|
||||
if ($client->ping()) {
|
||||
print "Ping is ok\n";
|
||||
}
|
||||
|
||||
my $result = $client->process($testmsg);
|
||||
|
||||
if ($result->{isspam} eq 'True') {
|
||||
do something with spam message here
|
||||
}
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin::Client is a module which provides a perl implementation of
|
||||
the spamd protocol.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Client;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use IO::Socket;
|
||||
use Errno qw(EBADF);
|
||||
|
||||
our($io_socket_module_name);
|
||||
BEGIN {
|
||||
if (eval { require IO::Socket::IP }) {
|
||||
$io_socket_module_name = 'IO::Socket::IP';
|
||||
} elsif (eval { require IO::Socket::INET6 }) {
|
||||
$io_socket_module_name = 'IO::Socket::INET6';
|
||||
} elsif (eval { require IO::Socket::INET }) {
|
||||
$io_socket_module_name = 'IO::Socket::INET';
|
||||
}
|
||||
}
|
||||
|
||||
my $EOL = "\015\012";
|
||||
my $BLANK = $EOL x 2;
|
||||
my $PROTOVERSION = 'SPAMC/1.5';
|
||||
|
||||
=head1 PUBLIC METHODS
|
||||
|
||||
=head2 new
|
||||
|
||||
public class (Mail::SpamAssassin::Client) new (\% $args)
|
||||
|
||||
Description:
|
||||
This method creates a new Mail::SpamAssassin::Client object.
|
||||
|
||||
=cut
|
||||
|
||||
sub new {
|
||||
my ($class, $args) = @_;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
|
||||
my $self = {};
|
||||
|
||||
# with a sockets_path set then it makes no sense to set host and port
|
||||
if ($args->{socketpath}) {
|
||||
$self->{socketpath} = $args->{socketpath};
|
||||
}
|
||||
else {
|
||||
$self->{port} = $args->{port};
|
||||
$self->{host} = $args->{host};
|
||||
}
|
||||
|
||||
if (defined $args->{username}) {
|
||||
$self->{username} = $args->{username};
|
||||
}
|
||||
|
||||
if ($args->{timeout}) {
|
||||
$self->{timeout} = $args->{timeout} || 30;
|
||||
}
|
||||
|
||||
bless($self, $class);
|
||||
|
||||
$self;
|
||||
}
|
||||
|
||||
=head2 process
|
||||
|
||||
public instance (\%) process (String $msg)
|
||||
|
||||
Description:
|
||||
This method calls the spamd server with the PROCESS command.
|
||||
|
||||
The return value is a hash reference containing several pieces of information,
|
||||
if available:
|
||||
|
||||
content_length
|
||||
|
||||
isspam
|
||||
|
||||
score
|
||||
|
||||
threshold
|
||||
|
||||
message
|
||||
|
||||
=cut
|
||||
|
||||
sub process {
|
||||
my ($self, $msg, $is_check_p) = @_;
|
||||
|
||||
my $command = 'PROCESS';
|
||||
|
||||
if ($is_check_p) {
|
||||
warn "Passing in \$is_check_p is deprecated, just call the check method instead.\n";
|
||||
$command = 'CHECK';
|
||||
}
|
||||
|
||||
return $self->_filter($msg, $command);
|
||||
}
|
||||
|
||||
=head2 check
|
||||
|
||||
public instance (\%) check (String $msg)
|
||||
|
||||
Description:
|
||||
The method implements the check call.
|
||||
|
||||
See the process method for the return value.
|
||||
|
||||
=cut
|
||||
|
||||
sub check {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
return $self->_filter($msg, 'CHECK');
|
||||
}
|
||||
|
||||
=head2 headers
|
||||
|
||||
public instance (\%) headers (String $msg)
|
||||
|
||||
Description:
|
||||
This method implements the headers call.
|
||||
|
||||
See the process method for the return value.
|
||||
|
||||
=cut
|
||||
|
||||
sub headers {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
return $self->_filter($msg, 'HEADERS');
|
||||
}
|
||||
|
||||
=head2 learn
|
||||
|
||||
public instance (Boolean) learn (String $msg, Integer $learntype)
|
||||
|
||||
Description:
|
||||
This method implements the learn call. C<$learntype> should be
|
||||
an integer, 0 for spam, 1 for ham and 2 for forget. The return
|
||||
value is a boolean indicating if the message was learned or not.
|
||||
|
||||
An undef return value indicates that there was an error and you
|
||||
should check the resp_code/resp_msg values to determine what
|
||||
the error was.
|
||||
|
||||
=cut
|
||||
|
||||
sub learn {
|
||||
my ($self, $msg, $learntype) = @_;
|
||||
|
||||
$self->_clear_errors();
|
||||
|
||||
my $remote = $self->_create_connection();
|
||||
|
||||
return unless $remote;
|
||||
|
||||
my $msgsize = length($msg.$EOL);
|
||||
|
||||
print $remote "TELL $PROTOVERSION$EOL";
|
||||
print $remote "Content-length: $msgsize$EOL";
|
||||
print $remote "User: $self->{username}$EOL" if defined $self->{username};
|
||||
|
||||
if ($learntype == 0) {
|
||||
print $remote "Message-class: spam$EOL";
|
||||
print $remote "Set: local$EOL";
|
||||
}
|
||||
elsif ($learntype == 1) {
|
||||
print $remote "Message-class: ham$EOL";
|
||||
print $remote "Set: local$EOL";
|
||||
}
|
||||
elsif ($learntype == 2) {
|
||||
print $remote "Remove: local$EOL";
|
||||
}
|
||||
else { # bad learntype
|
||||
$self->{resp_code} = 00;
|
||||
$self->{resp_msg} = 'do not know';
|
||||
return;
|
||||
}
|
||||
|
||||
print $remote "$EOL";
|
||||
print $remote $msg;
|
||||
print $remote "$EOL";
|
||||
|
||||
$! = 0; my $line = <$remote>;
|
||||
# deal gracefully with a Perl I/O bug which may return status EBADF at eof
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (1): $!")
|
||||
: die "error reading from spamd (1): $!";
|
||||
return unless defined $line;
|
||||
|
||||
my ($version, $resp_code, $resp_msg) = $self->_parse_response_line($line);
|
||||
|
||||
$self->{resp_code} = $resp_code;
|
||||
$self->{resp_msg} = $resp_msg;
|
||||
|
||||
return unless $resp_code == 0;
|
||||
|
||||
my $did_set = '';
|
||||
my $did_remove = '';
|
||||
|
||||
for ($!=0; defined($line=<$remote>); $!=0) {
|
||||
local $1;
|
||||
if ($line =~ /DidSet: (.*)/i) {
|
||||
$did_set = $1;
|
||||
}
|
||||
elsif ($line =~ /DidRemove: (.*)/i) {
|
||||
$did_remove = $1;
|
||||
}
|
||||
elsif ($line =~ /^${EOL}$/) {
|
||||
last;
|
||||
}
|
||||
}
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (2): $!")
|
||||
: die "error reading from spamd (2): $!";
|
||||
close $remote or die "error closing socket: $!";
|
||||
|
||||
if ($learntype == 0 || $learntype == 1) {
|
||||
return $did_set =~ /local/;
|
||||
}
|
||||
else { #safe since we've already checked the $learntype values
|
||||
return $did_remove =~ /local/;
|
||||
}
|
||||
}
|
||||
|
||||
=head2 report
|
||||
|
||||
public instance (Boolean) report (String $msg)
|
||||
|
||||
Description:
|
||||
This method provides the report interface to spamd.
|
||||
|
||||
=cut
|
||||
|
||||
sub report {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
$self->_clear_errors();
|
||||
|
||||
my $remote = $self->_create_connection();
|
||||
|
||||
return unless $remote;
|
||||
|
||||
my $msgsize = length($msg.$EOL);
|
||||
|
||||
print $remote "TELL $PROTOVERSION$EOL";
|
||||
print $remote "Content-length: $msgsize$EOL";
|
||||
print $remote "User: $self->{username}$EOL" if defined $self->{username};
|
||||
print $remote "Message-class: spam$EOL";
|
||||
print $remote "Set: local,remote$EOL";
|
||||
print $remote "$EOL";
|
||||
print $remote $msg;
|
||||
print $remote "$EOL";
|
||||
|
||||
$! = 0; my $line = <$remote>;
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (3): $!")
|
||||
: die "error reading from spamd (3): $!";
|
||||
return unless defined $line;
|
||||
|
||||
my ($version, $resp_code, $resp_msg) = $self->_parse_response_line($line);
|
||||
|
||||
$self->{resp_code} = $resp_code;
|
||||
$self->{resp_msg} = $resp_msg;
|
||||
|
||||
return unless $resp_code == 0;
|
||||
|
||||
my $reported_p = 0;
|
||||
|
||||
for ($!=0; defined($line=<$remote>); $!=0) {
|
||||
if ($line =~ /DidSet:\s+.*remote/i) {
|
||||
$reported_p = 1;
|
||||
last;
|
||||
}
|
||||
elsif ($line =~ /^${EOL}$/) {
|
||||
last;
|
||||
}
|
||||
}
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (4): $!")
|
||||
: die "error reading from spamd (4): $!";
|
||||
close $remote or die "error closing socket: $!";
|
||||
|
||||
return $reported_p;
|
||||
}
|
||||
|
||||
=head2 revoke
|
||||
|
||||
public instance (Boolean) revoke (String $msg)
|
||||
|
||||
Description:
|
||||
This method provides the revoke interface to spamd.
|
||||
|
||||
=cut
|
||||
|
||||
sub revoke {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
$self->_clear_errors();
|
||||
|
||||
my $remote = $self->_create_connection();
|
||||
|
||||
return unless $remote;
|
||||
|
||||
my $msgsize = length($msg.$EOL);
|
||||
|
||||
print $remote "TELL $PROTOVERSION$EOL";
|
||||
print $remote "Content-length: $msgsize$EOL";
|
||||
print $remote "User: $self->{username}$EOL" if defined $self->{username};
|
||||
print $remote "Message-class: ham$EOL";
|
||||
print $remote "Set: local$EOL";
|
||||
print $remote "Remove: remote$EOL";
|
||||
print $remote "$EOL";
|
||||
print $remote $msg;
|
||||
print $remote "$EOL";
|
||||
|
||||
$! = 0; my $line = <$remote>;
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (5): $!")
|
||||
: die "error reading from spamd (5): $!";
|
||||
return unless defined $line;
|
||||
|
||||
my ($version, $resp_code, $resp_msg) = $self->_parse_response_line($line);
|
||||
|
||||
$self->{resp_code} = $resp_code;
|
||||
$self->{resp_msg} = $resp_msg;
|
||||
|
||||
return unless $resp_code == 0;
|
||||
|
||||
my $revoked_p = 0;
|
||||
|
||||
for ($!=0; defined($line=<$remote>); $!=0) {
|
||||
if ($line =~ /DidRemove:\s+remote/i) {
|
||||
$revoked_p = 1;
|
||||
last;
|
||||
}
|
||||
elsif ($line =~ /^${EOL}$/) {
|
||||
last;
|
||||
}
|
||||
}
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (6): $!")
|
||||
: die "error reading from spamd (6): $!";
|
||||
close $remote or die "error closing socket: $!";
|
||||
|
||||
return $revoked_p;
|
||||
}
|
||||
|
||||
|
||||
=head2 ping
|
||||
|
||||
public instance (Boolean) ping ()
|
||||
|
||||
Description:
|
||||
This method performs a server ping and returns 0 or 1 depending on
|
||||
if the server responded correctly.
|
||||
|
||||
=cut
|
||||
|
||||
sub ping {
|
||||
my ($self) = @_;
|
||||
|
||||
my $remote = $self->_create_connection();
|
||||
|
||||
return 0 unless ($remote);
|
||||
|
||||
print $remote "PING $PROTOVERSION$EOL";
|
||||
print $remote "$EOL"; # bug 6187, bumps protocol version to 1.5
|
||||
|
||||
$! = 0; my $line = <$remote>;
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (7): $!")
|
||||
: die "error reading from spamd (7): $!";
|
||||
close $remote or die "error closing socket: $!";
|
||||
return unless defined $line;
|
||||
|
||||
my ($version, $resp_code, $resp_msg) = $self->_parse_response_line($line);
|
||||
return 0 unless ($resp_msg eq 'PONG');
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
=head1 PRIVATE METHODS
|
||||
|
||||
=head2 _create_connection
|
||||
|
||||
private instance (IO::Socket) _create_connection ()
|
||||
|
||||
Description:
|
||||
This method sets up a proper IO::Socket connection based on the arguments
|
||||
used when creating the client object.
|
||||
|
||||
On failure, it sets an internal error code and returns undef.
|
||||
|
||||
=cut
|
||||
|
||||
sub _create_connection {
|
||||
my ($self) = @_;
|
||||
|
||||
my $remote;
|
||||
|
||||
if ($self->{socketpath}) {
|
||||
$remote = IO::Socket::UNIX->new( Peer => $self->{socketpath},
|
||||
Type => SOCK_STREAM,
|
||||
Timeout => $self->{timeout},
|
||||
);
|
||||
}
|
||||
else {
|
||||
my %params = ( Proto => "tcp",
|
||||
PeerAddr => $self->{host},
|
||||
PeerPort => $self->{port},
|
||||
Timeout => $self->{timeout},
|
||||
);
|
||||
$remote = $io_socket_module_name->new(%params);
|
||||
}
|
||||
|
||||
unless ($remote) {
|
||||
print "Failed to create connection to spamd daemon: $!\n";
|
||||
return;
|
||||
}
|
||||
|
||||
$remote;
|
||||
}
|
||||
|
||||
=head2 _parse_response_line
|
||||
|
||||
private instance (@) _parse_response_line (String $line)
|
||||
|
||||
Description:
|
||||
This method parses the initial response line/header from the server
|
||||
and returns its parts.
|
||||
|
||||
We have this as a separate method in case we ever decide to get fancy
|
||||
with the response line.
|
||||
|
||||
=cut
|
||||
|
||||
sub _parse_response_line {
|
||||
my ($self, $line) = @_;
|
||||
|
||||
$line =~ s/\r?\n$//;
|
||||
return split(/\s+/, $line, 3);
|
||||
}
|
||||
|
||||
=head2 _clear_errors
|
||||
|
||||
private instance () _clear_errors ()
|
||||
|
||||
Description:
|
||||
This method clears out any current errors.
|
||||
|
||||
=cut
|
||||
|
||||
sub _clear_errors {
|
||||
my ($self) = @_;
|
||||
|
||||
$self->{resp_code} = undef;
|
||||
$self->{resp_msg} = undef;
|
||||
}
|
||||
|
||||
=head2 _filter
|
||||
|
||||
private instance (\%) _filter (String $msg, String $command)
|
||||
|
||||
Description:
|
||||
Makes the actual call to the spamd server for the various filter method
|
||||
(ie PROCESS, CHECK, HEADERS, etc). The command that is passed in is
|
||||
sent to the spamd server.
|
||||
|
||||
The return value is a hash reference containing several pieces of information,
|
||||
if available:
|
||||
|
||||
content_length
|
||||
|
||||
isspam
|
||||
|
||||
score
|
||||
|
||||
threshold
|
||||
|
||||
message (if available)
|
||||
|
||||
=cut
|
||||
|
||||
sub _filter {
|
||||
my ($self, $msg, $command) = @_;
|
||||
|
||||
my %data;
|
||||
|
||||
$self->_clear_errors();
|
||||
|
||||
my $remote = $self->_create_connection();
|
||||
|
||||
return 0 unless ($remote);
|
||||
|
||||
my $msgsize = length($msg.$EOL);
|
||||
|
||||
print $remote "$command $PROTOVERSION$EOL";
|
||||
print $remote "Content-length: $msgsize$EOL";
|
||||
print $remote "User: $self->{username}$EOL" if defined $self->{username};
|
||||
print $remote "$EOL";
|
||||
print $remote $msg;
|
||||
print $remote "$EOL";
|
||||
|
||||
$! = 0; my $line = <$remote>;
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (8): $!")
|
||||
: die "error reading from spamd (8): $!";
|
||||
return unless defined $line;
|
||||
|
||||
my ($version, $resp_code, $resp_msg) = $self->_parse_response_line($line);
|
||||
|
||||
$self->{resp_code} = $resp_code;
|
||||
$self->{resp_msg} = $resp_msg;
|
||||
|
||||
return unless $resp_code == 0;
|
||||
|
||||
for ($!=0; defined($line=<$remote>); $!=0) {
|
||||
local($1,$2,$3);
|
||||
if ($line =~ /Content-length: (\d+)/) {
|
||||
$data{content_length} = $1;
|
||||
}
|
||||
elsif ($line =~ m!Spam: (\S+) ; (\S+) / (\S+)!) {
|
||||
$data{isspam} = $1;
|
||||
$data{score} = $2 + 0;
|
||||
$data{threshold} = $3 + 0;
|
||||
}
|
||||
elsif ($line =~ /^${EOL}$/) {
|
||||
last;
|
||||
}
|
||||
}
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (9): $!")
|
||||
: die "error reading from spamd (9): $!";
|
||||
|
||||
my $return_msg;
|
||||
for ($!=0; defined($line=<$remote>); $!=0) {
|
||||
$return_msg .= $line;
|
||||
}
|
||||
defined $line || $!==0 or
|
||||
$!==EBADF ? dbg("error reading from spamd (10): $!")
|
||||
: die "error reading from spamd (10): $!";
|
||||
|
||||
$data{message} = $return_msg if ($return_msg);
|
||||
|
||||
close $remote or die "error closing socket: $!";
|
||||
|
||||
return \%data;
|
||||
}
|
||||
|
||||
1;
|
||||
|
5112
upstream/lib/Mail/SpamAssassin/Conf.pm
Normal file
5112
upstream/lib/Mail/SpamAssassin/Conf.pm
Normal file
File diff suppressed because it is too large
Load Diff
206
upstream/lib/Mail/SpamAssassin/Conf/LDAP.pm
Normal file
206
upstream/lib/Mail/SpamAssassin/Conf/LDAP.pm
Normal file
@ -0,0 +1,206 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Conf::LDAP - load SpamAssassin scores from LDAP database
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
(see Mail::SpamAssassin)
|
||||
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin is a module to identify spam using text analysis and
|
||||
several internet-based realtime blacklists.
|
||||
|
||||
This class is used internally by SpamAssassin to load scores from an LDAP
|
||||
database. Please refer to the C<Mail::SpamAssassin> documentation for public
|
||||
interfaces.
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Conf::LDAP;
|
||||
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my ($main) = @_;
|
||||
|
||||
my $self = {
|
||||
'main' => $main
|
||||
};
|
||||
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub load_modules { # static
|
||||
dbg("ldap: loading Net::LDAP and URI");
|
||||
eval {
|
||||
require Net::LDAP; # actual server connection
|
||||
require URI; # parse server connection dsn
|
||||
};
|
||||
|
||||
# do any other preloading that will speed up operation
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->load ($username)
|
||||
|
||||
Read configuration parameters from LDAP server and parse scores from it.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub load {
|
||||
my ($self, $username) = @_;
|
||||
|
||||
my $conf = $self->{main}->{conf};
|
||||
my $url = $conf->{user_scores_dsn}; # an ldap URI
|
||||
dbg("ldap: URL is $url");
|
||||
if(!defined($url) || $url eq '') {
|
||||
dbg("ldap: No URL defined; skipping LDAP");
|
||||
return;
|
||||
}
|
||||
|
||||
eval {
|
||||
# make sure we can see croak messages from DBI
|
||||
local $SIG{'__DIE__'} = sub { warn "$_[0]"; };
|
||||
require Net::LDAP;
|
||||
require URI;
|
||||
load_with_ldap($self, $username, $url);
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
if ($conf->{user_scores_fail_to_global}) {
|
||||
info("ldap: failed to load user (%s) scores from LDAP server, ".
|
||||
"using a global default: %s", $username, $eval_stat);
|
||||
return 1;
|
||||
} else {
|
||||
warn sprintf(
|
||||
"ldap: failed to load user (%s) scores from LDAP server: %s\n",
|
||||
$username, $eval_stat);
|
||||
return 0;
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
sub load_with_ldap {
|
||||
my ($self, $username, $url) = @_;
|
||||
|
||||
# ldapurl = scheme "://" [hostport] ["/"
|
||||
# [dn ["?" [attributes] ["?" [scope]
|
||||
# ["?" [filter] ["?" extensions]]]]]]
|
||||
|
||||
my $uri = URI->new("$url");
|
||||
|
||||
my $host = $uri->host;
|
||||
if (!defined($host) || $host eq '') {
|
||||
dbg("ldap: No server specified, assuming localhost");
|
||||
$host = "localhost";
|
||||
}
|
||||
my $port = $uri->port;
|
||||
my $base = $uri->dn;
|
||||
my @attr = $uri->attributes;
|
||||
my $scope = $uri->scope;
|
||||
my $filter = $uri->filter;
|
||||
my $scheme = $uri->scheme;
|
||||
my %extn = $uri->extensions; # unused
|
||||
|
||||
$filter =~ s/__USERNAME__/$username/g;
|
||||
dbg("ldap: host=$host, port=$port, base='$base', attr=${attr[0]}, scope=$scope, filter='$filter'");
|
||||
|
||||
my $main = $self->{main};
|
||||
my $conf = $main->{conf};
|
||||
my $ldapuser = $conf->{user_scores_ldap_username};
|
||||
my $ldappass = $conf->{user_scores_ldap_password};
|
||||
|
||||
if(!$ldapuser) {
|
||||
undef($ldapuser);
|
||||
} else {
|
||||
dbg("ldap: user='$ldapuser'");
|
||||
}
|
||||
|
||||
if(!$ldappass) {
|
||||
undef($ldappass);
|
||||
} else {
|
||||
# don't log this to avoid leaking sensitive info
|
||||
# dbg("ldap: pass='$ldappass'");
|
||||
}
|
||||
|
||||
my $f_attribute = $attr[0];
|
||||
|
||||
my $ldap = Net::LDAP->new ("$host:$port",
|
||||
onerror => "warn",
|
||||
scheme => $scheme);
|
||||
|
||||
if (!defined($ldapuser) && !defined($ldappass)) {
|
||||
$ldap->bind;
|
||||
} else {
|
||||
$ldap->bind($ldapuser, password => $ldappass);
|
||||
}
|
||||
|
||||
my $result = $ldap->search( base => $base,
|
||||
filter => $filter,
|
||||
scope => $scope,
|
||||
attrs => \@attr
|
||||
);
|
||||
|
||||
my $config_text = '';
|
||||
foreach my $entry ($result->all_entries) {
|
||||
my @v = $entry->get_value($f_attribute);
|
||||
foreach my $v (@v) {
|
||||
dbg("ldap: retrieving prefs for $username: $v");
|
||||
$config_text .= $v."\n";
|
||||
}
|
||||
}
|
||||
if ($config_text ne '') {
|
||||
$conf->{main} = $main;
|
||||
$conf->parse_scores_only($config_text);
|
||||
delete $conf->{main};
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub sa_die { Mail::SpamAssassin::sa_die(@_); }
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
1541
upstream/lib/Mail/SpamAssassin/Conf/Parser.pm
Normal file
1541
upstream/lib/Mail/SpamAssassin/Conf/Parser.pm
Normal file
File diff suppressed because it is too large
Load Diff
195
upstream/lib/Mail/SpamAssassin/Conf/SQL.pm
Normal file
195
upstream/lib/Mail/SpamAssassin/Conf/SQL.pm
Normal file
@ -0,0 +1,195 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Conf::SQL - load SpamAssassin scores from SQL database
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
(see Mail::SpamAssassin)
|
||||
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin is a module to identify spam using text analysis and
|
||||
several internet-based realtime blacklists.
|
||||
|
||||
This class is used internally by SpamAssassin to load scores from an SQL
|
||||
database. Please refer to the C<Mail::SpamAssassin> documentation for public
|
||||
interfaces.
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Conf::SQL;
|
||||
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my ($main) = @_;
|
||||
|
||||
my $self = {
|
||||
'main' => $main
|
||||
};
|
||||
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub load_modules { # static
|
||||
eval {
|
||||
require DBI;
|
||||
};
|
||||
|
||||
# do any other preloading that will speed up operation
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->load ($username)
|
||||
|
||||
Read configuration parameters from SQL database and parse scores from it.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub load {
|
||||
my ($self, $username) = @_;
|
||||
|
||||
my $conf = $self->{main}->{conf};
|
||||
my $dsn = $conf->{user_scores_dsn};
|
||||
if (!defined($dsn) || $dsn eq '') {
|
||||
dbg("config: no DSN defined; skipping sql");
|
||||
return 1;
|
||||
}
|
||||
|
||||
eval {
|
||||
# make sure we can see croak messages from DBI
|
||||
local $SIG{'__DIE__'} = sub { die "$_[0]"; };
|
||||
require DBI;
|
||||
load_with_dbi($self, $username, $dsn);
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
if ($conf->{user_scores_fail_to_global}) {
|
||||
info("config: failed to load user (%s) scores from SQL database, ".
|
||||
"using a global default: %s", $username, $eval_stat);
|
||||
return 1;
|
||||
} else {
|
||||
warn sprintf(
|
||||
"config: failed to load user (%s) scores from SQL database: %s\n",
|
||||
$username, $eval_stat);
|
||||
return 0;
|
||||
}
|
||||
};
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub load_with_dbi {
|
||||
my ($self, $username, $dsn) = @_;
|
||||
|
||||
my $main = $self->{main};
|
||||
my $conf = $main->{conf};
|
||||
my $dbuser = $conf->{user_scores_sql_username};
|
||||
my $dbpass = $conf->{user_scores_sql_password};
|
||||
my $custom_query = $conf->{user_scores_sql_custom_query};
|
||||
|
||||
my $f_preference = 'preference';
|
||||
my $f_value = 'value';
|
||||
my $f_username = 'username';
|
||||
my $f_table = 'userpref';
|
||||
|
||||
my $dbh = DBI->connect($dsn, $dbuser, $dbpass, {'PrintError' => 0});
|
||||
|
||||
if ($dbh) {
|
||||
my $sql;
|
||||
if (defined($custom_query)) {
|
||||
$sql = $custom_query;
|
||||
my $quoted_username = $dbh->quote($username);
|
||||
my ($mailbox, $domain) = split('@', $username);
|
||||
my $quoted_mailbox = $dbh->quote($mailbox);
|
||||
my $quoted_domain = $dbh->quote($domain);
|
||||
|
||||
$sql =~ s/_USERNAME_/$quoted_username/g;
|
||||
$sql =~ s/_TABLE_/$f_table/g;
|
||||
$sql =~ s/_MAILBOX_/$quoted_mailbox/g;
|
||||
$sql =~ s/_DOMAIN_/$quoted_domain/g;
|
||||
}
|
||||
else {
|
||||
$sql = "select $f_preference, $f_value from $f_table where ".
|
||||
"$f_username = ".$dbh->quote($username).
|
||||
" or $f_username = '\@GLOBAL' order by $f_username asc";
|
||||
}
|
||||
dbg("config: Conf::SQL: executing SQL: $sql");
|
||||
my $sth = $dbh->prepare($sql);
|
||||
if ($sth) {
|
||||
my $rv = $sth->execute();
|
||||
if ($rv) {
|
||||
dbg("config: retrieving prefs for $username from SQL server");
|
||||
my @row;
|
||||
my $config_text = '';
|
||||
while (@row = $sth->fetchrow_array()) {
|
||||
$config_text .= (defined($row[0]) ? $row[0] : '') . "\t" .
|
||||
(defined($row[1]) ? $row[1] : '') . "\n";
|
||||
}
|
||||
if ($config_text ne '') {
|
||||
$conf->{main} = $main;
|
||||
$conf->parse_scores_only($config_text);
|
||||
delete $conf->{main};
|
||||
}
|
||||
$sth->finish();
|
||||
undef $sth;
|
||||
}
|
||||
else {
|
||||
die "config: SQL error: $sql\n".$sth->errstr."\n";
|
||||
}
|
||||
}
|
||||
else {
|
||||
die "config: SQL error: " . $dbh->errstr . "\n";
|
||||
}
|
||||
$dbh->disconnect();
|
||||
}
|
||||
else {
|
||||
die "config: SQL error: " . DBI->errstr . "\n";
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub sa_die { Mail::SpamAssassin::sa_die(@_); }
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
414
upstream/lib/Mail/SpamAssassin/Constants.pm
Normal file
414
upstream/lib/Mail/SpamAssassin/Constants.pm
Normal file
@ -0,0 +1,414 @@
|
||||
# Constants used in many parts of the SpamAssassin codebase.
|
||||
#
|
||||
# TODO! we need to reimplement parts of the RESERVED regexp!
|
||||
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Constants;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use Exporter ();
|
||||
our @ISA = qw(Exporter);
|
||||
|
||||
our(@BAYES_VARS, @IP_VARS, @SA_VARS, %EXPORT_TAGS, @EXPORT_OK);
|
||||
|
||||
# NOTE: Unless you need these to be available at BEGIN time, you're better with this out of a BEGIN block with a simple our statement.
|
||||
BEGIN {
|
||||
@IP_VARS = qw(
|
||||
IP_IN_RESERVED_RANGE IP_PRIVATE LOCALHOST IPV4_ADDRESS IP_ADDRESS
|
||||
);
|
||||
@BAYES_VARS = qw(
|
||||
DUMP_MAGIC DUMP_TOKEN DUMP_BACKUP
|
||||
);
|
||||
# These are generic constants that may be used across several modules
|
||||
@SA_VARS = qw(
|
||||
HARVEST_DNSBL_PRIORITY MBX_SEPARATOR
|
||||
MAX_BODY_LINE_LENGTH MAX_HEADER_KEY_LENGTH MAX_HEADER_VALUE_LENGTH
|
||||
MAX_HEADER_LENGTH ARITH_EXPRESSION_LEXER AI_TIME_UNKNOWN
|
||||
CHARSETS_LIKELY_TO_FP_AS_CAPS MAX_URI_LENGTH RULENAME_RE IS_RULENAME
|
||||
META_RULES_MATCHING_RE
|
||||
);
|
||||
|
||||
%EXPORT_TAGS = (
|
||||
bayes => [ @BAYES_VARS ],
|
||||
ip => [ @IP_VARS ],
|
||||
sa => [ @SA_VARS ],
|
||||
all => [ @BAYES_VARS, @IP_VARS, @SA_VARS ],
|
||||
);
|
||||
|
||||
@EXPORT_OK = ( @BAYES_VARS, @IP_VARS, @SA_VARS );
|
||||
}
|
||||
|
||||
# BAYES_VARS
|
||||
use constant DUMP_MAGIC => 1;
|
||||
use constant DUMP_TOKEN => 2;
|
||||
use constant DUMP_SEEN => 4;
|
||||
use constant DUMP_BACKUP => 8;
|
||||
|
||||
# IP_VARS
|
||||
# ---------------------------------------------------------------------------
|
||||
# Initialize a regexp for private IPs, i.e. ones that could be
|
||||
# used inside a company and be the first or second relay hit by
|
||||
# a message. Some companies use these internally and translate
|
||||
# them using a NAT firewall. These are listed in the RBL as invalid
|
||||
# originators -- which is true, if you receive the mail directly
|
||||
# from them; however we do not, so we should ignore them.
|
||||
#
|
||||
# sources:
|
||||
# IANA = <https://www.iana.org/numbers>,
|
||||
# 5735 = <https://tools.ietf.org/html/rfc5735>
|
||||
# 6598 = <https://tools.ietf.org/html/rfc6598>
|
||||
# 4193 = <https://tools.ietf.org/html/rfc4193>
|
||||
# CYMRU = <https://www.team-cymru.com/bogon-reference.html>
|
||||
#
|
||||
# This includes:
|
||||
# host-local address space 127.0.0.0/8 and ::1,
|
||||
# link-local address space 169.254.0.0/16 and fe80::/10,
|
||||
# private-use address space 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16,
|
||||
# TODO: Unique Local Unicast Addresses fc00::/7 (RFC 4193)
|
||||
# shared address space 100.64.0.0/10 (RFC 6598 - for use in CGN),
|
||||
# IPv4-mapped IPv6 address ::ffff:0:0/96 (RFC 3513)
|
||||
#
|
||||
use constant IP_PRIVATE => qr{^(?:
|
||||
(?: # IPv4 addresses
|
||||
10| # 10.0.0.0/8 Private Use (5735, 1918)
|
||||
127| # 127.0.0.0/8 Host-local (5735, 1122)
|
||||
169\.254| # 169.254.0.0/16 Link-local (5735, 3927)
|
||||
172\.(?:1[6-9]|2[0-9]|3[01])| # 172.16.0.0/12 Private Use (5735, 1918)
|
||||
192\.168| # 192.168.0.0/16 Private Use (5735, 1918)
|
||||
100\.(?:6[4-9]|[7-9][0-9]|1[01][0-9]|12[0-7]) # 100.64.0.0/10 CGN (6598)
|
||||
)\..*
|
||||
|
|
||||
(?: # IPv6 addresses
|
||||
# don't use \b here, it hits on :'s
|
||||
(?:IPv6: # with optional prefix
|
||||
| (?<![a-f0-9:])
|
||||
)
|
||||
(?:
|
||||
# IPv4 mapped in IPv6
|
||||
# note the colon after the 12th byte in each here
|
||||
(?:
|
||||
# first 6 (12 bytes) non-zero
|
||||
(?:0{1,4}:){5} ffff:
|
||||
|
|
||||
# leading zeros omitted (note {0,5} not {1,5})
|
||||
::(?:0{1,4}:){0,4} ffff:
|
||||
|
|
||||
# trailing zeros (in the first 6) omitted
|
||||
(?:0{1,4}:){1,4}: ffff:
|
||||
|
|
||||
# 0000 in second up to (including) fifth omitted
|
||||
0{1,4}::(?:0{1,4}:){1,3} ffff:
|
||||
|
|
||||
# 0000 in third up to (including) fifth omitted
|
||||
(?:0{1,4}:){2}:0{1,2}: ffff:
|
||||
|
|
||||
# 0000 in fourth up to (including) fifth omitted
|
||||
(?:0{1,4}:){3}:0: ffff:
|
||||
|
|
||||
# 0000 in fifth omitted
|
||||
(?:0{1,4}:){4}: ffff:
|
||||
)
|
||||
# and the IPv4 address appended to all of the 12 bytes above
|
||||
(?:
|
||||
10|
|
||||
127|
|
||||
169\.254|
|
||||
172\.(?:1[6-9]|2[0-9]|3[01])|
|
||||
192\.168|
|
||||
100\.(?:6[4-9]|[7-9][0-9]|1[01][0-9]|12[0-7])
|
||||
)\..*
|
||||
|
||||
| # or IPv6 link-local address space, fe80::/10
|
||||
fe[89ab][0-9a-f]:.*
|
||||
|
||||
| # or the host-local ::1 addr, as a pure IPv6 address
|
||||
|
||||
# all 8 (16 bytes) of them present
|
||||
(?:0{1,4}:){7} 0{0,3}1
|
||||
|
|
||||
# leading zeros omitted
|
||||
:(?::0{1,4}){0,6}: 0{0,3}1
|
||||
|
|
||||
# 0000 in second up to (including) seventh omitted
|
||||
0{1,4}:(?::0{1,4}){0,5}: 0{0,3}1
|
||||
|
|
||||
# 0000 in third up to (including) seventh omitted
|
||||
(?:0{1,4}:){2}(?::0{1,4}){0,4}: 0{0,3}1
|
||||
|
|
||||
# 0000 in fourth up to (including) seventh omitted
|
||||
(?:0{1,4}:){3}(?::0{1,4}){0,3}: 0{0,3}1
|
||||
|
|
||||
# 0000 in fifth up to (including) seventh omitted
|
||||
(?:0{1,4}:){4}(?::0{1,4}){0,2}: 0{0,3}1
|
||||
|
|
||||
# 0000 in sixth up to (including) seventh omitted
|
||||
(?:0{1,4}:){5}(?::0{1,4}){0,1}: 0{0,3}1
|
||||
|
|
||||
# 0000 in seventh omitted
|
||||
(?:0{1,4}:){6}: 0{0,3}1
|
||||
)
|
||||
(?![a-f0-9:])
|
||||
)
|
||||
)}oxi;
|
||||
|
||||
# backward compatibility
|
||||
use constant IP_IN_RESERVED_RANGE => IP_PRIVATE;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# match the various ways of saying "localhost".
|
||||
|
||||
use constant LOCALHOST => qr/
|
||||
(?:
|
||||
# as a string
|
||||
localhost(?:\.localdomain)?
|
||||
|
|
||||
\b(?<!:) # ensure no "::" IPv6 marker before this one
|
||||
# plain IPv4
|
||||
127\.0\.0\.1 \b
|
||||
|
|
||||
# IPv6 addresses
|
||||
# don't use \b here, it hits on :'s
|
||||
(?:IPv6: # with optional prefix
|
||||
| (?<![a-f0-9:])
|
||||
)
|
||||
(?:
|
||||
# IPv4 mapped in IPv6
|
||||
# note the colon after the 12th byte in each here
|
||||
(?:
|
||||
# first 6 (12 bytes) non-zero
|
||||
(?:0{1,4}:){5} ffff:
|
||||
|
|
||||
# leading zeros omitted (note {0,5} not {1,5})
|
||||
::(?:0{1,4}:){0,4} ffff:
|
||||
|
|
||||
# trailing zeros (in the first 6) omitted
|
||||
(?:0{1,4}:){1,4}: ffff:
|
||||
|
|
||||
# 0000 in second up to (including) fifth omitted
|
||||
0{1,4}::(?:0{1,4}:){1,3} ffff:
|
||||
|
|
||||
# 0000 in third up to (including) fifth omitted
|
||||
(?:0{1,4}:){2}:0{1,2}: ffff:
|
||||
|
|
||||
# 0000 in fourth up to (including) fifth omitted
|
||||
(?:0{1,4}:){3}:0: ffff:
|
||||
|
|
||||
# 0000 in fifth omitted
|
||||
(?:0{1,4}:){4}: ffff:
|
||||
)
|
||||
# and the IPv4 address appended to all of the 12 bytes above
|
||||
127\.0\.0\.1 # no \b, we check later
|
||||
|
||||
| # or (separately) a pure IPv6 address
|
||||
|
||||
# all 8 (16 bytes) of them present
|
||||
(?:0{1,4}:){7} 0{0,3}1
|
||||
|
|
||||
# leading zeros omitted
|
||||
:(?::0{1,4}){0,6}: 0{0,3}1
|
||||
|
|
||||
# 0000 in second up to (including) seventh omitted
|
||||
0{1,4}:(?::0{1,4}){0,5}: 0{0,3}1
|
||||
|
|
||||
# 0000 in third up to (including) seventh omitted
|
||||
(?:0{1,4}:){2}(?::0{1,4}){0,4}: 0{0,3}1
|
||||
|
|
||||
# 0000 in fourth up to (including) seventh omitted
|
||||
(?:0{1,4}:){3}(?::0{1,4}){0,3}: 0{0,3}1
|
||||
|
|
||||
# 0000 in fifth up to (including) seventh omitted
|
||||
(?:0{1,4}:){4}(?::0{1,4}){0,2}: 0{0,3}1
|
||||
|
|
||||
# 0000 in sixth up to (including) seventh omitted
|
||||
(?:0{1,4}:){5}(?::0{1,4}){0,1}: 0{0,3}1
|
||||
|
|
||||
# 0000 in seventh omitted
|
||||
(?:0{1,4}:){6}: 0{0,3}1
|
||||
)
|
||||
(?![a-f0-9:])
|
||||
)
|
||||
/oxi;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# an IP address, in IPv4 format only.
|
||||
#
|
||||
use constant IPV4_ADDRESS => qr/\b
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)
|
||||
\b/ox;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# an IP address, in IPv4, IPv4-mapped-in-IPv6, or IPv6 format. NOTE: cannot
|
||||
# just refer to $IPV4_ADDRESS, due to perl bug reported in nesting qr//s. :(
|
||||
#
|
||||
use constant IP_ADDRESS => qr/
|
||||
(?:
|
||||
\b(?<!:) # ensure no "::" IPv4 marker before this one
|
||||
# plain IPv4, as above
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\b
|
||||
|
|
||||
# IPv6 addresses
|
||||
# don't use \b here, it hits on :'s
|
||||
(?:IPv6: # with optional prefix
|
||||
| (?<![a-f0-9:])
|
||||
)
|
||||
(?:
|
||||
# IPv4 mapped in IPv6
|
||||
# note the colon after the 12th byte in each here
|
||||
(?:
|
||||
# first 6 (12 bytes) non-zero
|
||||
(?:[a-f0-9]{1,4}:){6}
|
||||
|
|
||||
# leading zeros omitted (note {0,5} not {1,5})
|
||||
::(?:[a-f0-9]{1,4}:){0,5}
|
||||
|
|
||||
# trailing zeros (in the first 6) omitted
|
||||
(?:[a-f0-9]{1,4}:){1,5}:
|
||||
|
|
||||
# 0000 in second up to (including) fifth omitted
|
||||
[a-f0-9]{1,4}::(?:[a-f0-9]{1,4}:){1,4}
|
||||
|
|
||||
# 0000 in third up to (including) fifth omitted
|
||||
(?:[a-f0-9]{1,4}:){2}:(?:[a-f0-9]{1,4}:){1,3}
|
||||
|
|
||||
# 0000 in fourth up to (including) fifth omitted
|
||||
(?:[a-f0-9]{1,4}:){3}:(?:[a-f0-9]{1,4}:){1,2}
|
||||
|
|
||||
# 0000 in fifth omitted
|
||||
(?:[a-f0-9]{1,4}:){4}:[a-f0-9]{1,4}:
|
||||
)
|
||||
# and the IPv4 address appended to all of the 12 bytes above
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d) # no \b, we check later
|
||||
|
||||
| # or (separately) a pure IPv6 address
|
||||
|
||||
# all 8 (16 bytes) of them present
|
||||
(?:[a-f0-9]{1,4}:){7}[a-f0-9]{1,4}
|
||||
|
|
||||
# leading zeros omitted
|
||||
:(?::[a-f0-9]{1,4}){1,7}
|
||||
|
|
||||
# trailing zeros omitted
|
||||
(?:[a-f0-9]{1,4}:){1,7}:
|
||||
|
|
||||
# 0000 in second up to (including) seventh omitted
|
||||
[a-f0-9]{1,4}:(?::[a-f0-9]{1,4}){1,6}
|
||||
|
|
||||
# 0000 in third up to (including) seventh omitted
|
||||
(?:[a-f0-9]{1,4}:){2}(?::[a-f0-9]{1,4}){1,5}
|
||||
|
|
||||
# 0000 in fourth up to (including) seventh omitted
|
||||
(?:[a-f0-9]{1,4}:){3}(?::[a-f0-9]{1,4}){1,4}
|
||||
|
|
||||
# 0000 in fifth up to (including) seventh omitted
|
||||
(?:[a-f0-9]{1,4}:){4}(?::[a-f0-9]{1,4}){1,3}
|
||||
|
|
||||
# 0000 in sixth up to (including) seventh omitted
|
||||
(?:[a-f0-9]{1,4}:){5}(?::[a-f0-9]{1,4}){1,2}
|
||||
|
|
||||
# 0000 in seventh omitted
|
||||
(?:[a-f0-9]{1,4}:){6}:[a-f0-9]{1,4}
|
||||
|
|
||||
# :: (the unspecified address 0:0:0:0:0:0:0:0)
|
||||
# dos: I don't expect to see this address in a header, and
|
||||
# it may cause non-address strings to match, but we'll
|
||||
# include it for now since it is valid
|
||||
::
|
||||
)
|
||||
(?![a-f0-9:])
|
||||
)
|
||||
/oxi;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
use constant HARVEST_DNSBL_PRIORITY => 500;
|
||||
|
||||
# regular expression that matches message separators in The University of
|
||||
# Washington's MBX mailbox format
|
||||
use constant MBX_SEPARATOR => qr/^([\s\d]\d-[a-zA-Z]{3}-\d{4}\s\d{2}:\d{2}:\d{2}.*),(\d+);([\da-f]{12})-(\w{8})\r?$/;
|
||||
# $1 = datestamp (str)
|
||||
# $2 = size of message in bytes (int)
|
||||
# $3 = message status - binary (hex)
|
||||
# $4 = message ID (hex)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# values used for internal message representations
|
||||
|
||||
# maximum byte length of lines in the body
|
||||
use constant MAX_BODY_LINE_LENGTH => 2048;
|
||||
# maximum byte length of a header key
|
||||
use constant MAX_HEADER_KEY_LENGTH => 256;
|
||||
# maximum byte length of a header value including continued lines
|
||||
use constant MAX_HEADER_VALUE_LENGTH => 8192;
|
||||
# maximum byte length of entire header
|
||||
use constant MAX_HEADER_LENGTH => 65536;
|
||||
|
||||
# maximum byte length of any given URI
|
||||
use constant MAX_URI_LENGTH => 8192;
|
||||
|
||||
# used for meta rules and "if" conditionals in Conf::Parser
|
||||
use constant ARITH_EXPRESSION_LEXER => qr/(?:
|
||||
[\-\+\d\.]+| # A Number
|
||||
\w[\w\:]*| # Rule or Class Name
|
||||
[\(\)]| # Parens
|
||||
\|\|| # Boolean OR
|
||||
\&\&| # Boolean AND
|
||||
\^| # Boolean XOR
|
||||
!(?!=)| # Boolean NOT
|
||||
>=?| # GT or EQ
|
||||
<=?| # LT or EQ
|
||||
==| # EQ
|
||||
!=| # NEQ
|
||||
[\+\-\*\/]| # Mathematical Operator
|
||||
[\?:] # ? : Operator
|
||||
)/ox;
|
||||
|
||||
# ArchiveIterator
|
||||
|
||||
# if AI doesn't read in the message in the first pass to see if the received
|
||||
# date makes the message useful or not, we need to mark it so that in the
|
||||
# second pass (when the message is actually read + processed) the received
|
||||
# date is calculated. this value signifies "unknown" from the first pass.
|
||||
use constant AI_TIME_UNKNOWN => 0;
|
||||
|
||||
# Charsets which use capital letters heavily in their encoded representation.
|
||||
use constant CHARSETS_LIKELY_TO_FP_AS_CAPS => qr{[-_a-z0-9]*(?:
|
||||
koi|jp|jis|euc|gb|big5|isoir|cp1251|windows-1251|georgianps|pt154|tis
|
||||
)[-_a-z0-9]*}ix;
|
||||
|
||||
# Allowed rulename format
|
||||
use constant RULENAME_RE => qr([_a-zA-Z][_a-zA-Z0-9]{0,127});
|
||||
# Exact match
|
||||
use constant IS_RULENAME => qr/^${\(RULENAME_RE)}$/;
|
||||
|
||||
# meta function rules_matching(), takes argument RULENAME_RE with glob *? characters
|
||||
use constant META_RULES_MATCHING_RE => qr/(?<!_)\brules_matching\(\s*([_a-zA-Z*?][_a-zA-Z0-9*?]{0,127})\s*\)/;
|
||||
|
||||
1;
|
179
upstream/lib/Mail/SpamAssassin/DBBasedAddrList.pm
Normal file
179
upstream/lib/Mail/SpamAssassin/DBBasedAddrList.pm
Normal file
@ -0,0 +1,179 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::DBBasedAddrList;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Fcntl;
|
||||
|
||||
use Mail::SpamAssassin::PersistentAddrList;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::PersistentAddrList);
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new(@_);
|
||||
$self->{class} = $class;
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new_checker {
|
||||
my ($factory, $main) = @_;
|
||||
my $class = $factory->{class};
|
||||
|
||||
my $self = {
|
||||
'main' => $main,
|
||||
'accum' => { },
|
||||
'is_locked' => 0,
|
||||
'locked_file' => ''
|
||||
};
|
||||
|
||||
my @order = split(/\s+/, $main->{conf}->{auto_whitelist_db_modules});
|
||||
untaint_var(\@order);
|
||||
my $dbm_module = Mail::SpamAssassin::Util::first_available_module (@order);
|
||||
if (!$dbm_module) {
|
||||
die "auto-whitelist: cannot find a usable DB package from auto_whitelist_db_modules: " .
|
||||
$main->{conf}->{auto_whitelist_db_modules}."\n";
|
||||
}
|
||||
|
||||
my $umask = umask ~ (oct($main->{conf}->{auto_whitelist_file_mode}));
|
||||
|
||||
# if undef then don't worry -- empty hash!
|
||||
if (defined($main->{conf}->{auto_whitelist_path})) {
|
||||
my $path = $main->sed_path($main->{conf}->{auto_whitelist_path});
|
||||
my ($mod1, $mod2);
|
||||
|
||||
if ($main->{locker}->safe_lock
|
||||
($path, 30, $main->{conf}->{auto_whitelist_file_mode}))
|
||||
{
|
||||
$self->{locked_file} = $path;
|
||||
$self->{is_locked} = 1;
|
||||
($mod1, $mod2) = ('R/W', O_RDWR | O_CREAT);
|
||||
}
|
||||
else {
|
||||
$self->{is_locked} = 0;
|
||||
($mod1, $mod2) = ('R/O', O_RDONLY);
|
||||
}
|
||||
|
||||
dbg("auto-whitelist: tie-ing to DB file of type $dbm_module $mod1 in $path");
|
||||
|
||||
($self->{is_locked} && $dbm_module eq 'DB_File') and
|
||||
Mail::SpamAssassin::Util::avoid_db_file_locking_bug($path);
|
||||
|
||||
if (! tie %{ $self->{accum} }, $dbm_module, $path, $mod2,
|
||||
oct($main->{conf}->{auto_whitelist_file_mode}) & 0666)
|
||||
{
|
||||
my $err = $!; # might get overwritten later
|
||||
if ($self->{is_locked}) {
|
||||
$self->{main}->{locker}->safe_unlock($self->{locked_file});
|
||||
$self->{is_locked} = 0;
|
||||
}
|
||||
die "auto-whitelist: cannot open auto_whitelist_path $path: $err\n";
|
||||
}
|
||||
}
|
||||
umask $umask;
|
||||
|
||||
bless ($self, $class);
|
||||
return $self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub finish {
|
||||
my $self = shift;
|
||||
dbg("auto-whitelist: DB addr list: untie-ing and unlocking");
|
||||
untie %{$self->{accum}};
|
||||
if ($self->{is_locked}) {
|
||||
dbg("auto-whitelist: DB addr list: file locked, breaking lock");
|
||||
$self->{main}->{locker}->safe_unlock ($self->{locked_file});
|
||||
$self->{is_locked} = 0;
|
||||
}
|
||||
# TODO: untrap signals to unlock the db file here
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub get_addr_entry {
|
||||
my ($self, $addr, $signedby) = @_;
|
||||
|
||||
my $entry = {
|
||||
addr => $addr,
|
||||
};
|
||||
|
||||
$entry->{msgcount} = $self->{accum}->{$addr} || 0;
|
||||
$entry->{totscore} = $self->{accum}->{$addr.'|totscore'} || 0;
|
||||
|
||||
dbg("auto-whitelist: db-based $addr scores ".$entry->{msgcount}.'/'.$entry->{totscore});
|
||||
return $entry;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub add_score {
|
||||
my($self, $entry, $score) = @_;
|
||||
|
||||
$entry->{msgcount} ||= 0;
|
||||
$entry->{addr} ||= '';
|
||||
|
||||
$entry->{msgcount}++;
|
||||
$entry->{totscore} += $score;
|
||||
|
||||
dbg("auto-whitelist: add_score: new count: ".$entry->{msgcount}.", new totscore: ".$entry->{totscore});
|
||||
|
||||
$self->{accum}->{$entry->{addr}} = $entry->{msgcount};
|
||||
$self->{accum}->{$entry->{addr}.'|totscore'} = $entry->{totscore};
|
||||
return $entry;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub remove_entry {
|
||||
my ($self, $entry) = @_;
|
||||
|
||||
my $addr = $entry->{addr};
|
||||
delete $self->{accum}->{$addr};
|
||||
delete $self->{accum}->{$addr.'|totscore'};
|
||||
|
||||
if ($addr =~ /^(.*)\|ip=none$/) {
|
||||
# it doesn't have an IP attached.
|
||||
# try to delete any per-IP entries for this addr as well.
|
||||
# could be slow...
|
||||
my $mailaddr = $1;
|
||||
|
||||
while (my ($key, $value) = each %{$self->{accum}}) {
|
||||
# regex will catch both key and key|totscore entries and delete them
|
||||
if ($key =~ /^\Q${mailaddr}\E\|/) {
|
||||
delete $self->{accum}->{$key};
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
730
upstream/lib/Mail/SpamAssassin/Dns.pm
Normal file
730
upstream/lib/Mail/SpamAssassin/Dns.pm
Normal file
@ -0,0 +1,730 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
use strict; # make Test::Perl::Critic happy
|
||||
package Mail::SpamAssassin::Dns; 1;
|
||||
|
||||
package Mail::SpamAssassin::PerMsgStatus;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Conf;
|
||||
use Mail::SpamAssassin::PerMsgStatus;
|
||||
use Mail::SpamAssassin::AsyncLoop;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var am_running_on_windows);
|
||||
|
||||
use File::Spec;
|
||||
use IO::Socket;
|
||||
use POSIX ":sys_wait_h";
|
||||
|
||||
|
||||
our $KNOWN_BAD_DIALUP_RANGES; # Nothing uses this var???
|
||||
our $LAST_DNS_CHECK;
|
||||
|
||||
# use very well-connected domains (fast DNS response, many DNS servers,
|
||||
# geographical distribution is a plus, TTL of at least 3600s)
|
||||
our @EXISTING_DOMAINS = qw{
|
||||
adelphia.net
|
||||
akamai.com
|
||||
apache.org
|
||||
cingular.com
|
||||
colorado.edu
|
||||
comcast.net
|
||||
doubleclick.com
|
||||
ebay.com
|
||||
gmx.net
|
||||
google.com
|
||||
intel.com
|
||||
kernel.org
|
||||
linux.org
|
||||
mit.edu
|
||||
motorola.com
|
||||
msn.com
|
||||
sourceforge.net
|
||||
sun.com
|
||||
w3.org
|
||||
yahoo.com
|
||||
};
|
||||
|
||||
our $IS_DNS_AVAILABLE = undef;
|
||||
|
||||
#Removed $VERSION per BUG 6422
|
||||
#$VERSION = 'bogus'; # avoid CPAN.pm picking up razor ver
|
||||
|
||||
###########################################################################
|
||||
|
||||
BEGIN {
|
||||
# some trickery. Load these modules right here, if possible; that way, if
|
||||
# the module exists, we'll get it loaded now. Very useful to avoid attempted
|
||||
# loads later (which will happen). If we do a fork(), we could wind up
|
||||
# attempting to load these modules in *every* subprocess.
|
||||
#
|
||||
# # We turn off strict and warnings, because Net::DNS and Razor both contain
|
||||
# # crud that -w complains about (perl 5.6.0). Not that this seems to work,
|
||||
# # mind ;)
|
||||
# no strict;
|
||||
# local ($^W) = 0;
|
||||
|
||||
no warnings;
|
||||
eval {
|
||||
require Net::DNS;
|
||||
require Net::DNS::Resolver;
|
||||
};
|
||||
eval {
|
||||
require MIME::Base64;
|
||||
};
|
||||
eval {
|
||||
require IO::Socket::UNIX;
|
||||
};
|
||||
};
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub do_rbl_lookup {
|
||||
my ($self, $rule, $set, $type, $host, $subtest) = @_;
|
||||
|
||||
$host =~ s/\.\z//s; # strip a redundant trailing dot
|
||||
my $key = "dns:$type:$host";
|
||||
my $existing_ent = $self->{async}->get_lookup($key);
|
||||
|
||||
# only make a specific query once
|
||||
if (!$existing_ent) {
|
||||
my $ent = {
|
||||
key => $key,
|
||||
zone => $host, # serves to fetch other per-zone settings
|
||||
type => "DNSBL-".$type,
|
||||
sets => [ ], # filled in below
|
||||
rules => [ ], # filled in below
|
||||
# id is filled in after we send the query below
|
||||
};
|
||||
$existing_ent = $self->{async}->bgsend_and_start_lookup(
|
||||
$host, $type, undef, $ent,
|
||||
sub { my($ent, $pkt) = @_; $self->process_dnsbl_result($ent, $pkt) },
|
||||
master_deadline => $self->{master_deadline} );
|
||||
}
|
||||
|
||||
if ($existing_ent) {
|
||||
# always add set
|
||||
push @{$existing_ent->{sets}}, $set;
|
||||
|
||||
# sometimes match or always match
|
||||
if (defined $subtest) {
|
||||
$self->{dnspost}->{$set}->{$subtest} = $rule;
|
||||
} else {
|
||||
push @{$existing_ent->{rules}}, $rule;
|
||||
}
|
||||
|
||||
$self->{rule_to_rblkey}->{$rule} = $key;
|
||||
}
|
||||
}
|
||||
|
||||
# TODO: these are constant so they should only be added once at startup
|
||||
sub register_rbl_subtest {
|
||||
my ($self, $rule, $set, $subtest) = @_;
|
||||
|
||||
if ($subtest =~ /^sb:/) {
|
||||
warn("dns: ignored $rule, SenderBase rules are deprecated\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
$self->{dnspost}->{$set}->{$subtest} = $rule;
|
||||
}
|
||||
|
||||
sub do_dns_lookup {
|
||||
my ($self, $rule, $type, $host) = @_;
|
||||
|
||||
$host =~ s/\.\z//s; # strip a redundant trailing dot
|
||||
my $key = "dns:$type:$host";
|
||||
|
||||
my $ent = {
|
||||
key => $key,
|
||||
zone => $host, # serves to fetch other per-zone settings
|
||||
type => "DNSBL-".$type,
|
||||
rules => [ $rule ],
|
||||
# id is filled in after we send the query below
|
||||
};
|
||||
$ent = $self->{async}->bgsend_and_start_lookup(
|
||||
$host, $type, undef, $ent,
|
||||
sub { my($ent, $pkt) = @_; $self->process_dnsbl_result($ent, $pkt) },
|
||||
master_deadline => $self->{master_deadline} );
|
||||
$ent;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub dnsbl_hit {
|
||||
my ($self, $rule, $question, $answer) = @_;
|
||||
|
||||
my $log = "";
|
||||
if (substr($rule, 0, 2) eq "__") {
|
||||
# don't bother with meta rules
|
||||
} elsif ($answer->type eq 'TXT') {
|
||||
# txtdata returns a non- zone-file-format encoded result, unlike rdstring;
|
||||
# avoid space-separated RDATA <character-string> fields if possible,
|
||||
# txtdata provides a list of strings in a list context since Net::DNS 0.69
|
||||
$log = join('',$answer->txtdata);
|
||||
local $1;
|
||||
$log =~ s{ (?<! [<(\[] ) (https? : // \S+)}{<$1>}xgi;
|
||||
} else { # assuming $answer->type eq 'A'
|
||||
local($1,$2,$3,$4,$5);
|
||||
if ($question->string =~ m/^((?:[0-9a-fA-F]\.){32})(\S+\w)/) {
|
||||
$log = ' listed in ' . lc($2);
|
||||
my $ipv6addr = join('', reverse split(/\./, lc $1));
|
||||
$ipv6addr =~ s/\G(....)/$1:/g; chop $ipv6addr;
|
||||
$ipv6addr =~ s/:0{1,3}/:/g;
|
||||
$log = $ipv6addr . $log;
|
||||
} elsif ($question->string =~ m/^(\d+)\.(\d+)\.(\d+)\.(\d+)\.(\S+\w)/) {
|
||||
$log = "$4.$3.$2.$1 listed in " . lc($5);
|
||||
} else {
|
||||
$log = 'listed in ' . $question->string;
|
||||
}
|
||||
}
|
||||
|
||||
# TODO: this may result in some log messages appearing under the
|
||||
# wrong rules, since we could see this sequence: { test one hits,
|
||||
# test one's message is logged, test two hits, test one fires again
|
||||
# on another IP, test one's message is logged for that other IP --
|
||||
# but under test two's heading }. Right now though it's better
|
||||
# than just not logging at all.
|
||||
|
||||
$self->{already_logged} ||= { };
|
||||
if ($log && !$self->{already_logged}->{$log}) {
|
||||
$self->test_log($log);
|
||||
$self->{already_logged}->{$log} = 1;
|
||||
}
|
||||
|
||||
if (!$self->{tests_already_hit}->{$rule}) {
|
||||
$self->got_hit($rule, "RBL: ", ruletype => "dnsbl");
|
||||
}
|
||||
}
|
||||
|
||||
sub dnsbl_uri {
|
||||
my ($self, $question, $answer) = @_;
|
||||
|
||||
my $qname = $question->qname;
|
||||
|
||||
# txtdata returns a non- zone-file-format encoded result, unlike rdstring;
|
||||
# avoid space-separated RDATA <character-string> fields if possible,
|
||||
# txtdata provides a list of strings in a list context since Net::DNS 0.69
|
||||
#
|
||||
# rdatastr() is historical/undocumented, use rdstring() since Net::DNS 0.69
|
||||
my $rdatastr = $answer->UNIVERSAL::can('txtdata') ? join('',$answer->txtdata)
|
||||
: $answer->UNIVERSAL::can('rdstring') ? $answer->rdstring
|
||||
: $answer->rdatastr;
|
||||
if (defined $qname && defined $rdatastr) {
|
||||
my $qclass = $question->qclass;
|
||||
my $qtype = $question->qtype;
|
||||
my @vals;
|
||||
push(@vals, "class=$qclass") if $qclass ne "IN";
|
||||
push(@vals, "type=$qtype") if $qtype ne "A";
|
||||
my $uri = "dns:$qname" . (@vals ? "?" . join(";", @vals) : "");
|
||||
push @{ $self->{dnsuri}->{$uri} }, $rdatastr;
|
||||
|
||||
dbg("dns: hit <$uri> $rdatastr");
|
||||
}
|
||||
}
|
||||
|
||||
# called as a completion routine to bgsend by DnsResolver::poll_responses;
|
||||
# returns 1 on successful packet processing
|
||||
sub process_dnsbl_result {
|
||||
my ($self, $ent, $pkt) = @_;
|
||||
|
||||
return if !$pkt;
|
||||
my $question = ($pkt->question)[0];
|
||||
return if !$question;
|
||||
|
||||
my $sets = $ent->{sets} || [];
|
||||
my $rules = $ent->{rules};
|
||||
|
||||
# NO_DNS_FOR_FROM
|
||||
if ($self->{sender_host} &&
|
||||
# fishy, qname should have been "RFC 1035 zone format" -decoded first
|
||||
lc($question->qname) eq lc($self->{sender_host}) &&
|
||||
$question->qtype =~ /^(?:A|MX)$/ &&
|
||||
$pkt->header->rcode =~ /^(?:NXDOMAIN|SERVFAIL)$/ &&
|
||||
++$self->{sender_host_fail} == 2)
|
||||
{
|
||||
for my $rule (@{$rules}) {
|
||||
$self->got_hit($rule, "DNS: ", ruletype => "dns");
|
||||
}
|
||||
}
|
||||
|
||||
# DNSBL tests are here
|
||||
foreach my $answer ($pkt->answer) {
|
||||
next if !$answer;
|
||||
# track all responses
|
||||
$self->dnsbl_uri($question, $answer);
|
||||
my $answ_type = $answer->type;
|
||||
# TODO: there are some CNAME returns that might be useful
|
||||
next if ($answ_type ne 'A' && $answ_type ne 'TXT');
|
||||
if ($answ_type eq 'A') {
|
||||
# Net::DNS::RR::A::address() is available since Net::DNS 0.69
|
||||
my $ip_address = $answer->UNIVERSAL::can('address') ? $answer->address
|
||||
: $answer->rdatastr;
|
||||
# skip any A record that isn't on 127.0.0.0/8
|
||||
next if $ip_address !~ /^127\./;
|
||||
}
|
||||
for my $rule (@{$rules}) {
|
||||
$self->dnsbl_hit($rule, $question, $answer);
|
||||
}
|
||||
for my $set (@{$sets}) {
|
||||
if ($self->{dnspost}->{$set}) {
|
||||
$self->process_dnsbl_set($set, $question, $answer);
|
||||
}
|
||||
}
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub process_dnsbl_set {
|
||||
my ($self, $set, $question, $answer) = @_;
|
||||
|
||||
# txtdata returns a non- zone-file-format encoded result, unlike rdstring;
|
||||
# avoid space-separated RDATA <character-string> fields if possible,
|
||||
# txtdata provides a list of strings in a list context since Net::DNS 0.69
|
||||
#
|
||||
# rdatastr() is historical/undocumented, use rdstring() since Net::DNS 0.69
|
||||
my $rdatastr = $answer->UNIVERSAL::can('txtdata') ? join('',$answer->txtdata)
|
||||
: $answer->UNIVERSAL::can('rdstring') ? $answer->rdstring
|
||||
: $answer->rdatastr;
|
||||
|
||||
while (my ($subtest, $rule) = each %{ $self->{dnspost}->{$set} }) {
|
||||
next if $self->{tests_already_hit}->{$rule};
|
||||
|
||||
if ($subtest =~ /^\d+\.\d+\.\d+\.\d+$/) {
|
||||
# test for exact equality, not a regexp (an IPv4 address)
|
||||
$self->dnsbl_hit($rule, $question, $answer) if $subtest eq $rdatastr;
|
||||
}
|
||||
# bitmask
|
||||
elsif ($subtest =~ /^\d+$/) {
|
||||
# Bug 6803: response should be within 127.0.0.0/8, ignore otherwise
|
||||
if ($rdatastr =~ m/^127\.\d{1,3}\.\d{1,3}\.\d{1,3}$/ &&
|
||||
Mail::SpamAssassin::Util::my_inet_aton($rdatastr) & $subtest)
|
||||
{
|
||||
$self->dnsbl_hit($rule, $question, $answer);
|
||||
}
|
||||
}
|
||||
# regular expression
|
||||
else {
|
||||
my $test = qr/$subtest/;
|
||||
if ($rdatastr =~ /$test/) {
|
||||
$self->dnsbl_hit($rule, $question, $answer);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
sub harvest_until_rule_completes {
|
||||
my ($self, $rule) = @_;
|
||||
|
||||
dbg("dns: harvest_until_rule_completes");
|
||||
my $result = 0;
|
||||
|
||||
for (my $first=1; ; $first=0) {
|
||||
# complete_lookups() may call completed_callback(), which may
|
||||
# call start_lookup() again (like in Plugin::URIDNSBL)
|
||||
my ($alldone,$anydone) =
|
||||
$self->{async}->complete_lookups($first ? 0 : 1.0, 1);
|
||||
|
||||
$result = 1 if $self->is_rule_complete($rule);
|
||||
last if $result || $alldone;
|
||||
|
||||
dbg("dns: harvest_until_rule_completes - check_tick");
|
||||
$self->{main}->call_plugins ("check_tick", { permsgstatus => $self });
|
||||
}
|
||||
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub harvest_dnsbl_queries {
|
||||
my ($self) = @_;
|
||||
|
||||
dbg("dns: harvest_dnsbl_queries");
|
||||
|
||||
for (my $first=1; ; $first=0) {
|
||||
# complete_lookups() may call completed_callback(), which may
|
||||
# call start_lookup() again (like in Plugin::URIDNSBL)
|
||||
|
||||
# the first time around we specify a 0 timeout, which gives
|
||||
# complete_lookups a chance to ripe any available results and
|
||||
# abort overdue requests, without needlessly waiting for more
|
||||
|
||||
my ($alldone,$anydone) =
|
||||
$self->{async}->complete_lookups($first ? 0 : 1.0, 1);
|
||||
|
||||
last if $alldone;
|
||||
|
||||
dbg("dns: harvest_dnsbl_queries - check_tick");
|
||||
$self->{main}->call_plugins ("check_tick", { permsgstatus => $self });
|
||||
}
|
||||
|
||||
# explicitly abort anything left
|
||||
$self->{async}->abort_remaining_lookups();
|
||||
$self->{async}->log_lookups_timing();
|
||||
$self->mark_all_async_rules_complete();
|
||||
1;
|
||||
}
|
||||
|
||||
# collect and process whatever DNS responses have already arrived,
|
||||
# don't waste time waiting for more, don't poll too often.
|
||||
# don't abort any queries even if overdue,
|
||||
sub harvest_completed_queries {
|
||||
my ($self) = @_;
|
||||
|
||||
# don't bother collecting responses too often
|
||||
my $last_poll_time = $self->{async}->last_poll_responses_time();
|
||||
return if defined $last_poll_time && time - $last_poll_time < 0.1;
|
||||
|
||||
my ($alldone,$anydone) = $self->{async}->complete_lookups(0, 0);
|
||||
if ($anydone) {
|
||||
dbg("dns: harvested completed queries");
|
||||
# $self->{main}->call_plugins ("check_tick", { permsgstatus => $self });
|
||||
}
|
||||
}
|
||||
|
||||
sub set_rbl_tag_data {
|
||||
my ($self) = @_;
|
||||
|
||||
# DNS URIs
|
||||
my $rbl_tag = $self->{tag_data}->{RBL}; # just in case, should be empty
|
||||
$rbl_tag = '' if !defined $rbl_tag;
|
||||
while (my ($dnsuri, $answers) = each %{ $self->{dnsuri} }) {
|
||||
# when parsing, look for elements of \".*?\" or \S+ with ", " as separator
|
||||
$rbl_tag .= "<$dnsuri>" . " [" . join(", ", @{ $answers }) . "]\n";
|
||||
}
|
||||
if (defined $rbl_tag && $rbl_tag ne '') {
|
||||
chomp $rbl_tag;
|
||||
$self->set_tag('RBL', $rbl_tag);
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub rbl_finish {
|
||||
my ($self) = @_;
|
||||
|
||||
$self->set_rbl_tag_data();
|
||||
|
||||
delete $self->{dnspost};
|
||||
delete $self->{dnsuri};
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub load_resolver {
|
||||
my ($self) = @_;
|
||||
$self->{resolver} = $self->{main}->{resolver};
|
||||
return $self->{resolver}->load_resolver();
|
||||
}
|
||||
|
||||
sub clear_resolver {
|
||||
my ($self) = @_;
|
||||
dbg("dns: clear_resolver");
|
||||
$self->{main}->{resolver}->{res} = undef;
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub lookup_ns {
|
||||
my ($self, $dom) = @_;
|
||||
|
||||
return unless $self->load_resolver();
|
||||
return if ($self->server_failed_to_respond_for_domain ($dom));
|
||||
|
||||
my $nsrecords;
|
||||
dbg("dns: looking up NS for '$dom'");
|
||||
|
||||
eval {
|
||||
my $query = $self->{resolver}->send($dom, 'NS');
|
||||
my @nses;
|
||||
if ($query) {
|
||||
foreach my $rr ($query->answer) {
|
||||
if ($rr->type eq "NS") { push (@nses, $rr->nsdname); }
|
||||
}
|
||||
}
|
||||
$nsrecords = [ @nses ];
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("dns: NS lookup failed horribly, perhaps bad resolv.conf setting? (%s)", $eval_stat);
|
||||
return;
|
||||
};
|
||||
|
||||
$nsrecords;
|
||||
}
|
||||
|
||||
sub is_dns_available {
|
||||
my ($self) = @_;
|
||||
my $dnsopt = $self->{conf}->{dns_available};
|
||||
my $dnsint = $self->{conf}->{dns_test_interval} || 600;
|
||||
my @domains;
|
||||
|
||||
$LAST_DNS_CHECK ||= 0;
|
||||
my $diff = time() - $LAST_DNS_CHECK;
|
||||
|
||||
# undef $IS_DNS_AVAILABLE if we should be testing for
|
||||
# working DNS and our check interval time has passed
|
||||
if ($dnsopt eq "test" && $diff > $dnsint) {
|
||||
$IS_DNS_AVAILABLE = undef;
|
||||
dbg("dns: is_dns_available() last checked %.1f seconds ago; re-checking",
|
||||
$diff);
|
||||
}
|
||||
|
||||
return $IS_DNS_AVAILABLE if (defined $IS_DNS_AVAILABLE);
|
||||
$LAST_DNS_CHECK = time();
|
||||
|
||||
$IS_DNS_AVAILABLE = 0;
|
||||
if ($dnsopt eq "no") {
|
||||
dbg("dns: dns_available set to no in config file, skipping test");
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
|
||||
# Even if "dns_available" is explicitly set to "yes", we want to ignore
|
||||
# DNS if we're only supposed to be looking at local tests.
|
||||
goto done if ($self->{main}->{local_tests_only});
|
||||
|
||||
# Check version numbers - runtime check only
|
||||
if (defined $Net::DNS::VERSION) {
|
||||
if (am_running_on_windows()) {
|
||||
if ($Net::DNS::VERSION < 0.46) {
|
||||
warn("dns: Net::DNS version is $Net::DNS::VERSION, but need 0.46 for Win32");
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
}
|
||||
else {
|
||||
if ($Net::DNS::VERSION < 0.34) {
|
||||
warn("dns: Net::DNS version is $Net::DNS::VERSION, but need 0.34");
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
$self->clear_resolver();
|
||||
goto done unless $self->load_resolver();
|
||||
|
||||
if ($dnsopt eq "yes") {
|
||||
# optionally shuffle the list of nameservers to distribute the load
|
||||
if ($self->{conf}->{dns_options}->{rotate}) {
|
||||
my @nameservers = $self->{resolver}->available_nameservers();
|
||||
Mail::SpamAssassin::Util::fisher_yates_shuffle(\@nameservers);
|
||||
dbg("dns: shuffled NS list: " . join(", ", @nameservers));
|
||||
$self->{resolver}->available_nameservers(@nameservers);
|
||||
}
|
||||
$IS_DNS_AVAILABLE = 1;
|
||||
dbg("dns: dns_available set to yes in config file, skipping test");
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
|
||||
if ($dnsopt =~ /^test:\s*(\S.*)$/) {
|
||||
@domains = split (/\s+/, $1);
|
||||
dbg("dns: looking up NS records for user specified domains: %s",
|
||||
join(", ", @domains));
|
||||
} else {
|
||||
@domains = @EXISTING_DOMAINS;
|
||||
dbg("dns: looking up NS records for built-in domains");
|
||||
}
|
||||
|
||||
# do the test with a full set of configured nameservers
|
||||
my @nameservers = $self->{resolver}->configured_nameservers();
|
||||
|
||||
# optionally shuffle the list of nameservers to distribute the load
|
||||
if ($self->{conf}->{dns_options}->{rotate}) {
|
||||
Mail::SpamAssassin::Util::fisher_yates_shuffle(\@nameservers);
|
||||
dbg("dns: shuffled NS list, testing: " . join(", ", @nameservers));
|
||||
} else {
|
||||
dbg("dns: testing resolver nameservers: " . join(", ", @nameservers));
|
||||
}
|
||||
|
||||
# Try the different nameservers here and collect a list of working servers
|
||||
my @good_nameservers;
|
||||
foreach my $ns (@nameservers) {
|
||||
$self->{resolver}->available_nameservers($ns); # try just this one
|
||||
for (my $retry = 3; $retry > 0 && @domains; $retry--) {
|
||||
my $domain = splice(@domains, rand(@domains), 1);
|
||||
dbg("dns: trying ($retry) $domain, server $ns ...");
|
||||
my $result = $self->lookup_ns($domain);
|
||||
$self->{resolver}->finish_socket();
|
||||
if (!$result) {
|
||||
dbg("dns: NS lookup of $domain using $ns failed horribly, ".
|
||||
"may not be a valid nameserver");
|
||||
last;
|
||||
} elsif (!@$result) {
|
||||
dbg("dns: NS lookup of $domain using $ns failed, no results found");
|
||||
} else {
|
||||
dbg("dns: NS lookup of $domain using $ns succeeded => DNS available".
|
||||
" (set dns_available to override)");
|
||||
push(@good_nameservers, $ns);
|
||||
last;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (!@good_nameservers) {
|
||||
dbg("dns: all NS queries failed => DNS unavailable ".
|
||||
"(set dns_available to override)");
|
||||
} else {
|
||||
$IS_DNS_AVAILABLE = 1;
|
||||
dbg("dns: NS list: ".join(", ", @good_nameservers));
|
||||
$self->{resolver}->available_nameservers(@good_nameservers);
|
||||
}
|
||||
|
||||
done:
|
||||
# jm: leaving this in!
|
||||
dbg("dns: is DNS available? " . $IS_DNS_AVAILABLE);
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub server_failed_to_respond_for_domain {
|
||||
my ($self, $dom) = @_;
|
||||
if ($self->{dns_server_too_slow}->{$dom}) {
|
||||
dbg("dns: server for '$dom' failed to reply previously, not asking again");
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub set_server_failed_to_respond_for_domain {
|
||||
my ($self, $dom) = @_;
|
||||
dbg("dns: server for '$dom' failed to reply, marking as bad");
|
||||
$self->{dns_server_too_slow}->{$dom} = 1;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub enter_helper_run_mode {
|
||||
my ($self) = @_;
|
||||
|
||||
dbg("dns: entering helper-app run mode");
|
||||
$self->{old_slash} = $/; # Razor pollutes this
|
||||
%{$self->{old_env}} = ();
|
||||
if ( %ENV ) {
|
||||
# undefined values in %ENV can result due to autovivification elsewhere,
|
||||
# this prevents later possible warnings when we restore %ENV
|
||||
while (my ($key, $value) = each %ENV) {
|
||||
$self->{old_env}->{$key} = $value if defined $value;
|
||||
}
|
||||
}
|
||||
|
||||
Mail::SpamAssassin::Util::clean_path_in_taint_mode();
|
||||
|
||||
my $newhome;
|
||||
if ($self->{main}->{home_dir_for_helpers}) {
|
||||
$newhome = $self->{main}->{home_dir_for_helpers};
|
||||
} else {
|
||||
# use spamd -u user's home dir
|
||||
$newhome = (Mail::SpamAssassin::Util::portable_getpwuid ($>))[7];
|
||||
}
|
||||
|
||||
if ($newhome) {
|
||||
$ENV{'HOME'} = Mail::SpamAssassin::Util::untaint_file_path ($newhome);
|
||||
}
|
||||
|
||||
# enforce SIGCHLD as DEFAULT; IGNORE causes spurious kernel warnings
|
||||
# on Red Hat NPTL kernels (bug 1536), and some users of the
|
||||
# Mail::SpamAssassin modules set SIGCHLD to be a fatal signal
|
||||
# for some reason! (bug 3507)
|
||||
$self->{old_sigchld_handler} = $SIG{CHLD};
|
||||
$SIG{CHLD} = 'DEFAULT';
|
||||
}
|
||||
|
||||
sub leave_helper_run_mode {
|
||||
my ($self) = @_;
|
||||
|
||||
dbg("dns: leaving helper-app run mode");
|
||||
$/ = $self->{old_slash};
|
||||
%ENV = %{$self->{old_env}};
|
||||
|
||||
if (defined $self->{old_sigchld_handler}) {
|
||||
$SIG{CHLD} = $self->{old_sigchld_handler};
|
||||
} else {
|
||||
# if SIGCHLD has never been explicitly set, it's returned as undef.
|
||||
# however, when *setting* SIGCHLD, using undef(%) or assigning to an
|
||||
# undef value produces annoying 'Use of uninitialized value in scalar
|
||||
# assignment' warnings. That's silly. workaround:
|
||||
$SIG{CHLD} = 'DEFAULT';
|
||||
}
|
||||
}
|
||||
|
||||
# note: this must be called before leave_helper_run_mode() is called,
|
||||
# as the SIGCHLD signal must be set to DEFAULT for it to work.
|
||||
sub cleanup_kids {
|
||||
my ($self, $pid) = @_;
|
||||
|
||||
if ($SIG{CHLD} && $SIG{CHLD} ne 'IGNORE') { # running from spamd
|
||||
waitpid ($pid, 0);
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub register_async_rule_start {
|
||||
my ($self, $rule) = @_;
|
||||
dbg("dns: $rule lookup start");
|
||||
$self->{rule_to_rblkey}->{$rule} = '*ASYNC_START';
|
||||
}
|
||||
|
||||
sub register_async_rule_finish {
|
||||
my ($self, $rule) = @_;
|
||||
dbg("dns: $rule lookup finished");
|
||||
delete $self->{rule_to_rblkey}->{$rule};
|
||||
}
|
||||
|
||||
sub mark_all_async_rules_complete {
|
||||
my ($self) = @_;
|
||||
$self->{rule_to_rblkey} = { };
|
||||
}
|
||||
|
||||
sub is_rule_complete {
|
||||
my ($self, $rule) = @_;
|
||||
|
||||
my $key = $self->{rule_to_rblkey}->{$rule};
|
||||
if (!defined $key) {
|
||||
# dbg("dns: $rule lookup complete, not in list");
|
||||
return 1;
|
||||
}
|
||||
|
||||
if ($key eq '*ASYNC_START') {
|
||||
dbg("dns: $rule lookup not yet complete");
|
||||
return 0; # not yet complete
|
||||
}
|
||||
|
||||
my $ent = $self->{async}->get_lookup($key);
|
||||
if (!defined $ent) {
|
||||
dbg("dns: $rule lookup complete, $key no longer pending");
|
||||
return 1;
|
||||
}
|
||||
|
||||
dbg("dns: $rule lookup not yet complete");
|
||||
return 0; # not yet complete
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
# interface called by SPF plugin
|
||||
sub check_for_from_dns {
|
||||
my ($self, $pms) = @_;
|
||||
if (defined $pms->{sender_host_fail}) {
|
||||
return ($pms->{sender_host_fail} == 2); # both MX and A need to fail
|
||||
}
|
||||
}
|
||||
|
||||
1;
|
1036
upstream/lib/Mail/SpamAssassin/DnsResolver.pm
Normal file
1036
upstream/lib/Mail/SpamAssassin/DnsResolver.pm
Normal file
File diff suppressed because it is too large
Load Diff
1243
upstream/lib/Mail/SpamAssassin/HTML.pm
Normal file
1243
upstream/lib/Mail/SpamAssassin/HTML.pm
Normal file
File diff suppressed because it is too large
Load Diff
110
upstream/lib/Mail/SpamAssassin/Locales.pm
Normal file
110
upstream/lib/Mail/SpamAssassin/Locales.pm
Normal file
@ -0,0 +1,110 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Locales;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
###########################################################################
|
||||
|
||||
# A mapping of known country codes to frequent charsets used therein.
|
||||
# note that the ISO and CP charsets will already have been permitted,
|
||||
# so only "unusual" charsets should be listed here.
|
||||
#
|
||||
# Country codes should be lowercase, charsets uppercase.
|
||||
#
|
||||
# A good listing is in /usr/share/config/charsets from KDE 2.2.1
|
||||
#
|
||||
our %charsets_for_locale = (
|
||||
|
||||
# Japanese: Peter Evans writes: iso-2022-jp = rfc approved, rfc 1468, created
|
||||
# by Jun Murai in 1993 back when he didn't have white hair! rfc approved.
|
||||
# (rfc 2237) <-- by M$.
|
||||
'ja' => 'EUCJP JISX020119760 JISX020819830 JISX020819900 JISX020819970 '.
|
||||
'JISX021219900 JISX021320001 JISX021320002 SHIFT_JIS SHIFTJIS '.
|
||||
'ISO2022JP SJIS JIS7 JISX0201 JISX0208 JISX0212',
|
||||
|
||||
# Korea
|
||||
'ko' => 'EUCKR KSC56011987',
|
||||
|
||||
# Cyrillic: Andrew Vasilyev notes CP866 is common (bug 2278)
|
||||
'ru' => 'KOI8R KOI8U KOI8T ISOIR111 CP1251 GEORGIANPS CP1251 PT154 CP866',
|
||||
'ka' => 'KOI8R KOI8U KOI8T ISOIR111 CP1251 GEORGIANPS CP1251 PT154 CP866',
|
||||
'tg' => 'KOI8R KOI8U KOI8T ISOIR111 CP1251 GEORGIANPS CP1251 PT154 CP866',
|
||||
'be' => 'KOI8R KOI8U KOI8T ISOIR111 CP1251 GEORGIANPS CP1251 PT154 CP866',
|
||||
'uk' => 'KOI8R KOI8U KOI8T ISOIR111 CP1251 GEORGIANPS CP1251 PT154 CP866',
|
||||
'bg' => 'KOI8R KOI8U KOI8T ISOIR111 CP1251 GEORGIANPS CP1251 PT154 CP866',
|
||||
|
||||
# Thai
|
||||
'th' => 'TIS620',
|
||||
|
||||
# Chinese (simplified and traditional). Peter Evans writes: new government
|
||||
# mandated chinese encoding = gb18030, chinese mail is supposed to be
|
||||
# iso-2022-cn (rfc 1922?)
|
||||
'zh' => 'GB1988 GB2312 GB231219800 GB18030 GBK BIG5HKSCS BIG5 EUCTW ISO2022CN',
|
||||
|
||||
# Chinese Traditional charsets only
|
||||
'zh.big5' => 'BIG5HKSCS BIG5 EUCTW',
|
||||
|
||||
# Chinese Simplified charsets only
|
||||
'zh.gb2312' => 'GB1988 GB2312 GB231219800 GB18030 GBK ISO2022CN',
|
||||
);
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub is_charset_ok_for_locales {
|
||||
my ($cs, @locales) = @_;
|
||||
|
||||
$cs = uc $cs; $cs =~ s/[^A-Z0-9]//g;
|
||||
$cs =~ s/^3D//gs; # broken by quoted-printable
|
||||
$cs =~ s/:.*$//gs; # trim off multiple charsets, just use 1st
|
||||
|
||||
study $cs; # study is a no-op since perl 5.16.0, eliminating related bugs
|
||||
#warn "JMD $cs";
|
||||
|
||||
# always OK (the net speaks mostly roman charsets)
|
||||
return 1 if ($cs eq 'USASCII');
|
||||
return 1 if ($cs =~ /^ISO8859/);
|
||||
return 1 if ($cs =~ /^ISO10646/);
|
||||
return 1 if ($cs =~ /^UTF/);
|
||||
return 1 if ($cs =~ /^UCS/);
|
||||
return 1 if ($cs =~ /^CP125/);
|
||||
return 1 if ($cs =~ /^WINDOWS/); # argh, Windows
|
||||
return 1 if ($cs eq 'IBM852');
|
||||
return 1 if ($cs =~ /^UNICODE11UTF[78]/); # wtf? never heard of it
|
||||
return 1 if ($cs eq 'XUNKNOWN'); # added by sendmail when converting to 8bit
|
||||
return 1 if ($cs eq 'ISO'); # Magellan, sending as 'charset=iso 8859-15'. grr
|
||||
|
||||
foreach my $locale (@locales) {
|
||||
if (!defined($locale) || $locale eq 'C') { $locale = 'en'; }
|
||||
$locale =~ s/^([a-z][a-z]).*$/$1/; # zh_TW... => zh
|
||||
|
||||
my $ok_for_loc = $charsets_for_locale{$locale};
|
||||
next if (!defined $ok_for_loc);
|
||||
|
||||
if ($ok_for_loc =~ /(?:^| )\Q${cs}\E(?:$| )/) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
1;
|
74
upstream/lib/Mail/SpamAssassin/Locker.pm
Normal file
74
upstream/lib/Mail/SpamAssassin/Locker.pm
Normal file
@ -0,0 +1,74 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Locker;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Fcntl;
|
||||
use Time::HiRes ();
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my $self = { };
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub safe_lock {
|
||||
my ($self, $path, $max_retries, $mode) = @_;
|
||||
# max_retries is optional, should default to about 30
|
||||
# mode is UNIX-style and optional, should default to 0700,
|
||||
# callers must specify --x bits
|
||||
die "locker: safe_lock not implemented by Locker subclass";
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub safe_unlock {
|
||||
my ($self, $path) = @_;
|
||||
die "locker: safe_unlock not implemented by Locker subclass";
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub refresh_lock {
|
||||
my ($self, $path) = @_;
|
||||
die "locker: refresh_lock not implemented by Locker subclass";
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub jittery_one_second_sleep {
|
||||
my ($self) = @_;
|
||||
Time::HiRes::sleep(rand(1.0) + 0.5);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
173
upstream/lib/Mail/SpamAssassin/Locker/Flock.pm
Normal file
173
upstream/lib/Mail/SpamAssassin/Locker/Flock.pm
Normal file
@ -0,0 +1,173 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Locker::Flock;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Locker;
|
||||
use Mail::SpamAssassin::Util;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use File::Spec;
|
||||
use IO::File;
|
||||
use Fcntl qw(:DEFAULT :flock);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Locker);
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $self = $class->SUPER::new(@_);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
# Attempt to create a file lock, using NFS-UNsafe locking techniques.
|
||||
|
||||
sub safe_lock {
|
||||
my ($self, $path, $max_retries, $mode) = @_;
|
||||
my $is_locked = 0;
|
||||
my @stat;
|
||||
|
||||
$max_retries ||= 30;
|
||||
$mode ||= "0700";
|
||||
$mode = (oct $mode) & 0666;
|
||||
dbg ("locker: mode is $mode");
|
||||
|
||||
my $lock_file = "$path.mutex";
|
||||
my $umask = umask(~$mode);
|
||||
my $fh = new IO::File();
|
||||
|
||||
if (!$fh->open ($lock_file, O_RDWR|O_CREAT)) {
|
||||
umask $umask; # just in case
|
||||
die "locker: safe_lock: cannot create lockfile $lock_file: $!\n";
|
||||
}
|
||||
umask $umask; # we've created the file, so reset umask
|
||||
|
||||
dbg("locker: safe_lock: created $lock_file");
|
||||
|
||||
my $unalarmed = 0;
|
||||
my $oldalarm = 0;
|
||||
|
||||
# use a SIGALRM-based timer -- more efficient than second-by-second
|
||||
# sleeps
|
||||
my $eval_stat;
|
||||
eval {
|
||||
local $SIG{ALRM} = sub { die "alarm\n" };
|
||||
dbg("locker: safe_lock: trying to get lock on $path with $max_retries timeout");
|
||||
|
||||
# max_retries is basically seconds! so use it for the timeout
|
||||
$oldalarm = alarm $max_retries;
|
||||
|
||||
# HELLO!?! IO::File doesn't have a flock() method?!
|
||||
if (!flock($fh, LOCK_EX)) {
|
||||
warn "locker: safe_lock: cannot obtain a lock on log file: $!";
|
||||
} else {
|
||||
alarm $oldalarm;
|
||||
$unalarmed = 1; # avoid calling alarm(0) twice
|
||||
|
||||
dbg("locker: safe_lock: link to $lock_file: link ok");
|
||||
$is_locked = 1;
|
||||
|
||||
# just to be nice: let people know when it was locked
|
||||
$fh->print("$$\n") or die "error writing to lock file: $!";
|
||||
$fh->flush or die "cannot flush lock file: $!";
|
||||
|
||||
# keep the FD around - we need to keep the lockfile open or the lock
|
||||
# is unlocked!
|
||||
$self->{lock_fhs} ||= { };
|
||||
$self->{lock_fhs}->{$path} = $fh;
|
||||
}
|
||||
1;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
};
|
||||
|
||||
$unalarmed or alarm $oldalarm; # if we die'd above, need to reset here
|
||||
|
||||
if (defined $eval_stat) {
|
||||
if ($eval_stat =~ /alarm/) {
|
||||
dbg("locker: safe_lock: timed out after $max_retries seconds");
|
||||
} else {
|
||||
die "locker: safe_lock: $eval_stat\n";
|
||||
}
|
||||
}
|
||||
|
||||
return $is_locked;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub safe_unlock {
|
||||
my ($self, $path) = @_;
|
||||
|
||||
if (!exists $self->{lock_fhs} || !defined $self->{lock_fhs}->{$path}) {
|
||||
dbg("locker: safe_unlock: no lock handle for $path - already unlocked?");
|
||||
return;
|
||||
}
|
||||
|
||||
my $fh = $self->{lock_fhs}->{$path};
|
||||
delete $self->{lock_fhs}->{$path};
|
||||
|
||||
flock($fh, LOCK_UN) or die "cannot unlock a log file: $!";
|
||||
$fh->close or die "error closing a lock file: $!";
|
||||
|
||||
dbg("locker: safe_unlock: unlocked $path.mutex");
|
||||
|
||||
# do NOT unlink! this would open a race, whereby:
|
||||
#
|
||||
# procA: ....unlock (unlocked lockfile)
|
||||
# procB: lock (gets lock on lockfile)
|
||||
# procA: unlink (deletes lockfile)
|
||||
# (procB's lock is now deleted as well!)
|
||||
# procC: create, lock (gets lock on new file)
|
||||
#
|
||||
# both procB and procC would then think they had locks, and both
|
||||
# would write to the database file. this is bad.
|
||||
#
|
||||
# unlink ("$path.mutex");
|
||||
#
|
||||
# side-effect: we leave a .mutex file around. but hey!
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub refresh_lock {
|
||||
my($self, $path) = @_;
|
||||
|
||||
return unless $path;
|
||||
|
||||
if (!exists $self->{lock_fhs} || !defined $self->{lock_fhs}->{$path}) {
|
||||
warn "locker: refresh_lock: no lock handle for $path\n";
|
||||
return;
|
||||
}
|
||||
|
||||
my $fh = $self->{lock_fhs}->{$path};
|
||||
$fh->print("$$\n") or die "error writing to lock file: $!";
|
||||
$fh->flush or die "cannot flush lock file: $!";
|
||||
|
||||
dbg("locker: refresh_lock: refresh $path.mutex");
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
245
upstream/lib/Mail/SpamAssassin/Locker/UnixNFSSafe.pm
Normal file
245
upstream/lib/Mail/SpamAssassin/Locker/UnixNFSSafe.pm
Normal file
@ -0,0 +1,245 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Locker::UnixNFSSafe;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Locker;
|
||||
use Mail::SpamAssassin::Util;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use File::Spec;
|
||||
use Time::Local;
|
||||
use Fcntl qw(:DEFAULT :flock);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Locker);
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $self = $class->SUPER::new(@_);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
# NFS-safe locking (I hope!):
|
||||
# Attempt to create a file lock, using NFS-safe locking techniques.
|
||||
#
|
||||
# Locking code adapted from code by Alexis Rosen <alexis@panix.com>
|
||||
# by Kelsey Cummings <kgc@sonic.net>, with mods by jm and quinlan
|
||||
#
|
||||
# A good implementation of Alexis' code, for reference, is here:
|
||||
# http://mail-index.netbsd.org/netbsd-bugs/1996/04/17/0002.html
|
||||
|
||||
use constant LOCK_MAX_AGE => 600; # seconds
|
||||
|
||||
sub safe_lock {
|
||||
my ($self, $path, $max_retries, $mode) = @_;
|
||||
my $is_locked = 0;
|
||||
my @stat;
|
||||
|
||||
$max_retries ||= 30;
|
||||
$mode ||= "0700";
|
||||
$mode = (oct $mode) & 0666;
|
||||
dbg ("locker: mode is $mode");
|
||||
|
||||
my $lock_file = "$path.lock";
|
||||
my $hname = Mail::SpamAssassin::Util::fq_hostname();
|
||||
my $lock_tmp = Mail::SpamAssassin::Util::untaint_file_path
|
||||
($path.".lock.".$hname.".".$$);
|
||||
|
||||
# keep this for unlocking
|
||||
$self->{lock_tmp} = $lock_tmp;
|
||||
|
||||
my $umask = umask(~$mode);
|
||||
if (!open(LTMP, ">$lock_tmp")) {
|
||||
umask $umask; # just in case
|
||||
die "locker: safe_lock: cannot create tmp lockfile $lock_tmp for $lock_file: $!\n";
|
||||
}
|
||||
umask $umask;
|
||||
autoflush LTMP 1;
|
||||
dbg("locker: safe_lock: created $lock_tmp");
|
||||
|
||||
for (my $retries = 0; $retries < $max_retries; $retries++) {
|
||||
if ($retries > 0) { $self->jittery_one_second_sleep(); }
|
||||
print LTMP "$hname.$$\n" or warn "Error writing to $lock_tmp: $!";
|
||||
dbg("locker: safe_lock: trying to get lock on $path with $retries retries");
|
||||
if (link($lock_tmp, $lock_file)) {
|
||||
dbg("locker: safe_lock: link to $lock_file: link ok");
|
||||
$is_locked = 1;
|
||||
last;
|
||||
}
|
||||
# link _may_ return false even if the link _is_ created
|
||||
@stat = lstat($lock_tmp);
|
||||
@stat or warn "locker: error accessing $lock_tmp: $!";
|
||||
if (defined $stat[3] && $stat[3] > 1) {
|
||||
dbg("locker: safe_lock: link to $lock_file: stat ok");
|
||||
$is_locked = 1;
|
||||
last;
|
||||
}
|
||||
# check age of lockfile ctime
|
||||
my $now = ($#stat < 11 ? undef : $stat[10]);
|
||||
@stat = lstat($lock_file);
|
||||
@stat or warn "locker: error accessing $lock_file: $!";
|
||||
my $lock_age = ($#stat < 11 ? undef : $stat[10]);
|
||||
if (defined($lock_age) && defined($now) && ($now - $lock_age) > LOCK_MAX_AGE)
|
||||
{
|
||||
# we got a stale lock, break it
|
||||
dbg("locker: safe_lock: breaking stale $lock_file: age=" .
|
||||
(defined $lock_age ? $lock_age : "undef") . " now=$now");
|
||||
unlink($lock_file)
|
||||
or warn "locker: safe_lock: unlink of lock file $lock_file failed: $!\n";
|
||||
}
|
||||
}
|
||||
|
||||
close LTMP or die "error closing $lock_tmp: $!";
|
||||
unlink($lock_tmp)
|
||||
or warn "locker: safe_lock: unlink of temp lock $lock_tmp failed: $!\n";
|
||||
|
||||
# record this for safe unlocking
|
||||
if ($is_locked) {
|
||||
@stat = lstat($lock_file);
|
||||
@stat or warn "locker: error accessing $lock_file: $!";
|
||||
my $lock_ctime = ($#stat < 11 ? undef : $stat[10]);
|
||||
|
||||
$self->{lock_ctimes} ||= { };
|
||||
$self->{lock_ctimes}->{$path} = $lock_ctime;
|
||||
}
|
||||
|
||||
return $is_locked;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub safe_unlock {
|
||||
my ($self, $path) = @_;
|
||||
|
||||
my $lock_file = "$path.lock";
|
||||
my $lock_tmp = $self->{lock_tmp};
|
||||
if (!$lock_tmp) {
|
||||
dbg("locker: safe_unlock: $path.lock never locked");
|
||||
return;
|
||||
}
|
||||
|
||||
# 1. Build a temp file and stat that to get an idea of what the server
|
||||
# thinks the current time is (our_tmp.st_ctime). note: do not use time()
|
||||
# directly because the server's clock may be out of sync with the client's.
|
||||
|
||||
my @stat_ourtmp;
|
||||
if (!defined sysopen(LTMP, $lock_tmp, O_CREAT|O_WRONLY|O_EXCL, 0700)) {
|
||||
warn "locker: safe_unlock: failed to create lock tmpfile $lock_tmp: $!";
|
||||
return;
|
||||
} else {
|
||||
autoflush LTMP 1;
|
||||
print LTMP "\n" or warn "Error writing to $lock_tmp: $!";
|
||||
|
||||
if (!(@stat_ourtmp = stat(LTMP)) || (scalar(@stat_ourtmp) < 11)) {
|
||||
@stat_ourtmp or warn "locker: error accessing $lock_tmp: $!";
|
||||
warn "locker: safe_unlock: failed to create lock tmpfile $lock_tmp";
|
||||
close LTMP or die "error closing $lock_tmp: $!";
|
||||
unlink($lock_tmp)
|
||||
or warn "locker: safe_lock: unlink of lock file failed: $!\n";
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
my $ourtmp_ctime = $stat_ourtmp[10]; # paranoia
|
||||
if (!defined $ourtmp_ctime) {
|
||||
die "locker: safe_unlock: stat failed on $lock_tmp";
|
||||
}
|
||||
|
||||
close LTMP or die "error closing $lock_tmp: $!";
|
||||
unlink($lock_tmp)
|
||||
or warn "locker: safe_lock: unlink of lock file failed: $!\n";
|
||||
|
||||
# 2. If the ctime hasn't been modified, unlink the file and return. If the
|
||||
# lock has expired, sleep the usual random interval before returning. If we
|
||||
# didn't sleep, there could be a race if the caller immediately tries to
|
||||
# relock the file.
|
||||
|
||||
my $lock_ctime = $self->{lock_ctimes}->{$path};
|
||||
if (!defined $lock_ctime) {
|
||||
warn "locker: safe_unlock: no ctime recorded for $lock_file";
|
||||
return;
|
||||
}
|
||||
|
||||
my @stat_lock = lstat($lock_file);
|
||||
@stat_lock or warn "locker: error accessing $lock_file: $!";
|
||||
|
||||
my $now_ctime = $stat_lock[10];
|
||||
|
||||
if (defined $now_ctime && $now_ctime == $lock_ctime)
|
||||
{
|
||||
# things are good: the ctimes match so it was our lock
|
||||
unlink($lock_file)
|
||||
or warn "locker: safe_unlock: unlink failed: $lock_file\n";
|
||||
dbg("locker: safe_unlock: unlink $lock_file");
|
||||
|
||||
if ($ourtmp_ctime >= $lock_ctime + LOCK_MAX_AGE) {
|
||||
# the lock has expired, so sleep a bit; use some randomness
|
||||
# to avoid race conditions.
|
||||
dbg("locker: safe_unlock: lock expired on $lock_file expired safely; sleeping");
|
||||
my $i; for ($i = 0; $i < 5; $i++) {
|
||||
$self->jittery_one_second_sleep();
|
||||
}
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
# 4. Either ctime has been modified, or the entire lock file is missing.
|
||||
# If the lock should still be ours, based on the ctime of the temp
|
||||
# file, warn it was stolen. If not, then our lock is expired and
|
||||
# someone else has grabbed the file, so warn it was lost.
|
||||
if ($ourtmp_ctime < $lock_ctime + LOCK_MAX_AGE) {
|
||||
warn "locker: safe_unlock: lock on $lock_file was stolen";
|
||||
} else {
|
||||
warn "locker: safe_unlock: lock on $lock_file was lost due to expiry";
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub refresh_lock {
|
||||
my($self, $path) = @_;
|
||||
|
||||
return unless $path;
|
||||
|
||||
# this could arguably read the lock and make sure the same process
|
||||
# owns it, but this shouldn't, in theory, be an issue.
|
||||
# TODO: in NFS, it definitely may be one :(
|
||||
|
||||
my $lock_file = "$path.lock";
|
||||
utime time, time, $lock_file;
|
||||
|
||||
# update the lock_ctimes entry
|
||||
my @stat = lstat($lock_file);
|
||||
@stat or warn "locker: error accessing $lock_file: $!";
|
||||
|
||||
my $lock_ctime = ($#stat < 11 ? undef : $stat[10]);
|
||||
$self->{lock_ctimes}->{$path} = $lock_ctime;
|
||||
|
||||
dbg("locker: refresh_lock: refresh $path.lock");
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
116
upstream/lib/Mail/SpamAssassin/Locker/Win32.pm
Normal file
116
upstream/lib/Mail/SpamAssassin/Locker/Win32.pm
Normal file
@ -0,0 +1,116 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Locker::Win32;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Fcntl;
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Locker;
|
||||
use Mail::SpamAssassin::Util;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use File::Spec;
|
||||
use Time::Local;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Locker);
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $self = $class->SUPER::new(@_);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
use constant LOCK_MAX_AGE => 600; # seconds
|
||||
|
||||
sub safe_lock {
|
||||
my ($self, $path, $max_retries, $mode) = @_;
|
||||
my @stat;
|
||||
|
||||
$max_retries ||= 30;
|
||||
# $mode is ignored on win32
|
||||
|
||||
my $lock_file = "$path.lock";
|
||||
|
||||
if (-e $lock_file && -M $lock_file > (LOCK_MAX_AGE / 86400)) {
|
||||
dbg("locker: safe_lock: breaking stale lock: $lock_file");
|
||||
unlink($lock_file)
|
||||
or warn "locker: safe_lock: unlink of lock file $lock_file failed: $!\n";
|
||||
}
|
||||
for (my $retries = 0; $retries < $max_retries; $retries++) {
|
||||
if ($retries > 0) {
|
||||
sleep(1);
|
||||
# TODO: $self->jittery_one_second_sleep();?
|
||||
}
|
||||
dbg("locker: safe_lock: trying to get lock on $path with $retries retries");
|
||||
if (!defined sysopen(LOCKFILE, $lock_file, O_RDWR|O_CREAT|O_EXCL)) {
|
||||
dbg("locker: safe_lock: failed to create lock tmpfile $lock_file: $!");
|
||||
} else {
|
||||
dbg("locker: safe_lock: link to $lock_file: sysopen ok");
|
||||
close(LOCKFILE) or warn "error closing a lock file: $!";
|
||||
return 1;
|
||||
}
|
||||
my @stat = stat($lock_file);
|
||||
@stat or warn "locker: error accessing $lock_file: $!";
|
||||
|
||||
# check age of lockfile ctime
|
||||
my $age = ($#stat < 11 ? undef : $stat[10]);
|
||||
if ((!defined($age) && $retries > $max_retries / 2) ||
|
||||
(defined($age) && (time - $age > LOCK_MAX_AGE)))
|
||||
{
|
||||
dbg("locker: safe_lock: breaking stale lock: $lock_file");
|
||||
unlink($lock_file)
|
||||
or warn "locker: safe_lock: unlink of lock file $lock_file failed: $!\n";
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub safe_unlock {
|
||||
my ($self, $path) = @_;
|
||||
|
||||
unlink("$path.lock")
|
||||
or warn "locker: safe_unlock: unlink failed: $path.lock\n";
|
||||
dbg("locker: safe_unlock: unlink $path.lock");
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub refresh_lock {
|
||||
my($self, $path) = @_;
|
||||
|
||||
return unless $path;
|
||||
|
||||
# this could arguably read the lock and make sure the same process
|
||||
# owns it, but this shouldn't, in theory, be an issue.
|
||||
utime time, time, "$path.lock";
|
||||
|
||||
dbg("locker: refresh_lock: refresh $path.lock");
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
412
upstream/lib/Mail/SpamAssassin/Logger.pm
Normal file
412
upstream/lib/Mail/SpamAssassin/Logger.pm
Normal file
@ -0,0 +1,412 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Logger - SpamAssassin logging module
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
$SIG{__WARN__} = sub {
|
||||
log_message("warn", $_[0]);
|
||||
};
|
||||
|
||||
$SIG{__DIE__} = sub {
|
||||
log_message("error", $_[0]) if !$^S;
|
||||
};
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Logger;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Exporter ();
|
||||
use Time::HiRes ();
|
||||
|
||||
our @ISA = qw(Exporter);
|
||||
our @EXPORT = qw(dbg info would_log);
|
||||
our @EXPORT_OK = qw(log_message);
|
||||
|
||||
use constant ERROR => 0;
|
||||
use constant WARNING => 1;
|
||||
use constant INFO => 2;
|
||||
use constant DBG => 3;
|
||||
|
||||
my %log_level = (
|
||||
0 => 'ERROR',
|
||||
1 => 'WARNING',
|
||||
2 => 'INFO',
|
||||
3 => 'DBG',
|
||||
);
|
||||
|
||||
# global shared object
|
||||
our %LOG_SA;
|
||||
our $LOG_ENTERED; # to avoid recursion on die or warn from within logging
|
||||
# duplicate message line suppressor
|
||||
our $LOG_DUPMIN = 10; # only start suppressing after x duplicate lines
|
||||
our $LOG_DUPLINE = ''; # remembers last log line
|
||||
our $LOG_DUPLEVEL = ''; # remembers last log level
|
||||
our $LOG_DUPTIME; # remembers last log line timestamp
|
||||
our $LOG_DUPCNT = 0; # counts duplicates
|
||||
|
||||
# defaults
|
||||
$LOG_SA{level} = WARNING; # log info, warnings and errors
|
||||
$LOG_SA{facility} = {}; # no dbg facilities turned on
|
||||
|
||||
# always log to stderr initially
|
||||
use Mail::SpamAssassin::Logger::Stderr;
|
||||
$LOG_SA{method}->{stderr} = Mail::SpamAssassin::Logger::Stderr->new();
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=item add_facilities(facilities)
|
||||
|
||||
Enable debug logging for specific facilities. Each facility is the area
|
||||
of code to debug. Facilities can be specified as a hash reference (the
|
||||
key names are used), an array reference, an array, or a comma-separated
|
||||
scalar string. Facility names are case-sensitive.
|
||||
|
||||
If "all" is listed, then all debug facilities are implicitly enabled,
|
||||
except for those explicitly disabled. A facility name may be preceded
|
||||
by a "no" (case-insensitive), which explicitly disables it, overriding
|
||||
the "all". For example: all,norules,noconfig,nodcc. When facility names
|
||||
are given as an ordered list (array or scalar, not a hash), the last entry
|
||||
applies, e.g. 'nodcc,dcc,dcc,noddc' is equivalent to 'nodcc'. Note that
|
||||
currently no facility name starts with a "no", it is advised to keep this
|
||||
practice with newly added facility names to make life easier.
|
||||
|
||||
Higher priority informational messages that are suitable for logging in
|
||||
normal circumstances are available with an area of "info". Some very
|
||||
verbose messages require the facility to be specifically enabled (see
|
||||
C<would_log> below).
|
||||
|
||||
=cut
|
||||
|
||||
sub add_facilities {
|
||||
my ($facilities) = @_;
|
||||
|
||||
my @facilities;
|
||||
if (ref ($facilities) eq '') {
|
||||
if (defined $facilities && $facilities ne '0') {
|
||||
@facilities = split(/,/, $facilities);
|
||||
}
|
||||
}
|
||||
elsif (ref ($facilities) eq 'ARRAY') {
|
||||
@facilities = @{ $facilities };
|
||||
}
|
||||
elsif (ref ($facilities) eq 'HASH') {
|
||||
@facilities = keys %{ $facilities };
|
||||
}
|
||||
@facilities = grep(/^\S+$/, @facilities);
|
||||
if (@facilities) {
|
||||
for my $fac (@facilities) {
|
||||
local ($1,$2);
|
||||
$LOG_SA{facility}->{$2} = !defined($1) if $fac =~ /^(no)?(.+)\z/si;
|
||||
}
|
||||
# turn on debugging if facilities other than "info" are enabled
|
||||
if (grep { !/^info\z/ && !/^no./si } keys %{ $LOG_SA{facility} }) {
|
||||
$LOG_SA{level} = DBG if $LOG_SA{level} < DBG;
|
||||
}
|
||||
else {
|
||||
$LOG_SA{level} = INFO if $LOG_SA{level} < INFO;
|
||||
}
|
||||
# debug statement last so we might see it
|
||||
dbg("logger: adding facilities: " . join(", ", @facilities));
|
||||
dbg("logger: logging level is " . $log_level{$LOG_SA{level}});
|
||||
}
|
||||
}
|
||||
|
||||
=item log_message($level, @message)
|
||||
|
||||
Log a message at a specific level. Levels are specified as strings:
|
||||
"warn", "error", "info", and "dbg". The first element of the message
|
||||
must be prefixed with a facility name followed directly by a colon.
|
||||
|
||||
=cut
|
||||
|
||||
sub log_message {
|
||||
my ($level, @message) = @_;
|
||||
|
||||
# too many die and warn messages out there, don't log the ones that we don't
|
||||
# own. jm: off: this makes no sense -- if a dependency module dies or warns,
|
||||
# we want to know about it, unless we're *SURE* it's not something worth
|
||||
# worrying about.
|
||||
# if ($level eq "error" or $level eq "warn") {
|
||||
# return unless $message[0] =~ /^\S+:/;
|
||||
# }
|
||||
|
||||
if ($level eq "error") {
|
||||
# don't log alarm timeouts or broken pipes of various plugins' network checks
|
||||
return if (index($message[0], '__ignore__') != -1);
|
||||
|
||||
# dos: we can safely ignore any die's that we eval'd in our own modules so
|
||||
# don't log them -- this is caller 0, the use'ing package is 1, the eval is 2
|
||||
my @caller = caller 2;
|
||||
return if (defined $caller[3] && defined $caller[0] &&
|
||||
$caller[3] =~ /^\(eval\)$/ &&
|
||||
$caller[0] =~ m#^Mail::SpamAssassin(?:$|::)#);
|
||||
}
|
||||
|
||||
return if $LOG_ENTERED; # avoid recursion on die or warn from within logging
|
||||
$LOG_ENTERED = 1; # no 'returns' from this point on, must clear the flag
|
||||
|
||||
my $message = join(" ", @message);
|
||||
$message =~ s/[\r\n]+$//; # remove any trailing newlines
|
||||
|
||||
my $now = Time::HiRes::time;
|
||||
|
||||
# suppress duplicate loglines
|
||||
if ($message eq $LOG_DUPLINE) {
|
||||
$LOG_DUPCNT++;
|
||||
$LOG_DUPTIME = $now;
|
||||
# only start suppressing after x identical lines
|
||||
if ($LOG_DUPCNT >= $LOG_DUPMIN) {
|
||||
$LOG_ENTERED = 0;
|
||||
return;
|
||||
}
|
||||
} else {
|
||||
if ($LOG_DUPCNT >= $LOG_DUPMIN) {
|
||||
$LOG_DUPCNT -= $LOG_DUPMIN - 1;
|
||||
if ($LOG_DUPCNT > 1) {
|
||||
_log_message($LOG_DUPLEVEL,
|
||||
"$LOG_DUPLINE [... logline repeated $LOG_DUPCNT times]",
|
||||
$LOG_DUPTIME);
|
||||
} else {
|
||||
_log_message($LOG_DUPLEVEL, $LOG_DUPLINE, $LOG_DUPTIME);
|
||||
}
|
||||
}
|
||||
$LOG_DUPCNT = 0;
|
||||
$LOG_DUPLINE = $message;
|
||||
$LOG_DUPLEVEL = $level;
|
||||
}
|
||||
|
||||
_log_message($level, $message, $now);
|
||||
|
||||
$LOG_ENTERED = 0;
|
||||
}
|
||||
|
||||
# Private helper
|
||||
sub _log_message {
|
||||
# split on newlines and call log_message multiple times; saves
|
||||
# the subclasses having to understand multi-line logs
|
||||
my $first = 1;
|
||||
foreach my $line (split(/\n/, $_[1])) {
|
||||
# replace control characters with "_", tabs and spaces get
|
||||
# replaced with a single space.
|
||||
$line =~ tr/\x09\x20\x00-\x1f/ _/s;
|
||||
if ($first) {
|
||||
$first = 0;
|
||||
} else {
|
||||
local $1;
|
||||
$line =~ s/^([^:]+?):/$1: [...]/;
|
||||
}
|
||||
while (my ($name, $object) = each %{ $LOG_SA{method} }) {
|
||||
$object->log_message($_[0], $line, $_[2]);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
=item dbg("facility: message")
|
||||
|
||||
This is used for all low priority debugging messages.
|
||||
|
||||
=cut
|
||||
|
||||
sub dbg {
|
||||
_log(DBG, @_) if $LOG_SA{level} >= DBG;
|
||||
1; # always return the same simple value, regardless of log level
|
||||
}
|
||||
|
||||
=item info("facility: message")
|
||||
|
||||
This is used for informational messages indicating a normal, but
|
||||
significant, condition. This should be infrequently called. These
|
||||
messages are typically logged when SpamAssassin is run as a daemon.
|
||||
|
||||
=cut
|
||||
|
||||
sub info {
|
||||
_log(INFO, @_) if $LOG_SA{level} >= INFO;
|
||||
1; # always return the same simple value, regardless of log level
|
||||
}
|
||||
|
||||
# remember to avoid deep recursion, my friend
|
||||
sub _log {
|
||||
my $facility;
|
||||
local ($1);
|
||||
|
||||
# it's faster to access this as the $_[1] alias, and not to perform
|
||||
# string mods until we're sure we actually want to log anything
|
||||
if ($_[1] =~ /^([a-z0-9_-]*):/i) {
|
||||
$facility = $1;
|
||||
} else {
|
||||
$facility = "generic";
|
||||
}
|
||||
|
||||
# log all info, warn, and error messages;
|
||||
# only debug if asked to
|
||||
if ($_[0] == DBG) {
|
||||
return unless
|
||||
exists $LOG_SA{facility}->{$facility} ? $LOG_SA{facility}->{$facility}
|
||||
: $LOG_SA{facility}->{all};
|
||||
}
|
||||
|
||||
my ($level, $message, @args) = @_;
|
||||
$message =~ s/^(?:[a-z0-9_-]*):\s*//i;
|
||||
|
||||
$message = sprintf($message,@args) if @args;
|
||||
$message =~ s/\n+$//s;
|
||||
$message =~ s/^/${facility}: /mg;
|
||||
|
||||
# no reason to go through warn()
|
||||
log_message(($level == INFO ? "info" : "dbg"), $message);
|
||||
}
|
||||
|
||||
=item add(method => 'syslog', socket => $socket, facility => $facility)
|
||||
|
||||
C<socket> is the type the syslog ("unix" or "inet"). C<facility> is the
|
||||
syslog facility (typically "mail").
|
||||
|
||||
=item add(method => 'file', filename => $file)
|
||||
|
||||
C<filename> is the name of the log file.
|
||||
|
||||
=item add(method => 'stderr')
|
||||
|
||||
No options are needed for stderr logging, just don't close stderr first.
|
||||
|
||||
=cut
|
||||
|
||||
sub add {
|
||||
my %params = @_;
|
||||
|
||||
my $name = lc($params{method});
|
||||
my $class = ucfirst($name);
|
||||
|
||||
return 0 if $class !~ /^\w+$/; # be paranoid
|
||||
|
||||
eval 'use Mail::SpamAssassin::Logger::'.$class.'; 1'
|
||||
or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
die "logger: add $class failed: $eval_stat\n";
|
||||
};
|
||||
|
||||
if (!exists $LOG_SA{method}->{$name}) {
|
||||
my $object;
|
||||
my $eval_stat;
|
||||
eval '$object = Mail::SpamAssassin::Logger::'.$class.'->new(%params); 1'
|
||||
or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
undef $object; # just in case
|
||||
};
|
||||
if (!$object) {
|
||||
if (!defined $eval_stat) {
|
||||
$eval_stat = "Mail::SpamAssassin::Logger::$class->new ".
|
||||
"failed to return an object";
|
||||
}
|
||||
warn "logger: failed to add $name method: $eval_stat\n";
|
||||
}
|
||||
else {
|
||||
$LOG_SA{method}->{$name} = $object;
|
||||
dbg("logger: successfully added $name method\n");
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
warn "logger: $name method already added\n";
|
||||
return 1;
|
||||
}
|
||||
|
||||
=item remove(method)
|
||||
|
||||
Remove a logging method. Only the method name needs to be passed as a
|
||||
scalar.
|
||||
|
||||
=cut
|
||||
|
||||
sub remove {
|
||||
my ($method) = @_;
|
||||
|
||||
my $name = lc($method);
|
||||
if (exists $LOG_SA{method}->{$name}) {
|
||||
delete $LOG_SA{method}->{$name};
|
||||
info("logger: removing $name method");
|
||||
return 1;
|
||||
}
|
||||
warn "logger: unable to remove $name method, not present to be removed\n";
|
||||
return 1;
|
||||
}
|
||||
|
||||
=item would_log($level, $facility)
|
||||
|
||||
Returns false if a message at the given level and with the given facility
|
||||
would not be logged. Returns 1 if a message at a given level and facility
|
||||
would be logged normally. Returns 2 if the facility was specifically
|
||||
enabled.
|
||||
|
||||
The facility argument is optional.
|
||||
|
||||
=cut
|
||||
|
||||
sub would_log {
|
||||
my ($level, $facility) = @_;
|
||||
|
||||
if ($level eq 'dbg') {
|
||||
return 0 if $LOG_SA{level} < DBG;
|
||||
return 1 if !$facility;
|
||||
return ($LOG_SA{facility}->{$facility} ? 2 : 0)
|
||||
if exists $LOG_SA{facility}->{$facility};
|
||||
return 1 if $LOG_SA{facility}->{all};
|
||||
return 0;
|
||||
} elsif ($level eq 'info') {
|
||||
return $LOG_SA{level} >= INFO;
|
||||
}
|
||||
|
||||
warn "logger: would_log called with unknown level: $level\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
=item close_log()
|
||||
|
||||
Close all logs.
|
||||
|
||||
=cut
|
||||
|
||||
sub close_log {
|
||||
while (my ($name, $object) = each %{ $LOG_SA{method} }) {
|
||||
$object->close_log();
|
||||
}
|
||||
}
|
||||
|
||||
END {
|
||||
close_log();
|
||||
}
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
117
upstream/lib/Mail/SpamAssassin/Logger/File.pm
Normal file
117
upstream/lib/Mail/SpamAssassin/Logger/File.pm
Normal file
@ -0,0 +1,117 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Logger::File - log to file
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Logger::File
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Logger::File;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use POSIX ();
|
||||
use Time::HiRes ();
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(am_running_on_windows);
|
||||
|
||||
our @ISA = ();
|
||||
|
||||
# ADDING OS-DEPENDENT LINE TERMINATOR - BUG 6456
|
||||
my $eol = "\n";
|
||||
if (am_running_on_windows()) {
|
||||
$eol = "\r\n";
|
||||
}
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = { };
|
||||
bless ($self, $class);
|
||||
|
||||
# parameters
|
||||
my %params = @_;
|
||||
$self->{filename} = $params{filename} || 'spamassassin.log';
|
||||
$self->{timestamp_fmt} = $params{timestamp_fmt};
|
||||
|
||||
if (! $self->init()) {
|
||||
die "logger: file initialization failed$eol";
|
||||
}
|
||||
|
||||
return($self);
|
||||
}
|
||||
|
||||
# logging via file is requested
|
||||
sub init {
|
||||
my ($self) = @_;
|
||||
|
||||
if (open(STDLOG, ">> $self->{filename}")) {
|
||||
dbg("logger: successfully opened file $self->{filename}");
|
||||
|
||||
# ensure it's unbuffered
|
||||
my $oldfh = select STDLOG;
|
||||
$| = 1;
|
||||
select $oldfh;
|
||||
|
||||
return 1;
|
||||
}
|
||||
else {
|
||||
warn "logger: failed to open file $self->{filename}: $!$eol";
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
sub log_message {
|
||||
my ($self, $level, $msg, $ts) = @_;
|
||||
|
||||
my $timestamp;
|
||||
my $fmt = $self->{timestamp_fmt};
|
||||
my $now = defined $ts ? $ts : Time::HiRes::time;
|
||||
if (!defined $fmt) {
|
||||
$timestamp = scalar localtime($now); # default, backward compatibility
|
||||
} elsif ($fmt eq '') {
|
||||
$timestamp = '';
|
||||
} else {
|
||||
$timestamp = POSIX::strftime($fmt, localtime($now));
|
||||
}
|
||||
$timestamp .= ' ' if $timestamp ne '';
|
||||
|
||||
my($nwrite) = syswrite(STDLOG, sprintf("%s[%s] %s: %s%s",
|
||||
$timestamp, $$, $level, $msg, $eol));
|
||||
defined $nwrite or warn "error writing to log file: $!";
|
||||
}
|
||||
|
||||
sub close_log {
|
||||
my ($self) = @_;
|
||||
|
||||
if (defined $self->{filename}) {
|
||||
close(STDLOG) or die "error closing log file: $!";
|
||||
}
|
||||
}
|
||||
|
||||
1;
|
95
upstream/lib/Mail/SpamAssassin/Logger/Stderr.pm
Normal file
95
upstream/lib/Mail/SpamAssassin/Logger/Stderr.pm
Normal file
@ -0,0 +1,95 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Logger::Stderr - log to standard error
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Logger::Stderr
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Logger::Stderr;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use POSIX ();
|
||||
use Time::HiRes ();
|
||||
|
||||
our @ISA = ();
|
||||
|
||||
# ADDING OS-DEPENDENT LINE TERMINATOR - BUG 6456
|
||||
|
||||
# Using Mail::SpamAssassin::Util::am_running_on_windows() leads to circular
|
||||
# dependencies. So, we are duplicating the code instead.
|
||||
use constant RUNNING_ON_WINDOWS => ($^O =~ /^(?:mswin|dos|os2)/oi);
|
||||
|
||||
my $eol = "\n";
|
||||
if (RUNNING_ON_WINDOWS) {
|
||||
$eol = "\r\n";
|
||||
}
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = { };
|
||||
bless ($self, $class);
|
||||
|
||||
my %params = @_;
|
||||
$self->{timestamp_fmt} = $params{timestamp_fmt};
|
||||
|
||||
return($self);
|
||||
}
|
||||
|
||||
sub log_message {
|
||||
my ($self, $level, $msg, $ts) = @_;
|
||||
|
||||
my $timestamp;
|
||||
my $fmt = $self->{timestamp_fmt};
|
||||
my $now = defined $ts ? $ts : Time::HiRes::time;
|
||||
if (!defined $fmt) {
|
||||
# default since 3.3.0
|
||||
my $datetime = POSIX::strftime("%b %d %H:%M", localtime($now));
|
||||
utf8::encode($datetime) if utf8::is_utf8($datetime); # Bug 7305
|
||||
$timestamp = sprintf("%s:%06.3f", $datetime, $now-int($now/60)*60);
|
||||
# Bug 6329: %e is not in a POSIX standard, use %d instead and edit
|
||||
local $1; $timestamp =~ s/^(\S+\s+)0/$1 /;
|
||||
} elsif ($fmt eq '') {
|
||||
$timestamp = '';
|
||||
} else {
|
||||
$timestamp = POSIX::strftime($fmt, localtime($now));
|
||||
}
|
||||
$timestamp .= ' ' if $timestamp ne '';
|
||||
|
||||
my($nwrite) = syswrite(STDERR, sprintf("%s[%d] %s: %s%s",
|
||||
$timestamp, $$, $level, $msg, $eol));
|
||||
defined $nwrite or warn "error writing to log file: $!";
|
||||
}
|
||||
|
||||
sub close_log {
|
||||
my ($self) = @_;
|
||||
}
|
||||
|
||||
1;
|
265
upstream/lib/Mail/SpamAssassin/Logger/Syslog.pm
Normal file
265
upstream/lib/Mail/SpamAssassin/Logger/Syslog.pm
Normal file
@ -0,0 +1,265 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Logger::Syslog - log to syslog
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Logger::Syslog
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Logger::Syslog;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use POSIX qw(:sys_wait_h setsid sigprocmask);
|
||||
use Time::HiRes ();
|
||||
use Sys::Syslog qw(:DEFAULT setlogsock);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = ();
|
||||
|
||||
# %prio_map maps Logger.pm log level names (warn, error, info, dbg)
|
||||
# into standard Sys::Syslog::syslog() log level names
|
||||
#
|
||||
our %prio_map = (dbg => 'debug', debug => 'debug', info => 'info',
|
||||
notice => 'notice', warn => 'warning', warning => 'warning',
|
||||
error => 'err', err => 'err', crit => 'crit', alert => 'alert',
|
||||
emerg => 'emerg');
|
||||
|
||||
# make sure never to hit the CPAN-RT#56826 bug (memory corruption
|
||||
# when closelog() is called twice), fixed in Sys-Syslog 0.28
|
||||
our $syslog_open = 0;
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = { };
|
||||
bless ($self, $class);
|
||||
|
||||
# initialization
|
||||
$self->{already_done_failure_warning} = 0;
|
||||
$self->{disabled} = 0;
|
||||
$self->{consecutive_failures} = 0;
|
||||
$self->{failure_threshold} = 10;
|
||||
$self->{SIGPIPE_RECEIVED} = 0;
|
||||
|
||||
# parameters
|
||||
my %params = @_;
|
||||
$self->{ident} = $params{ident} || 'spamassassin';
|
||||
$self->{log_socket} = $params{socket};
|
||||
$self->{log_facility} = $params{facility};
|
||||
$self->{timestamp_fmt} = $params{timestamp_fmt};
|
||||
|
||||
if (! $self->init()) {
|
||||
die "logger: syslog initialization failed\n";
|
||||
}
|
||||
|
||||
return($self);
|
||||
}
|
||||
|
||||
# logging via syslog is requested
|
||||
sub init {
|
||||
my ($self) = @_;
|
||||
|
||||
my $log_socket = $self->{log_socket};
|
||||
$log_socket = '' if !defined $log_socket;
|
||||
|
||||
my $eval_stat;
|
||||
eval {
|
||||
if ($log_socket eq '') {
|
||||
# calling setlogsock is optional, let Sys::Syslog choose a default
|
||||
} else {
|
||||
dbg("logger: calling setlogsock($log_socket)");
|
||||
setlogsock($log_socket) or die "setlogsock($log_socket) failed: $!";
|
||||
}
|
||||
dbg("logger: opening syslog with $log_socket socket");
|
||||
# the next call is required to actually open the socket
|
||||
openlog($self->{ident}, 'cons,pid,ndelay', $self->{log_facility});
|
||||
$syslog_open = 1;
|
||||
1;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("logger: connection to syslog/$log_socket failed: $eval_stat");
|
||||
};
|
||||
|
||||
# Solaris sometimes doesn't support UNIX-domain syslog sockets apparently;
|
||||
# the same is true for perl 5.6.0 build on an early version of Red Hat 7!
|
||||
# In these cases we try it with INET instead.
|
||||
# See also Bug 6267 and Bug 6331.
|
||||
|
||||
if (defined($eval_stat) && $log_socket ne 'inet') {
|
||||
dbg("logger: trying setlogsock('inet')");
|
||||
undef $eval_stat;
|
||||
eval {
|
||||
setlogsock('inet') or die "setlogsock('inet') failed: $!";
|
||||
dbg("logger: opening syslog using inet socket");
|
||||
openlog($self->{ident}, 'cons,pid,ndelay', $self->{log_facility});
|
||||
$syslog_open = 1;
|
||||
1;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("logger: connection to syslog/inet failed: $eval_stat");
|
||||
};
|
||||
}
|
||||
|
||||
# we failed!
|
||||
if (defined $eval_stat) {
|
||||
return 0;
|
||||
}
|
||||
else {
|
||||
dbg("logger: successfully connected to syslog/$log_socket");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
sub log_message {
|
||||
my ($self, $level, $msg, $ts) = @_;
|
||||
|
||||
return if $self->{disabled};
|
||||
|
||||
# map level names
|
||||
$level = $prio_map{$level};
|
||||
if (!defined $level) { # just in case
|
||||
$level = 'err';
|
||||
$msg = '(bad prio: ' . $_[1] . ') ' . $msg;
|
||||
}
|
||||
|
||||
# install a new handler for SIGPIPE -- this signal has been
|
||||
# found to occur with syslog-ng after syslog-ng restarts.
|
||||
local $SIG{'PIPE'} = sub {
|
||||
$self->{SIGPIPE_RECEIVED}++;
|
||||
# force a log-close. trap possible die() calls
|
||||
eval { closelog() } if $syslog_open;
|
||||
$syslog_open = 0;
|
||||
};
|
||||
|
||||
my $timestamp = '';
|
||||
my $fmt = $self->{timestamp_fmt};
|
||||
if (defined $fmt && $fmt ne '') { # for completeness, rarely used
|
||||
my $now = defined $ts ? $ts : Time::HiRes::time;
|
||||
$timestamp = POSIX::strftime($fmt, localtime($now));
|
||||
}
|
||||
$msg = $timestamp . ' ' . $msg if $timestamp ne '';
|
||||
|
||||
# no longer needed since a patch to bug 6745:
|
||||
# # important: do not call syslog() from the SIGCHLD handler
|
||||
# # child_handler(). otherwise we can get into a loop if syslog()
|
||||
# # forks a process -- as it does in syslog-ng apparently! (bug 3625)
|
||||
# $Mail::SpamAssassin::Logger::LOG_SA{INHIBIT_LOGGING_IN_SIGCHLD_HANDLER} = 1;
|
||||
|
||||
my $eval_stat;
|
||||
eval {
|
||||
syslog($level, "%s", $msg); 1;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
};
|
||||
|
||||
# no longer needed since a patch to bug 6745:
|
||||
# $Mail::SpamAssassin::Logger::LOG_SA{INHIBIT_LOGGING_IN_SIGCHLD_HANDLER} = 0;
|
||||
|
||||
if (defined $eval_stat) {
|
||||
if ($self->check_syslog_sigpipe($msg)) {
|
||||
# dealt with
|
||||
}
|
||||
else {
|
||||
warn "logger: syslog failed: $eval_stat\n";
|
||||
|
||||
# only write this warning once, it gets annoying fast
|
||||
if (!$self->{already_done_failure_warning}) {
|
||||
warn "logger: try using --syslog-socket={unix,inet} or --syslog=file\n";
|
||||
$self->{already_done_failure_warning} = 1;
|
||||
}
|
||||
}
|
||||
$self->syslog_incr_failure_counter();
|
||||
}
|
||||
else {
|
||||
$self->{consecutive_failures} = 0;
|
||||
$self->check_syslog_sigpipe($msg); # check for SIGPIPE anyway (bug 3625)
|
||||
}
|
||||
|
||||
$SIG{PIPE} = 'IGNORE'; # this may have been reset (bug 4026)
|
||||
}
|
||||
|
||||
sub check_syslog_sigpipe {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
if (!$self->{SIGPIPE_RECEIVED}) {
|
||||
return 0; # didn't have a SIGPIPE
|
||||
}
|
||||
|
||||
eval {
|
||||
# SIGPIPE received when writing to syslog -- close and reopen
|
||||
# the log handle, then try again.
|
||||
closelog() if $syslog_open;
|
||||
$syslog_open = 0;
|
||||
openlog($self->{ident}, 'cons,pid,ndelay', $self->{log_facility});
|
||||
$syslog_open = 1;
|
||||
syslog('debug', "%s", "syslog reopened");
|
||||
syslog('info', "%s", $msg);
|
||||
|
||||
# now report what happened
|
||||
$msg = "SIGPIPE received, reopening log socket";
|
||||
dbg("log: $msg");
|
||||
syslog('info', "%s", $msg);
|
||||
|
||||
# if we've received multiple sigpipes, logging is probably still broken.
|
||||
if ($self->{SIGPIPE_RECEIVED}) {
|
||||
warn "logger: syslog failure: multiple SIGPIPEs received\n";
|
||||
$self->{disabled} = 1;
|
||||
}
|
||||
|
||||
$self->{SIGPIPE_RECEIVED} = 0;
|
||||
return 1;
|
||||
1; # just to not forget a good habit
|
||||
} or do { # something died? that's not good.
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("log: failure in check_syslog_sigpipe: $eval_stat");
|
||||
$self->syslog_incr_failure_counter();
|
||||
}
|
||||
}
|
||||
|
||||
sub syslog_incr_failure_counter {
|
||||
my ($self) = @_;
|
||||
|
||||
$self->{consecutive_failures}++;
|
||||
if ($self->{consecutive_failures}++ > $self->{failure_threshold}) {
|
||||
warn("logger: syslog() failed " . $self->{consecutive_failures} .
|
||||
" times in a row, disabled\n");
|
||||
$self->{disabled} = 1;
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub close_log {
|
||||
my ($self) = @_;
|
||||
|
||||
closelog() if $syslog_open;
|
||||
$syslog_open = 0;
|
||||
}
|
||||
|
||||
1;
|
151
upstream/lib/Mail/SpamAssassin/MailingList.pm
Normal file
151
upstream/lib/Mail/SpamAssassin/MailingList.pm
Normal file
@ -0,0 +1,151 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
# Eval Tests to detect genuine mailing lists.
|
||||
|
||||
use strict; # make Test::Perl::Critic happy
|
||||
package Mail::SpamAssassin::MailingList; 1;
|
||||
|
||||
package Mail::SpamAssassin::PerMsgStatus;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
sub detect_mailing_list {
|
||||
my ($self) = @_;
|
||||
return 1 if $self->detect_ml_ezmlm();
|
||||
return 1 if $self->detect_ml_mailman();
|
||||
return 1 if $self->detect_ml_sympa();
|
||||
return 0;
|
||||
}
|
||||
|
||||
# EZMLM
|
||||
# Mailing-List: .*run by ezmlm
|
||||
# Precedence: bulk
|
||||
# List-Post: <mailto:
|
||||
# List-Help: <mailto:
|
||||
# List-Unsubscribe: <mailto:[a-zA-Z\.-]+-unsubscribe@
|
||||
# List-Subscribe: <mailto:[a-zA-Z\.-]+-subscribe@
|
||||
sub detect_ml_ezmlm {
|
||||
my ($self) = @_;
|
||||
return 0 unless $self->get('mailing-list') =~ /ezmlm$/;
|
||||
return 0 unless $self->get('precedence') eq "bulk\n";
|
||||
return 0 unless $self->get('list-post') =~ /^<mailto:/i;
|
||||
return 0 unless $self->get('list-help') =~ /^<mailto:/i;
|
||||
return 0 unless $self->get('list-unsubscribe') =~ /<mailto:[a-zA-Z\.-]+-unsubscribe\@/i;
|
||||
return 0 unless $self->get('list-subscribe') =~ /<mailto:[a-zA-Z\.-]+-subscribe\@/i;
|
||||
return 1; # assume ezmlm then.
|
||||
}
|
||||
|
||||
# MailMan (the gnu mailing list manager)
|
||||
# Precedence: bulk [or list for v2]
|
||||
# List-Help: <mailto:
|
||||
# List-Post: <mailto:
|
||||
# List-Subscribe: .*<mailto:.*=subscribe>
|
||||
# List-Id:
|
||||
# List-Unsubscribe: .*<mailto:.*=unsubscribe>
|
||||
# List-Archive:
|
||||
# X-Mailman-Version: \d
|
||||
#
|
||||
# However, for mailing list membership reminders, most of
|
||||
# those headers are gone, so we identify on the following:
|
||||
#
|
||||
# Subject: ...... mailing list memberships reminder (v1)
|
||||
# or X-List-Administrivia: yes (only in version 2)
|
||||
# X-Mailman-Version: \d
|
||||
# Precedence: bulk [or list for v2]
|
||||
# X-No-Archive: yes
|
||||
# Errors-To:
|
||||
# X-BeenThere:
|
||||
sub detect_ml_mailman {
|
||||
my ($self) = @_;
|
||||
return 0 unless $self->get('x-mailman-version') =~ /^\d/;
|
||||
return 0 unless $self->get('precedence') =~ /^(?:bulk|list)$/;
|
||||
|
||||
if ($self->get('x-list-administrivia') =~ /yes/ ||
|
||||
$self->get('subject') =~ /mailing list memberships reminder$/)
|
||||
{
|
||||
return 0 unless defined $self->get('errors-to',undef);
|
||||
return 0 unless defined $self->get('x-beenthere',undef);
|
||||
return 0 unless $self->get('x-no-archive') =~ /yes/;
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0 unless defined $self->get('list-id',undef);
|
||||
return 0 unless $self->get('list-help') =~ /^<mailto:/i;
|
||||
return 0 unless $self->get('list-post') =~ /^<mailto:/i;
|
||||
return 0 unless $self->get('list-subscribe') =~ /<mailto:.*=subscribe>/i;
|
||||
return 0 unless $self->get('list-unsubscribe') =~ /<mailto:.*=unsubscribe>/i;
|
||||
return 1; # assume this is a valid mailman list
|
||||
}
|
||||
|
||||
# Sympa
|
||||
# Return-Path: somelist-owner@somedomain.com [...]
|
||||
# Precedence: list [...]
|
||||
# List-Id: <somelist@somedomain.com>
|
||||
# List-Help: <mailto:sympa@somedomain.com?subject=help>
|
||||
# List-Subscribe: <mailto:somedomain.com?subject=subscribe%20somelist>
|
||||
# List-Unsubscribe: <mailto:sympa@somedomain.com?subject=unsubscribe%somelist>
|
||||
# List-Post: <mailto:somelist@somedomain.com>
|
||||
# List-Owner: <mailto:somelist-request@somedomain.com>
|
||||
# X-Mailer: Sympa 6.2.22
|
||||
# X-Sympa-To: somelist@somedomain.com
|
||||
# [and optionally] List-Archive: <http://www.somedomain.com/wws/arc/somelist>
|
||||
|
||||
sub detect_ml_sympa {
|
||||
my ($self) = @_;
|
||||
return 1 if $self->get('X-Mailer') =~ /^Sympa \d\.\d\.\d/;
|
||||
return 1 if defined ($self->get('X-Sympa-To',undef));
|
||||
return 0;
|
||||
}
|
||||
|
||||
# Lyris
|
||||
# Not implemented - need headers
|
||||
sub detect_ml_lyris {
|
||||
}
|
||||
|
||||
# ListBuilder
|
||||
# Sep 17 2002 jm: turned off due to bad S/O ratio
|
||||
|
||||
# sub detect_ml_listbuilder {
|
||||
# my ($self, $full) = @_;
|
||||
#
|
||||
# my $reply = $self->get('Reply-To:addr');
|
||||
# if ($reply !~ /\@lb.bcentral.com/) { return 0; }
|
||||
#
|
||||
# # Received: from unknown (HELO lbrout14.listbuilder.com) (204.71.191.9)
|
||||
# my $rcvd = $self->get('received');
|
||||
# return 0 unless ($rcvd =~ /\blbrout\d+\.listbuilder\.com\b/i);
|
||||
# return 0 unless ($rcvd =~ /\b204\.71\.191\.\d+\b/);
|
||||
#
|
||||
# # _______________________________________________________________________
|
||||
# # Powered by List Builder
|
||||
# # To unsubscribe follow the link:
|
||||
# # http://lb.bcentral.com/ex/sp?c=19511&s=76CA511711046877&m=14
|
||||
# $full = join ("\n", @{$full});
|
||||
#
|
||||
# if ($full !~ /__________________{40,}\s+Powered by List Builder\s/) { return 0; }
|
||||
# if ($full !~
|
||||
# m,\shttp://lb\.bcentral\.com/ex/sp\?c=[0-9A-Z]*&s=[0-9A-Z]*&m=[0-9A-Z]*\s,)
|
||||
# { return 0; }
|
||||
#
|
||||
# return 1;
|
||||
# }
|
||||
|
||||
1;
|
1344
upstream/lib/Mail/SpamAssassin/Message.pm
Normal file
1344
upstream/lib/Mail/SpamAssassin/Message.pm
Normal file
File diff suppressed because it is too large
Load Diff
112
upstream/lib/Mail/SpamAssassin/Message/Metadata.pm
Normal file
112
upstream/lib/Mail/SpamAssassin/Message/Metadata.pm
Normal file
@ -0,0 +1,112 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Message::Metadata - extract metadata from a message
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This class is tasked with extracting "metadata" from messages for use as
|
||||
Bayes tokens, fodder for eval tests, or other rules. Metadata is
|
||||
supplemental data inferred from the message, like the examples below.
|
||||
|
||||
It is held in two forms:
|
||||
|
||||
1. as name-value pairs of strings, presented in mail header format. For
|
||||
example, "X-Languages" => "en". This is the general form for simple
|
||||
metadata that's useful as Bayes tokens, can be added to marked-up
|
||||
messages using "add_header", etc., such as the trusted-relay inference
|
||||
and language detection.
|
||||
|
||||
2. as more complex data structures on the $msg->{metadata} object. This
|
||||
is the form used for metadata like the HTML parse data, which is stored
|
||||
there for access by eval rule code. Because it's not simple strings,
|
||||
it's not added as a Bayes token by default (Bayes needs simple strings).
|
||||
|
||||
=head1 PUBLIC METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Message::Metadata;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
use Mail::SpamAssassin::Util qw(reverse_ip_address);
|
||||
use Mail::SpamAssassin::Message::Metadata::Received;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
=item new()
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub new {
|
||||
my ($class, $msg) = @_;
|
||||
$class = ref($class) || $class;
|
||||
|
||||
my $self = {
|
||||
msg => $msg,
|
||||
strings => { }
|
||||
};
|
||||
|
||||
bless($self,$class);
|
||||
$self;
|
||||
}
|
||||
|
||||
sub extract {
|
||||
my ($self, $msg, $permsgstatus) = @_;
|
||||
|
||||
# pre-chew Received headers
|
||||
$self->parse_received_headers ($permsgstatus, $msg);
|
||||
|
||||
foreach my $tuple (
|
||||
[$self->{relays_trusted}, 'RELAYSTRUSTEDREVIP' ],
|
||||
[$self->{relays_untrusted}, 'RELAYSUNTRUSTEDREVIP'],
|
||||
[$self->{relays_internal}, 'RELAYSINTERNALREVIP' ],
|
||||
[$self->{relays_external}, 'RELAYSEXTERNALREVIP' ])
|
||||
{ my($rly, $tag) = @$tuple;
|
||||
my @revips;
|
||||
@revips = map {
|
||||
my($ip,$revip);
|
||||
$ip = $_->{ip} if ref $_ && !$_->{ip_private};
|
||||
$revip = reverse_ip_address($ip) if defined $ip && $ip ne '';
|
||||
defined $revip && $revip ne '' ? $revip : ();
|
||||
} @$rly if $rly;
|
||||
$permsgstatus->set_tag($tag,
|
||||
@revips == 1 ? $revips[0] : \@revips) if @revips;
|
||||
}
|
||||
|
||||
$permsgstatus->{main}->call_plugins("extract_metadata",
|
||||
{ msg => $msg, permsgstatus => $permsgstatus,
|
||||
conf => $permsgstatus->{main}->{conf} });
|
||||
}
|
||||
|
||||
sub finish {
|
||||
my ($self) = @_;
|
||||
%{$self} = ();
|
||||
}
|
||||
|
||||
1;
|
1443
upstream/lib/Mail/SpamAssassin/Message/Metadata/Received.pm
Normal file
1443
upstream/lib/Mail/SpamAssassin/Message/Metadata/Received.pm
Normal file
File diff suppressed because it is too large
Load Diff
1029
upstream/lib/Mail/SpamAssassin/Message/Node.pm
Normal file
1029
upstream/lib/Mail/SpamAssassin/Message/Node.pm
Normal file
File diff suppressed because it is too large
Load Diff
337
upstream/lib/Mail/SpamAssassin/NetSet.pm
Normal file
337
upstream/lib/Mail/SpamAssassin/NetSet.pm
Normal file
@ -0,0 +1,337 @@
|
||||
# Mail::SpamAssassin::NetSet - object to manipulate CIDR net IP addrs
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::NetSet;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Time::HiRes qw(time);
|
||||
use NetAddr::IP 4.000;
|
||||
|
||||
use Mail::SpamAssassin::Util;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our $have_patricia;
|
||||
BEGIN {
|
||||
eval {
|
||||
require Net::Patricia;
|
||||
Net::Patricia->VERSION(1.16); # need AF_INET6 support
|
||||
import Net::Patricia;
|
||||
$have_patricia = 1;
|
||||
};
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my ($class,$netset_name) = @_;
|
||||
$class = ref($class) || $class;
|
||||
|
||||
$netset_name = '' if !defined $netset_name; # object name for debugging
|
||||
my $self = {
|
||||
name => $netset_name, num_nets => 0,
|
||||
cache_hits => 0, cache_attempts => 0,
|
||||
};
|
||||
$self->{pt} = Net::Patricia->new(&AF_INET6) if $have_patricia;
|
||||
|
||||
bless $self, $class;
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub DESTROY {
|
||||
my($self) = shift;
|
||||
if (exists $self->{cache}) {
|
||||
local($@, $!, $_); # protect outer layers from a potential surprise
|
||||
my($hits, $attempts) = ($self->{cache_hits}, $self->{cache_attempts});
|
||||
dbg("netset: cache %s hits/attempts: %d/%d, %.1f %%",
|
||||
$self->{name}, $hits, $attempts, 100*$hits/$attempts) if $attempts > 0;
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub add_cidr {
|
||||
my ($self, @nets) = @_;
|
||||
|
||||
$self->{nets} ||= [ ];
|
||||
my $numadded = 0;
|
||||
delete $self->{cache}; # invalidate cache (in case of late additions)
|
||||
|
||||
foreach my $cidr_orig (@nets) {
|
||||
my $cidr = $cidr_orig; # leave original unchanged, useful for logging
|
||||
|
||||
# recognizes syntax:
|
||||
# [IPaddr%scope]/len or IPaddr%scope/len or IPv4addr/mask
|
||||
# optionally prefixed by a '!' to indicate negation (exclusion);
|
||||
# the %scope (i.e. interface), /len or /mask are optional
|
||||
|
||||
local($1,$2,$3,$4);
|
||||
$cidr =~ s/^\s+//;
|
||||
my $exclude = ($cidr =~ s/^!\s*//) ? 1 : 0;
|
||||
|
||||
my $masklen; # netmask or a prefix length
|
||||
$masklen = $1 if $cidr =~ s{ / (.*) \z }{}xs;
|
||||
|
||||
# discard optional brackets
|
||||
$cidr = $1 if $cidr =~ /^ \[ ( [^\]]* ) \] \z/xs;
|
||||
|
||||
my $scope;
|
||||
# IPv6 Scoped Address (RFC 4007, RFC 6874, RFC 3986 "unreserved" charset)
|
||||
if ($cidr =~ s/ % ( [A-Z0-9._~-]* ) \z //xsi) { # scope <zone_id> ?
|
||||
$scope = $1; # interface specification
|
||||
# discard interface specification, currently just ignored
|
||||
info("netset: ignoring interface scope '%%%s' in IP address %s",
|
||||
$scope, $cidr_orig);
|
||||
}
|
||||
|
||||
my $is_ip4 = 0;
|
||||
if ($cidr =~ /^ \d+ (\. | \z) /x) { # looks like an IPv4 address
|
||||
if ($cidr =~ /^ (\d+) \. (\d+) \. (\d+) \. (\d+) \z/x) {
|
||||
# also strips leading zeroes, not liked by inet_pton
|
||||
$cidr = sprintf('%d.%d.%d.%d', $1,$2,$3,$4);
|
||||
$masklen = 32 if !defined $masklen;
|
||||
} elsif ($cidr =~ /^ (\d+) \. (\d+) \. (\d+) \.? \z/x) {
|
||||
$cidr = sprintf('%d.%d.%d.0', $1,$2,$3);
|
||||
$masklen = 24 if !defined $masklen;
|
||||
} elsif ($cidr =~ /^ (\d+) \. (\d+) \.? \z/x) {
|
||||
$cidr = sprintf('%d.%d.0.0', $1,$2);
|
||||
$masklen = 16 if !defined $masklen;
|
||||
} elsif ($cidr =~ /^ (\d+) \.? \z/x) {
|
||||
$cidr = sprintf('%d.0.0.0', $1);
|
||||
$masklen = 8 if !defined $masklen;
|
||||
} else {
|
||||
warn "netset: illegal IPv4 address given: '$cidr_orig'\n";
|
||||
next;
|
||||
}
|
||||
$is_ip4 = 1;
|
||||
}
|
||||
|
||||
if ($self->{pt}) {
|
||||
if (defined $masklen) {
|
||||
$masklen =~ /^\d{1,3}\z/
|
||||
or die "Network mask not supported, use a CIDR syntax: '$cidr_orig'";
|
||||
}
|
||||
my $key = $cidr;
|
||||
my $prefix_len = $masklen;
|
||||
if ($is_ip4) {
|
||||
$key = '::ffff:' . $key; # turn it into an IPv4-mapped IPv6 addresses
|
||||
$prefix_len += 96 if defined $prefix_len;
|
||||
}
|
||||
$prefix_len = 128 if !defined $prefix_len;
|
||||
$key .= '/' . $prefix_len;
|
||||
# dbg("netset: add_cidr (patricia trie) %s => %s",
|
||||
# $cidr_orig, $exclude ? '!'.$key : $key);
|
||||
defined eval {
|
||||
$self->{pt}->add_string($key, $exclude ? '!'.$key : $key)
|
||||
} or warn "netset: illegal IP address given (patricia trie): ".
|
||||
"'$key': $@\n";
|
||||
}
|
||||
|
||||
$cidr .= '/' . $masklen if defined $masklen;
|
||||
|
||||
my $ip = NetAddr::IP->new($cidr);
|
||||
if (!defined $ip) {
|
||||
warn "netset: illegal IP address given: '$cidr_orig'\n";
|
||||
next;
|
||||
}
|
||||
# dbg("netset: add_cidr %s => %s => %s", $cidr_orig, $cidr, $ip);
|
||||
|
||||
# if this is an IPv4 address, create an IPv6 representation, too
|
||||
my ($ip4, $ip6);
|
||||
if ($is_ip4) {
|
||||
$ip4 = $ip;
|
||||
$ip6 = $self->_convert_ipv4_cidr_to_ipv6($cidr);
|
||||
} else {
|
||||
$ip6 = $ip;
|
||||
}
|
||||
|
||||
# bug 5931: this is O(n^2). bad if there are lots of nets. There are good
|
||||
# reasons to keep it for linting purposes, though, so don't start skipping
|
||||
# it until we have over 200 nets in our list
|
||||
if (scalar @{$self->{nets}} < 200) {
|
||||
next if ($self->is_net_declared($ip4, $ip6, $exclude, 0));
|
||||
}
|
||||
|
||||
# note: it appears a NetAddr::IP object takes up about 279 bytes
|
||||
push @{$self->{nets}}, {
|
||||
exclude => $exclude,
|
||||
ip4 => $ip4,
|
||||
ip6 => $ip6,
|
||||
as_string => $cidr_orig,
|
||||
};
|
||||
$numadded++;
|
||||
}
|
||||
|
||||
$self->{num_nets} += $numadded;
|
||||
$numadded;
|
||||
}
|
||||
|
||||
sub get_num_nets {
|
||||
my ($self) = @_;
|
||||
return $self->{num_nets};
|
||||
}
|
||||
|
||||
sub _convert_ipv4_cidr_to_ipv6 {
|
||||
my ($self, $cidr) = @_;
|
||||
|
||||
# only do this for IPv4 addresses
|
||||
return unless $cidr =~ /^\d+[.\/]/;
|
||||
|
||||
if ($cidr !~ /\//) { # no mask
|
||||
return NetAddr::IP->new6("::ffff:".$cidr);
|
||||
}
|
||||
|
||||
# else we have a CIDR mask specified. use new6() to do this
|
||||
#
|
||||
my $ip6 = NetAddr::IP->new6($cidr)->cidr;
|
||||
# 127.0.0.1 -> 0:0:0:0:0:0:7F00:0001/128
|
||||
# 127/8 -> 0:0:0:0:0:0:7F00:0/104
|
||||
|
||||
# now, move that from 0:0:0:0:0:0: space to 0:0:0:0:0:ffff: space
|
||||
if (!defined $ip6 || $ip6 !~ /^0:0:0:0:0:0:(.*)$/) {
|
||||
warn "oops! unparseable IPv6 address for $cidr: $ip6";
|
||||
return;
|
||||
}
|
||||
|
||||
return NetAddr::IP->new6("::ffff:$1");
|
||||
}
|
||||
|
||||
sub _nets_contains_network {
|
||||
my ($self, $net4, $net6, $exclude, $quiet, $netname, $declared) = @_;
|
||||
|
||||
return 0 unless (defined $self->{nets});
|
||||
|
||||
foreach my $net (@{$self->{nets}}) {
|
||||
# check to see if the new network is contained by the old network
|
||||
my $in4 = defined $net4 && defined $net->{ip4} && $net->{ip4}->contains($net4);
|
||||
my $in6 = defined $net6 && defined $net->{ip6} && $net->{ip6}->contains($net6);
|
||||
if ($in4 || $in6) {
|
||||
warn sprintf("netset: cannot %s %s as it has already been %s\n",
|
||||
$exclude ? "exclude" : "include",
|
||||
$netname,
|
||||
$net->{exclude} ? "excluded" : "included") unless $quiet;
|
||||
# a network that matches an excluded network isn't contained by "nets"
|
||||
# return 0 if we're not just looking to see if the network was declared
|
||||
return 0 if (!$declared && $net->{exclude});
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub is_net_declared {
|
||||
my ($self, $net4, $net6, $exclude, $quiet) = @_;
|
||||
return $self->_nets_contains_network($net4, $net6, $exclude,
|
||||
$quiet, $net4 || $net6, 1);
|
||||
}
|
||||
|
||||
sub contains_ip {
|
||||
my ($self, $ip) = @_;
|
||||
my $result = 0;
|
||||
|
||||
if (!$self->{num_nets}) { return 0 }
|
||||
|
||||
$self->{cache_attempts}++;
|
||||
if ($self->{cache} && exists $self->{cache}{$ip}) {
|
||||
dbg("netset: %s cached lookup on %s, %d networks, result: %s",
|
||||
$self->{name}, $ip, $self->{num_nets}, $self->{cache}{$ip});
|
||||
$self->{cache_hits}++;
|
||||
return $self->{cache}{$ip};
|
||||
|
||||
} elsif ($self->{pt}) {
|
||||
# do a quick lookup on a Patricia Trie
|
||||
my $t0 = time;
|
||||
local($1,$2,$3,$4); local $_ = $ip;
|
||||
$_ = $1 if /^ \[ ( [^\]]* ) \] \z/xs; # discard optional brackets
|
||||
s/%[A-Z0-9:._-]+\z//si; # discard interface specification
|
||||
if (m{^ (\d+) \. (\d+) \. (\d+) \. (\d+) \z}x) {
|
||||
$_ = sprintf('::ffff:%d.%d.%d.%d', $1,$2,$3,$4);
|
||||
} else {
|
||||
s/^IPv6://si; # discard optional 'IPv6:' prefix
|
||||
}
|
||||
eval { $result = $self->{pt}->match_string($_); 1 } or undef $result;
|
||||
$result = defined $result && $result !~ /^!/ ? 1 : 0;
|
||||
dbg("netset: %s patricia lookup on %s, %d networks, result: %s, %.3f ms",
|
||||
$self->{name}, $ip, $self->{num_nets}, $result, 1000*(time - $t0));
|
||||
} else {
|
||||
# do a sequential search on a list of NetAddr::IP objects
|
||||
my $t0 = time;
|
||||
my ($ip4, $ip6);
|
||||
if ($ip =~ /^\d+\./) {
|
||||
$ip4 = NetAddr::IP->new($ip);
|
||||
$ip6 = $self->_convert_ipv4_cidr_to_ipv6($ip);
|
||||
} else {
|
||||
$ip6 = NetAddr::IP->new($ip);
|
||||
}
|
||||
foreach my $net (@{$self->{nets}}) {
|
||||
if ((defined $ip4 && defined $net->{ip4} && $net->{ip4}->contains($ip4))
|
||||
|| (defined $ip6 && defined $net->{ip6} && $net->{ip6}->contains($ip6))){
|
||||
$result = !$net->{exclude};
|
||||
last;
|
||||
}
|
||||
}
|
||||
dbg("netset: %s lookup on %s, %d networks, result: %s, %.3f ms",
|
||||
$self->{name}, $ip, $self->{num_nets}, $result, 1000*(time - $t0));
|
||||
}
|
||||
|
||||
$self->{cache}{$ip} = $result;
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub contains_net {
|
||||
my ($self, $net) = @_;
|
||||
my $exclude = $net->{exclude};
|
||||
my $net4 = $net->{ip4};
|
||||
my $net6 = $net->{ip6};
|
||||
return $self->_nets_contains_network($net4, $net6, $exclude, 1, "", 0);
|
||||
}
|
||||
|
||||
sub ditch_cache {
|
||||
my ($self) = @_;
|
||||
if (exists $self->{cache}) {
|
||||
dbg("netset: ditch cache on %s", $self->{name});
|
||||
delete $self->{cache};
|
||||
}
|
||||
}
|
||||
|
||||
sub clone {
|
||||
my ($self) = @_;
|
||||
my $dup = Mail::SpamAssassin::NetSet->new($self->{name});
|
||||
if ($self->{nets}) {
|
||||
@{$dup->{nets}} = @{$self->{nets}};
|
||||
}
|
||||
if ($self->{pt}) {
|
||||
my $dup_pt = $dup->{pt};
|
||||
$self->{pt}->climb(sub {
|
||||
my $key = $_[0]; $key =~ s/^!//;
|
||||
defined eval { $dup_pt->add_string($key, $_[0]) }
|
||||
or die "Adding a network $_[0] to a patricia trie failed: $@";
|
||||
1;
|
||||
});
|
||||
}
|
||||
$dup->{num_nets} = $self->{num_nets};
|
||||
return $dup;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
195
upstream/lib/Mail/SpamAssassin/PerMsgLearner.pm
Normal file
195
upstream/lib/Mail/SpamAssassin/PerMsgLearner.pm
Normal file
@ -0,0 +1,195 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::PerMsgLearner - per-message status (spam or not-spam)
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
my $spamtest = new Mail::SpamAssassin ({
|
||||
'rules_filename' => '/etc/spamassassin.rules',
|
||||
'userprefs_filename' => $ENV{HOME}.'/.spamassassin/user_prefs'
|
||||
});
|
||||
my $mail = $spamtest->parse();
|
||||
|
||||
my $status = $spamtest->learn($mail,$id,$isspam,$forget);
|
||||
my $didlearn = $status->did_learn();
|
||||
$status->finish();
|
||||
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
The Mail::SpamAssassin C<learn()> method returns an object of this
|
||||
class. This object encapsulates all the per-message state for
|
||||
the learning process.
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::PerMsgLearner;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::PerMsgStatus;
|
||||
use Mail::SpamAssassin::Bayes;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my ($main, $msg) = @_;
|
||||
|
||||
my $self = {
|
||||
'main' => $main,
|
||||
'msg' => $msg,
|
||||
'learned' => 0,
|
||||
'master_deadline' => $msg->{master_deadline}, # dflt inherited from msg
|
||||
};
|
||||
|
||||
$self->{conf} = $self->{main}->{conf};
|
||||
|
||||
$self->{bayes_scanner} = $self->{main}->{bayes_scanner};
|
||||
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
# $status->learn_spam($id)
|
||||
#
|
||||
# Learn the message as spam.
|
||||
#
|
||||
# C<$id> is an optional message-identification string, used internally
|
||||
# to tag the message. If it is C<undef>, one will be generated.
|
||||
# It should be unique to that message.
|
||||
#
|
||||
# This is a semi-private API; callers should use
|
||||
# C<$spamtest-E<gt>learn($mail,$id,$isspam,$forget)> instead.
|
||||
|
||||
sub learn_spam {
|
||||
my ($self, $id) = @_;
|
||||
|
||||
# bug 4096
|
||||
# if ($self->{main}->{learn_with_whitelist}) {
|
||||
# $self->{main}->add_all_addresses_to_blacklist ($self->{msg});
|
||||
# }
|
||||
|
||||
# use the real message-id here instead of mass-check's idea of an "id",
|
||||
# as we may deliver the msg into another mbox format but later need
|
||||
# to forget it's training.
|
||||
$self->{learned} = $self->{bayes_scanner}->learn (1, $self->{msg}, $id);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
# $status->learn_ham($id)
|
||||
#
|
||||
# Learn the message as ham.
|
||||
#
|
||||
# C<$id> is an optional message-identification string, used internally
|
||||
# to tag the message. If it is C<undef>, one will be generated.
|
||||
# It should be unique to that message.
|
||||
#
|
||||
# This is a semi-private API; callers should use
|
||||
# C<$spamtest-E<gt>learn($mail,$id,$isspam,$forget)> instead.
|
||||
|
||||
sub learn_ham {
|
||||
my ($self, $id) = @_;
|
||||
|
||||
# bug 4096
|
||||
# if ($self->{main}->{learn_with_whitelist}) {
|
||||
# $self->{main}->add_all_addresses_to_whitelist ($self->{msg});
|
||||
# }
|
||||
|
||||
$self->{learned} = $self->{bayes_scanner}->learn (0, $self->{msg}, $id);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
# $status->forget($id)
|
||||
#
|
||||
# Forget about a previously-learned message.
|
||||
#
|
||||
# C<$id> is an optional message-identification string, used internally
|
||||
# to tag the message. If it is C<undef>, one will be generated.
|
||||
# It should be unique to that message.
|
||||
#
|
||||
# This is a semi-private API; callers should use
|
||||
# C<$spamtest-E<gt>learn($mail,$id,$isspam,$forget)> instead.
|
||||
|
||||
sub forget {
|
||||
my ($self, $id) = @_;
|
||||
|
||||
# bug 4096
|
||||
# if ($self->{main}->{learn_with_whitelist}) {
|
||||
# $self->{main}->remove_all_addresses_from_whitelist ($self->{msg});
|
||||
# }
|
||||
|
||||
$self->{learned} = $self->{bayes_scanner}->forget ($self->{msg}, $id);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $didlearn = $status->did_learn()
|
||||
|
||||
Returns C<1> if the message was learned from or forgotten successfully.
|
||||
|
||||
=cut
|
||||
|
||||
sub did_learn {
|
||||
my ($self) = @_;
|
||||
return ($self->{learned});
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $status->finish()
|
||||
|
||||
Finish with the object.
|
||||
|
||||
=cut
|
||||
|
||||
sub finish {
|
||||
my $self = shift;
|
||||
%{$self} = ();
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
||||
__END__
|
||||
|
||||
=back
|
||||
|
||||
=head1 SEE ALSO
|
||||
|
||||
Mail::SpamAssassin(3)
|
||||
spamassassin(1)
|
||||
|
3317
upstream/lib/Mail/SpamAssassin/PerMsgStatus.pm
Normal file
3317
upstream/lib/Mail/SpamAssassin/PerMsgStatus.pm
Normal file
File diff suppressed because it is too large
Load Diff
162
upstream/lib/Mail/SpamAssassin/PersistentAddrList.pm
Normal file
162
upstream/lib/Mail/SpamAssassin/PersistentAddrList.pm
Normal file
@ -0,0 +1,162 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::PersistentAddrList - persistent address list base class
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
my $factory = PersistentAddrListSubclass->new();
|
||||
$spamtest->set_persistent_addr_list_factory ($factory);
|
||||
... call into SpamAssassin classes...
|
||||
|
||||
SpamAssassin will call:
|
||||
|
||||
my $addrlist = $factory->new_checker($spamtest);
|
||||
$entry = $addrlist->get_addr_entry ($addr);
|
||||
...
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
All persistent address list implementations, used by the auto-whitelist
|
||||
code to track known-good email addresses, use this as a base class.
|
||||
|
||||
See C<Mail::SpamAssassin::DBBasedAddrList> for an example.
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
=over 4
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::PersistentAddrList;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $factory = PersistentAddrListSubclass->new();
|
||||
|
||||
This creates a factory object, which SpamAssassin will call to create
|
||||
a new checker object for the persistent address list.
|
||||
|
||||
=cut
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my $self = { };
|
||||
bless ($self, $class);
|
||||
$self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item my $addrlist = $factory->new_checker();
|
||||
|
||||
Create a new address-list checker object from the factory. Called by the
|
||||
SpamAssassin classes.
|
||||
|
||||
=cut
|
||||
|
||||
sub new_checker {
|
||||
my ($factory, $main) = @_;
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $entry = $addrlist->get_addr_entry ($addr);
|
||||
|
||||
Given an email address C<$addr>, return an entry object with the details of
|
||||
that address.
|
||||
|
||||
The entry object is a reference to a hash, which must contain at least
|
||||
two keys: C<count>, which is the count of times that address has been
|
||||
encountered before; and C<totscore>, which is the total of all scores for
|
||||
messages associated with that address. From these two fields, an average
|
||||
score will be calculated, and the score for the current message will be
|
||||
regressed towards that mean message score.
|
||||
|
||||
The hash can contain whatever other data your back-end needs to store,
|
||||
under other keys.
|
||||
|
||||
The method should never return C<undef>, or a hash that does not contain
|
||||
a C<count> key and a C<totscore> key.
|
||||
|
||||
=cut
|
||||
|
||||
sub get_addr_entry {
|
||||
my ($self, $addr, $signedby) = @_;
|
||||
my $entry = { };
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
return $entry;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $entry = $addrlist->add_score($entry, $score);
|
||||
|
||||
This method should add the given score to the whitelist database for the
|
||||
given entry, and then return the new entry.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_score {
|
||||
my ($self, $entry, $score) = @_;
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $entry = $addrlist->remove_entry ($entry);
|
||||
|
||||
This method should remove the given entry from the whitelist database.
|
||||
|
||||
=cut
|
||||
|
||||
sub remove_entry {
|
||||
my ($self, $entry) = @_;
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $entry = $addrlist->finish ();
|
||||
|
||||
Clean up, if necessary. Called by SpamAssassin when it has finished
|
||||
checking, or adding to, the auto-whitelist database.
|
||||
|
||||
=cut
|
||||
|
||||
sub finish {
|
||||
my ($self) = @_;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
1223
upstream/lib/Mail/SpamAssassin/Plugin.pm
Normal file
1223
upstream/lib/Mail/SpamAssassin/Plugin.pm
Normal file
File diff suppressed because it is too large
Load Diff
491
upstream/lib/Mail/SpamAssassin/Plugin/ASN.pm
Normal file
491
upstream/lib/Mail/SpamAssassin/Plugin/ASN.pm
Normal file
@ -0,0 +1,491 @@
|
||||
# SpamAssassin - ASN Lookup Plugin
|
||||
#
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
#
|
||||
###########################################################################
|
||||
#
|
||||
# Originated by Matthias Leisi, 2006-12-15 (SpamAssassin enhancement #4770).
|
||||
# Modifications by D. Stussy, 2010-12-15 (SpamAssassin enhancement #6484):
|
||||
#
|
||||
# Since SA 3.4.0 a fixed text prefix (such as AS) to each ASN is configurable
|
||||
# through an asn_prefix directive. Its value is 'AS' by default for backward
|
||||
# compatibility with SA 3.3.*, but is rather redundant and can be set to an
|
||||
# empty string for clarity if desired.
|
||||
#
|
||||
# Enhanced TXT-RR decoding for alternative formats from other DNS zones.
|
||||
# Some of the supported formats of TXT RR are (quoted strings here represent
|
||||
# individual string fields in a TXT RR):
|
||||
# "1103" "192.88.99.0" "24"
|
||||
# "559 1103 1239 1257 1299 | 192.88.99.0/24 | US | iana | 2001-06-01"
|
||||
# "192.88.99.0/24 | AS1103 | SURFnet, The Netherlands | 2002-10-15 | EU"
|
||||
# "15169 | 2a00:1450::/32 | IE | ripencc | 2009-10-05"
|
||||
# "as1103"
|
||||
# Multiple routes are sometimes provided by returning multiple TXT records
|
||||
# (e.g. from cymru.com). This form of a response is handled as well.
|
||||
#
|
||||
# Some zones also support IPv6 lookups, for example:
|
||||
# asn_lookup_ipv6 origin6.asn.cymru.com [_ASN_ _ASNCIDR_]
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::ASN - SpamAssassin plugin to look up the
|
||||
Autonomous System Number (ASN) of the connecting IP address.
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::ASN
|
||||
|
||||
asn_lookup asn.routeviews.org _ASN_ _ASNCIDR_
|
||||
|
||||
asn_lookup_ipv6 origin6.asn.cymru.com _ASN_ _ASNCIDR_
|
||||
|
||||
add_header all ASN _ASN_ _ASNCIDR_
|
||||
|
||||
header TEST_AS1234 X-ASN =~ /^1234$/
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin uses DNS lookups to the services of an external DNS zone such
|
||||
as at C<http://www.routeviews.org/> to do the actual work. Please make
|
||||
sure that your use of the plugin does not overload their infrastructure -
|
||||
this generally means that B<you should not use this plugin in a
|
||||
high-volume environment> or that you should use a local mirror of the
|
||||
zone (see C<ftp://ftp.routeviews.org/dnszones/>). Other similar zones
|
||||
may also be used.
|
||||
|
||||
=head1 TEMPLATE TAGS
|
||||
|
||||
This plugin allows you to create template tags containing the connecting
|
||||
IP's AS number and route info for that AS number.
|
||||
|
||||
The default config will add a header field that looks like this:
|
||||
|
||||
X-Spam-ASN: AS24940 213.239.192.0/18
|
||||
|
||||
where "24940" is the ASN and "213.239.192.0/18" is the route
|
||||
announced by that ASN where the connecting IP address came from.
|
||||
If the AS announces multiple networks (more/less specific), they will
|
||||
all be added to the C<_ASNCIDR_> tag, separated by spaces, eg:
|
||||
|
||||
X-Spam-ASN: AS1680 89.138.0.0/15 89.139.0.0/16
|
||||
|
||||
Note that the literal "AS" before the ASN in the _ASN_ tag is configurable
|
||||
through the I<asn_prefix> directive and may be set to an empty string.
|
||||
|
||||
=head1 CONFIGURATION
|
||||
|
||||
The standard ruleset contains a configuration that will add a header field
|
||||
containing ASN data to scanned messages. The bayes tokenizer will use the
|
||||
added header field for bayes calculations, and thus affect which BAYES_* rule
|
||||
will trigger for a particular message.
|
||||
|
||||
B<Note> that in most cases you should not score on the ASN data directly.
|
||||
Bayes learning will probably trigger on the _ASNCIDR_ tag, but probably not
|
||||
very well on the _ASN_ tag alone.
|
||||
|
||||
=head1 SEE ALSO
|
||||
|
||||
http://www.routeviews.org/ - all data regarding routing, ASNs, etc....
|
||||
|
||||
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770 -
|
||||
SpamAssassin Issue #4770 concerning this plugin
|
||||
|
||||
=head1 STATUS
|
||||
|
||||
No in-depth analysis of the usefulness of bayes tokenization of ASN data has
|
||||
been performed.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::ASN;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(reverse_ip_address);
|
||||
use Mail::SpamAssassin::Dns;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
our $txtdata_can_provide_a_list;
|
||||
|
||||
my $IPV4_ADDRESS = IPV4_ADDRESS;
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsa->{conf});
|
||||
|
||||
#$txtdata_can_provide_a_list = Net::DNS->VERSION >= 0.69;
|
||||
#more robust version check from Damyan Ivanov - Bug 7095
|
||||
$txtdata_can_provide_a_list = version->parse(Net::DNS->VERSION) >= version->parse('0.69');
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
=head1 ADMINISTRATOR SETTINGS
|
||||
|
||||
=over 4
|
||||
|
||||
=item asn_lookup asn-zone.example.com [ _ASNTAG_ _ASNCIDRTAG_ ]
|
||||
|
||||
Use this to lookup the ASN info in the specified zone for the first external
|
||||
IPv4 address and add the AS number to the first specified tag and routing info
|
||||
to the second specified tag.
|
||||
|
||||
If no tags are specified the AS number will be added to the _ASN_ tag and the
|
||||
routing info will be added to the _ASNCIDR_ tag. You must specify either none
|
||||
or both of the tag names. Tag names must start and end with an underscore.
|
||||
|
||||
If two or more I<asn_lookup>s use the same set of template tags, the results of
|
||||
their lookups will be appended to each other in the template tag values in no
|
||||
particular order. Duplicate results will be omitted when combining results.
|
||||
In a similar fashion, you can also use the same template tag for both the AS
|
||||
number tag and the routing info tag.
|
||||
|
||||
Examples:
|
||||
|
||||
asn_lookup asn.routeviews.org
|
||||
|
||||
asn_lookup asn.routeviews.org _ASN_ _ASNCIDR_
|
||||
asn_lookup myview.example.com _MYASN_ _MYASNCIDR_
|
||||
|
||||
asn_lookup asn.routeviews.org _COMBINEDASN_ _COMBINEDASNCIDR_
|
||||
asn_lookup myview.example.com _COMBINEDASN_ _COMBINEDASNCIDR_
|
||||
|
||||
asn_lookup in1tag.example.net _ASNDATA_ _ASNDATA_
|
||||
|
||||
=item asn_lookup_ipv6 asn-zone6.example.com [_ASN_ _ASNCIDR_]
|
||||
|
||||
Use specified zone for lookups of IPv6 addresses. If zone supports both
|
||||
IPv4 and IPv6 queries, use both asn_lookup and asn_lookup_ipv6 for the same
|
||||
zone.
|
||||
|
||||
=item clear_asn_lookups
|
||||
|
||||
Removes any previously declared I<asn_lookup> entries from a list of queries.
|
||||
|
||||
=item asn_prefix 'prefix_string' (default: 'AS')
|
||||
|
||||
The string specified in the argument is prepended to each ASN when storing
|
||||
it as a tag. This prefix is rather redundant, but its default value 'AS'
|
||||
is kept for backward compatibility with versions of SpamAssassin earlier
|
||||
than 3.4.0. A sensible setting is an empty string. The argument may be (but
|
||||
need not be) enclosed in single or double quotes for clarity.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'asn_lookup',
|
||||
is_admin => 1,
|
||||
code => sub {
|
||||
my ($conf, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
local($1,$2,$3);
|
||||
unless ($value =~ /^(\S+?)\.?(?:\s+_(\S+)_\s+_(\S+)_)?$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
my ($zone, $asn_tag, $route_tag) = ($1, $2, $3);
|
||||
$asn_tag = 'ASN' if !defined $asn_tag;
|
||||
$route_tag = 'ASNCIDR' if !defined $route_tag;
|
||||
push @{$conf->{asnlookups}},
|
||||
{ zone=>$zone, asn_tag=>$asn_tag, route_tag=>$route_tag };
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'asn_lookup_ipv6',
|
||||
is_admin => 1,
|
||||
code => sub {
|
||||
my ($conf, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
local($1,$2,$3);
|
||||
unless ($value =~ /^(\S+?)\.?(?:\s+_(\S+)_\s+_(\S+)_)?$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
my ($zone, $asn_tag, $route_tag) = ($1, $2, $3);
|
||||
$asn_tag = 'ASN' if !defined $asn_tag;
|
||||
$route_tag = 'ASNCIDR' if !defined $route_tag;
|
||||
push @{$conf->{asnlookups_ipv6}},
|
||||
{ zone=>$zone, asn_tag=>$asn_tag, route_tag=>$route_tag };
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'clear_asn_lookups',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NOARGS,
|
||||
code => sub {
|
||||
my ($conf, $key, $value, $line) = @_;
|
||||
if (defined $value && $value ne '') {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
delete $conf->{asnlookups};
|
||||
delete $conf->{asnlookups_ipv6};
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'asn_prefix',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
default => 'AS',
|
||||
code => sub {
|
||||
my ($conf, $key, $value, $line) = @_;
|
||||
$value = '' if !defined $value;
|
||||
local($1,$2);
|
||||
$value = $2 if $value =~ /^(['"])(.*)\1\z/; # strip quotes if any
|
||||
$conf->{$key} = $value; # keep tainted
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $pms = $opts->{permsgstatus};
|
||||
my $conf = $self->{main}->{conf};
|
||||
|
||||
if (!$pms->is_dns_available()) {
|
||||
dbg("asn: DNS is not available, skipping ASN checks");
|
||||
return;
|
||||
}
|
||||
|
||||
if (!$conf->{asnlookups} && !$conf->{asnlookups_ipv6}) {
|
||||
dbg("asn: no asn_lookups configured, skipping ASN lookups");
|
||||
return;
|
||||
}
|
||||
|
||||
# initialize the tag data so that if no result is returned from the DNS
|
||||
# query we won't end up with a missing tag. Don't use $pms->set_tag()
|
||||
# here to avoid triggering any tag-dependent action unnecessarily
|
||||
if ($conf->{asnlookups}) {
|
||||
foreach my $entry (@{$conf->{asnlookups}}) {
|
||||
$pms->{tag_data}->{$entry->{asn_tag}} ||= '';
|
||||
$pms->{tag_data}->{$entry->{route_tag}} ||= '';
|
||||
}
|
||||
}
|
||||
if ($conf->{asnlookups_ipv6}) {
|
||||
foreach my $entry (@{$conf->{asnlookups_ipv6}}) {
|
||||
$pms->{tag_data}->{$entry->{asn_tag}} ||= '';
|
||||
$pms->{tag_data}->{$entry->{route_tag}} ||= '';
|
||||
}
|
||||
}
|
||||
|
||||
# get reversed IP address of last external relay to lookup
|
||||
# don't return until we've initialized the template tags
|
||||
my $relay = $pms->{relays_external}->[0];
|
||||
if (!defined $relay) {
|
||||
dbg("asn: no first external relay IP available, skipping ASN check");
|
||||
return;
|
||||
} elsif ($relay->{ip_private}) {
|
||||
dbg("asn: first external relay is a private IP, skipping ASN check");
|
||||
return;
|
||||
}
|
||||
|
||||
my $ip = $relay->{ip};
|
||||
my $reversed_ip = reverse_ip_address($ip);
|
||||
if (defined $reversed_ip) {
|
||||
dbg("asn: using first external relay IP for lookups: %s", $ip);
|
||||
} else {
|
||||
dbg("asn: could not parse first external relay IP: %s, skipping", $ip);
|
||||
return;
|
||||
}
|
||||
|
||||
my $lookup_zone;
|
||||
if ($ip =~ /^$IPV4_ADDRESS$/o) {
|
||||
if (!defined $conf->{asnlookups}) {
|
||||
dbg("asn: asn_lookup for IPv4 not defined, skipping");
|
||||
return;
|
||||
}
|
||||
$lookup_zone = "asnlookups";
|
||||
} else {
|
||||
if (!defined $conf->{asnlookups_ipv6}) {
|
||||
dbg("asn: asn_lookup_ipv6 for IPv6 not defined, skipping");
|
||||
return;
|
||||
}
|
||||
$lookup_zone = "asnlookups_ipv6";
|
||||
}
|
||||
|
||||
# we use arrays and array indices rather than hashes and hash keys
|
||||
# in case someone wants the same zone added to multiple sets of tags
|
||||
my $index = 0;
|
||||
foreach my $entry (@{$conf->{$lookup_zone}}) {
|
||||
# do the DNS query, have the callback process the result
|
||||
my $zone_index = $index;
|
||||
my $zone = $reversed_ip . '.' . $entry->{zone};
|
||||
my $key = "asnlookup-${lookup_zone}-${zone_index}-".$entry->{zone};
|
||||
my $ent = $pms->{async}->bgsend_and_start_lookup($zone, 'TXT', undef,
|
||||
{ type => 'ASN', key => $key, zone => $lookup_zone },
|
||||
sub { my($ent, $pkt) = @_;
|
||||
$self->process_dns_result($pms, $pkt, $zone_index, $lookup_zone) },
|
||||
master_deadline => $pms->{master_deadline}
|
||||
);
|
||||
$pms->register_async_rule_start($key) if $ent;
|
||||
$index++;
|
||||
}
|
||||
}
|
||||
|
||||
#
|
||||
# TXT-RR format of response:
|
||||
# 3 fields, each as one TXT RR <character-string> (RFC 1035): ASN IP MASK
|
||||
# The latter two fields are combined to create a CIDR.
|
||||
# or: At least 2 fields made of a single or multiple
|
||||
# <character-string>s, fields are separated by a vertical bar.
|
||||
# They will be the ASN and CIDR fields in any order.
|
||||
# If only one field is returned, it is the ASN. There will
|
||||
# be no CIDR field in that case.
|
||||
#
|
||||
sub process_dns_result {
|
||||
my ($self, $pms, $pkt, $zone_index, $lookup_zone) = @_;
|
||||
|
||||
my $conf = $self->{main}->{conf};
|
||||
|
||||
my $zone = $conf->{$lookup_zone}[$zone_index]->{zone};
|
||||
my $asn_tag = $conf->{$lookup_zone}[$zone_index]->{asn_tag};
|
||||
my $route_tag = $conf->{$lookup_zone}[$zone_index]->{route_tag};
|
||||
|
||||
my($any_asn_updates, $any_route_updates, $tag_value);
|
||||
|
||||
my(@asn_tag_data, %asn_tag_data_seen);
|
||||
$tag_value = $pms->get_tag($asn_tag);
|
||||
if (defined $tag_value) {
|
||||
my $prefix = $pms->{conf}->{asn_prefix};
|
||||
if (defined $prefix && $prefix ne '') {
|
||||
# must strip prefix before splitting on whitespace
|
||||
$tag_value =~ s/(^| )\Q$prefix\E(?=\d+)/$1/gs;
|
||||
}
|
||||
@asn_tag_data = split(/ /,$tag_value);
|
||||
%asn_tag_data_seen = map(($_,1), @asn_tag_data);
|
||||
}
|
||||
|
||||
my(@route_tag_data, %route_tag_data_seen);
|
||||
$tag_value = $pms->get_tag($route_tag);
|
||||
if (defined $tag_value) {
|
||||
@route_tag_data = split(/ /,$tag_value);
|
||||
%route_tag_data_seen = map(($_,1), @route_tag_data);
|
||||
}
|
||||
|
||||
# NOTE: $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
my @answer = !defined $pkt ? () : $pkt->answer;
|
||||
|
||||
foreach my $rr (@answer) {
|
||||
#dbg("asn: %s: lookup result packet: %s", $zone, $rr->string);
|
||||
next if $rr->type ne 'TXT';
|
||||
my @strings = $txtdata_can_provide_a_list ? $rr->txtdata :
|
||||
$rr->char_str_list; # historical
|
||||
next if !@strings;
|
||||
for (@strings) { utf8::encode($_) if utf8::is_utf8($_) }
|
||||
|
||||
my @items;
|
||||
if (@strings > 1 && join('',@strings) !~ m{\|}) {
|
||||
# routeviews.org style, multiple string fields in a TXT RR
|
||||
@items = @strings;
|
||||
if (@items >= 3 && $items[1] !~ m{/} && $items[2] =~ /^\d+\z/) {
|
||||
$items[1] .= '/' . $items[2]; # append the net mask length to route
|
||||
}
|
||||
} else {
|
||||
# cymru.com and spameatingmonkey.net style, or just a single field
|
||||
@items = split(/\s*\|\s*/, join(' ',@strings));
|
||||
}
|
||||
|
||||
my(@route_value, @asn_value);
|
||||
if (@items && $items[0] =~ /(?: (?:^|\s+) (?:AS)? \d+ )+ \z/xsi) {
|
||||
# routeviews.org and cymru.com style, ASN is the first field,
|
||||
# possibly a whitespace-separated list (e.g. cymru.com)
|
||||
@asn_value = split(' ',$items[0]);
|
||||
@route_value = split(' ',$items[1]) if @items >= 2;
|
||||
} elsif (@items > 1 && $items[1] =~ /(?: (?:^|\s+) (?:AS)? \d+ )+ \z/xsi) {
|
||||
# spameatingmonkey.net style, ASN is the second field
|
||||
@asn_value = split(' ',$items[1]);
|
||||
@route_value = split(' ',$items[0]);
|
||||
} else {
|
||||
dbg("asn: unparseable response: %s", join(' ', map("\"$_\"",@strings)));
|
||||
}
|
||||
|
||||
foreach my $route (@route_value) {
|
||||
if (!defined $route || $route eq '') {
|
||||
# ignore, just in case
|
||||
} elsif ($route =~ m{/0+\z}) {
|
||||
# unassigned/unannounced address space
|
||||
} elsif ($route_tag_data_seen{$route}) {
|
||||
dbg("asn: %s duplicate route %s", $route_tag, $route);
|
||||
} else {
|
||||
dbg("asn: %s added route %s", $route_tag, $route);
|
||||
push(@route_tag_data, $route);
|
||||
$route_tag_data_seen{$route} = 1;
|
||||
$any_route_updates = 1;
|
||||
}
|
||||
}
|
||||
|
||||
foreach my $asn (@asn_value) {
|
||||
$asn =~ s/^AS(?=\d+)//si;
|
||||
if (!$asn || $asn == 4294967295) {
|
||||
# unassigned/unannounced address space
|
||||
} elsif ($asn_tag_data_seen{$asn}) {
|
||||
dbg("asn: %s duplicate asn %s", $asn_tag, $asn);
|
||||
} else {
|
||||
dbg("asn: %s added asn %s", $asn_tag, $asn);
|
||||
push(@asn_tag_data, $asn);
|
||||
$asn_tag_data_seen{$asn} = 1;
|
||||
$any_asn_updates = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if ($any_asn_updates && @asn_tag_data) {
|
||||
$pms->{msg}->put_metadata('X-ASN', join(' ',@asn_tag_data));
|
||||
my $prefix = $pms->{conf}->{asn_prefix};
|
||||
if (defined $prefix && $prefix ne '') { s/^/$prefix/ for @asn_tag_data }
|
||||
$pms->set_tag($asn_tag,
|
||||
@asn_tag_data == 1 ? $asn_tag_data[0] : \@asn_tag_data);
|
||||
}
|
||||
if ($any_route_updates && @route_tag_data) {
|
||||
# Bayes already has X-ASN, Route is pointless duplicate, skip
|
||||
#$pms->{msg}->put_metadata('X-ASN-Route', join(' ',@route_tag_data));
|
||||
$pms->set_tag($route_tag,
|
||||
@route_tag_data == 1 ? $route_tag_data[0] : \@route_tag_data);
|
||||
}
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_asn_lookup_ipv6 { 1 }
|
||||
|
||||
1;
|
633
upstream/lib/Mail/SpamAssassin/Plugin/AWL.pm
Normal file
633
upstream/lib/Mail/SpamAssassin/Plugin/AWL.pm
Normal file
@ -0,0 +1,633 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::AWL - Normalize scores via auto-whitelist
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
To try this out, add this or uncomment this line in init.pre:
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::AWL
|
||||
|
||||
Use the supplied 60_awl.cf file (ie you don't have to do anything) or
|
||||
add these lines to a .cf file:
|
||||
|
||||
header AWL eval:check_from_in_auto_whitelist()
|
||||
describe AWL From: address is in the auto white-list
|
||||
tflags AWL userconf noautolearn
|
||||
priority AWL 1000
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin module provides support for the auto-whitelist. It keeps
|
||||
track of the average SpamAssassin score for senders. Senders are
|
||||
tracked using a combination of their From: address and their IP address.
|
||||
It then uses that average score to reduce the variability in scoring
|
||||
from message to message and modifies the final score by pushing the
|
||||
result towards the historical average. This improves the accuracy of
|
||||
filtering for most email.
|
||||
|
||||
=head1 TEMPLATE TAGS
|
||||
|
||||
This plugin module adds the following C<tags> that can be used as
|
||||
placeholders in certain options. See C<Mail::SpamAssassin::Conf>
|
||||
for more information on TEMPLATE TAGS.
|
||||
|
||||
_AWL_ AWL modifier
|
||||
_AWLMEAN_ Mean score on which AWL modification is based
|
||||
_AWLCOUNT_ Number of messages on which AWL modification is based
|
||||
_AWLPRESCORE_ Score before AWL
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::AWL;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::AutoWhitelist;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_from_in_auto_whitelist");
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
=head1 USER PREFERENCES
|
||||
|
||||
The following options can be used in both site-wide (C<local.cf>) and
|
||||
user-specific (C<user_prefs>) configuration files to customize how
|
||||
SpamAssassin handles incoming email messages.
|
||||
|
||||
=over 4
|
||||
|
||||
=item use_auto_whitelist ( 0 | 1 ) (default: 1)
|
||||
|
||||
Whether to use auto-whitelists. Auto-whitelists track the long-term
|
||||
average score for each sender and then shift the score of new messages
|
||||
toward that long-term average. This can increase or decrease the score
|
||||
for messages, depending on the long-term behavior of the particular
|
||||
correspondent.
|
||||
|
||||
For more information about the auto-whitelist system, please look
|
||||
at the the C<Automatic Whitelist System> section of the README file.
|
||||
The auto-whitelist is not intended as a general-purpose replacement
|
||||
for static whitelist entries added to your config files.
|
||||
|
||||
Note that certain tests are ignored when determining the final
|
||||
message score:
|
||||
|
||||
- rules with tflags set to 'noautolearn'
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'use_auto_whitelist',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL
|
||||
});
|
||||
|
||||
=item auto_whitelist_factor n (default: 0.5, range [0..1])
|
||||
|
||||
How much towards the long-term mean for the sender to regress a message.
|
||||
Basically, the algorithm is to track the long-term mean score of messages for
|
||||
the sender (C<mean>), and then once we have otherwise fully calculated the
|
||||
score for this message (C<score>), we calculate the final score for the
|
||||
message as:
|
||||
|
||||
C<finalscore> = C<score> + (C<mean> - C<score>) * C<factor>
|
||||
|
||||
So if C<factor> = 0.5, then we'll move to half way between the calculated
|
||||
score and the mean. If C<factor> = 0.3, then we'll move about 1/3 of the way
|
||||
from the score toward the mean. C<factor> = 1 means just use the long-term
|
||||
mean; C<factor> = 0 mean just use the calculated score.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_factor',
|
||||
default => 0.5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=item auto_whitelist_ipv4_mask_len n (default: 16, range [0..32])
|
||||
|
||||
The AWL database keeps only the specified number of most-significant bits
|
||||
of an IPv4 address in its fields, so that different individual IP addresses
|
||||
within a subnet belonging to the same owner are managed under a single
|
||||
database record. As we have no information available on the allocated
|
||||
address ranges of senders, this CIDR mask length is only an approximation.
|
||||
The default is 16 bits, corresponding to a former class B. Increase the
|
||||
number if a finer granularity is desired, e.g. to 24 (class C) or 32.
|
||||
A value 0 is allowed but is not particularly useful, as it would treat the
|
||||
whole internet as a single organization. The number need not be a multiple
|
||||
of 8, any split is allowed.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_ipv4_mask_len',
|
||||
default => 16,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if (!defined $value || $value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
} elsif ($value !~ /^\d+$/ || $value < 0 || $value > 32) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_ipv4_mask_len} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
=item auto_whitelist_ipv6_mask_len n (default: 48, range [0..128])
|
||||
|
||||
The AWL database keeps only the specified number of most-significant bits
|
||||
of an IPv6 address in its fields, so that different individual IP addresses
|
||||
within a subnet belonging to the same owner are managed under a single
|
||||
database record. As we have no information available on the allocated address
|
||||
ranges of senders, this CIDR mask length is only an approximation. The default
|
||||
is 48 bits, corresponding to an address range commonly allocated to individual
|
||||
(smaller) organizations. Increase the number for a finer granularity, e.g.
|
||||
to 64 or 96 or 128, or decrease for wider ranges, e.g. 32. A value 0 is
|
||||
allowed but is not particularly useful, as it would treat the whole internet
|
||||
as a single organization. The number need not be a multiple of 4, any split
|
||||
is allowed.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_ipv6_mask_len',
|
||||
default => 48,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if (!defined $value || $value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
} elsif ($value !~ /^\d+$/ || $value < 0 || $value > 128) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_ipv6_mask_len} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
=item user_awl_sql_override_username
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
If this option is set the SQLBasedAddrList module will override the set
|
||||
username with the value given. This can be useful for implementing global
|
||||
or group based auto-whitelist databases.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'user_awl_sql_override_username',
|
||||
default => '',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item auto_whitelist_distinguish_signed
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
If this option is set the SQLBasedAddrList module will keep separate
|
||||
database entries for DKIM-validated e-mail addresses and for non-validated
|
||||
ones. A pre-requisite when setting this option is that a field awl.signedby
|
||||
exists in a SQL table, otherwise SQL operations will fail (which is why we
|
||||
need this option at all - for compatibility with pre-3.3.0 database schema).
|
||||
A plugin DKIM should also be enabled, as otherwise there is no benefit from
|
||||
turning on this option.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_distinguish_signed',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL
|
||||
});
|
||||
|
||||
=back
|
||||
|
||||
=head1 ADMINISTRATOR SETTINGS
|
||||
|
||||
These settings differ from the ones above, in that they are considered 'more
|
||||
privileged' -- even more than the ones in the B<PRIVILEGED SETTINGS> section.
|
||||
No matter what C<allow_user_rules> is set to, these can never be set from a
|
||||
user's C<user_prefs> file.
|
||||
|
||||
=over 4
|
||||
|
||||
=item auto_whitelist_factory module (default: Mail::SpamAssassin::DBBasedAddrList)
|
||||
|
||||
Select alternative whitelist factory module.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_factory',
|
||||
is_admin => 1,
|
||||
default => 'Mail::SpamAssassin::DBBasedAddrList',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item auto_whitelist_path /path/filename (default: ~/.spamassassin/auto-whitelist)
|
||||
|
||||
This is the automatic-whitelist directory and filename. By default, each user
|
||||
has their own whitelist database in their C<~/.spamassassin> directory with
|
||||
mode 0700. For system-wide SpamAssassin use, you may want to share this
|
||||
across all users, although that is not recommended.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_path',
|
||||
is_admin => 1,
|
||||
default => '__userstate__/auto-whitelist',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
if (-d $value) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_path} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
=item auto_whitelist_db_modules Module ... (default: see below)
|
||||
|
||||
What database modules should be used for the auto-whitelist storage database
|
||||
file. The first named module that can be loaded from the perl include path
|
||||
will be used. The format is:
|
||||
|
||||
PreferredModuleName SecondBest ThirdBest ...
|
||||
|
||||
ie. a space-separated list of perl module names. The default is:
|
||||
|
||||
DB_File GDBM_File SDBM_File
|
||||
|
||||
NDBM_File is no longer supported, since it appears to have bugs that
|
||||
preclude its use for the AWL (see SpamAssassin bug 4353).
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_db_modules',
|
||||
is_admin => 1,
|
||||
default => 'DB_File GDBM_File SDBM_File',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item auto_whitelist_file_mode (default: 0700)
|
||||
|
||||
The file mode bits used for the automatic-whitelist directory or file.
|
||||
|
||||
Make sure you specify this using the 'x' mode bits set, as it may also be used
|
||||
to create directories. However, if a file is created, the resulting file will
|
||||
not have any execute bits set (the umask is set to 0111).
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_file_mode',
|
||||
is_admin => 1,
|
||||
default => '0700',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value !~ /^0?[0-7]{3}$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_file_mode} = untaint_var($value);
|
||||
}
|
||||
});
|
||||
|
||||
=item user_awl_dsn DBI:databasetype:databasename:hostname:port
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
This will set the DSN used to connect. Example:
|
||||
C<DBI:mysql:spamassassin:localhost>
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'user_awl_dsn',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item user_awl_sql_username username
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
The authorized username to connect to the above DSN.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'user_awl_sql_username',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item user_awl_sql_password password
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
The password for the database username, for the above DSN.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'user_awl_sql_password',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item user_awl_sql_table tablename
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
The table user auto-whitelists are stored in, for the above DSN.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'user_awl_sql_table',
|
||||
is_admin => 1,
|
||||
default => 'awl',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub check_from_in_auto_whitelist {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
return 0 unless ($pms->{conf}->{use_auto_whitelist});
|
||||
|
||||
my $timer = $self->{main}->time_method("total_awl");
|
||||
|
||||
my $from = lc $pms->get('From:addr');
|
||||
# dbg("auto-whitelist: From: $from");
|
||||
return 0 unless $from =~ /\S/;
|
||||
|
||||
# find the earliest usable "originating IP". ignore private nets
|
||||
my $origip;
|
||||
foreach my $rly (reverse (@{$pms->{relays_trusted}}, @{$pms->{relays_untrusted}}))
|
||||
{
|
||||
next if ($rly->{ip_private});
|
||||
if ($rly->{ip}) {
|
||||
$origip = $rly->{ip}; last;
|
||||
}
|
||||
}
|
||||
|
||||
my $scores = $pms->{conf}->{scores};
|
||||
my $tflags = $pms->{conf}->{tflags};
|
||||
my $points = 0;
|
||||
my $signedby = $pms->get_tag('DKIMDOMAIN');
|
||||
undef $signedby if defined $signedby && $signedby eq '';
|
||||
|
||||
foreach my $test (@{$pms->{test_names_hit}}) {
|
||||
# ignore tests with 0 score in this scoreset,
|
||||
# or if the test is marked as "noautolearn"
|
||||
next if !$scores->{$test};
|
||||
next if exists $tflags->{$test} && $tflags->{$test} =~ /\bnoautolearn\b/;
|
||||
return 0 if exists $tflags->{$test} && $tflags->{$test} =~ /\bnoawl\b/;
|
||||
$points += $scores->{$test};
|
||||
}
|
||||
|
||||
my $awlpoints = (sprintf "%0.3f", $points) + 0;
|
||||
|
||||
# Create the AWL object
|
||||
my $whitelist;
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($pms->{main});
|
||||
|
||||
my $meanscore;
|
||||
{ # check
|
||||
my $timer = $self->{main}->time_method("check_awl");
|
||||
$meanscore = $whitelist->check_address($from, $origip, $signedby);
|
||||
}
|
||||
my $delta = 0;
|
||||
|
||||
dbg("auto-whitelist: AWL active, pre-score: %s, autolearn score: %s, ".
|
||||
"mean: %s, IP: %s, address: %s %s",
|
||||
$pms->{score}, $awlpoints,
|
||||
!defined $meanscore ? 'undef' : sprintf("%.3f",$meanscore),
|
||||
$origip || 'undef',
|
||||
$from, $signedby ? "signed by $signedby" : '(not signed)');
|
||||
|
||||
if (defined $meanscore) {
|
||||
$delta = $meanscore - $awlpoints;
|
||||
$delta *= $pms->{main}->{conf}->{auto_whitelist_factor};
|
||||
|
||||
$pms->set_tag('AWL', sprintf("%2.1f",$delta));
|
||||
if (defined $meanscore) {
|
||||
$pms->set_tag('AWLMEAN', sprintf("%2.1f", $meanscore));
|
||||
}
|
||||
$pms->set_tag('AWLCOUNT', sprintf("%2.1f", $whitelist->count()));
|
||||
$pms->set_tag('AWLPRESCORE', sprintf("%2.1f", $pms->{score}));
|
||||
}
|
||||
|
||||
# Update the AWL *before* adding the new score, otherwise
|
||||
# early high-scoring messages are reinforced compared to
|
||||
# later ones. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=159704
|
||||
if (!$pms->{disable_auto_learning}) {
|
||||
my $timer = $self->{main}->time_method("update_awl");
|
||||
$whitelist->add_score($awlpoints);
|
||||
}
|
||||
|
||||
# now redundant, got_hit() takes care of it
|
||||
# for my $set (0..3) { # current AWL score changes with each hit
|
||||
# $pms->{conf}->{scoreset}->[$set]->{"AWL"} = sprintf("%0.3f", $delta);
|
||||
# }
|
||||
|
||||
if ($delta != 0) {
|
||||
$pms->got_hit("AWL", "AWL: ", ruletype => 'eval',
|
||||
score => sprintf("%0.3f", $delta));
|
||||
}
|
||||
|
||||
$whitelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
# try an unlock, in case we got that far
|
||||
eval { $whitelist->finish(); } if $whitelist;
|
||||
return 0;
|
||||
};
|
||||
|
||||
dbg("auto-whitelist: post auto-whitelist score: %.3f", $pms->{score});
|
||||
|
||||
# test hit is above
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub blacklist_address {
|
||||
my ($self, $args) = @_;
|
||||
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_whitelist});
|
||||
|
||||
unless ($args->{address}) {
|
||||
print "SpamAssassin auto-whitelist: failed to add address to blacklist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: failed to add address to blacklist");
|
||||
return;
|
||||
}
|
||||
|
||||
my $whitelist;
|
||||
my $status;
|
||||
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($self->{main});
|
||||
|
||||
if ($whitelist->add_known_bad_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-whitelist: adding address to blacklist: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: adding address to blacklist: " . $args->{address});
|
||||
$status = 0;
|
||||
}
|
||||
else {
|
||||
print "SpamAssassin auto-whitelist: error adding address to blacklist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: error adding address to blacklist");
|
||||
$status = 1;
|
||||
}
|
||||
$whitelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
eval { $whitelist->finish(); };
|
||||
return 0;
|
||||
};
|
||||
|
||||
return $status;
|
||||
}
|
||||
|
||||
sub whitelist_address {
|
||||
my ($self, $args) = @_;
|
||||
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_whitelist});
|
||||
|
||||
unless ($args->{address}) {
|
||||
print "SpamAssassin auto-whitelist: failed to add address to whitelist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: failed to add address to whitelist");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $whitelist;
|
||||
my $status;
|
||||
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($self->{main});
|
||||
|
||||
if ($whitelist->add_known_good_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-whitelist: adding address to whitelist: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: adding address to whitelist: " . $args->{address});
|
||||
$status = 1;
|
||||
}
|
||||
else {
|
||||
print "SpamAssassin auto-whitelist: error adding address to whitelist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: error adding address to whitelist");
|
||||
$status = 0;
|
||||
}
|
||||
|
||||
$whitelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
eval { $whitelist->finish(); };
|
||||
return 0;
|
||||
};
|
||||
|
||||
return $status;
|
||||
}
|
||||
|
||||
sub remove_address {
|
||||
my ($self, $args) = @_;
|
||||
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_whitelist});
|
||||
|
||||
unless ($args->{address}) {
|
||||
print "SpamAssassin auto-whitelist: failed to remove address\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: failed to remove address");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $whitelist;
|
||||
my $status;
|
||||
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($self->{main});
|
||||
|
||||
if ($whitelist->remove_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-whitelist: removing address: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: removing address: " . $args->{address});
|
||||
$status = 1;
|
||||
}
|
||||
else {
|
||||
print "SpamAssassin auto-whitelist: error removing address\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: error removing address");
|
||||
$status = 0;
|
||||
}
|
||||
|
||||
$whitelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
eval { $whitelist->finish(); };
|
||||
return 0;
|
||||
};
|
||||
|
||||
return $status;
|
||||
}
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
173
upstream/lib/Mail/SpamAssassin/Plugin/AccessDB.pm
Normal file
173
upstream/lib/Mail/SpamAssassin/Plugin/AccessDB.pm
Normal file
@ -0,0 +1,173 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::AccessDB - check message against Access Database
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::AccessDB
|
||||
|
||||
header ACCESSDB eval:check_access_database('/etc/mail/access.db')
|
||||
describe ACCESSDB Message would have been caught by accessdb
|
||||
tflags ACCESSDB userconf
|
||||
score ACCESSDB 2
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Many MTAs support access databases, such as Sendmail, Postfix, etc.
|
||||
This plugin does similar checks to see whether a message would have
|
||||
been flagged.
|
||||
|
||||
The rule returns false if an entry isn't found, or the entry has a RHS of
|
||||
I<OK> or I<SKIP>.
|
||||
|
||||
The rule returns true if an entry exists and has a RHS of I<REJECT>, I<ERROR>,
|
||||
or I<DISCARD>.
|
||||
|
||||
Note: only the first word (split on non-word characters) of the RHS
|
||||
is checked, so C<error:5.7.1:...> means C<ERROR>.
|
||||
|
||||
B<AccessDB Pointers:>
|
||||
|
||||
http://www.faqs.org/docs/securing/chap22sec178.html
|
||||
http://www.postfix.org/access.5.html
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::AccessDB;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Fcntl;
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
use constant HAS_DB_FILE => eval { require DB_File; };
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule("check_access_database");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub check_access_database {
|
||||
my ($self, $pms, $path) = @_;
|
||||
|
||||
if (!HAS_DB_FILE) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
my %access;
|
||||
my %ok = map { $_ => 1 } qw/ OK SKIP /;
|
||||
my %bad = map { $_ => 1 } qw/ REJECT ERROR DISCARD /;
|
||||
|
||||
$path = $self->{main}->sed_path ($path);
|
||||
dbg("accessdb: tie-ing to DB file R/O in $path");
|
||||
if (tie %access,"DB_File",$path, O_RDONLY) {
|
||||
my @lookfor;
|
||||
|
||||
# Look for "From:" versions as well!
|
||||
foreach my $from ($pms->all_from_addrs()) {
|
||||
# $user."\@"
|
||||
# rotate through $domain and check
|
||||
my ($user,$domain) = split(/\@/, $from,2);
|
||||
push(@lookfor, "From:$from",$from);
|
||||
if ($user) {
|
||||
push(@lookfor, "From:$user\@", "$user\@");
|
||||
}
|
||||
if ($domain) {
|
||||
while ($domain =~ /\./) {
|
||||
push(@lookfor, "From:$domain", $domain);
|
||||
$domain =~ s/^[^.]*\.//;
|
||||
}
|
||||
push(@lookfor, "From:$domain", $domain);
|
||||
}
|
||||
}
|
||||
|
||||
# we can only match this if we have at least 1 untrusted header
|
||||
if ($pms->{num_relays_untrusted} > 0) {
|
||||
my $lastunt = $pms->{relays_untrusted}->[0];
|
||||
|
||||
# If there was a reverse lookup, use it in a lookup
|
||||
if (! $lastunt->{no_reverse_dns}) {
|
||||
my $rdns = $lastunt->{lc_rdns};
|
||||
while($rdns =~ /\./) {
|
||||
push(@lookfor, "From:$rdns", $rdns);
|
||||
$rdns =~ s/^[^.]*\.//;
|
||||
}
|
||||
push(@lookfor, "From:$rdns", $rdns);
|
||||
}
|
||||
|
||||
# do both IP and net (rotate over IP)
|
||||
my ($ip) = $lastunt->{ip};
|
||||
$ip =~ tr/0-9.//cd;
|
||||
while($ip =~ /\./) {
|
||||
push(@lookfor, "From:$ip", $ip);
|
||||
$ip =~ s/\.[^.]*$//;
|
||||
}
|
||||
push(@lookfor, "From:$ip", $ip);
|
||||
}
|
||||
|
||||
my $retval = 0;
|
||||
my %cache;
|
||||
foreach (@lookfor) {
|
||||
next if ($cache{$_}++);
|
||||
dbg("accessdb: looking for $_");
|
||||
|
||||
# Some systems put a null at the end of the key, most don't...
|
||||
my $result = $access{$_} || $access{"$_\000"} || next;
|
||||
|
||||
my ($type) = split(/\W/,$result);
|
||||
$type = uc $type;
|
||||
|
||||
if (exists $ok{$type}) {
|
||||
dbg("accessdb: hit OK: $type, $_");
|
||||
$retval = 0;
|
||||
last;
|
||||
}
|
||||
if (exists $bad{$type} || $type =~ /^\d+$/) {
|
||||
$retval = 1;
|
||||
dbg("accessdb: hit not-OK: $type, $_");
|
||||
}
|
||||
}
|
||||
|
||||
dbg("accessdb: untie-ing DB file $path");
|
||||
untie %access;
|
||||
|
||||
return $retval;
|
||||
}
|
||||
else {
|
||||
dbg("accessdb: cannot open accessdb $path R/O: $!");
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
1;
|
164
upstream/lib/Mail/SpamAssassin/Plugin/AntiVirus.pm
Normal file
164
upstream/lib/Mail/SpamAssassin/Plugin/AntiVirus.pm
Normal file
@ -0,0 +1,164 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
AntiVirus - simple anti-virus tests
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::AntiVirus
|
||||
|
||||
body MICROSOFT_EXECUTABLE eval:check_microsoft_executable()
|
||||
body MIME_SUSPECT_NAME eval:check_suspect_name()
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
The MICROSOFT_EXECUTABLE rule works by checking for 3 possibilities in
|
||||
the message in any application/* or text/* part in the message:
|
||||
|
||||
=over 4
|
||||
|
||||
=item - in text parts, look for a uuencoded executable start string
|
||||
|
||||
=item - in application parts, look for filenames ending in an executable extension
|
||||
|
||||
=item - in application parts, look for a base64 encoded executable start string
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::AntiVirus;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Util;
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule("check_microsoft_executable");
|
||||
$self->register_eval_rule("check_suspect_name");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub check_microsoft_executable {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
_check_attachments(@_) unless exists $pms->{antivirus_microsoft_exe};
|
||||
|
||||
return $pms->{antivirus_microsoft_exe};
|
||||
}
|
||||
|
||||
sub check_suspect_name {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
_check_attachments(@_) unless exists $pms->{antivirus_suspect_name};
|
||||
|
||||
return $pms->{antivirus_suspect_name};
|
||||
}
|
||||
|
||||
sub _check_attachments {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
$pms->{antivirus_microsoft_exe} = 0;
|
||||
$pms->{antivirus_suspect_name} = 0;
|
||||
|
||||
# MICROSOFT_EXECUTABLE triggered here
|
||||
foreach my $p ($pms->{msg}->find_parts(qr/./, 1)) {
|
||||
my ($ctype, $boundary, $charset, $name) =
|
||||
Mail::SpamAssassin::Util::parse_content_type($p->get_header('content-type'));
|
||||
|
||||
$name = lc($name || '');
|
||||
|
||||
my $cte = lc($p->get_header('content-transfer-encoding') || '');
|
||||
$ctype = lc $ctype;
|
||||
|
||||
if ($name && $name =~ /\.(?:ade|adp|asx|bas|bat|chm|cmd|com|cpl|crt|dll|exe|hlp|hta|inf|ins|isp|js|jse|lnk|mda|mdb|mde|mdt|mdw|mdz|msc|msi|msp|mst|nws|ops|pcd|pif|prf|reg|scf|scr\??|sct|shb|shs|shm|swf|url|vb|vbe|vbs|vbx|vxd|wsc|wsf|wsh)$/)
|
||||
{
|
||||
# file extension indicates an executable
|
||||
$pms->{antivirus_microsoft_exe} = 1;
|
||||
}
|
||||
elsif ($cte =~ /base64/ && defined $p->raw()->[0] &&
|
||||
$p->raw()->[0] =~ /^TV[opqr].A..[AB].[AQgw][A-H].A/)
|
||||
{
|
||||
# base64-encoded executable
|
||||
$pms->{antivirus_microsoft_exe} = 1;
|
||||
}
|
||||
elsif ($ctype =~ /^text\b/) {
|
||||
# uuencoded executable
|
||||
for (@{$p->raw()}) {
|
||||
if (/^M35[GHIJK].`..`..*````/) {
|
||||
# uuencoded executable
|
||||
$pms->{antivirus_microsoft_exe} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# MIME_SUSPECT_NAME triggered here
|
||||
if ($name && $ctype ne "application/octet-stream") {
|
||||
$name =~ s/.*\.//;
|
||||
$ctype =~ s@/(x-|vnd\.)@/@;
|
||||
|
||||
if (
|
||||
# text
|
||||
(($name =~ /^(?:txt|[px]?html?|xml)$/) &&
|
||||
($ctype !~ m@^(?:text/(?:plain|[px]?html?|english|sgml|xml|enriched|richtext)|message/external-body)@)) ||
|
||||
|
||||
# image
|
||||
(($name =~ /^(?:jpe?g|tiff?|gif|png)$/) &&
|
||||
($ctype !~ m@^(?:image/|application/mac-binhex)@)) ||
|
||||
|
||||
# vcard
|
||||
(($name eq "vcf") && $ctype ne "text/vcard") ||
|
||||
|
||||
# application
|
||||
(($name =~ /^(?:bat|com|exe|pif|scr|swf|vbs)$/) &&
|
||||
($ctype !~ m@^application/@)) ||
|
||||
|
||||
# msword
|
||||
(($name eq "doc") && ($ctype !~ m@^application/.*word$@)) ||
|
||||
|
||||
# powerpoint
|
||||
(($name eq "ppt") &&
|
||||
($ctype !~ m@^application/.*(?:powerpoint|ppt)$@)) ||
|
||||
|
||||
# excel
|
||||
(($name eq "xls") && ($ctype !~ m@^application/.*excel$@))
|
||||
)
|
||||
{
|
||||
$pms->{antivirus_suspect_name} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
1;
|
660
upstream/lib/Mail/SpamAssassin/Plugin/AskDNS.pm
Normal file
660
upstream/lib/Mail/SpamAssassin/Plugin/AskDNS.pm
Normal file
@ -0,0 +1,660 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
AskDNS - form a DNS query using tag values, and look up the DNSxL lists
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::AskDNS
|
||||
askdns D_IN_DWL _DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT /\b(transaction|list|all)\b/
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Using a DNS query template as specified in a parameter of a askdns rule,
|
||||
the plugin replaces tag names as found in the template with their values
|
||||
and launches DNS queries as soon as tag values become available. When DNS
|
||||
responses trickle in, filters them according to the requested DNS resource
|
||||
record type and optional subrule filtering expression, yielding a rule hit
|
||||
if a response meets filtering conditions.
|
||||
|
||||
=head1 USER SETTINGS
|
||||
|
||||
=over 4
|
||||
|
||||
=item rbl_timeout t [t_min] [zone] (default: 15 3)
|
||||
|
||||
The rbl_timeout setting is common to all DNS querying rules (as implemented
|
||||
by other plugins). It can specify a DNS query timeout globally, or individually
|
||||
for each zone. When the zone parameter is specified, the settings affects DNS
|
||||
queries when their query domain equals the specified zone, or is its subdomain.
|
||||
See the C<Mail::SpamAssassin::Conf> POD for details on C<rbl_timeout>.
|
||||
|
||||
=back
|
||||
|
||||
=head1 RULE DEFINITIONS
|
||||
|
||||
=over 4
|
||||
|
||||
=item askdns NAME_OF_RULE query_template [rr_type [subqueryfilter]]
|
||||
|
||||
A query template is a string which will be expanded to produce a domain name
|
||||
to be used in a DNS query. The template may include SpamAssassin tag names,
|
||||
which will be replaced by their values to form a final query domain.
|
||||
The final query domain must adhere to rules governing DNS domains, i.e.
|
||||
must consist of fields each up to 63 characters long, delimited by dots.
|
||||
There may be a trailing dot at the end, but it is redundant / carries
|
||||
no semantics, because SpamAssassin uses a Net::DSN::Resolver::send method
|
||||
for querying DNS, which ignores any 'search' or 'domain' DNS resolver options.
|
||||
Domain names in DNS queries are case-insensitive.
|
||||
|
||||
A tag name is a string of capital letters, preceded and followed by an
|
||||
underscore character. This syntax mirrors the add_header setting, except that
|
||||
tags cannot have parameters in parenthesis when used in askdns templates.
|
||||
Tag names may appear anywhere in the template - each queried DNS zone
|
||||
prescribes how a query should be formed.
|
||||
|
||||
A query template may contain any number of tag names including none,
|
||||
although in the most common anticipated scenario exactly one tag name would
|
||||
appear in each askdns rule. Specified tag names are considered dependencies.
|
||||
Askdns rules with dependencies on the same set of tags are grouped, and all
|
||||
queries in a group are launched as soon as all their dependencies are met,
|
||||
i.e. when the last of the awaited tag values becomes available by a call
|
||||
to set_tag() from some other plugin or elsewhere in the SpamAssassin code.
|
||||
|
||||
Launched queries from all askdns rules are grouped too according to a pair
|
||||
of: query type and an expanded query domain name. Even if there are multiple
|
||||
rules producing the same type/domain pair, only one DNS query is launched,
|
||||
and a reply to such query contributes to all the constituent rules.
|
||||
|
||||
A tag may produce none, one or multiple values. Askdns rules awaiting for
|
||||
a tag which never receives its value never result in a DNS query. Tags which
|
||||
produce multiple values will result in multiple queries launched, each with
|
||||
an expanded template using one of the tag values. An example is a DKIMDOMAIN
|
||||
tag which yields a list of signing domains, one for each valid signature in
|
||||
a signed message.
|
||||
|
||||
When more than one distinct tag name appears in a template, each potentially
|
||||
resulting in multiple values, a Cartesian product is formed, and each tuple
|
||||
results in a launch of one DNS query (duplicates excluded). For example,
|
||||
a query template _A_._B_.example._A_.com where tag A is a list (11,22)
|
||||
and B is (xx,yy,zz), will result in queries: 11.xx.example.11.com,
|
||||
22.xx.example.22.com, 11.yy.example.11.com, 22.yy.example.22.com,
|
||||
11.zz.example.11.com, 22.zz.example.22.com .
|
||||
|
||||
A parameter rr_type following the query template is a comma-separated list
|
||||
of expected DNS resource record (RR) types. Missing rr_type parameter implies
|
||||
an 'A'. A DNS result may bring resource records of multiple types, but only
|
||||
resource records of a type found in the rr_type parameter list are considered,
|
||||
other resource records found in the answer section of a DNS reply are ignored
|
||||
for this rule. A value ANY in the rr_type parameter list matches any resource
|
||||
record type. An empty DNS answer section does not match ANY.
|
||||
|
||||
The rr_type parameter not only provides a filter for RR types found in
|
||||
the DNS answer, but also determines the DNS query type. If only a single
|
||||
RR type is specified in the parameter (e.g. TXT), than this is also the RR
|
||||
type of a query. When more than one RR type is specified (e.g. A, AAAA, TXT)
|
||||
or if ANY is specified, then the DNS query type will be ANY and the rr_type
|
||||
parameter will only act as a filter on a result.
|
||||
|
||||
Currently recognized RR types in the rr_type parameter are: ANY, A, AAAA,
|
||||
MX, TXT, PTR, NAPTR, NS, SOA, CERT, CNAME, DNAME, DHCID, HINFO, MINFO,
|
||||
RP, HIP, IPSECKEY, KX, LOC, SRV, SSHFP, SPF.
|
||||
|
||||
https://www.iana.org/assignments/dns-parameters/dns-parameters.xml
|
||||
|
||||
The last optional parameter of a rule is a filtering expression, a.k.a. a
|
||||
subrule. Its function is much like the subrule in URIDNSBL plugin rules,
|
||||
or in the check_rbl eval rules. The main difference is that with askdns
|
||||
rules there is no need to manually group rules according to their queried
|
||||
zone, as the grouping is automatic and duplicate queries are implicitly
|
||||
eliminated.
|
||||
|
||||
The subrule filtering parameter can be: a plain string, a regular expression,
|
||||
a single numerical value or a pair of numerical values, or a list of rcodes
|
||||
(DNS status codes of a response). Absence of the filtering parameter implies
|
||||
no filtering, i.e. any positive DNS response (rcode=NOERROR) of the requested
|
||||
RR type will result in a rule hit, regardless of the RR value returned with
|
||||
the response.
|
||||
|
||||
When a plain string is used as a filter, it must be enclosed in single or
|
||||
double quotes. For the rule to hit, the response must match the filtering
|
||||
string exactly, and a RR type of a response must match the query type.
|
||||
Typical use is an exact text string for TXT queries, or an exact quad-dotted
|
||||
IPv4 address. In case of a TXT or SPF resource record which can return
|
||||
multiple character-strings (as defined in Section 3.3 of [RFC1035]), these
|
||||
strings are concatenated with no delimiters before comparing the result
|
||||
to the filtering string. This follows requirements of several documents,
|
||||
such as RFC 5518, RFC 7208, RFC 4871, RFC 5617. Examples of a plain text
|
||||
filtering parameter: "127.0.0.1", "transaction", 'list' .
|
||||
|
||||
A regular expression follows a familiar perl syntax like /.../ or m{...}
|
||||
optionally followed by regexp flags (such as 'i' for case-insensitivity).
|
||||
If a DNS response matches the requested RR type and the regular expression,
|
||||
the rule hits. Examples: /^127\.0\.0\.\d+$/, m{\bdial up\b}i .
|
||||
|
||||
A single numerical value can be a decimal number, or a hexadecimal number
|
||||
prefixed by 0x. Such numeric filtering expression is typically used with
|
||||
RR type-A DNS queries. The returned value (an IPv4 address) is masked
|
||||
with a specified filtering value and tested to fall within a 127.0.0.0/8
|
||||
network range - the rule hits if the result is nonzero:
|
||||
((r & n) != 0) && ((r & 0xff000000) == 0x7f000000). An example: 0x10 .
|
||||
|
||||
A pair of numerical values (each a decimal, hexadecimal or quad-dotted)
|
||||
delimited by a '-' specifies an IPv4 address range, and a pair of values
|
||||
delimited by a '/' specifies an IPv4 address followed by a bitmask. Again,
|
||||
this type of filtering expression is primarily intended with RR type-A
|
||||
DNS queries. The rule hits if the RR type matches, and the returned IP
|
||||
address falls within the specified range: (r >= n1 && r <= n2), or
|
||||
masked with a bitmask matches the specified value: (r & m) == (n & m) .
|
||||
|
||||
As a shorthand notation, a single quad-dotted value is equivalent to
|
||||
a n-n form, i.e. it must match the returned value exactly with all its bits.
|
||||
|
||||
Some typical examples of a numeric filtering parameter are: 127.0.1.2,
|
||||
127.0.1.20-127.0.1.39, 127.0.1.0/255.255.255.0, 0.0.0.16/0.0.0.16,
|
||||
0x10/0x10, 16, 0x10 .
|
||||
|
||||
Lastly, the filtering parameter can be a comma-separated list of DNS status
|
||||
codes (rcode), enclosed in square brackets. Rcodes can be represented either
|
||||
by their numeric decimal values (0=NOERROR, 3=NXDOMAIN, ...), or their names.
|
||||
See https://www.iana.org/assignments/dns-parameters for the list of names. When
|
||||
testing for a rcode where rcode is nonzero, a RR type parameter is ignored
|
||||
as a filter, as there is typically no answer section in a DNS reply when
|
||||
rcode indicates an error. Example: [NXDOMAIN], or [FormErr,ServFail,4,5] .
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::AskDNS;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Util qw(decode_dns_question_entry);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use version 0.77;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
our %rcode_value = ( # https://www.iana.org/assignments/dns-parameters, RFC 6195
|
||||
NOERROR => 0, FORMERR => 1, SERVFAIL => 2, NXDOMAIN => 3, NOTIMP => 4,
|
||||
REFUSED => 5, YXDOMAIN => 6, YXRRSET => 7, NXRRSET => 8, NOTAUTH => 9,
|
||||
NOTZONE => 10, BADVERS => 16, BADSIG => 16, BADKEY => 17, BADTIME => 18,
|
||||
BADMODE => 19, BADNAME => 20, BADALG => 21, BADTRUNC => 22,
|
||||
);
|
||||
|
||||
our $txtdata_can_provide_a_list;
|
||||
|
||||
sub new {
|
||||
my($class,$sa_main) = @_;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($sa_main);
|
||||
bless($self, $class);
|
||||
|
||||
$self->set_config($sa_main->{conf});
|
||||
|
||||
#$txtdata_can_provide_a_list = Net::DNS->VERSION >= 0.69;
|
||||
#more robust version check from Damyan Ivanov - Bug 7095
|
||||
$txtdata_can_provide_a_list = version->parse(Net::DNS->VERSION) >= version->parse('0.69');
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Accepts argument as a string in single or double quotes, or as a regular
|
||||
# expression in // or m{} notation, or as a numerical value or a pair of
|
||||
# numerical values, or as a bracketed and comma-separated list of DNS rcode
|
||||
# names or their numerical codes. Recognized numerical forms are: m, n1-n2,
|
||||
# or n/m, where n,n1,n2,m can be any of: decimal digits, 0x followed by
|
||||
# up to 8 hexadecimal digits, or an IPv4 address in quad-dotted notation.
|
||||
# The argument is checked for syntax, undef is returned on syntax errors.
|
||||
# A string that looks like a regular expression is converted to a compiled
|
||||
# Regexp object and returned as a result. Otherwise, numeric components of
|
||||
# the remaining three forms are converted as follows: hex or decimal numeric
|
||||
# strings are converted to a number and a quad-dot is converted to a number,
|
||||
# then components are reassembled into a string delimited by '-' or '/'.
|
||||
# As a special backward compatibility measure, a single quad-dot (with no
|
||||
# second number) is converted into n-n, to distinguish it from a traditional
|
||||
# mask-only form. A list or rcodes is returned as a hashref, where keys
|
||||
# represent specified numerical rcodes.
|
||||
#
|
||||
# Arguments like the following are anticipated:
|
||||
# "127.0.0.1", "some text", 'some "more" text',
|
||||
# /regexp/flags, m{regexp}flags,
|
||||
# 127.0.1.2 (same as 127.0.1.2-127.0.1.2 or 127.0.1.2/255.255.255.255)
|
||||
# 127.0.1.20-127.0.1.39 (= 0x7f000114-0x7f000127 or 2130706708-2130706727)
|
||||
# 0.0.0.16/0.0.0.16 (same as 0x10/0x10 or 16/0x10 or 16/16)
|
||||
# 16 (traditional style mask-only, same as 0x10)
|
||||
# [NXDOMAIN], [FormErr,ServFail,4,5]
|
||||
#
|
||||
sub parse_and_canonicalize_subtest {
|
||||
my($subtest) = @_;
|
||||
my $result;
|
||||
|
||||
local($1,$2,$3);
|
||||
# modifiers /a, /d, /l, /u in suffix form were added with perl 5.13.10 (5.14)
|
||||
# currently known modifiers are [msixoadlu], but let's not be too picky here
|
||||
if ( $subtest =~ m{^ / (.+) / ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* \( (.+) \) ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* \[ (.+) \] ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* \{ (.+) \} ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* < (.+) > ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* (\S) (.+) \1 ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ (["']) (.*) \1 \z}xs) { # quoted string
|
||||
$result = $2;
|
||||
} elsif ($subtest =~ m{^ \[ ( (?:[A-Z]+|\d+)
|
||||
(?: \s* , \s* (?:[A-Z]+|\d+) )* ) \] \z}xis) {
|
||||
# a comma-separated list of rcode names or their decimal values
|
||||
my @rcodes = split(/\s*,\s*/, uc $1);
|
||||
for (@rcodes) { $_ = $rcode_value{$_} if exists $rcode_value{$_} }
|
||||
return if grep(!/^\d+\z/, @rcodes);
|
||||
# a hashref indicates a list of DNS rcodes (stored as hash keys)
|
||||
$result = { map( ($_,1), @rcodes) };
|
||||
} elsif ($subtest =~ m{^ ([^/-]+) (?: ([/-]) (.+) )? \z}xs) {
|
||||
my($n1,$delim,$n2) = ($1,$2,$3);
|
||||
my $any_quad_dot;
|
||||
for ($n1,$n2) {
|
||||
if (!defined $_) {
|
||||
# ok, $n2 may not exist
|
||||
} elsif (/^\d{1,10}\z/) {
|
||||
$_ = 0 + $_; # decimal string -> number
|
||||
} elsif (/^0x[0-9a-zA-Z]{1,8}\z/) {
|
||||
$_ = hex($_); # hex string -> number
|
||||
} elsif (/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\z/) {
|
||||
$_ = Mail::SpamAssassin::Util::my_inet_aton($_); # quad-dot -> number
|
||||
$any_quad_dot = 1;
|
||||
} else {
|
||||
return;
|
||||
}
|
||||
}
|
||||
$result = defined $n2 ? $n1.$delim.$n2
|
||||
: $any_quad_dot ? $n1.'-'.$n1 : "$n1";
|
||||
}
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'askdns',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
code => sub {
|
||||
my($self, $key, $value, $line) = @_;
|
||||
local($1,$2,$3,$4);
|
||||
if (!defined $value || $value =~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
} elsif ($value !~ /^ (\S+) \s+ (\S+)
|
||||
(?: \s+ ([A-Za-z0-9,]+)
|
||||
(?: \s+ (.*?) )? )? \s* $/xs) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
} else {
|
||||
my($rulename,$query_template,$query_type,$subtest) = ($1,$2,$3,$4);
|
||||
$query_type = 'A' if !defined $query_type;
|
||||
$query_type = uc $query_type;
|
||||
my @answer_types = split(/,/, $query_type);
|
||||
# https://www.iana.org/assignments/dns-parameters/dns-parameters.xml
|
||||
if (grep(!/^(?:ANY|A|AAAA|MX|TXT|PTR|NAPTR|NS|SOA|CERT|CNAME|DNAME|
|
||||
DHCID|HINFO|MINFO|RP|HIP|IPSECKEY|KX|LOC|SRV|
|
||||
SSHFP|SPF)\z/x, @answer_types)) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$query_type = 'ANY' if @answer_types > 1 || $answer_types[0] eq 'ANY';
|
||||
if (defined $subtest) {
|
||||
$subtest = parse_and_canonicalize_subtest($subtest);
|
||||
defined $subtest or return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
# collect tag names as used in each query template
|
||||
my @tags = $query_template =~ /_([A-Z][A-Z0-9]*)_/g;
|
||||
my %seen; @tags = grep(!$seen{$_}++, @tags); # filter out duplicates
|
||||
|
||||
# group rules by tag names used in them (to be used as a hash key)
|
||||
my $depends_on_tags = !@tags ? '' : join(',',@tags);
|
||||
|
||||
# subgroup rules by a DNS RR type and a nonexpanded query template
|
||||
my $query_template_key = $query_type . ':' . $query_template;
|
||||
|
||||
$self->{askdns}{$depends_on_tags}{$query_template_key} ||=
|
||||
{ query => $query_template, rules => {}, q_type => $query_type,
|
||||
a_types => # optimization: undef means "same as q_type"
|
||||
@answer_types == 1 && $answer_types[0] eq $query_type ? undef
|
||||
: \@answer_types };
|
||||
$self->{askdns}{$depends_on_tags}{$query_template_key}{rules}{$rulename}
|
||||
= $subtest;
|
||||
# dbg("askdns: rule: %s, config dep: %s, domkey: %s, subtest: %s",
|
||||
# $rulename, $depends_on_tags, $query_template_key, $subtest);
|
||||
|
||||
# just define the test so that scores and lint works
|
||||
$self->{parser}->add_test($rulename, undef,
|
||||
$Mail::SpamAssassin::Conf::TYPE_EMPTY_TESTS);
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
# run as early as possible, launching DNS queries as soon as their
|
||||
# dependencies are fulfilled
|
||||
#
|
||||
sub parsed_metadata {
|
||||
my($self, $opts) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
return if !$pms->is_dns_available;
|
||||
$pms->{askdns_map_dnskey_to_rules} = {};
|
||||
|
||||
# walk through all collected askdns rules, obtain tag values whenever
|
||||
# they may become available, and launch DNS queries right after
|
||||
#
|
||||
for my $depends_on_tags (keys %{$conf->{askdns}}) {
|
||||
my @tags;
|
||||
@tags = split(/,/, $depends_on_tags) if $depends_on_tags ne '';
|
||||
|
||||
if (would_log("dbg","askdns")) {
|
||||
while ( my($query_template_key, $struct) =
|
||||
each %{$conf->{askdns}{$depends_on_tags}} ) {
|
||||
my($query_template, $query_type, $answer_types_ref, $rules) =
|
||||
@$struct{qw(query q_type a_types rules)};
|
||||
dbg("askdns: depend on tags %s, rules: %s ",
|
||||
$depends_on_tags, join(', ', keys %$rules));
|
||||
}
|
||||
}
|
||||
|
||||
if (!@tags) {
|
||||
# no dependencies on tags, just call directly
|
||||
$self->launch_queries($pms,$depends_on_tags);
|
||||
} else {
|
||||
# enqueue callback for tags needed
|
||||
$pms->action_depends_on_tags(@tags == 1 ? $tags[0] : \@tags,
|
||||
sub { my($pms,@args) = @_;
|
||||
$self->launch_queries($pms,$depends_on_tags) }
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# generate DNS queries - called for each set of rules
|
||||
# when their tag dependencies are met
|
||||
#
|
||||
sub launch_queries {
|
||||
my($self, $pms, $depends_on_tags) = @_;
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
my %tags;
|
||||
# obtain tag/value pairs of tags we depend upon in this set of rules
|
||||
if ($depends_on_tags ne '') {
|
||||
%tags = map( ($_,$pms->get_tag($_)), split(/,/,$depends_on_tags) );
|
||||
}
|
||||
dbg("askdns: preparing queries which depend on tags: %s",
|
||||
join(', ', map($_.' => '.$tags{$_}, keys %tags)));
|
||||
|
||||
# replace tag names in a query template with actual tag values
|
||||
# and launch DNS queries
|
||||
while ( my($query_template_key, $struct) =
|
||||
each %{$conf->{askdns}{$depends_on_tags}} ) {
|
||||
my($query_template, $query_type, $answer_types_ref, $rules) =
|
||||
@$struct{qw(query q_type a_types rules)};
|
||||
|
||||
my @rulenames = keys %$rules;
|
||||
if (grep($conf->{scores}->{$_}, @rulenames)) {
|
||||
dbg("askdns: query template %s, type %s, rules: %s",
|
||||
$query_template,
|
||||
!$answer_types_ref ? $query_type
|
||||
: $query_type.'/'.join(',',@$answer_types_ref),
|
||||
join(', ', @rulenames));
|
||||
} else {
|
||||
dbg("askdns: query template %s, type %s, all rules disabled: %s",
|
||||
$query_template, $query_type, join(', ', @rulenames));
|
||||
next;
|
||||
}
|
||||
|
||||
# collect all tag names from a template, each may occur more than once
|
||||
my @templ_tags = $query_template =~ /_([A-Z][A-Z0-9]*)_/gs;
|
||||
|
||||
# filter out duplicate tag names, and tags with undefined or empty value
|
||||
my %seen;
|
||||
@templ_tags = grep(!$seen{$_}++ && defined $tags{$_} && $tags{$_} ne '',
|
||||
@templ_tags);
|
||||
|
||||
my %templ_vals; # values that each tag takes
|
||||
for my $t (@templ_tags) {
|
||||
my %seen;
|
||||
# a tag value may be a space-separated list,
|
||||
# store it as an arrayref, removing duplicate values
|
||||
$templ_vals{$t} = [ grep(!$seen{$_}++, split(' ',$tags{$t})) ];
|
||||
}
|
||||
|
||||
# count through all tag value tuples
|
||||
my @digit = (0) x @templ_tags; # counting accumulator
|
||||
OUTER:
|
||||
for (;;) {
|
||||
my %current_tag_val; # maps a tag name to its current iteration value
|
||||
for my $j (0 .. $#templ_tags) {
|
||||
my $t = $templ_tags[$j];
|
||||
$current_tag_val{$t} = $templ_vals{$t}[$digit[$j]];
|
||||
}
|
||||
local $1;
|
||||
my $query_domain = $query_template;
|
||||
$query_domain =~ s{_([A-Z][A-Z0-9]*)_}
|
||||
{ defined $current_tag_val{$1} ? $current_tag_val{$1}
|
||||
: '' }ge;
|
||||
|
||||
# the $dnskey identifies this query in AsyncLoop's pending_lookups
|
||||
my $dnskey = join(':', 'askdns', $query_type, $query_domain);
|
||||
dbg("askdns: expanded query %s, dns key %s", $query_domain, $dnskey);
|
||||
|
||||
if ($query_domain eq '') {
|
||||
# ignore, just in case
|
||||
} else {
|
||||
if (!exists $pms->{askdns_map_dnskey_to_rules}{$dnskey}) {
|
||||
$pms->{askdns_map_dnskey_to_rules}{$dnskey} =
|
||||
[ [$query_type, $answer_types_ref, $rules] ];
|
||||
} else {
|
||||
push(@{$pms->{askdns_map_dnskey_to_rules}{$dnskey}},
|
||||
[$query_type, $answer_types_ref, $rules] );
|
||||
}
|
||||
# launch a new DNS query for $query_type and $query_domain
|
||||
my $ent = $pms->{async}->bgsend_and_start_lookup(
|
||||
$query_domain, $query_type, undef,
|
||||
{ key => $dnskey, zone => $query_domain },
|
||||
sub { my ($ent2,$pkt) = @_;
|
||||
$self->process_response_packet($pms, $ent2, $pkt, $dnskey) },
|
||||
master_deadline => $pms->{master_deadline} );
|
||||
# these rules are now underway; unless the rule hits, these will
|
||||
# not be considered "finished" until harvest_dnsbl_queries() completes
|
||||
$pms->register_async_rule_start($dnskey) if $ent;
|
||||
}
|
||||
|
||||
last if !@templ_tags;
|
||||
# increment accumulator, little-endian
|
||||
for (my $j = 0; ; $j++) {
|
||||
last if ++$digit[$j] <= $#{$templ_vals{$templ_tags[$j]}};
|
||||
$digit[$j] = 0; # and carry
|
||||
last OUTER if $j >= $#templ_tags;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
sub process_response_packet {
|
||||
my($self, $pms, $ent, $pkt, $dnskey) = @_;
|
||||
|
||||
my $conf = $pms->{conf};
|
||||
my %rulenames_hit;
|
||||
|
||||
# map a dnskey back to info on queries which caused this DNS lookup
|
||||
my $queries_ref = $pms->{askdns_map_dnskey_to_rules}{$dnskey};
|
||||
|
||||
my($header, @question, @answer, $qtype, $rcode);
|
||||
# NOTE: $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
if ($pkt) {
|
||||
@answer = $pkt->answer;
|
||||
$header = $pkt->header;
|
||||
@question = $pkt->question;
|
||||
$qtype = uc $question[0]->qtype if @question;
|
||||
$rcode = uc $header->rcode if $header; # 'NOERROR', 'NXDOMAIN', ...
|
||||
|
||||
# NOTE: qname is encoded in RFC 1035 zone format, decode it
|
||||
dbg("askdns: answer received, rcode %s, query %s, answer has %d records",
|
||||
$rcode,
|
||||
join(', ', map(join('/', decode_dns_question_entry($_)), @question)),
|
||||
scalar @answer);
|
||||
|
||||
if (defined $rcode && exists $rcode_value{$rcode}) {
|
||||
# Net::DNS return a rcode name for codes it knows about,
|
||||
# and returns a number for the rest; we deal with numbers from here on
|
||||
$rcode = $rcode_value{$rcode} if exists $rcode_value{$rcode};
|
||||
}
|
||||
}
|
||||
if (!@answer) {
|
||||
# a trick to make the following loop run at least once, so that we can
|
||||
# evaluate also rules which only care for rcode status
|
||||
@answer = ( undef );
|
||||
}
|
||||
|
||||
# NOTE: $rr->rdstring returns the result encoded in a DNS zone file
|
||||
# format, i.e. enclosed in double quotes if a result contains whitespace
|
||||
# (or other funny characters), and may use \DDD encoding or \X quoting as
|
||||
# per RFC 1035. Using $rr->txtdata instead avoids this unnecessary encoding
|
||||
# step and a need for decoding by a caller, returning an unmodified string.
|
||||
# Caveat: in case of multiple RDATA <character-string> fields contained
|
||||
# in a resource record (TXT, SPF, HINFO), starting with Net::DNS 0.69
|
||||
# the $rr->txtdata in a list context returns these strings as a list.
|
||||
# The $rr->txtdata in a scalar context always returns a single string
|
||||
# with <character-string> fields joined by a single space character as
|
||||
# a separator. The $rr->txtdata in Net::DNS 0.68 and older returned
|
||||
# such joined space-separated string even in a list context.
|
||||
|
||||
# RFC 5518: If the RDATA in a TXT record contains multiple
|
||||
# character-strings (as defined in Section 3.3 of [RFC1035]),
|
||||
# the code handling such reply from DNS MUST assemble all of these
|
||||
# marshaled text blocks into a single one before any syntactical
|
||||
# verification takes place.
|
||||
# The same goes for RFC 4408 (SPF), RFC 4871 (DKIM), RFC 5617 (ADSP),
|
||||
# draft-kucherawy-dmarc-base (DMARC), ...
|
||||
|
||||
for my $rr (@answer) {
|
||||
my($rr_rdatastr, $rdatanum, $rr_type);
|
||||
if (!$rr) {
|
||||
# special case, no answer records, only rcode can be tested
|
||||
} else {
|
||||
$rr_type = uc $rr->type;
|
||||
if ($rr_type eq 'A') {
|
||||
# Net::DNS::RR::A::address() is available since Net::DNS 0.69
|
||||
$rr_rdatastr = $rr->UNIVERSAL::can('address') ? $rr->address
|
||||
: $rr->rdatastr;
|
||||
if ($rr_rdatastr =~ m/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\z/) {
|
||||
$rdatanum = Mail::SpamAssassin::Util::my_inet_aton($rr_rdatastr);
|
||||
}
|
||||
|
||||
} elsif ($rr->UNIVERSAL::can('txtdata')) {
|
||||
# TXT, SPF: join with no intervening spaces, as per RFC 5518
|
||||
if ($txtdata_can_provide_a_list || $rr_type ne 'TXT') {
|
||||
$rr_rdatastr = join('', $rr->txtdata); # txtdata() in list context!
|
||||
} else { # char_str_list() is only available for TXT records
|
||||
$rr_rdatastr = join('', $rr->char_str_list); # historical
|
||||
}
|
||||
} else {
|
||||
# rdatastr() is historical, use rdstring() since Net::DNS 0.69
|
||||
$rr_rdatastr = $rr->UNIVERSAL::can('rdstring') ? $rr->rdstring
|
||||
: $rr->rdatastr;
|
||||
utf8::encode($rr_rdatastr) if utf8::is_utf8($rr_rdatastr);
|
||||
}
|
||||
# dbg("askdns: received rr type %s, data: %s", $rr_type, $rr_rdatastr);
|
||||
}
|
||||
|
||||
my $j = 0;
|
||||
for my $q_tuple (!ref $queries_ref ? () : @$queries_ref) {
|
||||
next if !$q_tuple;
|
||||
my($query_type, $answer_types_ref, $rules) = @$q_tuple;
|
||||
|
||||
next if !defined $qtype || $query_type ne $qtype;
|
||||
$answer_types_ref = [$query_type] if !defined $answer_types_ref;
|
||||
|
||||
# mark rule as done
|
||||
$pms->{askdns_map_dnskey_to_rules}{$dnskey}[$j++] = undef;
|
||||
|
||||
while (my($rulename,$subtest) = each %$rules) {
|
||||
my $match;
|
||||
local($1,$2,$3);
|
||||
if (ref $subtest eq 'HASH') { # a list of DNS rcodes (as hash keys)
|
||||
$match = 1 if $subtest->{$rcode};
|
||||
} elsif ($rcode != 0) {
|
||||
# skip remaining tests on DNS error
|
||||
} elsif (!defined($rr_type) ||
|
||||
!grep($_ eq 'ANY' || $_ eq $rr_type, @$answer_types_ref) ) {
|
||||
# skip remaining tests on wrong RR type
|
||||
} elsif (!defined $subtest) {
|
||||
$match = 1; # any valid response of the requested RR type matches
|
||||
} elsif (ref $subtest eq 'Regexp') { # a regular expression
|
||||
$match = 1 if $rr_rdatastr =~ $subtest;
|
||||
} elsif ($rr_rdatastr eq $subtest) { # exact equality
|
||||
$match = 1;
|
||||
} elsif (defined $rdatanum &&
|
||||
$subtest =~ m{^ (\d+) (?: ([/-]) (\d+) )? \z}x) {
|
||||
my($n1,$delim,$n2) = ($1,$2,$3);
|
||||
$match =
|
||||
!defined $n2 ? ($rdatanum & $n1) && # mask only
|
||||
(($rdatanum & 0xff000000) == 0x7f000000) # 127/8
|
||||
: $delim eq '-' ? $rdatanum >= $n1 && $rdatanum <= $n2 # range
|
||||
: $delim eq '/' ? ($rdatanum & $n2) == (int($n1) & $n2) # value/mask
|
||||
: 0; # notice int($n1) to fix perl ~5.14 taint bug (Bug 7725)
|
||||
}
|
||||
if ($match) {
|
||||
$self->askdns_hit($pms, $ent->{query_domain}, $qtype,
|
||||
$rr_rdatastr, $rulename);
|
||||
$rulenames_hit{$rulename} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
# these rules have completed (since they got at least 1 hit)
|
||||
$pms->register_async_rule_finish($_) for keys %rulenames_hit;
|
||||
}
|
||||
|
||||
sub askdns_hit {
|
||||
my($self, $pms, $query_domain, $qtype, $rr_rdatastr, $rulename) = @_;
|
||||
|
||||
$rr_rdatastr = '' if !defined $rr_rdatastr; # e.g. with rules testing rcode
|
||||
dbg('askdns: domain "%s" listed (%s): %s',
|
||||
$query_domain, $rulename, $rr_rdatastr);
|
||||
|
||||
# only the first hit will show in the test log report, even if
|
||||
# an answer section matches more than once - got_hit() handles this
|
||||
$pms->clear_test_state;
|
||||
$pms->test_log(sprintf("%s %s:%s", $query_domain,$qtype,$rr_rdatastr));
|
||||
$pms->got_hit($rulename, 'ASKDNS: ', ruletype => 'askdns'); # score=>$score
|
||||
}
|
||||
|
||||
1;
|
261
upstream/lib/Mail/SpamAssassin/Plugin/AutoLearnThreshold.pm
Normal file
261
upstream/lib/Mail/SpamAssassin/Plugin/AutoLearnThreshold.pm
Normal file
@ -0,0 +1,261 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::AutoLearnThreshold - threshold-based discriminator for Bayes auto-learning
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin implements the threshold-based auto-learning discriminator
|
||||
for SpamAssassin's Bayes subsystem. Auto-learning is a mechanism
|
||||
whereby high-scoring mails (or low-scoring mails, for non-spam) are fed
|
||||
into its learning systems without user intervention, during scanning.
|
||||
|
||||
Note that certain tests are ignored when determining whether a message
|
||||
should be trained upon:
|
||||
|
||||
=over 4
|
||||
|
||||
=item * rules with tflags set to 'learn' (the Bayesian rules)
|
||||
|
||||
=item * rules with tflags set to 'userconf' (user configuration)
|
||||
|
||||
=item * rules with tflags set to 'noautolearn'
|
||||
|
||||
=back
|
||||
|
||||
Also note that auto-learning occurs using scores from either scoreset 0
|
||||
or 1, depending on what scoreset is used during message check. It is
|
||||
likely that the message check and auto-learn scores will be different.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::AutoLearnThreshold;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
=head1 USER OPTIONS
|
||||
|
||||
The following configuration settings are used to control auto-learning:
|
||||
|
||||
=over 4
|
||||
|
||||
=item bayes_auto_learn_threshold_nonspam n.nn (default: 0.1)
|
||||
|
||||
The score threshold below which a mail has to score, to be fed into
|
||||
SpamAssassin's learning systems automatically as a non-spam message.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'bayes_auto_learn_threshold_nonspam',
|
||||
default => 0.1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=item bayes_auto_learn_threshold_spam n.nn (default: 12.0)
|
||||
|
||||
The score threshold above which a mail has to score, to be fed into
|
||||
SpamAssassin's learning systems automatically as a spam message.
|
||||
|
||||
Note: SpamAssassin requires at least 3 points from the header, and 3
|
||||
points from the body to auto-learn as spam. Therefore, the minimum
|
||||
working value for this option is 6.
|
||||
|
||||
If the test option autolearn_force is set, the minimum value will
|
||||
remain at 6 points but there is no requirement that the points come
|
||||
from body and header rules. This option is useful for autolearning
|
||||
with rules that are considered to be extremely safe indicators of
|
||||
the spaminess of a message.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'bayes_auto_learn_threshold_spam',
|
||||
default => 12.0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=item bayes_auto_learn_on_error (0 | 1) (default: 0)
|
||||
|
||||
With C<bayes_auto_learn_on_error> off, autolearning will be performed
|
||||
even if bayes classifier already agrees with the new classification (i.e.
|
||||
yielded BAYES_00 for what we are now trying to teach it as ham, or yielded
|
||||
BAYES_99 for spam). This is a traditional setting, the default was chosen
|
||||
to retain backward compatibility.
|
||||
|
||||
With C<bayes_auto_learn_on_error> turned on, autolearning will be performed
|
||||
only when a bayes classifier had a different opinion from what the autolearner
|
||||
is now trying to teach it (i.e. it made an error in judgement). This strategy
|
||||
may or may not produce better future classifications, but usually works
|
||||
very well, while also preventing unnecessary overlearning and slows down
|
||||
database growth.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'bayes_auto_learn_on_error',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub autolearn_discriminator {
|
||||
my ($self, $params) = @_;
|
||||
|
||||
my $scan = $params->{permsgstatus};
|
||||
my $conf = $scan->{conf};
|
||||
|
||||
# Figure out min/max for autolearning.
|
||||
# Default to specified auto_learn_threshold settings
|
||||
my $min = $conf->{bayes_auto_learn_threshold_nonspam};
|
||||
my $max = $conf->{bayes_auto_learn_threshold_spam};
|
||||
|
||||
# Find out what score we should consider this message to have ...
|
||||
my $score = $scan->get_autolearn_points();
|
||||
my $body_only_points = $scan->get_body_only_points();
|
||||
my $head_only_points = $scan->get_head_only_points();
|
||||
my $learned_points = $scan->get_learned_points();
|
||||
|
||||
# find out if any of the tests added an autolearn_force status
|
||||
my $force_autolearn = $scan->get_autolearn_force_status();
|
||||
my $force_autolearn_names = $scan->get_autolearn_force_names();
|
||||
|
||||
dbg("learn: auto-learn? ham=$min, spam=$max, ".
|
||||
"body-points=".$body_only_points.", ".
|
||||
"head-points=".$head_only_points.", ".
|
||||
"learned-points=".$learned_points);
|
||||
|
||||
my $isspam;
|
||||
if ($score < $min) {
|
||||
$isspam = 0;
|
||||
} elsif ($score >= $max) {
|
||||
$isspam = 1;
|
||||
} else {
|
||||
dbg("learn: auto-learn? no: inside auto-learn thresholds, not considered ham or spam");
|
||||
return;
|
||||
}
|
||||
|
||||
my $learner_said_ham_points = -1.0;
|
||||
my $learner_said_spam_points = 1.0;
|
||||
|
||||
if ($isspam) {
|
||||
my $required_body_points = 3;
|
||||
my $required_head_points = 3;
|
||||
|
||||
#Set a lower threshold of "just has to be spam" if autolearn_force was set on a rule
|
||||
if ($force_autolearn) {
|
||||
$required_body_points = -99;
|
||||
$required_head_points = -99;
|
||||
dbg("learn: auto-learn: autolearn_force flagged for a rule. Removing separate body and head point threshold. Body Only Points: $body_only_points ($required_body_points req'd) / Head Only Points: $head_only_points ($required_head_points req'd)");
|
||||
dbg("learn: auto-learn: autolearn_force flagged because of rule(s): $force_autolearn_names");
|
||||
} else {
|
||||
dbg("learn: auto-learn: autolearn_force not flagged for a rule. Body Only Points: $body_only_points ($required_body_points req'd) / Head Only Points: $head_only_points ($required_head_points req'd)");
|
||||
}
|
||||
|
||||
if ($body_only_points < $required_body_points) {
|
||||
dbg("learn: auto-learn? no: scored as spam but too few body points (".
|
||||
$body_only_points." < ".$required_body_points.")");
|
||||
return;
|
||||
}
|
||||
if ($head_only_points < $required_head_points) {
|
||||
dbg("learn: auto-learn? no: scored as spam but too few head points (".
|
||||
$head_only_points." < ".$required_head_points.")");
|
||||
return;
|
||||
}
|
||||
if ($learned_points < $learner_said_ham_points) {
|
||||
dbg("learn: auto-learn? no: scored as spam but learner indicated ham (".
|
||||
$learned_points." < ".$learner_said_ham_points.")");
|
||||
return;
|
||||
}
|
||||
|
||||
if (!$scan->is_spam()) {
|
||||
dbg("learn: auto-learn? no: scored as ham but autolearn wanted spam");
|
||||
return;
|
||||
}
|
||||
|
||||
} else {
|
||||
if ($learned_points > $learner_said_spam_points) {
|
||||
dbg("learn: auto-learn? no: scored as ham but learner indicated spam (".
|
||||
$learned_points." > ".$learner_said_spam_points.")");
|
||||
return;
|
||||
}
|
||||
|
||||
if ($scan->is_spam()) {
|
||||
dbg("learn: auto-learn? no: scored as spam but autolearn wanted ham");
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
if ($conf->{bayes_auto_learn_on_error}) {
|
||||
# learn-on-error strategy chosen:
|
||||
# only allow learning if the autolearning classifier was unsure or
|
||||
# had a different opinion from what we are trying to make it learn
|
||||
#
|
||||
my $tests = $scan->get_tag('TESTS');
|
||||
if (defined $tests && $tests ne 'none') {
|
||||
my %t = map { ($_,1) } split(/,/, $tests);
|
||||
if ($isspam && $t{'BAYES_99'} || !$isspam && $t{'BAYES_00'}) {
|
||||
dbg("learn: auto-learn? no: learn-on-error, %s, already classified ".
|
||||
"as such", $isspam ? 'spam' : 'ham');
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
dbg("learn: auto-learn? yes, ".($isspam?"spam ($score > $max)":"ham ($score < $min)")." autolearn_force=".($force_autolearn?"yes":"no"));
|
||||
|
||||
#Return an array reference because call_plugins only carry's one return value
|
||||
return [$isspam, $force_autolearn, $force_autolearn_names];
|
||||
}
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
1725
upstream/lib/Mail/SpamAssassin/Plugin/Bayes.pm
Normal file
1725
upstream/lib/Mail/SpamAssassin/Plugin/Bayes.pm
Normal file
File diff suppressed because it is too large
Load Diff
301
upstream/lib/Mail/SpamAssassin/Plugin/BodyEval.pm
Normal file
301
upstream/lib/Mail/SpamAssassin/Plugin/BodyEval.pm
Normal file
@ -0,0 +1,301 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Plugin::BodyEval;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("multipart_alternative_difference");
|
||||
$self->register_eval_rule("multipart_alternative_difference_count");
|
||||
$self->register_eval_rule("check_blank_line_ratio");
|
||||
$self->register_eval_rule("tvd_vertical_words");
|
||||
$self->register_eval_rule("check_stock_info");
|
||||
$self->register_eval_rule("check_body_length");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub multipart_alternative_difference {
|
||||
my ($self, $pms, $fulltext, $min, $max) = @_;
|
||||
|
||||
$self->_multipart_alternative_difference($pms) unless (exists $pms->{madiff});
|
||||
|
||||
if (($min == 0 || $pms->{madiff} > $min) &&
|
||||
($max eq "undef" || $pms->{madiff} <= $max)) {
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub multipart_alternative_difference_count {
|
||||
my ($self, $pms, $fulltext, $ratio, $minhtml) = @_;
|
||||
$self->_multipart_alternative_difference($pms) unless (exists $pms->{madiff});
|
||||
return 0 unless $pms->{madiff_html} > $minhtml;
|
||||
return(($pms->{madiff_text} / $pms->{madiff_html}) > $ratio);
|
||||
}
|
||||
|
||||
sub _multipart_alternative_difference {
|
||||
my ($self, $pms) = @_;
|
||||
$pms->{madiff} = 0;
|
||||
$pms->{madiff_html} = 0;
|
||||
$pms->{madiff_text} = 0;
|
||||
|
||||
my $msg = $pms->{msg};
|
||||
|
||||
# Find all multipart/alternative parts in the message
|
||||
my @ma = $msg->find_parts(qr@^multipart/alternative\b@i);
|
||||
|
||||
# If there are no multipart/alternative sections, skip this test.
|
||||
return if (!@ma);
|
||||
|
||||
# Figure out what the MIME content of the message looks like
|
||||
my @content = $msg->content_summary();
|
||||
|
||||
# Exchange meeting requests come in as m/a text/html text/calendar,
|
||||
# which we want to ignore because of the high FP rate it would cause.
|
||||
#
|
||||
if (@content == 3 && $content[2] eq 'text/calendar' &&
|
||||
$content[1] eq 'text/html' &&
|
||||
$content[0] eq 'multipart/alternative') {
|
||||
return;
|
||||
}
|
||||
|
||||
# Go through each of the multipart parts
|
||||
foreach my $part (@ma) {
|
||||
my %html;
|
||||
my %text;
|
||||
|
||||
# limit our search to text-based parts
|
||||
my @txt = $part->find_parts(qr@^text\b@i);
|
||||
foreach my $text (@txt) {
|
||||
# we only care about the rendered version of the part
|
||||
my ($type, $rnd) = $text->rendered();
|
||||
next unless defined $type;
|
||||
|
||||
# parse the rendered text into tokens. assume they are whitespace
|
||||
# separated, and ignore anything that doesn't have a word-character
|
||||
# in it (0-9a-zA-Z_) since those are probably things like bullet
|
||||
# points, horizontal lines, etc. this assumes that punctuation
|
||||
# in one part will be the same in other parts.
|
||||
#
|
||||
if ($type eq 'text/html') {
|
||||
foreach my $w (grep(/\w/, split(/\s+/, $rnd))) {
|
||||
#dbg("eval: HTML: $w");
|
||||
$html{$w}++;
|
||||
}
|
||||
|
||||
# If there are no words, mark if there's at least 1 image ...
|
||||
if (!%html && exists $pms->{html}{inside}{img}) {
|
||||
# Use "\n" as the mark since it can't ever occur normally
|
||||
$html{"\n"}=1;
|
||||
}
|
||||
}
|
||||
else {
|
||||
foreach my $w (grep(/\w/, split(/\s+/, $rnd))) {
|
||||
#dbg("eval: TEXT: $w");
|
||||
$text{$w}++;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# How many HTML tokens do we have at the start?
|
||||
my $orig = keys %html;
|
||||
next if ($orig == 0);
|
||||
|
||||
$pms->{madiff_html} = $orig;
|
||||
$pms->{madiff_text} = keys %text;
|
||||
dbg('eval: text words: ' . $pms->{madiff_text} . ', html words: ' . $pms->{madiff_html});
|
||||
|
||||
# If the token appears at least as many times in the text part as
|
||||
# in the html part, remove it from the list of html tokens.
|
||||
while(my ($k,$v) = each %text) {
|
||||
delete $html{$k} if (exists $html{$k} && $html{$k}-$text{$k} < 1);
|
||||
}
|
||||
|
||||
#map { dbg("eval: LEFT: $_") } keys %html;
|
||||
|
||||
# In theory, the tokens should be the same in both text and html
|
||||
# parts, so there would be 0 tokens left in the html token list, for
|
||||
# a 0% difference rate. Calculate it here, and record the difference
|
||||
# if it's been the highest so far in this message.
|
||||
my $diff = scalar(keys %html)/$orig*100;
|
||||
$pms->{madiff} = $diff if ($diff > $pms->{madiff});
|
||||
|
||||
dbg("eval: " . sprintf "madiff: left: %d, orig: %d, max-difference: %0.2f%%", scalar(keys %html), $orig, $pms->{madiff});
|
||||
}
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
sub check_blank_line_ratio {
|
||||
my ($self, $pms, $fulltext, $min, $max, $minlines) = @_;
|
||||
|
||||
if (!defined $minlines || $minlines < 1) {
|
||||
$minlines = 1;
|
||||
}
|
||||
|
||||
my $blank_line_ratio_ref = $pms->{blank_line_ratio};
|
||||
|
||||
if (! exists $blank_line_ratio_ref->{$minlines}) {
|
||||
$fulltext = $pms->get_decoded_body_text_array();
|
||||
|
||||
my $blank = 0;
|
||||
my $nlines = 0;
|
||||
foreach my $chunk (@$fulltext) {
|
||||
foreach (split(/^/m, $chunk, -1)) {
|
||||
$nlines++;
|
||||
$blank++ if !/\S/;
|
||||
}
|
||||
}
|
||||
|
||||
# report -1 if it's a blank message ...
|
||||
$blank_line_ratio_ref->{$minlines} =
|
||||
$nlines < $minlines ? -1 : 100 * $blank / $nlines;
|
||||
}
|
||||
|
||||
return (($min == 0 && $blank_line_ratio_ref->{$minlines} <= $max) ||
|
||||
($blank_line_ratio_ref->{$minlines} > $min &&
|
||||
$blank_line_ratio_ref->{$minlines} <= $max));
|
||||
}
|
||||
|
||||
sub tvd_vertical_words {
|
||||
my ($self, $pms, $text, $min, $max) = @_;
|
||||
|
||||
# klugy
|
||||
$max = 101 if ($max >= 100);
|
||||
|
||||
if (!defined $pms->{tvd_vertical_words}) {
|
||||
$pms->{tvd_vertical_words} = -1;
|
||||
|
||||
foreach (@{$text}) {
|
||||
my $l = length $_;
|
||||
next unless ($l > 5);
|
||||
my $spaces = tr/ / /;
|
||||
my $nonspaces = $l - $spaces;
|
||||
my $pct;
|
||||
if ($spaces > $nonspaces || $nonspaces == 0) {
|
||||
$pct = 100;
|
||||
}
|
||||
else {
|
||||
$pct = int(100*$spaces/$nonspaces);
|
||||
}
|
||||
$pms->{tvd_vertical_words} = $pct if ($pct > $pms->{tvd_vertical_words});
|
||||
}
|
||||
}
|
||||
|
||||
dbg("eval: tvd_vertical_words value: $pms->{tvd_vertical_words} / min: $min / max: $max - value must be >= min and < max");
|
||||
return 1 if ($pms->{tvd_vertical_words} >= $min && $pms->{tvd_vertical_words} < $max);
|
||||
}
|
||||
|
||||
sub check_stock_info {
|
||||
my ($self, $pms, $fulltext, $min) = @_;
|
||||
|
||||
$self->_check_stock_info($pms) unless (exists $pms->{stock_info});
|
||||
|
||||
if ($min == 0 || $pms->{stock_info} >= $min) {
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _check_stock_info {
|
||||
my ($self, $pms) = @_;
|
||||
$pms->{stock_info} = 0;
|
||||
|
||||
# Find all multipart/alternative parts in the message
|
||||
my @parts = $pms->{msg}->find_parts(qr@^text/plain$@i);
|
||||
return if (!@parts);
|
||||
|
||||
# Go through each of the multipart parts
|
||||
my %hits;
|
||||
my $part = $parts[0];
|
||||
my ($type, $rnd) = $part->rendered();
|
||||
return unless $type;
|
||||
|
||||
# bug 5644,5717: avoid pathological cases where a regexp takes massive amount
|
||||
# of time by applying the regexp to limited-size text chunks, one at a time
|
||||
|
||||
foreach my $rnd_chunk (
|
||||
Mail::SpamAssassin::Message::split_into_array_of_short_paragraphs($rnd))
|
||||
{
|
||||
foreach ( $rnd_chunk =~ /^\s*([^:\s][^:\n]{2,29})\s*:\s*\S/mg ) {
|
||||
my $str = lc $_;
|
||||
$str =~ tr/a-z//cd;
|
||||
#$str =~ s/([a-z])0([a-z])/$1o$2/g;
|
||||
|
||||
if ($str =~ /(
|
||||
^trad(?:e|ing)date|
|
||||
company(?:name)?|
|
||||
s\w?(?:t\w?o\w?c\w?k|y\w?m(?:\w?b\w?o\w?l)?)|
|
||||
t(?:arget|icker)|
|
||||
(?:opening|current)p(?:rice)?|
|
||||
p(?:rojected|osition)|
|
||||
expectations|
|
||||
weeks?high|
|
||||
marketperformance|
|
||||
(?:year|week|month|day|price)(?:target|estimates?)|
|
||||
sector|
|
||||
r(?:ecommendation|ating)
|
||||
)$/x) {
|
||||
$hits{$1}++;
|
||||
dbg("eval: stock info hit: $1");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
$pms->{stock_info} = scalar keys %hits;
|
||||
dbg("eval: stock info total: ".$pms->{stock_info});
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
sub check_body_length {
|
||||
my ($self, $pms, undef, $min) = @_;
|
||||
|
||||
my $body_length = $pms->{msg}->{pristine_body_length};
|
||||
dbg("eval: body_length - %s - check for min of %s", $body_length, $min);
|
||||
|
||||
return (defined $body_length && $body_length <= $min) ? 1 : 0;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# capability checks for "if can()":
|
||||
#
|
||||
sub has_check_body_length { 1 }
|
||||
|
||||
1;
|
1138
upstream/lib/Mail/SpamAssassin/Plugin/BodyRuleBaseExtractor.pm
Normal file
1138
upstream/lib/Mail/SpamAssassin/Plugin/BodyRuleBaseExtractor.pm
Normal file
File diff suppressed because it is too large
Load Diff
1389
upstream/lib/Mail/SpamAssassin/Plugin/Check.pm
Normal file
1389
upstream/lib/Mail/SpamAssassin/Plugin/Check.pm
Normal file
File diff suppressed because it is too large
Load Diff
1092
upstream/lib/Mail/SpamAssassin/Plugin/DCC.pm
Normal file
1092
upstream/lib/Mail/SpamAssassin/Plugin/DCC.pm
Normal file
File diff suppressed because it is too large
Load Diff
1329
upstream/lib/Mail/SpamAssassin/Plugin/DKIM.pm
Normal file
1329
upstream/lib/Mail/SpamAssassin/Plugin/DKIM.pm
Normal file
File diff suppressed because it is too large
Load Diff
662
upstream/lib/Mail/SpamAssassin/Plugin/DNSEval.pm
Normal file
662
upstream/lib/Mail/SpamAssassin/Plugin/DNSEval.pm
Normal file
@ -0,0 +1,662 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
DNSEVAL - look up URLs against DNS blocklists
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::DNSEval
|
||||
|
||||
rbl_headers EnvelopeFrom,Reply-To,Disposition-Notification-To
|
||||
header RBL_IP eval:check_rbl_headers('rbl', 'rbl.example.com.', '127.0.0.2')
|
||||
describe RBL_IP From address associated with spam domains
|
||||
tflags RBL_IP net
|
||||
reuse RBL_IP
|
||||
|
||||
Supported extra tflags from SpamAssassin 3.4.3:
|
||||
domains_only - only non-IP-address "host" components are queried
|
||||
ips_only - only IP addresses as the "host" component will be queried
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
The DNSEval plugin queries dns to see if a domain or an ip address
|
||||
present on one of email's headers is on a particular rbl.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::DNSEval;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use Mail::SpamAssassin::Util qw(reverse_ip_address);
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
# this is done this way so that the same list can be used here and in
|
||||
# check_start()
|
||||
$self->{'evalrules'} = [
|
||||
'check_rbl_accreditor',
|
||||
'check_rbl',
|
||||
'check_rbl_ns_from',
|
||||
'check_rbl_txt',
|
||||
'check_rbl_sub',
|
||||
'check_rbl_results_for',
|
||||
'check_rbl_from_host',
|
||||
'check_rbl_from_domain',
|
||||
'check_rbl_envfrom',
|
||||
'check_rbl_headers',
|
||||
'check_rbl_rcvd',
|
||||
'check_dns_sender',
|
||||
];
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
foreach(@{$self->{'evalrules'}}) {
|
||||
$self->register_eval_rule($_);
|
||||
}
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
=head1 USER PREFERENCES
|
||||
|
||||
The following options can be used in both site-wide (C<local.cf>) and
|
||||
user-specific (C<user_prefs>) configuration files to customize how
|
||||
SpamAssassin handles incoming email messages.
|
||||
|
||||
=over
|
||||
|
||||
=item rbl_headers
|
||||
|
||||
This option tells SpamAssassin in which headers to check for content
|
||||
used to query the specified rbl.
|
||||
If on the headers content there is an email address, an ip address
|
||||
or a domain name, it will be checked on the specified rbl.
|
||||
The configuration option can be overridden by passing an headers list as
|
||||
last parameter to check_rbl_headers.
|
||||
The default headers checked are:
|
||||
|
||||
=back
|
||||
|
||||
=over
|
||||
|
||||
=item *
|
||||
|
||||
EnvelopeFrom
|
||||
|
||||
=item *
|
||||
|
||||
Reply-To
|
||||
|
||||
=item *
|
||||
|
||||
Disposition-Notification-To
|
||||
|
||||
=item *
|
||||
|
||||
X-WebmailclientIP
|
||||
|
||||
=item *
|
||||
|
||||
X-Source-IP
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
push(@cmds, {
|
||||
setting => 'rbl_headers',
|
||||
default => 'EnvelopeFrom,Reply-To,Disposition-Notification-To,X-WebmailclientIP,X-Source-IP',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
}
|
||||
);
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
# this is necessary because PMS::run_rbl_eval_tests() calls these functions
|
||||
# directly as part of PMS
|
||||
sub check_start {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
foreach(@{$self->{'evalrules'}}) {
|
||||
$opts->{'permsgstatus'}->register_plugin_eval_glue($_);
|
||||
}
|
||||
}
|
||||
|
||||
sub ip_list_uniq_and_strip_private {
|
||||
my ($self, @origips) = @_;
|
||||
my @ips;
|
||||
my %seen;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
foreach my $ip (@origips) {
|
||||
next unless $ip;
|
||||
next if (exists ($seen{$ip})); $seen{$ip} = 1;
|
||||
next if ($ip =~ /$IP_PRIVATE/o);
|
||||
push(@ips, $ip);
|
||||
}
|
||||
return @ips;
|
||||
}
|
||||
|
||||
# check an RBL if the message contains an "accreditor assertion,"
|
||||
# that is, the message contains the name of a service that will vouch
|
||||
# for their practices.
|
||||
#
|
||||
sub check_rbl_accreditor {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest, $accreditor) = @_;
|
||||
|
||||
if (!defined $pms->{accreditor_tag}) {
|
||||
$self->message_accreditor_tag($pms);
|
||||
}
|
||||
if ($pms->{accreditor_tag}->{$accreditor}) {
|
||||
$self->check_rbl_backend($pms, $rule, $set, $rbl_server, 'A', $subtest);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
# Check for an Accreditor Assertion within the message, that is, the name of
|
||||
# a third-party who will vouch for the sender's practices. The accreditor
|
||||
# can be asserted in the EnvelopeFrom like this:
|
||||
#
|
||||
# listowner@a--accreditor.mail.example.com
|
||||
#
|
||||
# or in an 'Accreditor" Header field, like this:
|
||||
#
|
||||
# Accreditor: accreditor1, parm=value; accreditor2, parm-value
|
||||
#
|
||||
# This implementation supports multiple accreditors, but ignores any
|
||||
# parameters in the header field.
|
||||
#
|
||||
sub message_accreditor_tag {
|
||||
my ($self, $pms) = @_;
|
||||
my %acctags;
|
||||
|
||||
if ($pms->get('EnvelopeFrom:addr') =~ /[@.]a--([a-z0-9]{3,})\./i) {
|
||||
(my $tag = $1) =~ tr/A-Z/a-z/;
|
||||
$acctags{$tag} = -1;
|
||||
}
|
||||
my $accreditor_field = $pms->get('Accreditor',undef);
|
||||
if (defined $accreditor_field) {
|
||||
my @accreditors = split(/,/, $accreditor_field);
|
||||
foreach my $accreditor (@accreditors) {
|
||||
my @terms = split(' ', $accreditor);
|
||||
if ($#terms >= 0) {
|
||||
my $tag = $terms[0];
|
||||
$tag =~ tr/A-Z/a-z/;
|
||||
$acctags{$tag} = -1;
|
||||
}
|
||||
}
|
||||
}
|
||||
$pms->{accreditor_tag} = \%acctags;
|
||||
}
|
||||
|
||||
sub check_rbl_backend {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $type, $subtest) = @_;
|
||||
local ($_);
|
||||
|
||||
# First check that DNS is available, if not do not perform this check
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
|
||||
if (($rbl_server !~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) &&
|
||||
(index($rbl_server, '.') >= 0) &&
|
||||
($rbl_server !~ /\.$/)) {
|
||||
$rbl_server .= ".";
|
||||
}
|
||||
|
||||
dbg("dns: checking RBL $rbl_server, set $set");
|
||||
|
||||
# ok, make a list of all the IPs in the untrusted set
|
||||
my @fullips = map { $_->{ip} } @{$pms->{relays_untrusted}};
|
||||
|
||||
# now, make a list of all the IPs in the external set, for use in
|
||||
# notfirsthop testing. This will often be more IPs than found
|
||||
# in @fullips. It includes the IPs that are trusted, but
|
||||
# not in internal_networks.
|
||||
my @fullexternal = map {
|
||||
(!$_->{internal}) ? ($_->{ip}) : ()
|
||||
} @{$pms->{relays_trusted}};
|
||||
push (@fullexternal, @fullips); # add untrusted set too
|
||||
|
||||
# Make sure a header significantly improves results before adding here
|
||||
# X-Sender-Ip: could be worth using (very low occurence for me)
|
||||
# X-Sender: has a very low bang-for-buck for me
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my @originating;
|
||||
for my $header (@{$pms->{conf}->{originating_ip_headers}}) {
|
||||
my $str = $pms->get($header,undef);
|
||||
next unless defined $str && $str ne '';
|
||||
push (@originating, ($str =~ m/($IP_ADDRESS)/g));
|
||||
}
|
||||
|
||||
# Let's go ahead and trim away all private ips (KLC)
|
||||
# also uniq the list and strip dups. (jm)
|
||||
my @ips = $self->ip_list_uniq_and_strip_private(@fullips);
|
||||
|
||||
# if there's no untrusted IPs, it means we trust all the open-internet
|
||||
# relays, so we can return right now.
|
||||
return 0 unless (scalar @ips + scalar @originating > 0);
|
||||
|
||||
dbg("dns: IPs found: full-external: ".join(", ", @fullexternal).
|
||||
" untrusted: ".join(", ", @ips).
|
||||
" originating: ".join(", ", @originating));
|
||||
|
||||
my $trusted = $self->{main}->{conf}->{trusted_networks};
|
||||
|
||||
# If name is foo-notfirsthop, check all addresses except for
|
||||
# the originating one. Suitable for use with dialup lists, like the PDL.
|
||||
# note that if there's only 1 IP in the untrusted set, do NOT pop the
|
||||
# list, since it'd remove that one, and a legit user is supposed to
|
||||
# use their SMTP server (ie. have at least 1 more hop)!
|
||||
# If name is foo-lastexternal, check only the Received header just before
|
||||
# it enters our internal networks; we can trust it and it's the one that
|
||||
# passed mail between networks
|
||||
if ($set =~ /-(notfirsthop|lastexternal)$/)
|
||||
{
|
||||
# use the external IP set, instead of the trusted set; the user may have
|
||||
# specified some third-party relays as trusted. Also, don't use
|
||||
# @originating; those headers are added by a phase of relaying through
|
||||
# a server like Hotmail, which is not going to be in dialup lists anyway.
|
||||
@ips = $self->ip_list_uniq_and_strip_private(@fullexternal);
|
||||
if ($1 eq "lastexternal") {
|
||||
@ips = (defined $ips[0]) ? ($ips[0]) : ();
|
||||
} else {
|
||||
pop @ips if (scalar @ips > 1);
|
||||
}
|
||||
}
|
||||
# If name is foo-firsttrusted, check only the Received header just
|
||||
# after it enters our trusted networks; that's the only one we can
|
||||
# trust the IP address from (since our relay added that header).
|
||||
# And if name is foo-untrusted, check any untrusted IP address.
|
||||
elsif ($set =~ /-(first|un)trusted$/)
|
||||
{
|
||||
my @tips;
|
||||
foreach my $ip (@originating) {
|
||||
if ($ip && !$trusted->contains_ip($ip)) {
|
||||
push(@tips, $ip);
|
||||
}
|
||||
}
|
||||
@ips = $self->ip_list_uniq_and_strip_private (@ips, @tips);
|
||||
if ($1 eq "first") {
|
||||
@ips = (defined $ips[0]) ? ($ips[0]) : ();
|
||||
} else {
|
||||
shift @ips;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
my @tips;
|
||||
foreach my $ip (@originating) {
|
||||
if ($ip && !$trusted->contains_ip($ip)) {
|
||||
push(@tips, $ip);
|
||||
}
|
||||
}
|
||||
|
||||
# add originating IPs as untrusted IPs (if they are untrusted)
|
||||
@ips = reverse $self->ip_list_uniq_and_strip_private (@ips, @tips);
|
||||
}
|
||||
|
||||
# How many IPs max you check in the received lines
|
||||
my $checklast=$self->{main}->{conf}->{num_check_received};
|
||||
|
||||
if (scalar @ips > $checklast) {
|
||||
splice (@ips, $checklast); # remove all others
|
||||
}
|
||||
|
||||
my $tflags = $pms->{conf}->{tflags}->{$rule};
|
||||
|
||||
# Trusted relays should only be checked against nice rules (dnswls)
|
||||
if (defined $tflags && $tflags !~ /\bnice\b/) {
|
||||
# remove trusted hosts from beginning
|
||||
while (@ips && $trusted->contains_ip($ips[0])) { shift @ips }
|
||||
}
|
||||
|
||||
unless (scalar @ips > 0) {
|
||||
dbg("dns: no untrusted IPs to check");
|
||||
return 0;
|
||||
}
|
||||
|
||||
dbg("dns: only inspecting the following IPs: ".join(", ", @ips));
|
||||
|
||||
eval {
|
||||
foreach my $ip (@ips) {
|
||||
my $revip = reverse_ip_address($ip);
|
||||
$pms->do_rbl_lookup($rule, $set, $type,
|
||||
$revip.'.'.$rbl_server, $subtest) if defined $revip;
|
||||
}
|
||||
};
|
||||
|
||||
# note that results are not handled here, hits are handled directly
|
||||
# as DNS responses are harvested
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_rbl {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
$self->check_rbl_backend($pms, $rule, $set, $rbl_server, 'A', $subtest);
|
||||
}
|
||||
|
||||
sub check_rbl_txt {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
$self->check_rbl_backend($pms, $rule, $set, $rbl_server, 'TXT', $subtest);
|
||||
}
|
||||
|
||||
# run for first message
|
||||
sub check_rbl_sub {
|
||||
my ($self, $pms, $rule, $set, $subtest) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
|
||||
$pms->register_rbl_subtest($rule, $set, $subtest);
|
||||
}
|
||||
|
||||
# backward compatibility
|
||||
sub check_rbl_results_for {
|
||||
#warn "dns: check_rbl_results_for() is deprecated, use check_rbl_sub()\n";
|
||||
check_rbl_sub(@_);
|
||||
}
|
||||
|
||||
# this only checks the address host name and not the domain name because
|
||||
# using the domain name had much worse results for dsn.rfc-ignorant.org
|
||||
sub check_rbl_from_host {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
_check_rbl_addresses($self, $pms, $rule, $set, $rbl_server, $subtest, $_[1]->all_from_addrs());
|
||||
}
|
||||
|
||||
sub check_rbl_headers {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest, $test_headers) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
my @env_hdr;
|
||||
my $conf = $self->{main}->{conf};
|
||||
|
||||
if ( defined $test_headers ) {
|
||||
@env_hdr = split(/,/, $test_headers);
|
||||
} else {
|
||||
@env_hdr = split(/,/, $conf->{rbl_headers});
|
||||
}
|
||||
|
||||
foreach my $rbl_headers (@env_hdr) {
|
||||
my $addr = $_[1]->get($rbl_headers.':addr', undef);
|
||||
if ( defined $addr && $addr =~ /\@([^\@\s]+)/ ) {
|
||||
$self->_check_rbl_addresses($pms, $rule, $set, $rbl_server,
|
||||
$subtest, $addr);
|
||||
} else {
|
||||
my $host = $pms->get($rbl_headers);
|
||||
chomp($host);
|
||||
if($host =~ /^$IP_ADDRESS$/ ) {
|
||||
return if ($conf->{tflags}->{$rule}||'') =~ /\bdomains_only\b/;
|
||||
$host = reverse_ip_address($host);
|
||||
} else {
|
||||
return if ($conf->{tflags}->{$rule}||'') =~ /\bips_only\b/;
|
||||
}
|
||||
$pms->do_rbl_lookup($rule, $set, 'A',
|
||||
"$host.$rbl_server", $subtest) if ( defined $host and $host ne "");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
=over 4
|
||||
|
||||
=item check_rbl_from_domain
|
||||
|
||||
This checks all the from addrs domain names as an alternate to check_rbl_from_host. As of v3.4.1, it has been improved to include a subtest for a specific octet.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub check_rbl_from_domain {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
_check_rbl_addresses($self, $pms, $rule, $set, $rbl_server, $subtest, $_[1]->all_from_addrs_domains());
|
||||
}
|
||||
|
||||
=over 4
|
||||
|
||||
=item check_rbl_ns_from
|
||||
|
||||
This checks the dns server of the from addrs domain name.
|
||||
It is possible to include a subtest for a specific octet.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub check_rbl_ns_from {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
my $domain;
|
||||
my @nshost = ();
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
|
||||
for my $from ($pms->get('EnvelopeFrom:addr')) {
|
||||
next unless defined $from;
|
||||
$from =~ tr/././s; # bug 3366
|
||||
if ($from =~ m/ \@ ( [^\@\s]+ \. [^\@\s]+ )/x ) {
|
||||
$domain = lc($1);
|
||||
last;
|
||||
}
|
||||
}
|
||||
return 0 unless defined $domain;
|
||||
|
||||
dbg("dns: checking NS for host $domain");
|
||||
|
||||
my $key = "NS:" . $domain;
|
||||
my $obj = { dom => $domain, rule => $rule, set => $set, rbl_server => $rbl_server, subtest => $subtest };
|
||||
my $ent = {
|
||||
key => $key, zone => $domain, obj => $obj, type => "URI-NS",
|
||||
};
|
||||
# dig $dom ns
|
||||
$ent = $pms->{async}->bgsend_and_start_lookup(
|
||||
$domain, 'NS', undef, $ent,
|
||||
sub { my ($ent2,$pkt) = @_;
|
||||
$self->complete_ns_lookup($pms, $ent2, $pkt, $domain) },
|
||||
master_deadline => $pms->{master_deadline} );
|
||||
return $ent;
|
||||
}
|
||||
|
||||
sub complete_ns_lookup {
|
||||
my ($self, $pms, $ent, $pkt, $host) = @_;
|
||||
|
||||
my $rule = $ent->{obj}->{rule};
|
||||
my $set = $ent->{obj}->{set};
|
||||
my $rbl_server = $ent->{obj}->{rbl_server};
|
||||
my $subtest = $ent->{obj}->{subtest};
|
||||
|
||||
if (!$pkt) {
|
||||
# $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
dbg("DNSEval: complete_ns_lookup aborted %s", $ent->{key});
|
||||
return;
|
||||
}
|
||||
|
||||
dbg("DNSEval: complete_ns_lookup %s", $ent->{key});
|
||||
my @ns = $pkt->authority;
|
||||
|
||||
foreach my $rr (@ns) {
|
||||
my $nshost = $rr->mname;
|
||||
if(defined($nshost)) {
|
||||
chomp($nshost);
|
||||
if ( defined $subtest ) {
|
||||
dbg("dns: checking [$nshost] / $rule / $set / $rbl_server / $subtest");
|
||||
} else {
|
||||
dbg("dns: checking [$nshost] / $rule / $set / $rbl_server");
|
||||
}
|
||||
$pms->do_rbl_lookup($rule, $set, 'A',
|
||||
"$nshost.$rbl_server", $subtest) if ( defined $nshost and $nshost ne "");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
=over 4
|
||||
|
||||
=item check_rbl_rcvd
|
||||
|
||||
This checks all received headers domains or ip addresses against a specific rbl.
|
||||
It is possible to include a subtest for a specific octet.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub check_rbl_rcvd {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
my %seen;
|
||||
my @udnsrcvd = ();
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
my $rcvd = $pms->{relays_untrusted}->[$pms->{num_relays_untrusted} - 1];
|
||||
my @dnsrcvd = ( $rcvd->{ip}, $rcvd->{by}, $rcvd->{helo}, $rcvd->{rdns} );
|
||||
# unique values
|
||||
foreach my $value (@dnsrcvd) {
|
||||
if ( ( defined $value ) && (! $seen{$value}++ ) ) {
|
||||
push @udnsrcvd, $value;
|
||||
}
|
||||
}
|
||||
|
||||
foreach my $host ( @udnsrcvd ) {
|
||||
if((defined $host) and ($host ne "")) {
|
||||
chomp($host);
|
||||
if($host =~ /^$IP_ADDRESS$/ ) {
|
||||
next if ($pms->{conf}->{tflags}->{$rule}||'') =~ /\bdomains_only\b/;
|
||||
$host = reverse_ip_address($host);
|
||||
} else {
|
||||
next if ($pms->{conf}->{tflags}->{$rule}||'') =~ /\bips_only\b/;
|
||||
$host =~ s/\.$//;
|
||||
}
|
||||
if ( defined $subtest ) {
|
||||
dbg("dns: checking [$host] / $rule / $set / $rbl_server / $subtest");
|
||||
} else {
|
||||
dbg("dns: checking [$host] / $rule / $set / $rbl_server");
|
||||
}
|
||||
$pms->do_rbl_lookup($rule, $set, 'A',
|
||||
"$host.$rbl_server", $subtest) if ( defined $host and $host ne "");
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
# this only checks the address host name and not the domain name because
|
||||
# using the domain name had much worse results for dsn.rfc-ignorant.org
|
||||
sub check_rbl_envfrom {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
_check_rbl_addresses($self, $pms, $rule, $set, $rbl_server, $subtest, $_[1]->get('EnvelopeFrom:addr',undef));
|
||||
}
|
||||
|
||||
sub _check_rbl_addresses {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest, @addresses) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
|
||||
my %hosts;
|
||||
for (@addresses) {
|
||||
next if !defined($_) || !/ \@ ( [^\@\s]+ )/x;
|
||||
my $address = $1;
|
||||
# strip leading & trailing dots (as seen in some e-mail addresses)
|
||||
$address =~ s/^\.+//; $address =~ s/\.+\z//;
|
||||
# squash duplicate dots to avoid an invalid DNS query with a null label
|
||||
$address =~ tr/.//s;
|
||||
$hosts{lc($address)} = 1 if $address =~ /\./; # must by a FQDN
|
||||
}
|
||||
return unless scalar keys %hosts;
|
||||
|
||||
if (($rbl_server !~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) &&
|
||||
(index($rbl_server, '.') >= 0) &&
|
||||
($rbl_server !~ /\.$/)) {
|
||||
$rbl_server .= ".";
|
||||
}
|
||||
|
||||
for my $host (keys %hosts) {
|
||||
if ($host =~ /^$IP_ADDRESS$/) {
|
||||
next if ($pms->{conf}->{tflags}->{$rule}||'') =~ /\bdomains_only\b/;
|
||||
$host = reverse_ip_address($host);
|
||||
} else {
|
||||
next if ($pms->{conf}->{tflags}->{$rule}||'') =~ /\bips_only\b/;
|
||||
}
|
||||
dbg("dns: checking [$host] / $rule / $set / $rbl_server");
|
||||
$pms->do_rbl_lookup($rule, $set, 'A', "$host.$rbl_server", $subtest);
|
||||
}
|
||||
}
|
||||
|
||||
sub check_dns_sender {
|
||||
my ($self, $pms, $rule) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
|
||||
my $host;
|
||||
for my $from ($pms->get('EnvelopeFrom:addr',undef)) {
|
||||
next unless defined $from;
|
||||
|
||||
$from =~ tr/././s; # bug 3366
|
||||
if ($from =~ m/ \@ ( [^\@\s]+ \. [^\@\s]+ )/x ) {
|
||||
$host = lc($1);
|
||||
last;
|
||||
}
|
||||
}
|
||||
return 0 unless defined $host;
|
||||
|
||||
if ($host eq 'compiling.spamassassin.taint.org') {
|
||||
# only used when compiling
|
||||
return 0;
|
||||
}
|
||||
|
||||
dbg("dns: checking A and MX for host $host");
|
||||
|
||||
$pms->do_dns_lookup($rule, 'A', $host);
|
||||
$pms->do_dns_lookup($rule, 'MX', $host);
|
||||
|
||||
# cache name of host for later checking
|
||||
$pms->{sender_host} = $host;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
# capability checks for "if can(Mail::SpamAssassin::Plugin::DNSEval::XXX)":
|
||||
#
|
||||
sub has_tflags_domains_only { 1 }
|
||||
sub has_tflags_ips_only { 1 }
|
||||
|
||||
1;
|
650
upstream/lib/Mail/SpamAssassin/Plugin/FreeMail.pm
Normal file
650
upstream/lib/Mail/SpamAssassin/Plugin/FreeMail.pm
Normal file
@ -0,0 +1,650 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Plugin::FreeMail;
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
my $VERSION = 2.003;
|
||||
|
||||
=head1 NAME
|
||||
|
||||
FreeMail - check message headers/body for freemail-domains
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
If for example From-address is freemail, and Reply-To or address found in mail body is
|
||||
different freemail address, return success. Good sign of Nigerian scams
|
||||
etc. Test idea from Marc Perkel.
|
||||
|
||||
Also separate functions to check various portions of message for freemails.
|
||||
|
||||
=head1 CONFIGURATION
|
||||
|
||||
freemail_domains domain ...
|
||||
|
||||
List of domains to be used in checks.
|
||||
|
||||
Regexp is not supported, but following wildcards work:
|
||||
|
||||
? for single character (does not match a dot)
|
||||
* for multiple characters (does not match a dot)
|
||||
|
||||
For example:
|
||||
freemail_domains hotmail.com hotmail.co.?? yahoo.* yahoo.*.*
|
||||
|
||||
freemail_whitelist email/domain ...
|
||||
|
||||
Emails or domains listed here are ignored (pretend they aren't
|
||||
freemail). No wildcards!
|
||||
|
||||
freemail_import_whitelist_auth 1/0
|
||||
|
||||
Entries in whitelist_auth will also be used to whitelist emails
|
||||
or domains from being freemail. Default is 0.
|
||||
|
||||
freemail_import_def_whitelist_auth 1/0
|
||||
|
||||
Entries in def_whitelist_auth will also be used to whitelist emails
|
||||
or domains from being freemail. Default is 0.
|
||||
|
||||
header FREEMAIL_REPLYTO eval:check_freemail_replyto(['option'])
|
||||
|
||||
Checks/compares freemail addresses found from headers and body.
|
||||
|
||||
Possible options:
|
||||
|
||||
replyto From: or body address is different than Reply-To
|
||||
(this is the default)
|
||||
reply as above, but if no Reply-To header is found,
|
||||
compares From: and body
|
||||
|
||||
header FREEMAIL_FROM eval:check_freemail_from(['regex'])
|
||||
|
||||
Checks all possible "from" headers to see if sender is freemail.
|
||||
Uses SA all_from_addrs() function (includes 'Resent-From', 'From',
|
||||
'EnvelopeFrom' etc).
|
||||
|
||||
Add optional regex to match the found email address(es). For example,
|
||||
to see if user ends in digit: check_freemail_from('\d@')
|
||||
|
||||
If you use multiple check_freemail_from rules with regexes, remember
|
||||
that they might hit different emails from different heades. To match
|
||||
a certain header only, use check_freemail_header.
|
||||
|
||||
header FREEMAIL_HDRX eval:check_freemail_header('header' [, 'regex'])
|
||||
|
||||
Searches defined header for freemail address. Optional regex to match
|
||||
the found address (like in check_freemail_from).
|
||||
|
||||
header FREEMAIL_BODY eval:check_freemail_body(['regex'])
|
||||
|
||||
Searches body for freemail address. With optional regex to match.
|
||||
|
||||
=head1 CHANGELOG
|
||||
|
||||
1.996 - fix freemail_skip_bulk_envfrom
|
||||
1.997 - set freemail_skip_when_over_max to 1 by default
|
||||
1.998 - don't warn about missing freemail_domains when linting
|
||||
1.999 - default whitelist undisclosed-recipient@yahoo.com etc
|
||||
2.000 - some cleaning up
|
||||
2.001 - fix freemail_whitelist
|
||||
2.002 - _add_desc -> _got_hit, fix description email append bug
|
||||
2.003 - freemail_import_(def_)whitelist_auth
|
||||
|
||||
=cut
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::PerMsgStatus;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# default email whitelist
|
||||
our $email_whitelist = qr/
|
||||
^(?:
|
||||
abuse|support|sales|info|helpdesk|contact|kontakt
|
||||
| (?:post|host|domain)master
|
||||
| undisclosed.* # yahoo.com etc(?)
|
||||
| request-[a-f0-9]{16} # live.com
|
||||
| bounced?- # yahoo.com etc
|
||||
| [a-f0-9]{8}(?:\.[a-f0-9]{8}|-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}) # gmail msgids?
|
||||
| .+=.+=.+ # gmail forward
|
||||
)\@
|
||||
/xi;
|
||||
|
||||
# skip replyto check when envelope sender is
|
||||
# allow <> for now
|
||||
# no re "strict"; # since perl 5.21.8: Ranges of ASCII printables...
|
||||
our $skip_replyto_envfrom = qr/
|
||||
(?:
|
||||
^(?:post|host|domain)master
|
||||
| ^double-bounce
|
||||
| ^(?:sentto|owner|return|(?:gr)?bounced?)-.+
|
||||
| -(?:request|bounces?|admin|owner)
|
||||
| \b(?:do[._-t]?)?no[._-t]?repl(?:y|ies)
|
||||
| .+=.+
|
||||
)\@
|
||||
/xi;
|
||||
|
||||
sub dbg { Mail::SpamAssassin::Plugin::dbg ("FreeMail: @_"); }
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->{freemail_available} = 1;
|
||||
$self->set_config($mailsa->{conf});
|
||||
$self->register_eval_rule("check_freemail_replyto");
|
||||
$self->register_eval_rule("check_freemail_from");
|
||||
$self->register_eval_rule("check_freemail_header");
|
||||
$self->register_eval_rule("check_freemail_body");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub _init_email_regex {
|
||||
my ($self) = @_;
|
||||
|
||||
dbg("initializing email regex");
|
||||
|
||||
# Some regexp tips courtesy of http://www.regular-expressions.info/email.html
|
||||
# full email regex v0.02
|
||||
$self->{email_regex} = qr/
|
||||
(?=.{0,64}\@) # limit userpart to 64 chars (and speed up searching?)
|
||||
(?<![a-z0-9!#\$%&'*+\/=?^_`{|}~-]) # start boundary
|
||||
( # capture email
|
||||
[a-z0-9!#\$%&'*+\/=?^_`{|}~-]+ # no dot in beginning
|
||||
(?:\.[a-z0-9!#\$%&'*+\/=?^_`{|}~-]+)* # no consecutive dots, no ending dot
|
||||
\@
|
||||
(?:[a-z0-9](?:[a-z0-9-]{0,59}[a-z0-9])?\.){1,4} # max 4x61 char parts (should be enough?)
|
||||
$self->{main}->{registryboundaries}->{valid_tlds_re} # ends with valid tld
|
||||
)
|
||||
/xi;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
push(@cmds, {
|
||||
setting => 'freemail_max_body_emails',
|
||||
default => 5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_max_body_freemails',
|
||||
default => 3,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_skip_when_over_max',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_skip_bulk_envfrom',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_add_describe_email',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_import_whitelist_auth',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_import_def_whitelist_auth',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub parse_config {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
if ($opts->{key} eq "freemail_domains") {
|
||||
foreach my $temp (split(/\s+/, $opts->{value})) {
|
||||
if ($temp =~ /^[a-z0-9.*?-]+$/i) {
|
||||
my $value = lc($temp);
|
||||
if ($value =~ /[*?]/) { # separate wildcard list
|
||||
$self->{freemail_temp_wc}{$value} = 1;
|
||||
}
|
||||
else {
|
||||
$self->{freemail_domains}{$value} = 1;
|
||||
}
|
||||
}
|
||||
else {
|
||||
warn("invalid freemail_domains: $temp");
|
||||
}
|
||||
}
|
||||
$self->inhibit_further_callbacks();
|
||||
return 1;
|
||||
}
|
||||
|
||||
if ($opts->{key} eq "freemail_whitelist") {
|
||||
foreach my $temp (split(/\s+/, $opts->{value})) {
|
||||
my $value = lc($temp);
|
||||
if ($value =~ /\w[.@]\w/) {
|
||||
$self->{freemail_whitelist}{$value} = 1;
|
||||
}
|
||||
else {
|
||||
warn("invalid freemail_whitelist: $temp");
|
||||
}
|
||||
}
|
||||
$self->inhibit_further_callbacks();
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub finish_parsing_end {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $wcount = 0;
|
||||
if (defined $self->{freemail_temp_wc}) {
|
||||
my @domains;
|
||||
foreach my $value (keys %{$self->{freemail_temp_wc}}) {
|
||||
$value =~ s/\./\\./g;
|
||||
$value =~ s/\?/./g;
|
||||
$value =~ s/\*/[^.]*/g;
|
||||
push(@domains, $value);
|
||||
}
|
||||
my $doms = join('|', @domains);
|
||||
$self->{freemail_domains_re} = qr/\@(?:${doms})$/;
|
||||
$wcount = scalar @domains;
|
||||
undef $self->{freemail_temp_wc};
|
||||
delete $self->{freemail_temp_wc};
|
||||
}
|
||||
|
||||
my $count = scalar keys %{$self->{freemail_domains}};
|
||||
if ($count + $wcount) {
|
||||
dbg("loaded freemail_domains entries: $count normal, $wcount wildcard");
|
||||
}
|
||||
else {
|
||||
if ($self->{main}->{lint_rules} ||1) {
|
||||
dbg("no freemail_domains entries defined, disabling plugin");
|
||||
}
|
||||
else {
|
||||
warn("no freemail_domains entries defined, disabling plugin");
|
||||
}
|
||||
$self->{freemail_available} = 0;
|
||||
}
|
||||
|
||||
# valid_tlds_re will be available at finish_parsing_end, compile it now,
|
||||
# we only need to do it once and before possible forking
|
||||
if ($self->{freemail_available} && !$self->{email_regex}) {
|
||||
$self->_init_email_regex();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _is_freemail {
|
||||
my ($self, $email, $pms) = @_;
|
||||
|
||||
return 0 if $email eq '';
|
||||
|
||||
if (defined $self->{freemail_whitelist}{$email}) {
|
||||
dbg("whitelisted email: $email");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $domain = $email;
|
||||
$domain =~ s/.*\@//;
|
||||
|
||||
if (defined $self->{freemail_whitelist}{$domain}) {
|
||||
dbg("whitelisted domain: $domain");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($email =~ $email_whitelist) {
|
||||
dbg("whitelisted email, default: $email");
|
||||
return 0;
|
||||
}
|
||||
|
||||
foreach my $list ('whitelist_auth','def_whitelist_auth') {
|
||||
if ($pms->{conf}->{"freemail_import_$list"}) {
|
||||
foreach my $regexp (values %{$pms->{conf}->{$list}}) {
|
||||
if ($email =~ /$regexp/o) {
|
||||
dbg("whitelisted email, $list: $email");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (defined $self->{freemail_domains}{$domain}
|
||||
or ( defined $self->{freemail_domains_re}
|
||||
and $email =~ $self->{freemail_domains_re} )) {
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _parse_body {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Parse body
|
||||
if (not defined $pms->{freemail_cache}{body}) {
|
||||
%{$pms->{freemail_cache}{body}} = ();
|
||||
my %seen;
|
||||
my @body_emails;
|
||||
# get all <a href="mailto:", since they don't show up on stripped_body
|
||||
my $parsed = $pms->get_uri_detail_list();
|
||||
while (my($uri, $info) = each %{$parsed}) {
|
||||
if (defined $info->{types}->{a} and not defined $info->{types}->{parsed}) {
|
||||
if ($uri =~ /^(?:(?i)mailto):$self->{email_regex}/o) {
|
||||
my $email = lc($1);
|
||||
push(@body_emails, $email) unless defined $seen{$email};
|
||||
$seen{$email} = 1;
|
||||
last if scalar @body_emails >= 20; # sanity
|
||||
}
|
||||
}
|
||||
}
|
||||
# scan stripped normalized body
|
||||
# have to do this way since get_uri_detail_list doesn't know what mails are inside <>
|
||||
my $body = $pms->get_decoded_stripped_body_text_array();
|
||||
BODY: foreach (@$body) {
|
||||
# strip urls with possible emails inside
|
||||
s{<?https?://\S{0,255}(?:\@|%40)\S{0,255}}{ }gi;
|
||||
# strip emails contained in <>, not mailto:
|
||||
# also strip ones followed by quote-like "wrote:" (but not fax: and tel: etc)
|
||||
s{<?(?<!mailto:)$self->{email_regex}(?:>|\s{1,10}(?!(?:fa(?:x|csi)|tel|phone|e?-?mail))[a-z]{2,11}:)}{ }gi;
|
||||
while (/$self->{email_regex}/g) {
|
||||
my $email = lc($1);
|
||||
utf8::encode($email) if utf8::is_utf8($email); # chars to UTF-8
|
||||
push(@body_emails, $email) unless $seen{$email};
|
||||
$seen{$email} = 1;
|
||||
last BODY if @body_emails >= 40; # sanity
|
||||
}
|
||||
}
|
||||
my $count_all = 0;
|
||||
my $count_fm = 0;
|
||||
foreach my $email (@body_emails) { # as UTF-8 octets
|
||||
if (++$count_all == $pms->{main}->{conf}->{freemail_max_body_emails}) {
|
||||
if ($pms->{main}->{conf}->{freemail_skip_when_over_max}) {
|
||||
$pms->{freemail_skip_body} = 1;
|
||||
dbg("too many unique emails found from body");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
next unless $self->_is_freemail($email, $pms);
|
||||
if (++$count_fm == $pms->{main}->{conf}->{freemail_max_body_freemails}) {
|
||||
if ($pms->{main}->{conf}->{freemail_skip_when_over_max}) {
|
||||
$pms->{freemail_skip_body} = 1;
|
||||
dbg("too many unique freemails found from body");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
$pms->{freemail_cache}{body}{$email} = 1;
|
||||
}
|
||||
dbg("all body freemails: ".join(', ', keys %{$pms->{freemail_cache}{body}}))
|
||||
if scalar keys %{$pms->{freemail_cache}{body}};
|
||||
}
|
||||
|
||||
if (defined $pms->{freemail_skip_body}) {
|
||||
dbg("[cached] body email limit exceeded, skipping");
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub _got_hit {
|
||||
my ($self, $pms, $email, $desc) = @_;
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
|
||||
if (defined $pms->{conf}->{descriptions}->{$rulename}) {
|
||||
$desc = $pms->{conf}->{descriptions}->{$rulename};
|
||||
}
|
||||
|
||||
if ($pms->{main}->{conf}->{freemail_add_describe_email}) {
|
||||
$email =~ s/\@/[at]/g;
|
||||
$pms->test_log($email);
|
||||
}
|
||||
|
||||
$pms->got_hit($rulename, "", description => $desc, ruletype => 'eval');
|
||||
}
|
||||
|
||||
sub check_freemail_header {
|
||||
my ($self, $pms, $header, $regex) = @_;
|
||||
|
||||
return 0 unless $self->{freemail_available};
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
dbg("RULE ($rulename) check_freemail_header".(defined $regex ? " regex:$regex" : ""));
|
||||
|
||||
unless (defined $header) {
|
||||
warn("check_freemail_header needs argument");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $re;
|
||||
if (defined $regex) {
|
||||
my ($rec, $err) = compile_regexp($regex, 0);
|
||||
if (!$rec) {
|
||||
warn "freemail: invalid regexp for $rulename '$regex': $err\n";
|
||||
return 0;
|
||||
}
|
||||
$re = $rec;
|
||||
}
|
||||
|
||||
my @emails = map (lc, $pms->{main}->find_all_addrs_in_line ($pms->get($header)));
|
||||
|
||||
if (!scalar (@emails)) {
|
||||
dbg("header $header not found from mail");
|
||||
return 0;
|
||||
}
|
||||
dbg("addresses from header $header: ".join(';',@emails));
|
||||
|
||||
foreach my $email (@emails) {
|
||||
if ($self->_is_freemail($email, $pms)) {
|
||||
if (defined $re) {
|
||||
next unless $email =~ /$re/o;
|
||||
dbg("HIT! $email is freemail and matches regex");
|
||||
}
|
||||
else {
|
||||
dbg("HIT! $email is freemail");
|
||||
}
|
||||
$self->_got_hit($pms, $email, "Header $header is freemail");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_freemail_body {
|
||||
my ($self, $pms, $regex) = @_;
|
||||
|
||||
return 0 unless $self->{freemail_available};
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
dbg("RULE ($rulename) check_freemail_body".(defined $regex ? " regex:$regex" : ""));
|
||||
|
||||
return 0 unless $self->_parse_body($pms);
|
||||
|
||||
my $re;
|
||||
if (defined $regex) {
|
||||
my ($rec, $err) = compile_regexp($regex, 0);
|
||||
if (!$rec) {
|
||||
warn "freemail: invalid regexp for $rulename '$regex': $err\n";
|
||||
return 0;
|
||||
}
|
||||
$re = $rec;
|
||||
}
|
||||
|
||||
if (defined $re) {
|
||||
foreach my $email (keys %{$pms->{freemail_cache}{body}}) {
|
||||
if ($email =~ /$re/o) {
|
||||
dbg("HIT! email from body is freemail and matches regex: $email");
|
||||
$self->_got_hit($pms, $email, "Email from body is freemail");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
}
|
||||
elsif (scalar keys %{$pms->{freemail_cache}{body}}) {
|
||||
my $emails = join(', ', keys %{$pms->{freemail_cache}{body}});
|
||||
dbg("HIT! body has freemails: $emails");
|
||||
$self->_got_hit($pms, $emails, "Body contains freemails");
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_freemail_from {
|
||||
my ($self, $pms, $regex) = @_;
|
||||
|
||||
return 0 unless $self->{freemail_available};
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
dbg("RULE ($rulename) check_freemail_from".(defined $regex ? " regex:$regex" : ""));
|
||||
|
||||
my $re;
|
||||
if (defined $regex) {
|
||||
my ($rec, $err) = compile_regexp($regex, 0);
|
||||
if (!$rec) {
|
||||
warn "freemail: invalid regexp for $rulename '$regex': $err\n";
|
||||
return 0;
|
||||
}
|
||||
$re = $rec;
|
||||
}
|
||||
|
||||
my %from_addrs = map { lc($_) => 1 } ($pms->all_from_addrs());
|
||||
delete $from_addrs{''}; # no empty ones thx
|
||||
|
||||
unless (scalar keys %from_addrs) {
|
||||
dbg("no from-addresses found to check");
|
||||
return 0;
|
||||
}
|
||||
|
||||
dbg("all from-addresses: ".join(', ', keys %from_addrs));
|
||||
|
||||
foreach my $email (keys %from_addrs) {
|
||||
next unless $self->_is_freemail($email, $pms);
|
||||
if (defined $re) {
|
||||
next unless $email =~ /$re/o;
|
||||
dbg("HIT! $email is freemail and matches regex");
|
||||
}
|
||||
else {
|
||||
dbg("HIT! $email is freemail");
|
||||
}
|
||||
$self->_got_hit($pms, $email, "Sender address is freemail");
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_freemail_replyto {
|
||||
my ($self, $pms, $what) = @_;
|
||||
|
||||
return 0 unless $self->{freemail_available};
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
dbg("RULE ($rulename) check_freemail_replyto");
|
||||
|
||||
if (defined $what) {
|
||||
if ($what ne 'replyto' and $what ne 'reply') {
|
||||
warn("invalid check_freemail_replyto option: $what");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
else {
|
||||
$what = 'replyto';
|
||||
}
|
||||
|
||||
# Skip mailing-list etc looking requests, mostly FPs from them
|
||||
if ($pms->{main}->{conf}->{freemail_skip_bulk_envfrom}) {
|
||||
my $envfrom = lc($pms->get("EnvelopeFrom"));
|
||||
if ($envfrom =~ $skip_replyto_envfrom) {
|
||||
dbg("envelope sender looks bulk, skipping check: $envfrom");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
my $from = lc($pms->get("From:addr"));
|
||||
my $replyto = lc($pms->get("Reply-To:addr"));
|
||||
my $from_is_fm = $self->_is_freemail($from, $pms);
|
||||
my $replyto_is_fm = $self->_is_freemail($replyto, $pms);
|
||||
|
||||
dbg("From address: $from") if $from ne '';
|
||||
dbg("Reply-To address: $replyto") if $replyto ne '';
|
||||
|
||||
if ($from_is_fm and $replyto_is_fm and ($from ne $replyto)) {
|
||||
dbg("HIT! From and Reply-To are different freemails");
|
||||
$self->_got_hit($pms, "$from, $replyto", "From and Reply-To are different freemails");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($what eq 'replyto') {
|
||||
if (!$replyto_is_fm) {
|
||||
dbg("Reply-To is not freemail, skipping check");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
elsif ($what eq 'reply') {
|
||||
if ($replyto ne '' and !$replyto_is_fm) {
|
||||
dbg("Reply-To defined and is not freemail, skipping check");
|
||||
return 0;
|
||||
}
|
||||
elsif (!$from_is_fm) {
|
||||
dbg("No Reply-To and From is not freemail, skipping check");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
my $reply = $replyto_is_fm ? $replyto : $from;
|
||||
|
||||
return 0 unless $self->_parse_body($pms);
|
||||
|
||||
# Compare body to headers
|
||||
if (scalar keys %{$pms->{freemail_cache}{body}}) {
|
||||
my $check = $what eq 'replyto' ? $replyto : $reply;
|
||||
dbg("comparing $check to body freemails");
|
||||
foreach my $email (keys %{$pms->{freemail_cache}{body}}) {
|
||||
if ($email ne $check) {
|
||||
dbg("HIT! $check and $email are different freemails");
|
||||
$self->_got_hit($pms, "$check, $email", "Different freemails in reply header and body");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
1;
|
437
upstream/lib/Mail/SpamAssassin/Plugin/FromNameSpoof.pm
Normal file
437
upstream/lib/Mail/SpamAssassin/Plugin/FromNameSpoof.pm
Normal file
@ -0,0 +1,437 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
FromNameSpoof - perform various tests to detect spoof attempts using the From header name section
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::FromNameSpoof
|
||||
|
||||
# Does the From:name look like it contains an email address
|
||||
header __PLUGIN_FROMNAME_EMAIL eval:check_fromname_contains_email()
|
||||
|
||||
# Is the From:name different to the From:addr header
|
||||
header __PLUGIN_FROMNAME_DIFFERENT eval:check_fromname_different()
|
||||
|
||||
# From:name and From:addr owners differ
|
||||
header __PLUGIN_FROMNAME_OWNERS_DIFFER eval:check_fromname_owners_differ()
|
||||
|
||||
# From:name domain differs to from header
|
||||
header __PLUGIN_FROMNAME_DOMAIN_DIFFER eval:check_fromname_domain_differ()
|
||||
|
||||
# From:name and From:address don't match and owners differ
|
||||
header __PLUGIN_FROMNAME_SPOOF eval:check_fromname_spoof()
|
||||
|
||||
# From:name address matches To:address
|
||||
header __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Perform various tests against From:name header to detect spoofing. Steps in place to
|
||||
ensure minimal FPs.
|
||||
|
||||
=head1 CONFIGURATION
|
||||
|
||||
The plugin allows you to skip emails that have been DKIM signed by specific senders:
|
||||
|
||||
fns_ignore_dkim googlegroups.com
|
||||
|
||||
FromNameSpoof allows for a configurable closeness when matching the From:addr and From:name,
|
||||
the closeness can be adjusted with:
|
||||
|
||||
fns_extrachars 50
|
||||
|
||||
B<Note> that FromNameSpoof detects the "owner" of a domain by the following search:
|
||||
|
||||
<owner>.<tld>
|
||||
|
||||
By default FromNameSpoof will ignore the TLD when testing if From:addr is spoofed.
|
||||
Default 1
|
||||
|
||||
fns_check 1
|
||||
|
||||
Check levels:
|
||||
|
||||
0 - Strict checking of From:name != From:addr
|
||||
1 - Allow for different tlds
|
||||
2 - Allow for different aliases but same domain
|
||||
|
||||
=head1 TAGS
|
||||
|
||||
The following tags are added to the set if a spoof is detected. They are available for
|
||||
use in reports, header fields, other plugins, etc.:
|
||||
|
||||
_FNSFNAMEADDR_
|
||||
Detected spoof address from From:name header
|
||||
|
||||
_FNSFNAMEDOMAIN_
|
||||
Detected spoof domain from From:name header
|
||||
|
||||
_FNSFNAMEOWNER_
|
||||
Detected spoof owner from From:name header
|
||||
|
||||
_FNSFADDRADDR_
|
||||
Actual From:addr address
|
||||
|
||||
_FNSFADDRDOMAIN_
|
||||
Actual From:addr domain
|
||||
|
||||
_FNSFADDROWNER_
|
||||
Actual From:addr detected owner
|
||||
|
||||
=head1 EXAMPLE
|
||||
|
||||
header __PLUGIN_FROMNAME_SPOOF eval:check_fromname_spoof()
|
||||
header __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
|
||||
|
||||
meta FROMNAME_SPOOF_EQUALS_TO (__PLUGIN_FROMNAME_SPOOF && __PLUGIN_FROMNAME_EQUALS_TO)
|
||||
describe FROMNAME_SPOOF_EQUALS_TO From:name is spoof to look like To: address
|
||||
score FROMNAME_SPOOF_EQUALS_TO 1.2
|
||||
|
||||
=cut
|
||||
|
||||
use strict;
|
||||
|
||||
package Mail::SpamAssassin::Plugin::FromNameSpoof;
|
||||
my $VERSION = 0.9;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use List::Util ();
|
||||
use Mail::SpamAssassin::Util;
|
||||
|
||||
use vars qw(@ISA);
|
||||
@ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub dbg { Mail::SpamAssassin::Plugin::dbg ("FromNameSpoof: @_"); }
|
||||
|
||||
sub uri_to_domain {
|
||||
my ($self, $domain) = @_;
|
||||
|
||||
return unless defined $domain;
|
||||
|
||||
if ($Mail::SpamAssassin::VERSION <= 3.004000) {
|
||||
Mail::SpamAssassin::Util::uri_to_domain($domain);
|
||||
} else {
|
||||
$self->{main}->{registryboundaries}->uri_to_domain($domain);
|
||||
}
|
||||
}
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new
|
||||
{
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_fromname_spoof");
|
||||
$self->register_eval_rule("check_fromname_different");
|
||||
$self->register_eval_rule("check_fromname_domain_differ");
|
||||
$self->register_eval_rule("check_fromname_contains_email");
|
||||
$self->register_eval_rule("check_fromname_equals_to");
|
||||
$self->register_eval_rule("check_fromname_owners_differ");
|
||||
$self->register_eval_rule("check_fromname_equals_replyto");
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds = ();
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'fns_add_addrlist',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my($self, $key, $value, $line) = @_;
|
||||
local($1,$2);
|
||||
if ($value !~ /^ \( (.*?) \) \s+ (.*) \z/sx) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my $listname = "FNS_$1";
|
||||
$value = $2;
|
||||
$self->{parser}->add_to_addrlist ($listname, split(/\s+/, lc($value)));
|
||||
$self->{fns_addrlists}{$listname} = 1;
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'fns_remove_addrlist',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my($self, $key, $value, $line) = @_;
|
||||
local($1,$2);
|
||||
if ($value !~ /^ \( (.*?) \) \s+ (.*) \z/sx) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my $listname = "FNS_$1";
|
||||
$value = $2;
|
||||
$self->{parser}->remove_from_addrlist ($listname, split (/\s+/, $value));
|
||||
}
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'fns_extrachars',
|
||||
default => 50,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'fns_ignore_dkim',
|
||||
default => {},
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
$self->{fns_ignore_dkim}->{$_} = 1 foreach (split(/\s+/, lc($value)));
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'fns_ignore_headers',
|
||||
default => {},
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
$self->{fns_ignore_header}->{$_} = 1 foreach (split(/\s+/, $value));
|
||||
}
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'fns_check',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
$pms->action_depends_on_tags('DKIMDOMAIN',
|
||||
sub { my($pms,@args) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
}
|
||||
);
|
||||
1;
|
||||
}
|
||||
|
||||
sub check_fromname_different
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_address_different};
|
||||
}
|
||||
|
||||
sub check_fromname_domain_differ
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_domain_different};
|
||||
}
|
||||
|
||||
sub check_fromname_spoof
|
||||
{
|
||||
my ($self, $pms, $check_lvl) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
|
||||
if ( not defined $check_lvl ) {
|
||||
$check_lvl = $pms->{conf}->{fns_check};
|
||||
}
|
||||
|
||||
my @array = (
|
||||
($pms->{fromname_address_different}) ,
|
||||
($pms->{fromname_address_different} && $pms->{fromname_owner_different}) ,
|
||||
($pms->{fromname_address_different} && $pms->{fromname_domain_different})
|
||||
);
|
||||
|
||||
return $array[$check_lvl];
|
||||
|
||||
}
|
||||
|
||||
sub check_fromname_contains_email
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_contains_email};
|
||||
}
|
||||
|
||||
sub check_fromname_equals_replyto
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_equals_replyto};
|
||||
}
|
||||
|
||||
sub check_fromname_equals_to
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_equals_to_addr};
|
||||
}
|
||||
|
||||
sub check_fromname_owners_differ
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_owner_different};
|
||||
}
|
||||
|
||||
sub _check_fromnamespoof
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
return if (defined $pms->{fromname_contains_email});
|
||||
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
$pms->{fromname_contains_email} = 0;
|
||||
$pms->{fromname_address_different} = 0;
|
||||
$pms->{fromname_equals_to_addr} = 0;
|
||||
$pms->{fromname_domain_different} = 0;
|
||||
$pms->{fromname_owner_different} = 0;
|
||||
$pms->{fromname_equals_replyto} = 0;
|
||||
|
||||
foreach my $addr (split / /, $pms->get_tag('DKIMDOMAIN') || '') {
|
||||
if ($conf->{fns_ignore_dkim}->{lc($addr)}) {
|
||||
dbg("ignoring, DKIM signed: $addr");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
foreach my $iheader (keys %{$conf->{fns_ignore_header}}) {
|
||||
if ($pms->get($iheader)) {
|
||||
dbg("ignoring, header $iheader found");
|
||||
return 0 if ($pms->get($iheader));
|
||||
}
|
||||
}
|
||||
|
||||
my $list_refs = {};
|
||||
|
||||
if ($conf->{fns_addrlists}) {
|
||||
my @lists = keys %{$conf->{fns_addrlists}};
|
||||
foreach my $list (@lists) {
|
||||
$list_refs->{$list} = $conf->{$list};
|
||||
}
|
||||
s/^FNS_// foreach (@lists);
|
||||
dbg("using addrlists: ".join(', ', @lists));
|
||||
}
|
||||
|
||||
my %fnd = ();
|
||||
my %fad = ();
|
||||
my %tod = ();
|
||||
|
||||
$fnd{'addr'} = $pms->get("From:name");
|
||||
|
||||
if ($fnd{'addr'} =~ /\b((?>[\w\.\!\#\$\%\&\'\*\+\/\=\?\^\_\`\{\|\}\~\-]+@[\w\-\.]+\.[\w\-\.]+))\b/i) {
|
||||
my $nochar = ($fnd{'addr'} =~ y/A-Za-z0-9//c);
|
||||
$nochar -= ($1 =~ y/A-Za-z0-9//c);
|
||||
|
||||
return 0 unless ((length($fnd{'addr'})+$nochar) - length($1) <= $conf->{'fns_extrachars'});
|
||||
|
||||
$fnd{'addr'} = lc $1;
|
||||
} else {
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $replyto = lc $pms->get("Reply-To:addr");
|
||||
|
||||
$fad{'addr'} = lc $pms->get("From:addr");
|
||||
my @toaddrs = $pms->all_to_addrs();
|
||||
return 0 unless @toaddrs;
|
||||
|
||||
$tod{'addr'} = lc $toaddrs[0];
|
||||
|
||||
$fnd{'domain'} = $self->uri_to_domain($fnd{'addr'});
|
||||
$fad{'domain'} = $self->uri_to_domain($fad{'addr'});
|
||||
$tod{'domain'} = $self->uri_to_domain($tod{'addr'});
|
||||
|
||||
return 0 unless (defined $fnd{'domain'} && defined $fad{'domain'});
|
||||
|
||||
$pms->{fromname_contains_email} = 1;
|
||||
|
||||
$fnd{'owner'} = $self->_find_address_owner($fnd{'addr'}, $list_refs);
|
||||
|
||||
$fad{'owner'} = $self->_find_address_owner($fad{'addr'}, $list_refs);
|
||||
|
||||
$tod{'owner'} = $self->_find_address_owner($tod{'addr'}, $list_refs);
|
||||
|
||||
$pms->{fromname_address_different} = 1 if ($fnd{'addr'} ne $fad{'addr'});
|
||||
|
||||
$pms->{fromname_domain_different} = 1 if ($fnd{'domain'} ne $fad{'domain'});
|
||||
|
||||
$pms->{fromname_equals_to_addr} = 1 if ($fnd{'addr'} eq $tod{addr});
|
||||
|
||||
$pms->{fromname_equals_replyto} = 1 if ($fnd{'addr'} eq $replyto);
|
||||
|
||||
if ($fnd{'owner'} ne $fad{'owner'}) {
|
||||
$pms->{fromname_owner_different} = 1;
|
||||
}
|
||||
|
||||
if ($pms->{fromname_address_different}) {
|
||||
$pms->set_tag("FNSFNAMEADDR", $fnd{'addr'});
|
||||
$pms->set_tag("FNSFADDRADDR", $fad{'addr'});
|
||||
$pms->set_tag("FNSFNAMEOWNER", $fnd{'owner'});
|
||||
$pms->set_tag("FNSFADDROWNER", $fad{'owner'});
|
||||
$pms->set_tag("FNSFNAMEDOMAIN", $fnd{'domain'});
|
||||
$pms->set_tag("FNSFADDRDOMAIN", $fad{'domain'});
|
||||
|
||||
dbg("From name spoof: $fnd{addr} $fnd{domain} $fnd{owner}");
|
||||
dbg("Actual From: $fad{addr} $fad{domain} $fad{owner}");
|
||||
dbg("To Address: $tod{addr} $tod{domain} $tod{owner}");
|
||||
}
|
||||
}
|
||||
|
||||
sub _find_address_owner
|
||||
{
|
||||
my ($self, $check, $list_refs) = @_;
|
||||
foreach my $owner (keys %{$list_refs}) {
|
||||
foreach my $white_addr (keys %{$list_refs->{$owner}}) {
|
||||
my $regexp = qr/$list_refs->{$owner}{$white_addr}/i;
|
||||
if ($check =~ /$regexp/) {
|
||||
$owner =~ s/^FNS_//i;
|
||||
return lc $owner;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
my $owner = $self->uri_to_domain($check);
|
||||
|
||||
$check =~ /^([^\@]+)\@(.*)$/;
|
||||
|
||||
if ($owner ne $2) {
|
||||
return $self->_find_address_owner("$1\@$owner", $list_refs);
|
||||
}
|
||||
|
||||
$owner =~ /^([^\.]+)\./;
|
||||
return lc $1;
|
||||
}
|
||||
|
||||
1;
|
216
upstream/lib/Mail/SpamAssassin/Plugin/HTMLEval.pm
Normal file
216
upstream/lib/Mail/SpamAssassin/Plugin/HTMLEval.pm
Normal file
@ -0,0 +1,216 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Plugin::HTMLEval;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Locales;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var compile_regexp);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("html_tag_balance");
|
||||
$self->register_eval_rule("html_image_only");
|
||||
$self->register_eval_rule("html_image_ratio");
|
||||
$self->register_eval_rule("html_charset_faraway");
|
||||
$self->register_eval_rule("html_tag_exists");
|
||||
$self->register_eval_rule("html_test");
|
||||
$self->register_eval_rule("html_eval");
|
||||
$self->register_eval_rule("html_text_match");
|
||||
$self->register_eval_rule("html_title_subject_ratio");
|
||||
$self->register_eval_rule("html_text_not_match");
|
||||
$self->register_eval_rule("html_range");
|
||||
$self->register_eval_rule("check_iframe_src");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub html_tag_balance {
|
||||
my ($self, $pms, undef, $rawtag, $rawexpr) = @_;
|
||||
|
||||
return 0 if $rawtag !~ /^([a-zA-Z0-9]+)$/;
|
||||
my $tag = $1;
|
||||
|
||||
return 0 unless exists $pms->{html}{inside}{$tag};
|
||||
|
||||
return 0 if $rawexpr !~ /^([\<\>\=\!\-\+ 0-9]+)$/;
|
||||
my $expr = untaint_var($1);
|
||||
|
||||
$pms->{html}{inside}{$tag} =~ /^([\<\>\=\!\-\+ 0-9]+)$/;
|
||||
my $val = untaint_var($1);
|
||||
|
||||
return eval "\$val $expr";
|
||||
}
|
||||
|
||||
sub html_image_only {
|
||||
my ($self, $pms, undef, $min, $max) = @_;
|
||||
|
||||
return (exists $pms->{html}{inside}{img} &&
|
||||
exists $pms->{html}{length} &&
|
||||
$pms->{html}{length} > $min &&
|
||||
$pms->{html}{length} <= $max);
|
||||
}
|
||||
|
||||
sub html_image_ratio {
|
||||
my ($self, $pms, undef, $min, $max) = @_;
|
||||
|
||||
return 0 unless (exists $pms->{html}{non_space_len} &&
|
||||
exists $pms->{html}{image_area} &&
|
||||
$pms->{html}{image_area} > 0);
|
||||
my $ratio = $pms->{html}{non_space_len} / $pms->{html}{image_area};
|
||||
return ($ratio > $min && $ratio <= $max);
|
||||
}
|
||||
|
||||
sub html_charset_faraway {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
return 0 unless exists $pms->{html}{charsets};
|
||||
|
||||
my @locales = Mail::SpamAssassin::Util::get_my_locales($pms->{conf}->{ok_locales});
|
||||
return 0 if grep { $_ eq "all" } @locales;
|
||||
|
||||
my $okay = 0;
|
||||
my $bad = 0;
|
||||
for my $c (split(' ', $pms->{html}{charsets})) {
|
||||
if (Mail::SpamAssassin::Locales::is_charset_ok_for_locales($c, @locales)) {
|
||||
$okay++;
|
||||
}
|
||||
else {
|
||||
$bad++;
|
||||
}
|
||||
}
|
||||
return ($bad && ($bad >= $okay));
|
||||
}
|
||||
|
||||
sub html_tag_exists {
|
||||
my ($self, $pms, undef, $tag) = @_;
|
||||
return exists $pms->{html}{inside}{$tag};
|
||||
}
|
||||
|
||||
sub html_test {
|
||||
my ($self, $pms, undef, $test) = @_;
|
||||
return $pms->{html}{$test};
|
||||
}
|
||||
|
||||
sub html_eval {
|
||||
my ($self, $pms, undef, $test, $rawexpr) = @_;
|
||||
|
||||
return 0 if $rawexpr !~ /^([\<\>\=\!\-\+ 0-9]+)$/;
|
||||
my $expr = untaint_var($1);
|
||||
|
||||
# workaround bug 3320: weird perl bug where additional, very explicit
|
||||
# untainting into a new var is required.
|
||||
my $tainted = $pms->{html}{$test};
|
||||
return 0 unless defined($tainted);
|
||||
my $val = $tainted;
|
||||
|
||||
# just use the value in $val, don't copy it needlessly
|
||||
return eval "\$val $expr";
|
||||
}
|
||||
|
||||
sub html_text_match {
|
||||
my ($self, $pms, undef, $text, $regexp) = @_;
|
||||
my ($rec, $err) = compile_regexp($regexp, 0);
|
||||
if (!$rec) {
|
||||
warn "htmleval: html_text_match invalid regexp '$regexp': $err";
|
||||
return 0;
|
||||
}
|
||||
foreach my $string (@{$pms->{html}{$text}}) {
|
||||
next unless defined $string;
|
||||
if ($string =~ $rec) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_title_subject_ratio {
|
||||
my ($self, $pms, undef, $ratio) = @_;
|
||||
|
||||
my $subject = $pms->get('Subject');
|
||||
if ($subject eq '') {
|
||||
return 0;
|
||||
}
|
||||
my $max = 0;
|
||||
for my $string (@{ $pms->{html}{title} }) {
|
||||
if ($string) {
|
||||
my $ratio = length($string) / length($subject);
|
||||
$max = $ratio if $ratio > $max;
|
||||
}
|
||||
}
|
||||
return $max > $ratio;
|
||||
}
|
||||
|
||||
sub html_text_not_match {
|
||||
my ($self, $pms, undef, $text, $regexp) = @_;
|
||||
for my $string (@{ $pms->{html}{$text} }) {
|
||||
if (defined $string && $string !~ /${regexp}/) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_range {
|
||||
my ($self, $pms, undef, $test, $min, $max) = @_;
|
||||
|
||||
return 0 unless exists $pms->{html}{$test};
|
||||
|
||||
$test = $pms->{html}{$test};
|
||||
|
||||
# not all perls understand what "inf" means, so we need to do
|
||||
# non-numeric tests! urg!
|
||||
if (!defined $max || $max eq "inf") {
|
||||
return ($test eq "inf") ? 1 : ($test > $min);
|
||||
}
|
||||
elsif ($test eq "inf") {
|
||||
# $max < inf, so $test == inf means $test > $max
|
||||
return 0;
|
||||
}
|
||||
else {
|
||||
# if we get here everything should be a number
|
||||
return ($test > $min && $test <= $max);
|
||||
}
|
||||
}
|
||||
|
||||
sub check_iframe_src {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
foreach my $v ( values %{$pms->{html}->{uri_detail}} ) {
|
||||
return 1 if $v->{types}->{iframe};
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
1;
|
110
upstream/lib/Mail/SpamAssassin/Plugin/HTTPSMismatch.pm
Normal file
110
upstream/lib/Mail/SpamAssassin/Plugin/HTTPSMismatch.pm
Normal file
@ -0,0 +1,110 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
package Mail::SpamAssassin::Plugin::HTTPSMismatch;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule ("check_https_http_mismatch");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
# [lt]a href="http://baboz-njeryz.de/"[gt]https://bankofamerica.com/[lt]/a[gt]
|
||||
# ("<" and ">" replaced with "[lt]" and "[gt]" to avoid Kaspersky Desktop AV
|
||||
# false positive ;)
|
||||
sub check_https_http_mismatch {
|
||||
my ($self, $permsgstatus, undef, $minanchors, $maxanchors) = @_;
|
||||
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
|
||||
$minanchors ||= 1;
|
||||
|
||||
if (!exists $permsgstatus->{chhm_hit}) {
|
||||
$permsgstatus->{chhm_hit} = 0;
|
||||
$permsgstatus->{chhm_anchors} = 0;
|
||||
|
||||
foreach my $k ( keys %{$permsgstatus->{html}->{uri_detail}} ) {
|
||||
my %uri_detail = %{$permsgstatus->{html}->{uri_detail}};
|
||||
my $v = ${uri_detail}{$k};
|
||||
# if the URI wasn't used for an anchor tag, or the anchor text didn't
|
||||
# exist, skip this.
|
||||
next unless (exists $v->{anchor_text} && @{$v->{anchor_text}});
|
||||
|
||||
my $uri;
|
||||
if ($k =~ m@^https?://([^/:]+)@i) {
|
||||
$uri = $1;
|
||||
# Skip IPs since there's another rule to catch that already
|
||||
if ($uri =~ /^$IP_ADDRESS+$/) {
|
||||
undef $uri;
|
||||
next;
|
||||
}
|
||||
# want to compare whole hostnames instead of domains?
|
||||
# comment this next section to the blank line.
|
||||
$uri = $self->{main}->{registryboundaries}->trim_domain($uri);
|
||||
undef $uri unless ($self->{main}->{registryboundaries}->is_domain_valid($uri));
|
||||
}
|
||||
|
||||
next unless $uri;
|
||||
$permsgstatus->{chhm_anchors}++ if exists $v->{anchor_text};
|
||||
|
||||
foreach (@{$v->{anchor_text}}) {
|
||||
if (m@https://([^/:]+)@i) {
|
||||
my $https = $1;
|
||||
|
||||
# want to compare whole hostnames instead of domains?
|
||||
# comment this next section to the blank line.
|
||||
if ($https !~ /^$IP_ADDRESS+$/) {
|
||||
$https = $self->{main}->{registryboundaries}->trim_domain($https);
|
||||
undef $https unless ($self->{main}->{registryboundaries}->is_domain_valid($https));
|
||||
}
|
||||
next unless $https;
|
||||
|
||||
dbg("https_http_mismatch: domains $uri -> $https");
|
||||
|
||||
next if $uri eq $https;
|
||||
$permsgstatus->{chhm_hit} = 1;
|
||||
last;
|
||||
}
|
||||
}
|
||||
}
|
||||
dbg("https_http_mismatch: anchors ".$permsgstatus->{chhm_anchors});
|
||||
}
|
||||
|
||||
return ( $permsgstatus->{chhm_hit} && $permsgstatus->{chhm_anchors} >= $minanchors && (defined $maxanchors && $permsgstatus->{chhm_anchors} < $maxanchors) );
|
||||
}
|
||||
|
||||
1;
|
682
upstream/lib/Mail/SpamAssassin/Plugin/HashBL.pm
Normal file
682
upstream/lib/Mail/SpamAssassin/Plugin/HashBL.pm
Normal file
@ -0,0 +1,682 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
HashBL - query hashed (and unhashed) DNS blocklists
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::HashBL
|
||||
|
||||
header HASHBL_EMAIL eval:check_hashbl_emails('ebl.msbl.org')
|
||||
describe HASHBL_EMAIL Message contains email address found on EBL
|
||||
|
||||
hashbl_acl_freemail gmail.com
|
||||
header HASHBL_OSENDR eval:check_hashbl_emails('rbl.example.com/A', 'md5/max=10/shuffle', 'X-Original-Sender', '^127\.', 'freemail')
|
||||
describe HASHBL_OSENDR Message contains email address found on HASHBL
|
||||
tflags HASHBL_OSENDR net
|
||||
|
||||
body HASHBL_BTC eval:check_hashbl_bodyre('btcbl.foo.bar', 'sha1/max=10/shuffle', '\b([13][a-km-zA-HJ-NP-Z1-9]{25,34})\b')
|
||||
describe HASHBL_BTC Message contains BTC address found on BTCBL
|
||||
priority HASHBL_BTC -100 # required priority to launch async lookups
|
||||
|
||||
header HASHBL_URI eval:check_hashbl_uris('rbl.foo.bar', 'sha1', '127.0.0.32')
|
||||
describe HASHBL_URI Message contains uri found on rbl
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin support multiple types of hashed or unhashed DNS blocklists.
|
||||
|
||||
OPTS refers to multiple generic options:
|
||||
|
||||
raw do not hash data, query as is
|
||||
md5 hash query with MD5
|
||||
sha1 hash query with SHA1
|
||||
case keep case before hashing, default is to lowercase
|
||||
max=x maximum number of queries
|
||||
shuffle if max exceeded, random shuffle queries before truncating to limit
|
||||
|
||||
Multiple options can be separated with slash or other non-word character.
|
||||
If OPTS is empty ('') or missing, default is used.
|
||||
|
||||
HEADERS refers to slash separated list of Headers to process:
|
||||
|
||||
ALL all headers
|
||||
ALLFROM all From headers as returned by $pms->all_from_addrs()
|
||||
EnvelopeFrom message envelope from (Return-Path etc)
|
||||
HeaderName any header as used with $pms->get()
|
||||
|
||||
if HEADERS is empty ('') or missing, default is used.
|
||||
|
||||
=over 4
|
||||
|
||||
=item header RULE check_hashbl_emails('bl.example.com/A', 'OPTS', 'HEADERS/body', '^127\.')
|
||||
|
||||
Check email addresses from DNS list, "body" can be specified along with
|
||||
headers to search body for emails. Optional subtest regexp to match DNS
|
||||
answer. Note that eval rule type must always be "header".
|
||||
|
||||
DNS query type can be appended to list with /A (default) or /TXT.
|
||||
|
||||
Additional supported OPTS:
|
||||
|
||||
nodot strip username dots from email
|
||||
notag strip username tags from email
|
||||
nouri ignore emails inside uris
|
||||
noquote ignore emails inside < > or possible quotings
|
||||
|
||||
Default OPTS: sha1/notag/noquote/max=10/shuffle
|
||||
|
||||
Default HEADERS: ALLFROM/Reply-To/body
|
||||
|
||||
For existing public email blacklist, see: http://msbl.org/ebl.html
|
||||
|
||||
header HASHBL_EBL check_hashbl_emails('ebl.msbl.org')
|
||||
priority HASHBL_EBL -100 # required for async query
|
||||
|
||||
=over 4
|
||||
|
||||
=item header RULE check_hashbl_uris('bl.example.com/A', 'OPTS', '^127\.')
|
||||
|
||||
Check uris from DNS list, optional subtest regexp to match DNS
|
||||
answer.
|
||||
|
||||
DNS query type can be appended to list with /A (default) or /TXT.
|
||||
|
||||
Default OPTS: sha1/max=10/shuffle
|
||||
|
||||
=back
|
||||
|
||||
=item body RULE check_hashbl_bodyre('bl.example.com/A', 'OPTS', '\b(match)\b', '^127\.')
|
||||
|
||||
Search body for matching regexp and query the string captured. Regexp must
|
||||
have a single capture ( ) for the string ($1). Optional subtest regexp to
|
||||
match DNS answer. Note that eval rule type must be "body" or "rawbody".
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::HashBL;
|
||||
use strict;
|
||||
use warnings;
|
||||
|
||||
my $VERSION = 0.101;
|
||||
|
||||
use Digest::MD5 qw(md5_hex);
|
||||
use Digest::SHA qw(sha1_hex);
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub dbg {
|
||||
my $msg = shift;
|
||||
Mail::SpamAssassin::Plugin::dbg("HashBL: $msg", @_);
|
||||
}
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless ($self, $class);
|
||||
|
||||
# are network tests enabled?
|
||||
if ($mailsa->{local_tests_only}) {
|
||||
$self->{hashbl_available} = 0;
|
||||
dbg("local tests only, disabling HashBL");
|
||||
} else {
|
||||
$self->{hashbl_available} = 1;
|
||||
}
|
||||
|
||||
$self->register_eval_rule("check_hashbl_emails");
|
||||
$self->register_eval_rule("check_hashbl_uris");
|
||||
$self->register_eval_rule("check_hashbl_bodyre");
|
||||
$self->set_config($mailsa->{conf});
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'hashbl_ignore',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
default => {},
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if (!defined $value || $value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
foreach my $str (split (/\s+/, $value)) {
|
||||
$self->{hashbl_ignore}->{lc $str} = 1;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub _parse_args {
|
||||
my ($self, $acl) = @_;
|
||||
|
||||
if (not defined $acl) {
|
||||
return ();
|
||||
}
|
||||
$acl =~ s/\s+//g;
|
||||
if ($acl !~ /^[a-z0-9]{1,32}$/) {
|
||||
warn("invalid acl name: $acl");
|
||||
return ();
|
||||
}
|
||||
if ($acl eq 'all') {
|
||||
return ();
|
||||
}
|
||||
if (defined $self->{hashbl_acl}{$acl}) {
|
||||
warn("no such acl defined: $acl");
|
||||
return ();
|
||||
}
|
||||
}
|
||||
|
||||
sub parse_config {
|
||||
my ($self, $opt) = @_;
|
||||
|
||||
if ($opt->{key} =~ /^hashbl_acl_([a-z0-9]{1,32})$/i) {
|
||||
$self->inhibit_further_callbacks();
|
||||
return 1 unless $self->{hashbl_available};
|
||||
|
||||
my $acl = lc($1);
|
||||
my @opts = split(/\s+/, $opt->{value});
|
||||
foreach my $tmp (@opts)
|
||||
{
|
||||
if ($tmp =~ /^(\!)?(\S+)$/i) {
|
||||
my $neg = $1;
|
||||
my $value = lc($2);
|
||||
|
||||
if (defined $neg) {
|
||||
$self->{hashbl_acl}{$acl}{$value} = 0;
|
||||
} else {
|
||||
next if $acl eq 'all';
|
||||
# exclusions overrides
|
||||
if ( not defined $self->{hashbl_acl}{$acl}{$value} ) {
|
||||
$self->{hashbl_acl}{$acl}{$value} = 1
|
||||
}
|
||||
}
|
||||
} else {
|
||||
warn("invalid acl: $tmp");
|
||||
}
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub finish_parsing_end {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
return 0 if !$self->{hashbl_available};
|
||||
|
||||
# valid_tlds_re will be available at finish_parsing_end, compile it now,
|
||||
# we only need to do it once and before possible forking
|
||||
if (!exists $self->{email_re}) {
|
||||
$self->_init_email_re();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _init_email_re {
|
||||
my ($self) = @_;
|
||||
|
||||
# Some regexp tips courtesy of http://www.regular-expressions.info/email.html
|
||||
# full email regex v0.02
|
||||
$self->{email_re} = qr/
|
||||
(?=.{0,64}\@) # limit userpart to 64 chars (and speed up searching?)
|
||||
(?<![a-z0-9!#\$%&'*+\/=?^_`{|}~-]) # start boundary
|
||||
( # capture email
|
||||
[a-z0-9!#\$%&'*+\/=?^_`{|}~-]+ # no dot in beginning
|
||||
(?:\.[a-z0-9!#\$%&'*+\/=?^_`{|}~-]+)* # no consecutive dots, no ending dot
|
||||
\@
|
||||
(?:[a-z0-9](?:[a-z0-9-]{0,59}[a-z0-9])?\.){1,4} # max 4x61 char parts (should be enough?)
|
||||
$self->{main}->{registryboundaries}->{valid_tlds_re} # ends with valid tld
|
||||
)
|
||||
/xi;
|
||||
|
||||
# default email whitelist
|
||||
$self->{email_whitelist} = qr/
|
||||
^(?:
|
||||
abuse|support|sales|info|helpdesk|contact|kontakt
|
||||
| (?:post|host|domain)master
|
||||
| undisclosed.* # yahoo.com etc(?)
|
||||
| request-[a-f0-9]{16} # live.com
|
||||
| bounced?- # yahoo.com etc
|
||||
| [a-f0-9]{8}(?:\.[a-f0-9]{8}|-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}) # gmail msgids?
|
||||
| .+=.+=.+ # gmail forward
|
||||
)\@
|
||||
/xi;
|
||||
}
|
||||
|
||||
sub _get_emails {
|
||||
my ($self, $pms, $opts, $from, $acl) = @_;
|
||||
|
||||
my @emails; # keep find order
|
||||
my %seen;
|
||||
my @tmp_email;
|
||||
my $domain;
|
||||
|
||||
foreach my $hdr (split(/\//, $from)) {
|
||||
my $parsed_emails = $self->_parse_emails($pms, $opts, $hdr);
|
||||
foreach (@$parsed_emails) {
|
||||
next if exists $seen{$_};
|
||||
my @tmp_email = split('@', $_);
|
||||
my $domain = $tmp_email[1];
|
||||
if (defined($acl) and ($acl ne "all") and defined($domain)) {
|
||||
if (defined($self->{hashbl_acl}{$acl}{$domain}) and ($self->{hashbl_acl}{$acl}{$domain} eq 1)) {
|
||||
push @emails, $_;
|
||||
$seen{$_} = 1;
|
||||
}
|
||||
} else {
|
||||
push @emails, $_;
|
||||
$seen{$_} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return \@emails;
|
||||
}
|
||||
|
||||
sub _parse_emails {
|
||||
my ($self, $pms, $opts, $hdr) = @_;
|
||||
|
||||
if (exists $pms->{hashbl_email_cache}{$hdr}) {
|
||||
return $pms->{hashbl_email_cache}{$hdr};
|
||||
}
|
||||
|
||||
if ($hdr eq 'ALLFROM') {
|
||||
my @emails = $pms->all_from_addrs();
|
||||
return $pms->{hashbl_email_cache}{$hdr} = \@emails;
|
||||
}
|
||||
|
||||
if (not defined $pms->{hashbl_whitelist}) {
|
||||
%{$pms->{hashbl_whitelist}} = map { lc($_) => 1 }
|
||||
( $pms->get("X-Original-To:addr"),
|
||||
$pms->get("Apparently-To:addr"),
|
||||
$pms->get("Delivered-To:addr"),
|
||||
$pms->get("Envelope-To:addr"),
|
||||
);
|
||||
if ( defined $pms->{hashbl_whitelist}{''} ) {
|
||||
delete $pms->{hashbl_whitelist}{''};
|
||||
}
|
||||
}
|
||||
|
||||
my $str = '';
|
||||
if ($hdr eq 'ALL') {
|
||||
$str = join("\n", $pms->get('ALL'));
|
||||
} elsif ($hdr eq 'body') {
|
||||
# get all <a href="mailto:", since they don't show up on stripped_body
|
||||
my $uris = $pms->get_uri_detail_list();
|
||||
while (my($uri, $info) = each %{$uris}) {
|
||||
if (defined $info->{types}->{a} && !defined $info->{types}->{parsed}) {
|
||||
if ($uri =~ /^mailto:(.+)/i) {
|
||||
$str .= "$1\n";
|
||||
}
|
||||
}
|
||||
}
|
||||
my $body = join('', $pms->get_decoded_stripped_body_text_array());
|
||||
if ($opts =~ /\bnouri\b/) {
|
||||
# strip urls with possible emails inside
|
||||
$body =~ s#<?https?://\S{0,255}(?:\@|%40)\S{0,255}# #gi;
|
||||
}
|
||||
if ($opts =~ /\bnoquote\b/) {
|
||||
# strip emails contained in <>, not mailto:
|
||||
# also strip ones followed by quote-like "wrote:" (but not fax: and tel: etc)
|
||||
$body =~ s#<?(?<!mailto:)$self->{email_re}(?:>|\s{1,10}(?!(?:fa(?:x|csi)|tel|phone|e?-?mail))[a-z]{2,11}:)# #gi;
|
||||
}
|
||||
$str .= $body;
|
||||
} else {
|
||||
$str .= join("\n", $pms->get($hdr));
|
||||
}
|
||||
|
||||
my @emails; # keep find order
|
||||
my %seen;
|
||||
|
||||
while ($str =~ /($self->{email_re})/g) {
|
||||
next if exists $seen{$1};
|
||||
push @emails, $1;
|
||||
}
|
||||
|
||||
return $pms->{hashbl_email_cache}{$hdr} = \@emails;
|
||||
}
|
||||
|
||||
sub check_hashbl_emails {
|
||||
my ($self, $pms, $list, $opts, $from, $subtest, $acl) = @_;
|
||||
|
||||
return 0 if !$self->{hashbl_available};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
return 0 if !$self->{email_re};
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
|
||||
if (!defined $list) {
|
||||
warn "HashBL: $rulename blocklist argument missing\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($subtest) {
|
||||
my ($rec, $err) = compile_regexp($subtest, 0);
|
||||
if (!$rec) {
|
||||
warn "HashBL: $rulename invalid subtest regex: $@\n";
|
||||
return 0;
|
||||
}
|
||||
$subtest = $rec;
|
||||
}
|
||||
|
||||
# Defaults
|
||||
$opts = 'sha1/notag/noquote/max=10/shuffle' if !$opts;
|
||||
|
||||
$from = 'ALLFROM/Reply-To/body' if !$from;
|
||||
|
||||
# Find all emails
|
||||
my $emails = $self->_get_emails($pms, $opts, $from, $acl);
|
||||
if (!@$emails) {
|
||||
if(defined $acl) {
|
||||
dbg("$rulename: no emails found ($from) on acl $acl");
|
||||
} else {
|
||||
dbg("$rulename: no emails found ($from)");
|
||||
}
|
||||
return 0;
|
||||
} else {
|
||||
dbg("$rulename: raw emails found: ".join(', ', @$emails));
|
||||
}
|
||||
|
||||
# Filter list
|
||||
my $keep_case = $opts =~ /\bcase\b/i;
|
||||
my $nodot = $opts =~ /\bnodot\b/i;
|
||||
my $notag = $opts =~ /\bnotag\b/i;
|
||||
my @filtered_emails; # keep order
|
||||
my %seen;
|
||||
foreach my $email (@$emails) {
|
||||
next if exists $seen{$email};
|
||||
if (($email =~ $self->{email_whitelist}) or defined ($pms->{hashbl_whitelist}{$email})) {
|
||||
dbg("Address whitelisted: $email");
|
||||
next;
|
||||
}
|
||||
if ($nodot || $notag) {
|
||||
my ($username, $domain) = ($email =~ /(.*)(\@.*)/);
|
||||
$username =~ tr/.//d if $nodot;
|
||||
$username =~ s/\+.*// if $notag;
|
||||
$email = $username.$domain;
|
||||
}
|
||||
push @filtered_emails, $keep_case ? $email : lc($email);
|
||||
$seen{$email} = 1;
|
||||
}
|
||||
|
||||
# Randomize order
|
||||
if ($opts =~ /\bshuffle\b/) {
|
||||
Mail::SpamAssassin::Util::fisher_yates_shuffle(\@filtered_emails);
|
||||
}
|
||||
|
||||
# Truncate list
|
||||
my $max = $opts =~ /\bmax=(\d+)\b/ ? $1 : 10;
|
||||
$#filtered_emails = $max-1 if scalar @filtered_emails > $max;
|
||||
|
||||
foreach my $email (@filtered_emails) {
|
||||
$self->_submit_query($pms, $rulename, $email, $list, $opts, $subtest);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_hashbl_uris {
|
||||
my ($self, $pms, $list, $opts, $subtest) = @_;
|
||||
|
||||
return 0 if !$self->{hashbl_available};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
|
||||
if (!defined $list) {
|
||||
warn "HashBL: $rulename blocklist argument missing\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($subtest) {
|
||||
my ($rec, $err) = compile_regexp($subtest, 0);
|
||||
if (!$rec) {
|
||||
warn "HashBL: $rulename invalid subtest regex: $@\n";
|
||||
return 0;
|
||||
}
|
||||
$subtest = $rec;
|
||||
}
|
||||
|
||||
# Defaults
|
||||
$opts = 'sha1/max=10/shuffle' if !$opts;
|
||||
|
||||
# Filter list
|
||||
my $keep_case = $opts =~ /\bcase\b/i;
|
||||
|
||||
if ($opts =~ /raw/) {
|
||||
warn "HashBL: $rulename raw option invalid\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $uris = $pms->get_uri_detail_list();
|
||||
my %seen;
|
||||
my @filtered_uris;
|
||||
|
||||
while (my($uri, $info) = each %{$uris}) {
|
||||
# we want to skip mailto: uris
|
||||
next if ($uri =~ /^mailto:/i);
|
||||
next if exists $seen{$uri};
|
||||
|
||||
# no hosts/domains were found via this uri, so skip
|
||||
next unless $info->{hosts};
|
||||
next unless $info->{cleaned};
|
||||
next unless $info->{types}->{a} || $info->{types}->{parsed};
|
||||
foreach my $uri (@{$info->{cleaned}}) {
|
||||
# check url
|
||||
push @filtered_uris, $keep_case ? $uri : lc($uri);
|
||||
}
|
||||
$seen{$uri} = 1;
|
||||
}
|
||||
|
||||
# Randomize order
|
||||
if ($opts =~ /\bshuffle\b/) {
|
||||
Mail::SpamAssassin::Util::fisher_yates_shuffle(\@filtered_uris);
|
||||
}
|
||||
|
||||
# Truncate list
|
||||
my $max = $opts =~ /\bmax=(\d+)\b/ ? $1 : 10;
|
||||
$#filtered_uris = $max-1 if scalar @filtered_uris > $max;
|
||||
|
||||
foreach my $furi (@filtered_uris) {
|
||||
$self->_submit_query($pms, $rulename, $furi, $list, $opts, $subtest);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_hashbl_bodyre {
|
||||
my ($self, $pms, $bodyref, $list, $opts, $re, $subtest) = @_;
|
||||
|
||||
return 0 if !$self->{hashbl_available};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
|
||||
if (!defined $list) {
|
||||
warn "HashBL: $rulename blocklist argument missing\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!$re) {
|
||||
warn "HashBL: $rulename missing body regex\n";
|
||||
return 0;
|
||||
}
|
||||
my ($rec, $err) = compile_regexp($re, 0);
|
||||
if (!$rec) {
|
||||
warn "HashBL: $rulename invalid body regex: $@\n";
|
||||
return 0;
|
||||
}
|
||||
$re = $rec;
|
||||
|
||||
if ($subtest) {
|
||||
my ($rec, $err) = compile_regexp($subtest, 0);
|
||||
if (!$rec) {
|
||||
warn "HashBL: $rulename invalid subtest regex: $@\n";
|
||||
return 0;
|
||||
}
|
||||
$subtest = $rec;
|
||||
}
|
||||
|
||||
# Defaults
|
||||
$opts = 'sha1/max=10/shuffle' if !$opts;
|
||||
|
||||
my $keep_case = $opts =~ /\bcase\b/i;
|
||||
|
||||
# Search body
|
||||
my @matches;
|
||||
my %seen;
|
||||
if (ref($bodyref) eq 'ARRAY') {
|
||||
# body, rawbody
|
||||
foreach (@$bodyref) {
|
||||
while ($_ =~ /$re/gs) {
|
||||
next if !defined $1;
|
||||
my $match = $keep_case ? $1 : lc($1);
|
||||
next if exists $seen{$match};
|
||||
$seen{$match} = 1;
|
||||
push @matches, $match;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
# full
|
||||
while ($$bodyref =~ /$re/gs) {
|
||||
next if !defined $1;
|
||||
my $match = $keep_case ? $1 : lc($1);
|
||||
next if exists $seen{$match};
|
||||
$seen{$match} = 1;
|
||||
push @matches, $match;
|
||||
}
|
||||
}
|
||||
|
||||
if (!@matches) {
|
||||
dbg("$rulename: no matches found");
|
||||
return 0;
|
||||
} else {
|
||||
dbg("$rulename: matches found: '".join("', '", @matches)."'");
|
||||
}
|
||||
|
||||
# Randomize order
|
||||
if ($opts =~ /\bshuffle\b/) {
|
||||
Mail::SpamAssassin::Util::fisher_yates_shuffle(\@matches);
|
||||
}
|
||||
|
||||
# Truncate list
|
||||
my $max = $opts =~ /\bmax=(\d+)\b/ ? $1 : 10;
|
||||
$#matches = $max-1 if scalar @matches > $max;
|
||||
|
||||
foreach my $match (@matches) {
|
||||
$self->_submit_query($pms, $rulename, $match, $list, $opts, $subtest);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _hash {
|
||||
my ($self, $opts, $value) = @_;
|
||||
|
||||
my $hashtype = $opts =~ /\b(raw|sha1|md5)\b/i ? lc($1) : 'sha1';
|
||||
if ($hashtype eq 'sha1') {
|
||||
return sha1_hex($value);
|
||||
} elsif ($hashtype eq 'md5') {
|
||||
return md5_hex($value);
|
||||
} else {
|
||||
return $value;
|
||||
}
|
||||
}
|
||||
|
||||
sub _submit_query {
|
||||
my ($self, $pms, $rulename, $value, $list, $opts, $subtest) = @_;
|
||||
|
||||
if (exists $pms->{conf}->{hashbl_ignore}->{lc $value}) {
|
||||
dbg("query skipped, ignored string: $value");
|
||||
return 1;
|
||||
}
|
||||
|
||||
my $hash = $self->_hash($opts, $value);
|
||||
dbg("querying $value ($hash) from $list");
|
||||
|
||||
if (exists $pms->{conf}->{hashbl_ignore}->{$hash}) {
|
||||
dbg("query skipped, ignored hash: $value");
|
||||
return 1;
|
||||
}
|
||||
|
||||
my $type = $list =~ s,/(A|TXT)$,,i ? uc($1) : 'A';
|
||||
my $lookup = "$hash.$list";
|
||||
|
||||
my $key = "HASHBL_EMAIL:$lookup";
|
||||
my $ent = {
|
||||
key => $key,
|
||||
zone => $list,
|
||||
rulename => $rulename,
|
||||
type => "HASHBL",
|
||||
hash => $hash,
|
||||
value => $value,
|
||||
subtest => $subtest,
|
||||
};
|
||||
$ent = $pms->{async}->bgsend_and_start_lookup($lookup, $type, undef, $ent,
|
||||
sub { my ($ent, $pkt) = @_; $self->_finish_query($pms, $ent, $pkt); },
|
||||
master_deadline => $pms->{master_deadline}
|
||||
);
|
||||
$pms->register_async_rule_start($rulename) if $ent;
|
||||
}
|
||||
|
||||
sub _finish_query {
|
||||
my ($self, $pms, $ent, $pkt) = @_;
|
||||
|
||||
if (!$pkt) {
|
||||
# $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
dbg("lookup was aborted: $ent->{rulename} $ent->{key}");
|
||||
return;
|
||||
}
|
||||
|
||||
my $dnsmatch = $ent->{subtest} ? $ent->{subtest} : qr/^127\./;
|
||||
my @answer = $pkt->answer;
|
||||
foreach my $rr (@answer) {
|
||||
if ($rr->address =~ $dnsmatch) {
|
||||
dbg("$ent->{rulename}: $ent->{zone} hit '$ent->{value}'");
|
||||
$ent->{value} =~ s/\@/[at]/g;
|
||||
$pms->test_log($ent->{value});
|
||||
$pms->got_hit($ent->{rulename}, '', ruletype => 'eval');
|
||||
$pms->register_async_rule_finish($ent->{rulename});
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_hashbl_bodyre { 1 }
|
||||
sub has_hashbl_emails { 1 }
|
||||
sub has_hashbl_uris { 1 }
|
||||
sub has_hashbl_ignore { 1 }
|
||||
|
||||
1;
|
352
upstream/lib/Mail/SpamAssassin/Plugin/Hashcash.pm
Normal file
352
upstream/lib/Mail/SpamAssassin/Plugin/Hashcash.pm
Normal file
@ -0,0 +1,352 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::Hashcash - perform hashcash verification tests
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::Hashcash
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Hashcash is a payment system for email where CPU cycles used as the
|
||||
basis for an e-cash system. This plugin makes it possible to use valid
|
||||
hashcash tokens added by mail programs as a bonus for messages.
|
||||
|
||||
=cut
|
||||
|
||||
=head1 USER SETTINGS
|
||||
|
||||
=over 4
|
||||
|
||||
=item use_hashcash { 1 | 0 } (default: 1)
|
||||
|
||||
Whether to use hashcash, if it is available.
|
||||
|
||||
=cut
|
||||
|
||||
=item hashcash_accept user@example.com ...
|
||||
|
||||
Used to specify addresses that we accept HashCash tokens for. You should set
|
||||
it to match all the addresses that you may receive mail at.
|
||||
|
||||
Like whitelist and blacklist entries, the addresses are file-glob-style
|
||||
patterns, so C<friend@somewhere.com>, C<*@isp.com>, or C<*.domain.net> will all
|
||||
work. Specifically, C<*> and C<?> are allowed, but all other metacharacters
|
||||
are not. Regular expressions are not used for security reasons.
|
||||
|
||||
The sequence C<%u> is replaced with the current user's username, which
|
||||
is useful for ISPs or multi-user domains.
|
||||
|
||||
Multiple addresses per line, separated by spaces, is OK. Multiple
|
||||
C<hashcash_accept> lines is also OK.
|
||||
|
||||
=cut
|
||||
|
||||
=item hashcash_doublespend_path /path/to/file (default: ~/.spamassassin/hashcash_seen)
|
||||
|
||||
Path for HashCash double-spend database. HashCash tokens are only usable once,
|
||||
so their use is tracked in this database to avoid providing a loophole.
|
||||
|
||||
By default, each user has their own, in their C<~/.spamassassin> directory with
|
||||
mode 0700/0600. Note that once a token is 'spent' it is written to this file,
|
||||
and double-spending of a hashcash token makes it invalid, so this is not
|
||||
suitable for sharing between multiple users.
|
||||
|
||||
=cut
|
||||
|
||||
=item hashcash_doublespend_file_mode (default: 0700)
|
||||
|
||||
The file mode bits used for the HashCash double-spend database file.
|
||||
|
||||
Make sure you specify this using the 'x' mode bits set, as it may also be used
|
||||
to create directories. However, if a file is created, the resulting file will
|
||||
not have any execute bits set (the umask is set to 111).
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::Hashcash;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
|
||||
use Errno qw(ENOENT EACCES);
|
||||
use Fcntl;
|
||||
use File::Path;
|
||||
use File::Basename;
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1) }
|
||||
}
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
use constant HAS_DB_FILE => eval { require DB_File; };
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule ("check_hashcash_value");
|
||||
$self->register_eval_rule ("check_hashcash_double_spend");
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'use_hashcash',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'hashcash_doublespend_path',
|
||||
default => '__userstate__/hashcash_seen',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'hashcash_doublespend_file_mode',
|
||||
default => "0700",
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'hashcash_accept',
|
||||
default => {},
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub check_hashcash_value {
|
||||
my ($self, $scanner, $valmin, $valmax) = @_;
|
||||
my $val = $self->_run_hashcash($scanner);
|
||||
return ($val >= $valmin && $val < $valmax);
|
||||
}
|
||||
|
||||
sub check_hashcash_double_spend {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_run_hashcash($scanner);
|
||||
return ($scanner->{hashcash_double_spent});
|
||||
}
|
||||
|
||||
############################################################################
|
||||
|
||||
sub _run_hashcash {
|
||||
my ($self, $scanner) = @_;
|
||||
|
||||
if (defined $scanner->{hashcash_value}) { return $scanner->{hashcash_value}; }
|
||||
|
||||
$scanner->{hashcash_value} = 0;
|
||||
|
||||
# X-Hashcash: 0:031118:camram-spam@camram.org:c068b58ade6dcbaf
|
||||
# or:
|
||||
# X-hashcash: 1:20:040803:hashcash@freelists.org::6dcdb3a3ad4e1b86:1519d
|
||||
# X-hashcash: 1:20:040803:jm@jmason.org::6b484d06469ccb28:8838a
|
||||
# X-hashcash: 1:20:040803:adam@cypherspace.org::a1cbc54bf0182ea8:5d6a0
|
||||
|
||||
# call down to {msg} so that we can get it as an array of
|
||||
# individual headers
|
||||
my @hdrs = $scanner->{msg}->get_header ("X-Hashcash");
|
||||
if (scalar @hdrs == 0) {
|
||||
@hdrs = $scanner->{msg}->get_header ("Hashcash");
|
||||
}
|
||||
|
||||
foreach my $hc (@hdrs) {
|
||||
my $value = $self->_run_hashcash_for_one_string($scanner, $hc);
|
||||
if ($value) {
|
||||
# remove the "double-spend" bool if we did find a usable string;
|
||||
# this happens when one string is already spent, but another
|
||||
# string has not yet been.
|
||||
delete $scanner->{hashcash_double_spent};
|
||||
return $value;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _run_hashcash_for_one_string {
|
||||
my ($self, $scanner, $hc) = @_;
|
||||
|
||||
if (!$hc) { return 0; }
|
||||
$hc =~ s/\s+//gs; # remove whitespace from multiline, folded tokens
|
||||
|
||||
# untaint the string for paranoia, making sure not to allow \n \0 \' \"
|
||||
if ($hc =~ /^[-A-Za-z0-9\xA0-\xFF:_\/\%\@\.\,\= \*\+\;]+$/) {
|
||||
$hc = untaint_var($hc);
|
||||
}
|
||||
if (!$hc) { return 0; }
|
||||
|
||||
my ($ver, $bits, $date, $rsrc, $exts, $rand, $trial);
|
||||
if ($hc =~ /^0:/) {
|
||||
($ver, $date, $rsrc, $trial) = split (/:/, $hc, 4);
|
||||
}
|
||||
elsif ($hc =~ /^1:/) {
|
||||
($ver, $bits, $date, $rsrc, $exts, $rand, $trial) =
|
||||
split (/:/, $hc, 7);
|
||||
# extensions are, as yet, unused by SpamAssassin
|
||||
}
|
||||
else {
|
||||
dbg("hashcash: version $ver stamps not yet supported");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!$trial) {
|
||||
dbg("hashcash: no trial in stamp '$hc'");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $accept = $scanner->{conf}->{hashcash_accept};
|
||||
if (!$self->_check_hashcash_resource ($scanner, $accept, $rsrc)) {
|
||||
dbg("hashcash: resource $rsrc not accepted here");
|
||||
return 0;
|
||||
}
|
||||
|
||||
# get the hash collision from the token. Computing the hash collision
|
||||
# is very easy (great!) -- just get SHA1(token) and count the 0 bits at
|
||||
# the start of the SHA1 hash, according to the draft at
|
||||
# http://www.hashcash.org/draft-hashcash.txt .
|
||||
my $value = 0;
|
||||
my $bitstring = unpack ("B*", sha1($hc));
|
||||
$bitstring =~ /^(0+)/ and $value = length $1;
|
||||
|
||||
# hashcash v1 tokens: if the "claimed value" of the token is less than
|
||||
# what the token actually contains (ie. token was accidentally generated
|
||||
# with 24 bits instead of the claimed 20), then cut it down to just the
|
||||
# claimed value. that way it's a bit tidier and more deterministic.
|
||||
if ($bits && $value > $bits) {
|
||||
$value = $bits;
|
||||
}
|
||||
|
||||
dbg("hashcash: token value: $value");
|
||||
|
||||
if ($self->was_hashcash_token_double_spent ($scanner, $hc)) {
|
||||
$scanner->{hashcash_double_spent} = 1;
|
||||
return 0;
|
||||
}
|
||||
|
||||
$scanner->{hashcash_value} = $value;
|
||||
return $value;
|
||||
}
|
||||
|
||||
sub was_hashcash_token_double_spent {
|
||||
my ($self, $scanner, $token) = @_;
|
||||
|
||||
my $main = $self->{main};
|
||||
if (!$main->{conf}->{hashcash_doublespend_path}) {
|
||||
dbg("hashcash: hashcash_doublespend_path not defined or empty");
|
||||
return 0;
|
||||
}
|
||||
if (!HAS_DB_FILE) {
|
||||
dbg("hashcash: DB_File module not installed, cannot use double-spend db");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $path = $main->sed_path ($main->{conf}->{hashcash_doublespend_path});
|
||||
my $parentdir = dirname ($path);
|
||||
my $stat_errn = stat($parentdir) ? 0 : 0+$!;
|
||||
if ($stat_errn == 0 && !-d _) {
|
||||
dbg("hashcash: parent dir $parentdir exists but is not a directory");
|
||||
} elsif ($stat_errn == ENOENT) {
|
||||
# run in an eval(); if mkpath has no perms, it calls die()
|
||||
eval {
|
||||
mkpath ($parentdir, 0, (oct ($main->{conf}->{hashcash_doublespend_file_mode}) & 0777));
|
||||
};
|
||||
}
|
||||
|
||||
my %spenddb;
|
||||
if (!tie %spenddb, "DB_File", $path, O_RDWR|O_CREAT,
|
||||
(oct ($main->{conf}->{hashcash_doublespend_file_mode}) & 0666))
|
||||
{
|
||||
dbg("hashcash: failed to tie to $path: $@ $!");
|
||||
# not a serious error. TODO?
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (exists $spenddb{$token}) {
|
||||
untie %spenddb;
|
||||
dbg("hashcash: token '$token' spent already");
|
||||
return 1;
|
||||
}
|
||||
|
||||
$spenddb{$token} = time;
|
||||
dbg("hashcash: marking token '$token' as spent");
|
||||
|
||||
# TODO: expiry?
|
||||
|
||||
untie %spenddb;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _check_hashcash_resource {
|
||||
my ($self, $scanner, $list, $addr) = @_;
|
||||
$addr = lc $addr;
|
||||
if (defined ($list->{$addr})) { return 1; }
|
||||
study $addr; # study is a no-op since perl 5.16.0, eliminating related bugs
|
||||
|
||||
foreach my $regexp (values %{$list})
|
||||
{
|
||||
# allow %u == current username
|
||||
# \\ is added by $conf->add_to_addrlist()
|
||||
$regexp =~ s/\\\%u/$scanner->{main}->{username}/gs;
|
||||
|
||||
if ($addr =~ /$regexp/i) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
# TODO: use "To" and "Cc" addresses gleaned from the mails in the Bayes
|
||||
# database trained as ham, as well.
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
############################################################################
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
1113
upstream/lib/Mail/SpamAssassin/Plugin/HeaderEval.pm
Normal file
1113
upstream/lib/Mail/SpamAssassin/Plugin/HeaderEval.pm
Normal file
File diff suppressed because it is too large
Load Diff
409
upstream/lib/Mail/SpamAssassin/Plugin/ImageInfo.pm
Normal file
409
upstream/lib/Mail/SpamAssassin/Plugin/ImageInfo.pm
Normal file
@ -0,0 +1,409 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
#
|
||||
# -------------------------------------------------------
|
||||
# ImageInfo Plugin for SpamAssassin
|
||||
# Version: 0.7
|
||||
# Created: 2006-08-02
|
||||
# Modified: 2007-01-17
|
||||
#
|
||||
# Changes:
|
||||
# 0.7 - added image_name_regex to allow pattern matching on the image name
|
||||
# - added support for image/pjpeg content types (progressive jpeg)
|
||||
# - updated imageinfo.cf with a few sample rules for using image_name_regex()
|
||||
# 0.6 - fixed dems_ bug in image_size_range_
|
||||
# 0.5 - added image_named and image_to_text_ratio
|
||||
# 0.4 - added image_size_exact and image_size_range
|
||||
# 0.3 - added jpeg support
|
||||
# 0.2 - optimized by theo
|
||||
# 0.1 - added gif/png support
|
||||
#
|
||||
#
|
||||
# Usage:
|
||||
# image_count()
|
||||
#
|
||||
# body RULENAME eval:image_count(<type>,<min>,[max])
|
||||
# type: 'all','gif','png', or 'jpeg'
|
||||
# min: required, message contains at least this
|
||||
# many images
|
||||
# max: optional, if specified, message must not
|
||||
# contain more than this number of images
|
||||
#
|
||||
# image_count() examples
|
||||
#
|
||||
# body ONE_IMAGE eval:image_count('all',1,1)
|
||||
# body ONE_OR_MORE_IMAGES eval:image_count('all',1)
|
||||
# body ONE_PNG eval:image_count('png',1,1)
|
||||
# body TWO_GIFS eval:image_count('gif',2,2)
|
||||
# body MANY_JPEGS eval:image_count('gif',5)
|
||||
#
|
||||
# pixel_coverage()
|
||||
#
|
||||
# body RULENAME eval:pixel_coverage(<type>,<min>,[max])
|
||||
# type: 'all','gif','png', or 'jpeg'
|
||||
# min: required, message contains at least this
|
||||
# much pixel area
|
||||
# max: optional, if specified, message must not
|
||||
# contain more than this much pixel area
|
||||
#
|
||||
# pixel_coverage() examples
|
||||
#
|
||||
# body LARGE_IMAGE_AREA eval:pixel_coverage('all',150000) # catches any images that are 150k pixel/sq or higher
|
||||
# body SMALL_GIF_AREA eval:pixel_coverage('gif',1,40000) # catches only gifs that 1 to 40k pixel/sql
|
||||
#
|
||||
# image_name_regex()
|
||||
#
|
||||
# body RULENAME eval:image_name_regex(<regex>)
|
||||
# regex: full quoted regexp, see examples below
|
||||
#
|
||||
# image_name_regex() examples
|
||||
#
|
||||
# body CG_DOUBLEDOT_GIF eval:image_name_regex('/^\w{2,9}\.\.gif$/i') # catches double dot gifs abcd..gif
|
||||
#
|
||||
#
|
||||
#
|
||||
# -------------------------------------------------------
|
||||
|
||||
package Mail::SpamAssassin::Plugin::ImageInfo;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule ("image_count");
|
||||
$self->register_eval_rule ("pixel_coverage");
|
||||
$self->register_eval_rule ("image_size_exact");
|
||||
$self->register_eval_rule ("image_size_range");
|
||||
$self->register_eval_rule ("image_named");
|
||||
$self->register_eval_rule ("image_name_regex");
|
||||
$self->register_eval_rule ("image_to_text_ratio");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
my %get_details = (
|
||||
'gif' => sub {
|
||||
my ($pms, $part) = @_;
|
||||
my $header = $part->decode(13);
|
||||
|
||||
# make sure this is actually a valid gif..
|
||||
return unless $header =~ s/^GIF(8[79]a)//;
|
||||
my $version = $1;
|
||||
|
||||
my ($width, $height, $packed, $bgcolor, $aspect) = unpack("vvCCC", $header);
|
||||
my $color_table_size = 1 << (($packed & 0x07) + 1);
|
||||
|
||||
# for future enhancements
|
||||
#my $global_color_table = $packed & 0x80;
|
||||
#my $has_global_color_table = $global_color_table ? 1 : 0;
|
||||
#my $sorted_colors = ($packed & 0x08)?1:0;
|
||||
#my $resolution = ((($packed & 0x70) >> 4) + 1);
|
||||
|
||||
if ($height && $width) {
|
||||
my $area = $width * $height;
|
||||
$pms->{imageinfo}->{pc_gif} += $area;
|
||||
$pms->{imageinfo}->{dems_gif}->{"${height}x${width}"} = 1;
|
||||
$pms->{imageinfo}->{names_all}->{$part->{'name'}} = 1 if $part->{'name'};
|
||||
dbg("imageinfo: gif image ".($part->{'name'} ? $part->{'name'} : '')." is $height x $width pixels ($area pixels sq.), with $color_table_size color table");
|
||||
}
|
||||
},
|
||||
|
||||
'png' => sub {
|
||||
my ($pms, $part) = @_;
|
||||
my $data = $part->decode();
|
||||
|
||||
return unless (substr($data, 0, 8) eq "\x89PNG\x0d\x0a\x1a\x0a");
|
||||
|
||||
my $datalen = length $data;
|
||||
my $pos = 8;
|
||||
my $chunksize = 8;
|
||||
my ($width, $height) = ( 0, 0 );
|
||||
my ($depth, $ctype, $compression, $filter, $interlace);
|
||||
|
||||
while ($pos < $datalen) {
|
||||
my ($len, $type) = unpack("Na4", substr($data, $pos, $chunksize));
|
||||
$pos += $chunksize;
|
||||
|
||||
last if $type eq "IEND"; # end of png image.
|
||||
|
||||
next unless ( $type eq "IHDR" && $len == 13 );
|
||||
|
||||
my $bytes = substr($data, $pos, $len + 4);
|
||||
my $crc = unpack("N", substr($bytes, -4, 4, ""));
|
||||
|
||||
if ($type eq "IHDR" && $len == 13) {
|
||||
($width, $height, $depth, $ctype, $compression, $filter, $interlace) = unpack("NNCCCCC", $bytes);
|
||||
last;
|
||||
}
|
||||
}
|
||||
|
||||
if ($height && $width) {
|
||||
my $area = $width * $height;
|
||||
$pms->{imageinfo}->{pc_png} += $area;
|
||||
$pms->{imageinfo}->{dems_png}->{"${height}x${width}"} = 1;
|
||||
$pms->{imageinfo}->{names_all}->{$part->{'name'}} = 1 if $part->{'name'};
|
||||
dbg("imageinfo: png image ".($part->{'name'} ? $part->{'name'} : '')." is $height x $width pixels ($area pixels sq.)");
|
||||
}
|
||||
},
|
||||
|
||||
'jpeg' => sub {
|
||||
my ($pms, $part) = @_;
|
||||
|
||||
my $data = $part->decode();
|
||||
|
||||
my $index = substr($data, 0, 2);
|
||||
return unless $index eq "\xFF\xD8";
|
||||
|
||||
my $pos = 2;
|
||||
my $chunksize = 4;
|
||||
my ($prec, $height, $width, $comps) = (undef,0,0,undef);
|
||||
while (1) {
|
||||
my ($xx, $mark, $len) = unpack("CCn", substr($data, $pos, $chunksize));
|
||||
last if (!defined $xx || $xx != 0xFF);
|
||||
last if (!defined $mark || $mark == 0xDA || $mark == 0xD9);
|
||||
last if (!defined $len || $len < 2);
|
||||
$pos += $chunksize;
|
||||
my $block = substr($data, $pos, $len - 2);
|
||||
my $blocklen = length($block);
|
||||
if ( ($mark >= 0xC0 && $mark <= 0xC3) || ($mark >= 0xC5 && $mark <= 0xC7) ||
|
||||
($mark >= 0xC9 && $mark <= 0xCB) || ($mark >= 0xCD && $mark <= 0xCF) ) {
|
||||
($prec, $height, $width, $comps) = unpack("CnnC", substr($block, 0, 6, ""));
|
||||
last;
|
||||
}
|
||||
$pos += $blocklen;
|
||||
}
|
||||
|
||||
if ($height && $width) {
|
||||
my $area = $height * $width;
|
||||
$pms->{imageinfo}->{pc_jpeg} += $area;
|
||||
$pms->{imageinfo}->{dems_jpeg}->{"${height}x${width}"} = 1;
|
||||
$pms->{imageinfo}->{names_all}->{$part->{'name'}} = 1 if $part->{'name'};
|
||||
dbg("imageinfo: jpeg image ".($part->{'name'} ? $part->{'name'} : '')." is $height x $width pixels ($area pixels sq.)");
|
||||
}
|
||||
|
||||
},
|
||||
|
||||
);
|
||||
|
||||
sub _get_images {
|
||||
my ($self,$pms) = @_;
|
||||
my $result = 0;
|
||||
|
||||
foreach my $type ( 'all', keys %get_details ) {
|
||||
$pms->{'imageinfo'}->{"pc_$type"} = 0;
|
||||
$pms->{'imageinfo'}->{"count_$type"} = 0;
|
||||
}
|
||||
|
||||
foreach my $p ($pms->{msg}->find_parts(qr@^image/(?:gif|png|jpe?g)$@, 1)) {
|
||||
# make sure its base64 encoded
|
||||
my $cte = lc($p->get_header('content-transfer-encoding') || '');
|
||||
next if ($cte !~ /^base64$/);
|
||||
|
||||
my ($type) = $p->{'type'} =~ m@/(\w+)$@;
|
||||
$type = 'jpeg' if $type eq 'jpg';
|
||||
if ($type && exists $get_details{$type}) {
|
||||
$get_details{$type}->($pms,$p);
|
||||
$pms->{'imageinfo'}->{"count_$type"} ++;
|
||||
}
|
||||
}
|
||||
|
||||
foreach my $name ( keys %{$pms->{'imageinfo'}->{"names_all"}} ) {
|
||||
dbg("imageinfo: image name $name found");
|
||||
}
|
||||
|
||||
foreach my $type ( keys %get_details ) {
|
||||
$pms->{'imageinfo'}->{'pc_all'} += $pms->{'imageinfo'}->{"pc_$type"};
|
||||
$pms->{'imageinfo'}->{'count_all'} += $pms->{'imageinfo'}->{"count_$type"};
|
||||
foreach my $dem ( keys %{$pms->{'imageinfo'}->{"dems_$type"}} ) {
|
||||
dbg("imageinfo: adding $dem to dems_all");
|
||||
$pms->{'imageinfo'}->{'dems_all'}->{$dem} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub image_named {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
return unless (defined $name);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
$self->_get_images($pms);
|
||||
}
|
||||
|
||||
return 0 unless (exists $pms->{'imageinfo'}->{"names_all"});
|
||||
return 1 if (exists $pms->{'imageinfo'}->{"names_all"}->{$name});
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub image_name_regex {
|
||||
my ($self,$pms,$body,$re) = @_;
|
||||
return unless (defined $re);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
$self->_get_images($pms);
|
||||
}
|
||||
|
||||
return 0 unless (exists $pms->{'imageinfo'}->{"names_all"});
|
||||
|
||||
my $hit = 0;
|
||||
foreach my $name (keys %{$pms->{'imageinfo'}->{"names_all"}}) {
|
||||
dbg("imageinfo: checking image named $name against regex $re");
|
||||
if (eval { $name =~ /$re/ }) { $hit = 1 }
|
||||
dbg("imageinfo: error in regex /$re/ - $@") if $@;
|
||||
if ($hit) {
|
||||
dbg("imageinfo: image_name_regex hit on $name");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub image_count {
|
||||
my ($self,$pms,$body,$type,$min,$max) = @_;
|
||||
|
||||
return unless defined $min;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
$self->_get_images($pms);
|
||||
}
|
||||
|
||||
# dbg("imageinfo: count: $min, ".($max ? $max:'').", $type, ".$pms->{'imageinfo'}->{"count_$type"});
|
||||
return result_check($min, $max, $pms->{'imageinfo'}->{"count_$type"});
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pixel_coverage {
|
||||
my ($self,$pms,$body,$type,$min,$max) = @_;
|
||||
|
||||
return unless (defined $type && defined $min);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
$self->_get_images($pms);
|
||||
}
|
||||
|
||||
# dbg("imageinfo: pc_$type: $min, ".($max ? $max:'').", $type, ".$pms->{'imageinfo'}->{"pc_$type"});
|
||||
return result_check($min, $max, $pms->{'imageinfo'}->{"pc_$type"});
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub image_to_text_ratio {
|
||||
my ($self,$pms,$body,$type,$min,$max) = @_;
|
||||
return unless (defined $type && defined $min && defined $max);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
$self->_get_images($pms);
|
||||
}
|
||||
|
||||
# depending on how you call this eval (body vs rawbody),
|
||||
# the $textlen will differ.
|
||||
my $textlen = length(join('',@$body));
|
||||
|
||||
return 0 unless ( $textlen > 0 && exists $pms->{'imageinfo'}->{"pc_$type"} && $pms->{'imageinfo'}->{"pc_$type"} > 0);
|
||||
|
||||
my $ratio = $textlen / $pms->{'imageinfo'}->{"pc_$type"};
|
||||
dbg("imageinfo: image ratio=$ratio, min=$min max=$max");
|
||||
return result_check($min, $max, $ratio, 1);
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub image_size_exact {
|
||||
my ($self,$pms,$body,$type,$height,$width) = @_;
|
||||
return unless (defined $type && defined $height && defined $width);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
$self->_get_images($pms);
|
||||
}
|
||||
|
||||
return 0 unless (exists $pms->{'imageinfo'}->{"dems_$type"});
|
||||
return 1 if (exists $pms->{'imageinfo'}->{"dems_$type"}->{"${height}x${width}"});
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub image_size_range {
|
||||
my ($self,$pms,$body,$type,$minh,$minw,$maxh,$maxw) = @_;
|
||||
return unless (defined $type && defined $minh && defined $minw);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
$self->_get_images($pms);
|
||||
}
|
||||
|
||||
my $name = 'dems_'.$type;
|
||||
return unless (exists $pms->{'imageinfo'}->{$name});
|
||||
|
||||
foreach my $dem ( keys %{$pms->{'imageinfo'}->{"dems_$type"}}) {
|
||||
my ($h,$w) = split(/x/,$dem);
|
||||
next if ($h < $minh); # height less than min height
|
||||
next if ($w < $minw); # width less than min width
|
||||
next if (defined $maxh && $h > $maxh); # height more than max height
|
||||
next if (defined $maxw && $w > $maxw); # width more than max width
|
||||
|
||||
# if we make it here, we have a match
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub result_check {
|
||||
my ($min, $max, $value, $nomaxequal) = @_;
|
||||
return 0 unless defined $value;
|
||||
return 0 if ($value < $min);
|
||||
return 0 if (defined $max && $value > $max);
|
||||
return 0 if (defined $nomaxequal && $nomaxequal && $value == $max);
|
||||
return 1;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
1;
|
683
upstream/lib/Mail/SpamAssassin/Plugin/MIMEEval.pm
Normal file
683
upstream/lib/Mail/SpamAssassin/Plugin/MIMEEval.pm
Normal file
@ -0,0 +1,683 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
MIMEEval - perform various tests against MIME structure and body
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::MIMEEval
|
||||
|
||||
body NAME_OF_RULE eval:check_for_mime
|
||||
body NAME_OF_RULE eval:check_for_mime_html
|
||||
body NAME_OF_RULE eval:check_for_mime_html_only
|
||||
body NAME_OF_RULE eval:check_mime_multipart_ratio
|
||||
body NAME_OF_RULE eval:check_msg_parse_flags
|
||||
body NAME_OF_RULE eval:check_for_ascii_text_illegal
|
||||
body NAME_OF_RULE eval:check_abundant_unicode_ratio
|
||||
body NAME_OF_RULE eval:check_for_faraway_charset
|
||||
body NAME_OF_RULE eval:check_for_uppercase
|
||||
body NAME_OF_RULE eval:check_ma_non_text
|
||||
body NAME_OF_RULE eval:check_base64_length
|
||||
body NAME_OF_RULE eval:check_qp_ratio
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Perform various tests against MIME structure and body.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::MIMEEval;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Locales;
|
||||
use Mail::SpamAssassin::Constants qw(:sa CHARSETS_LIKELY_TO_FP_AS_CAPS);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_for_mime");
|
||||
$self->register_eval_rule("check_for_mime_html");
|
||||
$self->register_eval_rule("check_for_mime_html_only");
|
||||
$self->register_eval_rule("check_mime_multipart_ratio");
|
||||
$self->register_eval_rule("check_msg_parse_flags");
|
||||
$self->register_eval_rule("check_for_ascii_text_illegal");
|
||||
$self->register_eval_rule("check_abundant_unicode_ratio");
|
||||
$self->register_eval_rule("check_for_faraway_charset");
|
||||
$self->register_eval_rule("check_for_uppercase");
|
||||
$self->register_eval_rule("check_ma_non_text");
|
||||
$self->register_eval_rule("check_base64_length");
|
||||
$self->register_eval_rule("check_qp_ratio");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub are_more_high_bits_set {
|
||||
my ($self, $str) = @_;
|
||||
|
||||
# TODO: I suspect a tr// trick may be faster here
|
||||
my $numhis = () = ($str =~ /[\200-\377]/g);
|
||||
my $numlos = length($str) - $numhis;
|
||||
|
||||
($numlos <= $numhis && $numhis > 3);
|
||||
}
|
||||
|
||||
=over 4
|
||||
|
||||
=item has_check_for_ascii_text_illegal
|
||||
|
||||
Adds capability check for "if can()" for check_for_ascii_text_illegal
|
||||
|
||||
=cut
|
||||
|
||||
sub has_check_for_ascii_text_illegal { 1 }
|
||||
|
||||
=item check_for_ascii_text_illegal
|
||||
|
||||
If a MIME part claims to be text/plain or text/plain;charset=us-ascii and the Content-Transfer-Encoding is 7bit (either explicitly or by default), then we should enforce the actual text being only TAB, NL, SPACE through TILDE, i.e. all 7bit characters excluding NO-WS-CTL (per RFC-2822).
|
||||
|
||||
All mainstream MTA's get this right.
|
||||
|
||||
=cut
|
||||
|
||||
sub check_for_ascii_text_illegal {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{mime_ascii_text_illegal};
|
||||
return ($pms->{mime_ascii_text_illegal} > 0);
|
||||
}
|
||||
|
||||
=item has_check_abundant_unicode_ratio
|
||||
|
||||
Adds capability check for "if can()" for check_abundant_unicode_ratio
|
||||
|
||||
=cut
|
||||
|
||||
sub has_check_abundant_unicode_ratio { 1 }
|
||||
|
||||
=item check_abundant_unicode_ratio
|
||||
|
||||
A MIME part claiming to be text/plain and containing Unicode characters must be encoded as quoted-printable or base64, or use UTF data coding (typically with 8bit encoding). Any message in 7bit or 8bit encoding containing (HTML) Unicode entities will not render them as Unicode, but literally.
|
||||
|
||||
Thus a few such sequences might occur on a mailing list of developers discussing such characters, but a message with a high density of such characters is likely spam.
|
||||
|
||||
=cut
|
||||
|
||||
sub check_abundant_unicode_ratio {
|
||||
my ($self, $pms, undef, $ratio) = @_;
|
||||
|
||||
# validate ratio?
|
||||
return 0 unless ($ratio =~ /^\d{0,3}\.\d{1,3}$/);
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{mime_text_unicode_ratio};
|
||||
return ($pms->{mime_text_unicode_ratio} >= $ratio);
|
||||
}
|
||||
|
||||
sub check_for_faraway_charset {
|
||||
my ($self, $pms, $body) = @_;
|
||||
|
||||
my $type = $pms->get('Content-Type',undef);
|
||||
|
||||
my @locales = Mail::SpamAssassin::Util::get_my_locales($self->{main}->{conf}->{ok_locales});
|
||||
|
||||
return 0 if grep { $_ eq "all" } @locales;
|
||||
|
||||
$type = get_charset_from_ct_line($type) if defined $type;
|
||||
|
||||
if (defined $type &&
|
||||
!Mail::SpamAssassin::Locales::is_charset_ok_for_locales
|
||||
($type, @locales))
|
||||
{
|
||||
# sanity check. Some charsets (e.g. koi8-r) include the ASCII
|
||||
# 7-bit charset as well, so make sure we actually have a high
|
||||
# number of 8-bit chars in the body text first.
|
||||
|
||||
$body = join("\n", @$body);
|
||||
if ($self->are_more_high_bits_set ($body)) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
0;
|
||||
}
|
||||
|
||||
sub check_for_mime {
|
||||
my ($self, $pms, undef, $test) = @_;
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{$test};
|
||||
return $pms->{$test};
|
||||
}
|
||||
|
||||
# any text/html MIME part
|
||||
sub check_for_mime_html {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $ctype = $pms->get('Content-Type');
|
||||
return 1 if $ctype =~ m{^text/html}i;
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{mime_body_html_count};
|
||||
return ($pms->{mime_body_html_count} > 0);
|
||||
}
|
||||
|
||||
# HTML without some other type of MIME text part
|
||||
sub check_for_mime_html_only {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $ctype = $pms->get('Content-Type');
|
||||
return 1 if $ctype =~ m{^text/html}i;
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{mime_body_html_count};
|
||||
return 0 unless exists $pms->{mime_body_text_count};
|
||||
return ($pms->{mime_body_html_count} > 0 &&
|
||||
$pms->{mime_body_text_count} == 0);
|
||||
}
|
||||
|
||||
sub check_mime_multipart_ratio {
|
||||
my ($self, $pms, undef, $min, $max) = @_;
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{mime_multipart_ratio};
|
||||
return ($pms->{mime_multipart_ratio} >= $min &&
|
||||
$pms->{mime_multipart_ratio} < $max);
|
||||
}
|
||||
|
||||
sub _check_mime_header {
|
||||
my ($self, $pms, $ctype, $cte, $cd, $charset, $name) = @_;
|
||||
|
||||
$charset ||= '';
|
||||
|
||||
if ($ctype eq 'text/html') {
|
||||
$pms->{mime_body_html_count}++;
|
||||
}
|
||||
elsif ($ctype =~ m@^text@i) {
|
||||
$pms->{mime_body_text_count}++;
|
||||
}
|
||||
|
||||
if ($cte =~ /base64/) {
|
||||
$pms->{mime_base64_count}++;
|
||||
}
|
||||
elsif ($cte =~ /quoted-printable/) {
|
||||
$pms->{mime_qp_count}++;
|
||||
}
|
||||
|
||||
if ($cd && $cd =~ /attachment/) {
|
||||
$pms->{mime_attachment}++;
|
||||
}
|
||||
|
||||
if ($ctype =~ /^text/ &&
|
||||
$cte =~ /base64/ &&
|
||||
(!$charset || $charset =~ /(?:us-ascii|ansi_x3\.4-1968|iso-ir-6|ansi_x3\.4-1986|iso_646\.irv:1991|ascii|iso646-us|us|ibm367|cp367|csascii)/) &&
|
||||
!($cd && $cd =~ /^(?:attachment|inline)/))
|
||||
{
|
||||
$pms->{mime_base64_encoded_text} = 1;
|
||||
}
|
||||
|
||||
if ($charset =~ /iso-\S+-\S+\b/i &&
|
||||
$charset !~ /iso-(?:8859-\d{1,2}|2022-(?:jp|kr))\b/)
|
||||
{
|
||||
$pms->{mime_bad_iso_charset} = 1;
|
||||
}
|
||||
|
||||
# MIME_BASE64_LATIN: now a zero-hitter
|
||||
# if (!$name &&
|
||||
# $cte =~ /base64/ &&
|
||||
# $charset =~ /\b(?:us-ascii|iso-8859-(?:[12349]|1[0345])|windows-(?:125[0247]))\b/)
|
||||
# {
|
||||
# $pms->{mime_base64_latin} = 1;
|
||||
# }
|
||||
|
||||
# MIME_QP_NO_CHARSET: now a zero-hitter
|
||||
# if ($cte =~ /quoted-printable/ && $cd =~ /inline/ && !$charset) {
|
||||
# $pms->{mime_qp_inline_no_charset} = 1;
|
||||
# }
|
||||
|
||||
# MIME_HTML_NO_CHARSET: now a zero-hitter
|
||||
# if ($ctype eq 'text/html' &&
|
||||
# !(defined($charset) && $charset) &&
|
||||
# !($cd && $cd =~ /^(?:attachment|inline)/))
|
||||
# {
|
||||
# $pms->{mime_html_no_charset} = 1;
|
||||
# }
|
||||
|
||||
if ($charset =~ /[a-z]/i) {
|
||||
if (defined $pms->{mime_html_charsets}) {
|
||||
$pms->{mime_html_charsets} .= " ".$charset;
|
||||
} else {
|
||||
$pms->{mime_html_charsets} = $charset;
|
||||
}
|
||||
|
||||
if (! $pms->{mime_faraway_charset}) {
|
||||
my @l = Mail::SpamAssassin::Util::get_my_locales($self->{main}->{conf}->{ok_locales});
|
||||
|
||||
if (!(grep { $_ eq "all" } @l) &&
|
||||
!Mail::SpamAssassin::Locales::is_charset_ok_for_locales($charset, @l))
|
||||
{
|
||||
$pms->{mime_faraway_charset} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_attachments {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# MIME status
|
||||
my $where = -1; # -1 = start, 0 = nowhere, 1 = header, 2 = body
|
||||
my $qp_bytes = 0; # total bytes in QP regions
|
||||
my $qp_count = 0; # QP-encoded bytes in QP regions
|
||||
my @part_bytes; # MIME part total bytes
|
||||
my @part_type; # MIME part types
|
||||
|
||||
my $normal_chars = 0; # MIME text bytes that aren't encoded
|
||||
my $unicode_chars = 0; # MIME text bytes that are unicode entities
|
||||
|
||||
# MIME header information
|
||||
my $part = -1; # MIME part index
|
||||
|
||||
# indicate the scan has taken place
|
||||
$pms->{mime_checked_attachments} = 1;
|
||||
|
||||
# results
|
||||
# $pms->{mime_base64_blanks} = 0; # expensive to determine, no longer avail
|
||||
$pms->{mime_base64_count} = 0;
|
||||
$pms->{mime_base64_encoded_text} = 0;
|
||||
# $pms->{mime_base64_illegal} = 0;
|
||||
# $pms->{mime_base64_latin} = 0;
|
||||
$pms->{mime_body_html_count} = 0;
|
||||
$pms->{mime_body_text_count} = 0;
|
||||
$pms->{mime_faraway_charset} = 0;
|
||||
# $pms->{mime_html_no_charset} = 0;
|
||||
$pms->{mime_missing_boundary} = 0;
|
||||
$pms->{mime_multipart_alternative} = 0;
|
||||
$pms->{mime_multipart_ratio} = 1.0;
|
||||
$pms->{mime_qp_count} = 0;
|
||||
# $pms->{mime_qp_illegal} = 0;
|
||||
# $pms->{mime_qp_inline_no_charset} = 0;
|
||||
$pms->{mime_qp_long_line} = 0;
|
||||
$pms->{mime_qp_ratio} = 0;
|
||||
$pms->{mime_ascii_text_illegal} = 0;
|
||||
$pms->{mime_text_unicode_ratio} = 0;
|
||||
|
||||
# Get all parts ...
|
||||
foreach my $p ($pms->{msg}->find_parts(qr/./)) {
|
||||
# message headers
|
||||
my ($ctype, $boundary, $charset, $name) = Mail::SpamAssassin::Util::parse_content_type($p->get_header("content-type"));
|
||||
|
||||
if ($ctype eq 'multipart/alternative') {
|
||||
$pms->{mime_multipart_alternative} = 1;
|
||||
}
|
||||
|
||||
my $cte = $p->get_header('Content-Transfer-Encoding') || '';
|
||||
chomp($cte = defined($cte) ? lc $cte : "");
|
||||
|
||||
my $cd = $p->get_header('Content-Disposition') || '';
|
||||
chomp($cd = defined($cd) ? lc $cd : "");
|
||||
|
||||
$charset = lc $charset if ($charset);
|
||||
$name = lc $name if ($name);
|
||||
|
||||
$self->_check_mime_header($pms, $ctype, $cte, $cd, $charset, $name);
|
||||
|
||||
# If we're not in a leaf node in the tree, there will be no raw
|
||||
# section, so skip it.
|
||||
if (! $p->is_leaf()) {
|
||||
next;
|
||||
}
|
||||
|
||||
$part++;
|
||||
$part_type[$part] = $ctype;
|
||||
$part_bytes[$part] = 0 if $cd !~ /attachment/;
|
||||
|
||||
my $cte_is_base64 = $cte =~ /base64/i;
|
||||
my $previous = '';
|
||||
foreach (@{$p->raw()}) {
|
||||
|
||||
# if ($cte_is_base64) {
|
||||
# if ($previous =~ /^\s*$/ && /^\s*$/) { # expensive, avoid!
|
||||
# $pms->{mime_base64_blanks} = 1; # never used, don't bother
|
||||
# }
|
||||
# # MIME_BASE64_ILLEGAL: now a zero-hitter
|
||||
# # if (m@[^A-Za-z0-9+/=\n]@ || /=[^=\s]/) {
|
||||
# # $pms->{mime_base64_illegal} = 1;
|
||||
# # }
|
||||
# }
|
||||
|
||||
# if ($pms->{mime_html_no_charset} && $ctype eq 'text/html' && defined $charset) {
|
||||
# $pms->{mime_html_no_charset} = 0;
|
||||
# }
|
||||
if ($pms->{mime_multipart_alternative} && $cd !~ /attachment/ &&
|
||||
($ctype eq 'text/plain' || $ctype eq 'text/html')) {
|
||||
$part_bytes[$part] += length;
|
||||
}
|
||||
|
||||
if ($where != 1 && $cte eq "quoted-printable" && ! /^SPAM: /) {
|
||||
# RFC 5322: Each line SHOULD be no more than 78 characters,
|
||||
# excluding the CRLF.
|
||||
# RFC 2045: The Quoted-Printable encoding REQUIRES that
|
||||
# encoded lines be no more than 76 characters long.
|
||||
# Bug 5491: 6% of email classified as HAM by SA triggered the
|
||||
# MIME_QP_LONG_LINE rule. Apple Mail can generate a QP-line
|
||||
# that is 2 chars too long. Same goes for Outlook Web Access.
|
||||
# lines include one trailing \n character
|
||||
# if (length > 76+1) { # conforms to RFC 5322 and RFC 2045
|
||||
if (length > 78+1) { # conforms to RFC 5322 only, not RFC 2045
|
||||
$pms->{mime_qp_long_line} = 1;
|
||||
}
|
||||
$qp_bytes += length;
|
||||
|
||||
# MIME_QP_DEFICIENT: zero-hitter now
|
||||
|
||||
# check for illegal substrings (RFC 2045), hexadecimal values 7F-FF and
|
||||
# control characters other than TAB, or CR and LF as parts of CRLF pairs
|
||||
# if (!$pms->{mime_qp_illegal} && /[\x00-\x08\x0b\x0c\x0e-\x1f\x7f-\xff]/)
|
||||
# {
|
||||
# $pms->{mime_qp_illegal} = 1;
|
||||
# }
|
||||
|
||||
# count excessive QP bytes
|
||||
if (index($_, '=') != -1) {
|
||||
# whoever wrote this next line is an evil hacker -- jm
|
||||
my $qp = () = m/=(?:09|3[0-9ABCEF]|[2456][0-9A-F]|7[0-9A-E])/g;
|
||||
if ($qp) {
|
||||
$qp_count += $qp;
|
||||
# tabs and spaces at end of encoded line are okay. Also, multiple
|
||||
# whitespace at the end of a line are OK, like ">=20=20=20=20=20=20".
|
||||
my ($trailing) = m/((?:=09|=20)+)\s*$/g;
|
||||
if ($trailing) {
|
||||
$qp_count -= (length($trailing) / 3);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# if our charset is ASCII, this should only contain 7-bit characters
|
||||
# except NUL or a free-standing CR. anything else is a violation of
|
||||
# the definition of charset="us-ascii".
|
||||
if ($ctype eq 'text/plain' && (!defined $charset || $charset eq 'us-ascii')) {
|
||||
# no re "strict"; # since perl 5.21.8: Ranges of ASCII printables...
|
||||
if (m/[\x00\x0d\x80-\xff]+/) {
|
||||
if (would_log('dbg', 'eval')) {
|
||||
my $str = $_;
|
||||
$str =~ s/([\x00\x0d\x80-\xff]+)/'<' . unpack('H*', $1) . '>'/eg;
|
||||
dbg("check: ascii_text_illegal: matches " . $str . "\n");
|
||||
}
|
||||
$pms->{mime_ascii_text_illegal}++;
|
||||
}
|
||||
}
|
||||
|
||||
# if we're text/plain, we should never see unicode escapes in this
|
||||
# format, especially not for 7bit or 8bit.
|
||||
if ($ctype eq 'text/plain' && ($cte eq '' || $cte eq '7bit' || $cte eq '8bit')) {
|
||||
my ($text, $subs) = $_;
|
||||
|
||||
$subs = $text =~ s/&#x[0-9A-F]{4};//g;
|
||||
$normal_chars += length($text);
|
||||
$unicode_chars += $subs;
|
||||
|
||||
if ($subs && would_log('dbg', 'eval')) {
|
||||
my $str = $_;
|
||||
$str = substr($str, 0, 512) . '...' if (length($str) > 512);
|
||||
dbg("check: abundant_unicode: " . $str . " (" . $subs . ")\n");
|
||||
}
|
||||
}
|
||||
|
||||
$previous = $_;
|
||||
}
|
||||
}
|
||||
|
||||
if ($qp_bytes) {
|
||||
$pms->{mime_qp_ratio} = $qp_count / $qp_bytes;
|
||||
$pms->{mime_qp_count} = $qp_count;
|
||||
$pms->{mime_qp_bytes} = $qp_bytes;
|
||||
}
|
||||
|
||||
if ($normal_chars) {
|
||||
$pms->{mime_text_unicode_ratio} = $unicode_chars / $normal_chars;
|
||||
}
|
||||
|
||||
if ($pms->{mime_multipart_alternative}) {
|
||||
my $text;
|
||||
my $html;
|
||||
# bug 4207: we want the size of the last parts
|
||||
for (my $i = $part; $i >= 0; $i--) {
|
||||
next if !defined $part_bytes[$i];
|
||||
if (!defined($html) && $part_type[$i] eq 'text/html') {
|
||||
$html = $part_bytes[$i];
|
||||
}
|
||||
elsif (!defined($text) && $part_type[$i] eq 'text/plain') {
|
||||
$text = $part_bytes[$i];
|
||||
}
|
||||
last if (defined($html) && defined($text));
|
||||
}
|
||||
if (defined($text) && defined($html) && $html > 0) {
|
||||
$pms->{mime_multipart_ratio} = ($text / $html);
|
||||
}
|
||||
}
|
||||
|
||||
# Look to see if any multipart boundaries are not "balanced"
|
||||
foreach my $val (values %{$pms->{msg}->{mime_boundary_state}}) {
|
||||
if ($val != 0) {
|
||||
$pms->{mime_missing_boundary} = 1;
|
||||
last;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
=item has_check_qp_ratio
|
||||
|
||||
Adds capability check for "if can()" for check_qp_ratio
|
||||
|
||||
=cut
|
||||
|
||||
sub has_check_qp_ratio { 1 }
|
||||
|
||||
=item check_qp_ratio
|
||||
|
||||
Takes a min ratio to use in eval to see if there is an spamminess to the ratio of
|
||||
quoted printable to total bytes in an email.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub check_qp_ratio {
|
||||
my ($self, $pms, undef, $min) = @_;
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{mime_qp_ratio};
|
||||
|
||||
my $qp_ratio = $pms->{mime_qp_ratio};
|
||||
|
||||
dbg("eval: qp_ratio - %s - check for min of %s", $qp_ratio, $min);
|
||||
|
||||
return (defined $qp_ratio && $qp_ratio >= $min) ? 1 : 0;
|
||||
}
|
||||
|
||||
|
||||
sub check_msg_parse_flags {
|
||||
my($self, $pms, $type, $type2) = @_;
|
||||
$type = $type2 if ref($type);
|
||||
return defined $pms->{msg}->{$type};
|
||||
}
|
||||
|
||||
sub check_for_uppercase {
|
||||
my ($self, $pms, $body, $min, $max) = @_;
|
||||
local ($_);
|
||||
|
||||
if (exists $pms->{uppercase}) {
|
||||
return ($pms->{uppercase} > $min && $pms->{uppercase} <= $max);
|
||||
}
|
||||
|
||||
if ($self->body_charset_is_likely_to_fp($pms)) {
|
||||
$pms->{uppercase} = 0; return 0;
|
||||
}
|
||||
|
||||
# Dec 20 2002 jm: trade off some speed for low memory footprint, by
|
||||
# iterating over the array computing sums, instead of joining the
|
||||
# array into a giant string and working from that.
|
||||
|
||||
my $len = 0;
|
||||
my $lower = 0;
|
||||
my $upper = 0;
|
||||
foreach (@{$body}) {
|
||||
# examine lines in the body that have an intermediate space
|
||||
next unless /\S\s+\S/;
|
||||
# strip out lingering base64 (currently possible for forwarded messages)
|
||||
next if /^(?:[A-Za-z0-9+\/=]{60,76} ){2}/;
|
||||
|
||||
my $line = $_; # copy so we don't muck up the original
|
||||
|
||||
# remove shift-JIS charset codes
|
||||
$line =~ s/\x1b\$B.*\x1b\(B//gs;
|
||||
|
||||
$len += length($line);
|
||||
|
||||
# count numerals as lower case, otherwise 'date|mail' is spam
|
||||
$lower += ($line =~ tr/a-z0-9//d);
|
||||
$upper += ($line =~ tr/A-Z//);
|
||||
}
|
||||
|
||||
# report only on mails above a minimum size; otherwise one
|
||||
# or two acronyms can throw it off
|
||||
if ($len < 200) {
|
||||
$pms->{uppercase} = 0;
|
||||
return 0;
|
||||
}
|
||||
if (($upper + $lower) == 0) {
|
||||
$pms->{uppercase} = 0;
|
||||
} else {
|
||||
$pms->{uppercase} = ($upper / ($upper + $lower)) * 100;
|
||||
}
|
||||
|
||||
return ($pms->{uppercase} > $min && $pms->{uppercase} <= $max);
|
||||
}
|
||||
|
||||
sub body_charset_is_likely_to_fp {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# check for charsets where this test will FP -- iso-2022-jp, gb2312,
|
||||
# koi8-r etc.
|
||||
#
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
my @charsets;
|
||||
my $type = $pms->get('Content-Type',undef);
|
||||
$type = get_charset_from_ct_line($type) if defined $type;
|
||||
push (@charsets, $type) if defined $type;
|
||||
if (defined $pms->{mime_html_charsets}) {
|
||||
push (@charsets, split(' ', $pms->{mime_html_charsets}));
|
||||
}
|
||||
|
||||
my $CHARSETS_LIKELY_TO_FP_AS_CAPS = CHARSETS_LIKELY_TO_FP_AS_CAPS;
|
||||
foreach my $charset (@charsets) {
|
||||
if ($charset =~ /^${CHARSETS_LIKELY_TO_FP_AS_CAPS}$/) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub get_charset_from_ct_line {
|
||||
my $type = shift;
|
||||
if (!defined $type) { return; }
|
||||
if ($type =~ /charset="([^"]+)"/i) { return $1; }
|
||||
if ($type =~ /charset='([^']+)'/i) { return $1; }
|
||||
if ($type =~ /charset=(\S+)/i) { return $1; }
|
||||
return;
|
||||
}
|
||||
|
||||
# came up on the users@ list, look for multipart/alternative parts which
|
||||
# include non-text parts -- skip certain types which occur normally in ham
|
||||
sub check_ma_non_text {
|
||||
my($self, $pms) = @_;
|
||||
|
||||
foreach my $map ($pms->{msg}->find_parts(qr@^multipart/alternative$@i)) {
|
||||
foreach my $p ($map->find_parts(qr/./, 1, 0)) {
|
||||
next if (lc $p->{'type'} eq 'multipart/related');
|
||||
next if (lc $p->{'type'} eq 'application/rtf');
|
||||
next if ($p->{'type'} =~ m@^text/@i);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_base64_length {
|
||||
my $self = shift;
|
||||
my $pms = shift;
|
||||
shift; # body array, unnecessary
|
||||
my $min = shift;
|
||||
my $max = shift;
|
||||
|
||||
if (!defined $pms->{base64_length}) {
|
||||
$pms->{base64_length} = $self->_check_base64_length($pms->{msg});
|
||||
}
|
||||
|
||||
return 0 if (defined $max && $pms->{base64_length} > $max);
|
||||
return $pms->{base64_length} >= $min;
|
||||
}
|
||||
|
||||
sub _check_base64_length {
|
||||
my $self = shift;
|
||||
my $msg = shift;
|
||||
|
||||
my $result = 0;
|
||||
|
||||
foreach my $p ($msg->find_parts(qr@.@, 1)) {
|
||||
my $ctype=
|
||||
Mail::SpamAssassin::Util::parse_content_type($p->get_header('content-type'));
|
||||
|
||||
# FPs from Google Calendar invites, etc.
|
||||
# perhaps just limit to test, and image?
|
||||
next if ($ctype eq 'application/ics');
|
||||
|
||||
my $cte = lc($p->get_header('content-transfer-encoding') || '');
|
||||
next if ($cte !~ /^base64$/);
|
||||
foreach my $l ( @{$p->raw()} ) {
|
||||
$result = length $l if length $l > $result;
|
||||
}
|
||||
}
|
||||
|
||||
return $result;
|
||||
}
|
||||
|
||||
1;
|
227
upstream/lib/Mail/SpamAssassin/Plugin/MIMEHeader.pm
Normal file
227
upstream/lib/Mail/SpamAssassin/Plugin/MIMEHeader.pm
Normal file
@ -0,0 +1,227 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
MIMEHeader - perform regexp tests against MIME headers
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::MIMEHeader
|
||||
mimeheader NAME_OF_RULE Content-Id =~ /foo/
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin allows regexp rules to be written against MIME headers in the
|
||||
message.
|
||||
|
||||
=head1 RULE DEFINITIONS AND PRIVILEGED SETTINGS
|
||||
|
||||
=over 4
|
||||
|
||||
=item mimeheader NAME_OF_RULE Header-Name =~ /pattern/modifiers
|
||||
|
||||
Specify a rule. C<NAME_OF_RULE> is the name of the rule to be used,
|
||||
C<Header-Name> is the name of the MIME header to check, and
|
||||
C</pattern/modifiers> is the Perl regular expression to match against this.
|
||||
|
||||
Note that in a message of multiple parts, each header will be checked
|
||||
against the pattern separately. In other words, if multiple parts
|
||||
have a 'Content-Type' header, each header's value will be tested
|
||||
individually as a separate string.
|
||||
|
||||
Header names are considered case-insensitive.
|
||||
|
||||
The header values are normally cleaned up a little; for example, whitespace
|
||||
around the newline character in "folded" headers will be replaced with a single
|
||||
space. Append C<:raw> to the header name to retrieve the raw, undecoded value,
|
||||
including pristine whitespace, instead.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::MIMEHeader;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Conf;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var compile_regexp);
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
our @TEMPORARY_METHODS;
|
||||
|
||||
my $RULENAME_RE = RULENAME_RE;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# constructor
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $samain = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($samain);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($samain->{conf});
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
my $pluginobj = $self; # allow use inside the closure below
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'mimeheader',
|
||||
is_priv => 1,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
local ($1,$2,$3);
|
||||
if ($value !~ s/^(${RULENAME_RE})\s+//) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
my $rulename = untaint_var($1);
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
# Take :raw to hdrname!
|
||||
if ($value !~ /^([^:\s]+(?:\:(?:raw)?)?)\s*([=!]~)\s*(.+)$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
my $hdrname = $1;
|
||||
my $negated = $2 eq '!~' ? 1 : 0;
|
||||
my $pattern = $3;
|
||||
$hdrname =~ s/:$//;
|
||||
my $if_unset = '';
|
||||
if ($pattern =~ s/\s+\[if-unset:\s+(.+)\]$//) {
|
||||
$if_unset = $1;
|
||||
}
|
||||
my ($rec, $err) = compile_regexp($pattern, 1);
|
||||
if (!$rec) {
|
||||
info("mimeheader: invalid regexp for $rulename '$pattern': $err");
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
|
||||
$self->{mimeheader_tests}->{$rulename} = {
|
||||
hdr => $hdrname,
|
||||
negated => $negated,
|
||||
if_unset => $if_unset,
|
||||
pattern => $rec
|
||||
};
|
||||
|
||||
# now here's a hack; generate a fake eval rule function to
|
||||
# call this rule's _real_ code!
|
||||
# TODO: we should have a more elegant way for new rule types to
|
||||
# be defined
|
||||
my $evalfn = "_mimeheader_eval_$rulename";
|
||||
|
||||
# don't redefine the subroutine if it already exists!
|
||||
# this causes lots of annoying warnings and such during things like
|
||||
# "make test".
|
||||
return if (defined &{'Mail::SpamAssassin::Plugin::MIMEHeader::'.$evalfn});
|
||||
|
||||
$self->{parser}->add_test($rulename, $evalfn."()",
|
||||
$Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
# evalfn/rulename safe, sanitized by $RULENAME_RE
|
||||
my $evalcode = '
|
||||
sub Mail::SpamAssassin::Plugin::MIMEHeader::'.$evalfn.' {
|
||||
$_[0]->eval_hook_called($_[1], q{'.$rulename.'});
|
||||
}
|
||||
';
|
||||
|
||||
eval
|
||||
$evalcode . '; 1'
|
||||
or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn "mimeheader: plugin error: $eval_stat\n";
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
};
|
||||
|
||||
$pluginobj->register_eval_rule($evalfn);
|
||||
|
||||
push @TEMPORARY_METHODS, "Mail::SpamAssassin::Plugin::MIMEHeader::${evalfn}";
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub eval_hook_called {
|
||||
my ($pobj, $scanner, $rulename) = @_;
|
||||
|
||||
my $rule = $scanner->{conf}->{mimeheader_tests}->{$rulename};
|
||||
my $hdr = $rule->{hdr};
|
||||
my $negated = $rule->{negated};
|
||||
my $if_unset = $rule->{if_unset};
|
||||
my $pattern = $rule->{pattern};
|
||||
|
||||
|
||||
my $getraw;
|
||||
if ($hdr =~ s/:raw$//) {
|
||||
$getraw = 1;
|
||||
} else {
|
||||
$getraw = 0;
|
||||
}
|
||||
|
||||
foreach my $p ($scanner->{msg}->find_parts(qr/./)) {
|
||||
my $val;
|
||||
if ($getraw) {
|
||||
$val = $p->raw_header($hdr);
|
||||
} else {
|
||||
$val = $p->get_header($hdr);
|
||||
}
|
||||
$val = $if_unset if !defined $val;
|
||||
|
||||
if ($val =~ $pattern) {
|
||||
return ($negated ? 0 : 1);
|
||||
}
|
||||
}
|
||||
|
||||
return ($negated ? 1 : 0);
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub finish_tests {
|
||||
my ($self, $params) = @_;
|
||||
|
||||
foreach my $method (@TEMPORARY_METHODS) {
|
||||
undef &{$method};
|
||||
}
|
||||
@TEMPORARY_METHODS = (); # clear for next time
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
1;
|
974
upstream/lib/Mail/SpamAssassin/Plugin/OLEVBMacro.pm
Normal file
974
upstream/lib/Mail/SpamAssassin/Plugin/OLEVBMacro.pm
Normal file
@ -0,0 +1,974 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::OLEVBMacro - search attached documents for evidence of containing an OLE Macro
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::OLEVBMacro
|
||||
|
||||
ifplugin Mail::SpamAssassin::Plugin::OLEVBMacro
|
||||
body OLEMACRO eval:check_olemacro()
|
||||
describe OLEMACRO Attachment has an Office Macro
|
||||
|
||||
body OLEMACRO_MALICE eval:check_olemacro_malice()
|
||||
describe OLEMACRO_MALICE Potentially malicious Office Macro
|
||||
|
||||
body OLEMACRO_ENCRYPTED eval:check_olemacro_encrypted()
|
||||
describe OLEMACRO_ENCRYPTED Has an Office doc that is encrypted
|
||||
|
||||
body OLEMACRO_RENAME eval:check_olemacro_renamed()
|
||||
describe OLEMACRO_RENAME Has an Office doc that has been renamed
|
||||
|
||||
body OLEMACRO_ZIP_PW eval:check_olemacro_zip_password()
|
||||
describe OLEMACRO_ZIP_PW Has an Office doc that is password protected in a zip
|
||||
|
||||
body OLEMACRO_CSV eval:check_olemacro_csv()
|
||||
describe OLEMACRO_CSV Malicious csv file that tries to exec cmd.exe detected
|
||||
endif
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin detects OLE Macro inside documents attached to emails.
|
||||
It can detect documents inside zip files as well as encrypted documents.
|
||||
|
||||
=head1 REQUIREMENT
|
||||
|
||||
This plugin requires Archive::Zip and IO::String perl modules.
|
||||
|
||||
=head1 USER PREFERENCES
|
||||
|
||||
The following options can be used in both site-wide (C<local.cf>) and
|
||||
user-specific (C<user_prefs>) configuration files to customize how
|
||||
the module handles attached documents
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::OLEVBMacro;
|
||||
use strict;
|
||||
use warnings;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
|
||||
use constant HAS_ARCHIVE_ZIP => eval { require Archive::Zip; };
|
||||
use constant HAS_IO_STRING => eval { require IO::String; };
|
||||
|
||||
BEGIN
|
||||
{
|
||||
eval{
|
||||
import Archive::Zip qw( :ERROR_CODES :CONSTANTS )
|
||||
};
|
||||
eval{
|
||||
import IO::String
|
||||
};
|
||||
}
|
||||
|
||||
use re 'taint';
|
||||
|
||||
use vars qw(@ISA);
|
||||
@ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
our $VERSION = '0.52';
|
||||
|
||||
# https://www.openoffice.org/sc/compdocfileformat.pdf
|
||||
# http://blog.rootshell.be/2015/01/08/searching-for-microsoft-office-files-containing-macro/
|
||||
my $marker1 = "\xd0\xcf\x11\xe0";
|
||||
my $marker2 = "\x00\x41\x74\x74\x72\x69\x62\x75\x74\x00";
|
||||
# embedded object in rtf files (https://www.biblioscape.com/rtf15_spec.htm)
|
||||
my $marker3 = "\x5c\x6f\x62\x6a\x65\x6d\x62";
|
||||
my $marker4 = "\x5c\x6f\x62\x6a\x64\x61\x74";
|
||||
my $marker5 = "\x5c\x20\x6f\x62\x6a\x64\x61\x74";
|
||||
# Excel .xlsx encrypted package, thanks to Dan Bagwell for the sample
|
||||
my $encrypted_marker = "\x45\x00\x6e\x00\x63\x00\x72\x00\x79\x00\x70\x00\x74\x00\x65\x00\x64\x00\x50\x00\x61\x00\x63\x00\x6b\x00\x61\x00\x67\x00\x65";
|
||||
|
||||
# this code burps an ugly message if it fails, but that's redirected elsewhere
|
||||
# AZ_OK is a constant exported by Archive::Zip
|
||||
my $az_ok;
|
||||
eval '$az_ok = AZ_OK';
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
$self->register_eval_rule("check_olemacro");
|
||||
$self->register_eval_rule("check_olemacro_csv");
|
||||
$self->register_eval_rule("check_olemacro_malice");
|
||||
$self->register_eval_rule("check_olemacro_renamed");
|
||||
$self->register_eval_rule("check_olemacro_encrypted");
|
||||
$self->register_eval_rule("check_olemacro_zip_password");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub dbg {
|
||||
Mail::SpamAssassin::Plugin::dbg ("OLEVBMacro: @_");
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds = ();
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_num_mime',
|
||||
default => 5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_num_mime (default: 5)
|
||||
|
||||
Configure the maximum number of matching MIME parts the plugin will scan
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_num_zip',
|
||||
default => 8,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_num_zip (default: 8)
|
||||
|
||||
Configure the maximum number of matching zip members the plugin will scan
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_zip_depth',
|
||||
default => 2,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_zip_depth (default: 2)
|
||||
|
||||
Depth to recurse within Zip files
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_extended_scan',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_extended_scan ( 0 | 1 ) (default: 0)
|
||||
|
||||
Scan more files for potential macros, the C<olemacro_skip_exts> parameter will still be honored.
|
||||
This parameter is off by default, this option is needed only to run
|
||||
C<eval:check_olemacro_renamed> rule.
|
||||
If this is turned on consider adjusting values for C<olemacro_num_mime> and C<olemacro_num_zip>
|
||||
and prepare for more CPU overhead
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_prefer_contentdisposition',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_prefer_contentdisposition ( 0 | 1 ) (default: 1)
|
||||
|
||||
Choose if the content-disposition header filename be preferred if ambiguity is encountered whilst trying to get filename
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_max_file',
|
||||
default => 1024000,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_max_file (default: 1024000)
|
||||
|
||||
Configure the largest file that the plugin will decode from the MIME objects
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
# https://blogs.msdn.microsoft.com/vsofficedeveloper/2008/05/08/office-2007-file-format-mime-types-for-http-content-streaming-2/
|
||||
# https://technet.microsoft.com/en-us/library/ee309278(office.12).aspx
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_exts',
|
||||
default => qr/(?:doc|docx|dot|pot|ppa|pps|ppt|rtf|sldm|xl|xla|xls|xlsx|xlt|xltx|xslb)$/,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my ($rec, $err) = compile_regexp($value, 0);
|
||||
if (!$rec) {
|
||||
dbg("config: invalid olemacro_exts '$value': $err");
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{olemacro_exts} = $rec;
|
||||
},
|
||||
}
|
||||
);
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_exts (default: (?:doc|docx|dot|pot|ppa|pps|ppt|rtf|sldm|xl|xla|xls|xlsx|xlt|xltx|xslb)$)
|
||||
|
||||
Set the case-insensitive regexp used to configure the extensions the plugin
|
||||
targets for macro scanning
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_macro_exts',
|
||||
default => qr/(?:docm|dotm|ppam|potm|ppst|ppsm|pptm|sldm|xlm|xlam|xlsb|xlsm|xltm|xltx|xps)$/,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my ($rec, $err) = compile_regexp($value, 0);
|
||||
if (!$rec) {
|
||||
dbg("config: invalid olemacro_macro_exts '$value': $err");
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{olemacro_macro_exts} = $rec;
|
||||
},
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_macro_exts (default: (?:docm|dotm|ppam|potm|ppst|ppsm|pptm|sldm|xlm|xlam|xlsb|xlsm|xltm|xltx|xps)$)
|
||||
|
||||
Set the case-insensitive regexp used to configure the extensions the plugin
|
||||
treats as containing a macro
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_skip_exts',
|
||||
default => qr/(?:dotx|potx|ppsx|pptx|sldx)$/,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my ($rec, $err) = compile_regexp($value, 0);
|
||||
if (!$rec) {
|
||||
dbg("config: invalid olemacro_skip_exts '$value': $err");
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
|
||||
$self->{olemacro_skip_exts} = $rec;
|
||||
},
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_skip_exts (default: (?:dotx|potx|ppsx|pptx|sldx|xltx)$)
|
||||
|
||||
Set the case-insensitive regexp used to configure extensions for the plugin
|
||||
to skip entirely, these should only be guaranteed macro free files
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_skip_ctypes',
|
||||
default => qr/^(?:text\/)/,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my ($rec, $err) = compile_regexp($value, 0);
|
||||
if (!$rec) {
|
||||
dbg("config: invalid olemacro_skip_ctypes '$value': $err");
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
|
||||
$self->{olemacro_skip_ctypes} = $rec;
|
||||
},
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_skip_ctypes (default: ^(?:text\/))
|
||||
|
||||
Set the case-insensitive regexp used to configure content types for the
|
||||
plugin to skip entirely, these should only be guaranteed macro free
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'olemacro_zips',
|
||||
default => qr/(?:zip)$/,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my ($rec, $err) = compile_regexp($value, 0);
|
||||
if (!$rec) {
|
||||
dbg("config: invalid olemacro_zips '$value': $err");
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
|
||||
$self->{olemacro_zips} = $rec;
|
||||
},
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item olemacro_zips (default: (?:zip)$)
|
||||
|
||||
Set the case-insensitive regexp used to configure extensions for the plugin
|
||||
to target as zip files, files listed in configs above are also tested for zip
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub check_olemacro {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
|
||||
_check_attachments(@_) unless exists $pms->{olemacro_exists};
|
||||
|
||||
return $pms->{olemacro_exists};
|
||||
}
|
||||
|
||||
sub check_olemacro_csv {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
|
||||
my $chunk_size = $pms->{conf}->{olemacro_max_file};
|
||||
|
||||
foreach my $part ($pms->{msg}->find_parts(qr/./, 1)) {
|
||||
|
||||
next unless ($part->{type} eq "text/plain");
|
||||
|
||||
my ($ctt, $ctd, $cte, $name) = _get_part_details($pms, $part);
|
||||
next unless defined $ctt;
|
||||
|
||||
next if $name eq '';
|
||||
|
||||
# we skipped what we need/want to
|
||||
my $data = undef;
|
||||
|
||||
# if name extension is csv - return true
|
||||
if ($name =~ /\.csv/i) {
|
||||
dbg("Found csv file with name $name");
|
||||
$data = $part->decode($chunk_size) unless defined $data;
|
||||
if($data =~ /MSEXCEL\|.{1,20}Windows\\System32\\cmd\.exe/) {
|
||||
$pms->{olemacro_csv} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
return $pms->{olemacro_csv};
|
||||
}
|
||||
|
||||
sub check_olemacro_malice {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
|
||||
_check_attachments(@_) unless exists $pms->{olemacro_malice};
|
||||
|
||||
return $pms->{olemacro_malice};
|
||||
}
|
||||
|
||||
sub check_olemacro_renamed {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
|
||||
_check_attachments(@_) unless exists $pms->{olemacro_renamed};
|
||||
|
||||
if ( $pms->{olemacro_renamed} == 1 ) {
|
||||
dbg("Found Office document with a renamed macro");
|
||||
}
|
||||
|
||||
return $pms->{olemacro_renamed};
|
||||
}
|
||||
|
||||
sub check_olemacro_encrypted {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
|
||||
_check_attachments(@_) unless exists $pms->{olemacro_encrypted};
|
||||
|
||||
return $pms->{olemacro_encrypted};
|
||||
}
|
||||
|
||||
sub check_olemacro_zip_password {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
|
||||
_check_attachments(@_) unless exists $pms->{olemacro_zip_password};
|
||||
|
||||
return $pms->{olemacro_zip_password};
|
||||
}
|
||||
|
||||
sub _check_attachments {
|
||||
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
|
||||
my $mimec = 0;
|
||||
my $chunk_size = $pms->{conf}->{olemacro_max_file};
|
||||
|
||||
$pms->{olemacro_exists} = 0;
|
||||
$pms->{olemacro_malice} = 0;
|
||||
$pms->{olemacro_renamed} = 0;
|
||||
$pms->{olemacro_encrypted} = 0;
|
||||
$pms->{olemacro_zip_password} = 0;
|
||||
$pms->{olemacro_office_xml} = 0;
|
||||
|
||||
foreach my $part ($pms->{msg}->find_parts(qr/./, 1)) {
|
||||
|
||||
next if ($part->{type} =~ /$pms->{conf}->{olemacro_skip_ctypes}/i);
|
||||
|
||||
my ($ctt, $ctd, $cte, $name) = _get_part_details($pms, $part);
|
||||
next unless defined $ctt;
|
||||
|
||||
next if $name eq '';
|
||||
next if ($name =~ /$pms->{conf}->{olemacro_skip_exts}/i);
|
||||
|
||||
# we skipped what we need/want to
|
||||
my $data = undef;
|
||||
|
||||
# if name is macrotype - return true
|
||||
if ($name =~ /$pms->{conf}->{olemacro_macro_exts}/i) {
|
||||
dbg("Found macrotype attachment with name $name");
|
||||
$pms->{olemacro_exists} = 1;
|
||||
|
||||
$data = $part->decode($chunk_size) unless defined $data;
|
||||
|
||||
_check_encrypted_doc($pms, $name, $data);
|
||||
_check_macrotype_doc($pms, $name, $data);
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
}
|
||||
|
||||
# if name is ext type - check and return true if needed
|
||||
if ($name =~ /$pms->{conf}->{olemacro_exts}/i) {
|
||||
dbg("Found attachment with name $name");
|
||||
$data = $part->decode($chunk_size) unless defined $data;
|
||||
|
||||
_check_encrypted_doc($pms, $name, $data);
|
||||
_check_oldtype_doc($pms, $name, $data);
|
||||
# zipped doc that matches olemacro_exts - strange
|
||||
if (_check_macrotype_doc($pms, $name, $data)) {
|
||||
$pms->{olemacro_renamed} = $pms->{olemacro_office_xml};
|
||||
}
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
}
|
||||
|
||||
if ($name =~ /$pms->{conf}->{olemacro_zips}/i) {
|
||||
dbg("Found zip attachment with name $name");
|
||||
$data = $part->decode($chunk_size) unless defined $data;
|
||||
|
||||
_check_zip($pms, $name, $data);
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
}
|
||||
|
||||
if ($pms->{conf}->{olemacro_extended_scan} == 1) {
|
||||
dbg("Extended scan attachment with name $name");
|
||||
$data = $part->decode($chunk_size) unless defined $data;
|
||||
|
||||
if (_is_office_doc($data)) {
|
||||
$pms->{olemacro_renamed} = 1;
|
||||
dbg("Found $name to be an Office Doc!");
|
||||
_check_encrypted_doc($pms, $name, $data);
|
||||
_check_oldtype_doc($pms, $name, $data);
|
||||
}
|
||||
|
||||
if (_check_macrotype_doc($pms, $name, $data)) {
|
||||
$pms->{olemacro_renamed} = $pms->{olemacro_office_xml};
|
||||
}
|
||||
|
||||
_check_zip($pms, $name, $data);
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
}
|
||||
|
||||
# if we get to here with data a part has been scanned nudge as reqd
|
||||
$mimec+=1 if defined $data;
|
||||
if ($mimec >= $pms->{conf}->{olemacro_num_mime}) {
|
||||
dbg('MIME limit reached');
|
||||
last;
|
||||
}
|
||||
dbg("No Marker of a Macro found in file $name");
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _check_zip {
|
||||
my ($pms, $name, $data, $depth) = @_;
|
||||
|
||||
if (!HAS_ARCHIVE_ZIP) {
|
||||
warn "check_zip not supported, required module Archive::Zip missing\n";
|
||||
return 0;
|
||||
}
|
||||
return 0 if $pms->{conf}->{olemacro_num_zip} == 0;
|
||||
|
||||
$depth = $depth || 1;
|
||||
return 0 if ($depth > $pms->{conf}->{olemacro_zip_depth});
|
||||
|
||||
return 0 unless _is_zip_file($name, $data);
|
||||
my $zip = _open_zip_handle($data);
|
||||
return 0 unless $zip;
|
||||
|
||||
dbg("Zip opened");
|
||||
|
||||
my $filec = 0;
|
||||
my @members = $zip->members();
|
||||
# foreach zip member
|
||||
# - skip if in skip exts
|
||||
# - return 1 if in macro types
|
||||
# - check for marker if doc type
|
||||
# - check if a zip
|
||||
foreach my $member (@members){
|
||||
my $mname = lc $member->fileName();
|
||||
next if ($mname =~ /$pms->{conf}->{olemacro_skip_exts}/i);
|
||||
|
||||
my $data = undef;
|
||||
my $status = undef;
|
||||
|
||||
# if name is macrotype - return true
|
||||
if ($mname =~ /$pms->{conf}->{olemacro_macro_exts}/i) {
|
||||
dbg("Found macrotype zip member $mname");
|
||||
$pms->{olemacro_exists} = 1;
|
||||
|
||||
if ($member->isEncrypted()) {
|
||||
dbg("Zip member $mname is encrypted (zip pw)");
|
||||
$pms->{olemacro_zip_password} = 1;
|
||||
return 1;
|
||||
}
|
||||
|
||||
( $data, $status ) = $member->contents() unless defined $data;
|
||||
return 1 unless $status == $az_ok;
|
||||
|
||||
_check_encrypted_doc($pms, $name, $data);
|
||||
_check_macrotype_doc($pms, $name, $data);
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
}
|
||||
|
||||
if ($mname =~ /$pms->{conf}->{olemacro_exts}/i) {
|
||||
dbg("Found zip member $mname");
|
||||
|
||||
if ($member->isEncrypted()) {
|
||||
dbg("Zip member $mname is encrypted (zip pw)");
|
||||
$pms->{olemacro_zip_password} = 1;
|
||||
next;
|
||||
}
|
||||
|
||||
( $data, $status ) = $member->contents() unless defined $data;
|
||||
next unless $status == $az_ok;
|
||||
|
||||
|
||||
_check_encrypted_doc($pms, $name, $data);
|
||||
_check_oldtype_doc($pms, $name, $data);
|
||||
# zipped doc that matches olemacro_exts - strange
|
||||
if (_check_macrotype_doc($pms, $name, $data)) {
|
||||
$pms->{olemacro_renamed} = $pms->{olemacro_office_xml};
|
||||
}
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
|
||||
}
|
||||
|
||||
if ($mname =~ /$pms->{conf}->{olemacro_zips}/i) {
|
||||
dbg("Found zippy zip member $mname");
|
||||
( $data, $status ) = $member->contents() unless defined $data;
|
||||
next unless $status == $az_ok;
|
||||
|
||||
_check_zip($pms, $name, $data, $depth);
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
|
||||
}
|
||||
|
||||
if ($pms->{conf}->{olemacro_extended_scan} == 1) {
|
||||
dbg("Extended scan attachment with member name $mname");
|
||||
( $data, $status ) = $member->contents() unless defined $data;
|
||||
next unless $status == $az_ok;
|
||||
|
||||
if (_is_office_doc($data)) {
|
||||
dbg("Found $name to be an Office Doc!");
|
||||
_check_encrypted_doc($pms, $name, $data);
|
||||
$pms->{olemacro_renamed} = 1;
|
||||
_check_oldtype_doc($pms, $name, $data);
|
||||
}
|
||||
|
||||
if (_check_macrotype_doc($pms, $name, $data)) {
|
||||
$pms->{olemacro_renamed} = $pms->{olemacro_office_xml};
|
||||
}
|
||||
|
||||
_check_zip($pms, $name, $data, $depth);
|
||||
|
||||
return 1 if $pms->{olemacro_exists} == 1;
|
||||
|
||||
}
|
||||
|
||||
# if we get to here with data a member has been scanned nudge as reqd
|
||||
$filec+=1 if defined $data;
|
||||
if ($filec >= $pms->{conf}->{olemacro_num_zip}) {
|
||||
dbg('Zip limit reached');
|
||||
last;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _get_part_details {
|
||||
my ($pms, $part) = @_;
|
||||
#https://en.wikipedia.org/wiki/MIME#Content-Disposition
|
||||
#https://github.com/mikel/mail/pull/464
|
||||
|
||||
my $ctt = $part->get_header('content-type');
|
||||
return undef unless defined $ctt; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
my $cte = lc($part->get_header('content-transfer-encoding') || '');
|
||||
return undef unless ($cte =~ /^(?:base64|quoted\-printable)$/); ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
$ctt = _decode_part_header($part, $ctt || '');
|
||||
|
||||
my $name = '';
|
||||
my $cttname = '';
|
||||
my $ctdname = '';
|
||||
|
||||
if($ctt =~ m/(?:file)?name\s*=\s*["']?([^"';]*)["']?/is){
|
||||
$cttname = $1;
|
||||
$cttname =~ s/\s+$//;
|
||||
}
|
||||
|
||||
my $ctd = $part->get_header('content-disposition');
|
||||
$ctd = _decode_part_header($part, $ctd || '');
|
||||
|
||||
if($ctd =~ m/filename\s*=\s*["']?([^"';]*)["']?/is){
|
||||
$ctdname = $1;
|
||||
$ctdname =~ s/\s+$//;
|
||||
}
|
||||
|
||||
if (lc $ctdname eq lc $cttname) {
|
||||
$name = $ctdname;
|
||||
} elsif ($ctdname eq '') {
|
||||
$name = $cttname;
|
||||
} elsif ($cttname eq '') {
|
||||
$name = $ctdname;
|
||||
} else {
|
||||
if ($pms->{conf}->{olemacro_prefer_contentdisposition}) {
|
||||
$name = $ctdname;
|
||||
} else {
|
||||
$name = $cttname;
|
||||
}
|
||||
}
|
||||
|
||||
return $ctt, $ctd, $cte, lc $name;
|
||||
}
|
||||
|
||||
sub _open_zip_handle {
|
||||
my ($data) = @_;
|
||||
# open our archive from raw data
|
||||
my $SH = IO::String->new($data);
|
||||
|
||||
Archive::Zip::setErrorHandler( \&_zip_error_handler );
|
||||
my $zip = Archive::Zip->new();
|
||||
if($zip->readFromFileHandle( $SH ) != $az_ok){
|
||||
dbg("cannot read zipfile");
|
||||
# as we cannot read it its not a zip (or too big/corrupted)
|
||||
# so skip processing.
|
||||
return 0;
|
||||
}
|
||||
return $zip;
|
||||
}
|
||||
|
||||
sub _check_macrotype_doc {
|
||||
my ($pms, $name, $data) = @_;
|
||||
|
||||
if (!HAS_IO_STRING) {
|
||||
warn "check_macrotype_doc not supported, required module IO::String missing\n";
|
||||
return 0;
|
||||
}
|
||||
return 0 unless _is_zip_file($name, $data);
|
||||
|
||||
my $zip = _open_zip_handle($data);
|
||||
return 0 unless $zip;
|
||||
|
||||
#https://www.decalage.info/vba_tools
|
||||
my %macrofiles = (
|
||||
'word/vbaproject.bin' => 'word2k7',
|
||||
'macros/vba/_vba_project' => 'word97',
|
||||
'xl/vbaproject.bin' => 'xl2k7',
|
||||
'xl/embeddings/oleObject1.bin' => 'xl2k13',
|
||||
'_vba_project_cur/vba/_vba_project' => 'xl97',
|
||||
'ppt/vbaproject.bin' => 'ppt2k7',
|
||||
);
|
||||
|
||||
my @members = $zip->members();
|
||||
foreach my $member (@members){
|
||||
my $mname = lc $member->fileName();
|
||||
if (exists($macrofiles{$mname})) {
|
||||
dbg("Found $macrofiles{$mname} vba file");
|
||||
$pms->{olemacro_exists} = 1;
|
||||
last;
|
||||
}
|
||||
}
|
||||
|
||||
# Look for a member named [Content_Types].xml and do checks
|
||||
if (my $ctypesxml = $zip->memberNamed('[Content_Types].xml')) {
|
||||
dbg('Found [Content_Types].xml file');
|
||||
$pms->{olemacro_office_xml} = 1;
|
||||
if (!$pms->{olemacro_exists}) {
|
||||
my ( $data, $status ) = $ctypesxml->contents();
|
||||
|
||||
if (($status == $az_ok) && (_check_ctype_xml($data))) {
|
||||
$pms->{olemacro_exists} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (($pms->{olemacro_exists}) && (_find_malice_bins($zip))) {
|
||||
$pms->{olemacro_malice} = 1;
|
||||
}
|
||||
|
||||
return $pms->{olemacro_exists};
|
||||
|
||||
}
|
||||
|
||||
# Office 2003
|
||||
|
||||
sub _check_oldtype_doc {
|
||||
my ($pms, $name, $data) = @_;
|
||||
|
||||
if (_check_markers($data)) {
|
||||
$pms->{olemacro_exists} = 1;
|
||||
if (_check_malice($data)) {
|
||||
$pms->{olemacro_malice} = 1;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
# Encrypted doc
|
||||
|
||||
sub _check_encrypted_doc {
|
||||
my ($pms, $name, $data) = @_;
|
||||
|
||||
if (_is_encrypted_doc($data)) {
|
||||
dbg("File $name is encrypted");
|
||||
$pms->{olemacro_encrypted} = 1;
|
||||
}
|
||||
|
||||
return $pms->{olemacro_encrypted};
|
||||
}
|
||||
|
||||
sub _is_encrypted_doc {
|
||||
my ($data) = @_;
|
||||
|
||||
#http://stackoverflow.com/questions/14347513/how-to-detect-if-a-word-document-is-password-protected-before-uploading-the-file/14347730#14347730
|
||||
if (_is_office_doc($data)) {
|
||||
if ($data =~ /(?:<encryption xmlns)/i) {
|
||||
return 1;
|
||||
}
|
||||
if (index($data, "\x13") == 523) {
|
||||
return 1;
|
||||
}
|
||||
if (index($data, "\x2f") == 532) {
|
||||
return 1;
|
||||
}
|
||||
if (index($data, "\xfe") == 520) {
|
||||
return 1;
|
||||
}
|
||||
my $tdata = substr $data, 2000;
|
||||
$tdata =~ s/\\0/ /g;
|
||||
if (index($tdata, "E n c r y p t e d P a c k a g e") > -1) {
|
||||
return 1;
|
||||
}
|
||||
if (index($tdata, $encrypted_marker) > -1) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
sub _is_office_doc {
|
||||
my ($data) = @_;
|
||||
if (index($data, $marker1) == 0) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
sub _is_zip_file {
|
||||
my ($name, $data) = @_;
|
||||
if (index($data, 'PK') == 0) {
|
||||
return 1;
|
||||
} else {
|
||||
return($name =~ /(?:zip)$/i);
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_markers {
|
||||
my ($data) = @_;
|
||||
|
||||
if (index($data, $marker1) == 0 && index($data, $marker2) > -1) {
|
||||
dbg('Marker 1 & 2 found');
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (index($data, $marker3) > -1) {
|
||||
dbg('Marker 3 found');
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (index($data, $marker4) > -1) {
|
||||
dbg('Marker 4 found');
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (index($data, $marker5) > -1) {
|
||||
dbg('Marker 5 found');
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (index($data, 'w:macrosPresent="yes"') > -1) {
|
||||
dbg('XML macros marker found');
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (index($data, 'vbaProject.bin.rels') > -1) {
|
||||
dbg('XML macros marker found');
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
sub _find_malice_bins {
|
||||
my ($zip) = @_;
|
||||
|
||||
my @binfiles = $zip->membersMatching( '.*\.bin' );
|
||||
|
||||
foreach my $member (@binfiles){
|
||||
my ( $data, $status ) = $member->contents();
|
||||
next unless $status == $az_ok;
|
||||
if (_check_malice($data)) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_malice {
|
||||
my ($data) = @_;
|
||||
|
||||
# https://www.greyhathacker.net/?p=872
|
||||
if ($data =~ /(?:document|auto|workbook)_?open/i) {
|
||||
dbg('Found potential malicious code');
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_ctype_xml {
|
||||
my ($data) = @_;
|
||||
|
||||
# http://download.microsoft.com/download/D/3/3/D334A189-E51B-47FF-B0E8-C0479AFB0E3C/[MS-OFFMACRO].pdf
|
||||
if ($data =~ /ContentType=["']application\/vnd\.ms-office\.vbaProject["']/i){
|
||||
dbg('Found VBA ref');
|
||||
return 1;
|
||||
}
|
||||
if ($data =~ /macroEnabled/i) {
|
||||
dbg('Found Macro Ref');
|
||||
return 1;
|
||||
}
|
||||
if ($data =~ /application\/vnd\.ms-excel\.(?:intl)?macrosheet/i) {
|
||||
dbg('Excel macrosheet found');
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
sub _zip_error_handler {
|
||||
1;
|
||||
}
|
||||
|
||||
sub _decode_part_header {
|
||||
my($part, $header_field_body) = @_;
|
||||
|
||||
return '' unless defined $header_field_body && $header_field_body ne '';
|
||||
|
||||
# deal with folding and cream the newlines and such
|
||||
$header_field_body =~ s/\n[ \t]+/\n /g;
|
||||
$header_field_body =~ s/\015?\012//gs;
|
||||
|
||||
local($1,$2,$3);
|
||||
|
||||
# Multiple encoded sections must ignore the interim whitespace.
|
||||
# To avoid possible FPs with (\s+(?==\?))?, look for the whole RE
|
||||
# separated by whitespace.
|
||||
1 while $header_field_body =~
|
||||
s{ ( = \? [A-Za-z0-9_-]+ \? [bqBQ] \? [^?]* \? = ) \s+
|
||||
( = \? [A-Za-z0-9_-]+ \? [bqBQ] \? [^?]* \? = ) }
|
||||
{$1$2}xsg;
|
||||
|
||||
# transcode properly encoded RFC 2047 substrings into UTF-8 octets,
|
||||
# leave everything else unchanged as it is supposed to be UTF-8 (RFC 6532)
|
||||
# or plain US-ASCII
|
||||
$header_field_body =~
|
||||
s{ (?: = \? ([A-Za-z0-9_-]+) \? ([bqBQ]) \? ([^?]*) \? = ) }
|
||||
{ $part->__decode_header($1, uc($2), $3) }xsge;
|
||||
|
||||
return $header_field_body;
|
||||
}
|
||||
|
||||
1;
|
154
upstream/lib/Mail/SpamAssassin/Plugin/OneLineBodyRuleType.pm
Normal file
154
upstream/lib/Mail/SpamAssassin/Plugin/OneLineBodyRuleType.pm
Normal file
@ -0,0 +1,154 @@
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::OneLineBodyRuleType - spamassassin body test plugin
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::OneLineBodyRuleType;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
# constructor
|
||||
sub new {
|
||||
my $class = shift;
|
||||
$class = ref($class) || $class;
|
||||
my $self = {};
|
||||
bless ($self, $class);
|
||||
return $self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub check_rules_at_priority {
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
my $checkobj = $params->{checkobj};
|
||||
my $priority = $params->{priority};
|
||||
Mail::SpamAssassin::Plugin::Check::do_one_line_body_tests($checkobj,
|
||||
$pms, $priority);
|
||||
}
|
||||
|
||||
sub check_start {
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
# this method runs before the body ruleset is compiled, but after
|
||||
# finish_tests(). perfect spot to remove rules from the body
|
||||
# set and add to another set...
|
||||
|
||||
my $test_set = $conf->{body_tests};
|
||||
foreach my $pri (keys %{$test_set})
|
||||
{
|
||||
foreach my $rulename (keys %{$test_set->{$pri}})
|
||||
{
|
||||
if ($conf->{generate_body_one_line_sub}->{$rulename}) {
|
||||
# add the rule to the one-liner set
|
||||
$conf->{one_line_body_tests}->{$pri} ||= { };
|
||||
$conf->{one_line_body_tests}->{$pri}->{$rulename} =
|
||||
$test_set->{$pri}->{$rulename};
|
||||
}
|
||||
|
||||
if ($conf->{skip_body_rules}->{$rulename}) {
|
||||
# remove from the body set
|
||||
delete $test_set->{$pri}->{$rulename};
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
||||
|
||||
# inject this method into the Check plugin's namespace
|
||||
# TODO: we need a better way to define new ruletypes via plugin
|
||||
package Mail::SpamAssassin::Plugin::Check;
|
||||
|
||||
sub do_one_line_body_tests {
|
||||
my ($self, $pms, $priority) = @_;
|
||||
|
||||
# TODO: should have a consttype for plugin-defined "alien" rule types,
|
||||
# probably something like TYPE_ALIEN_TESTS. it's only used as a key
|
||||
# for {user_rules_of_type}, so that should be fine
|
||||
|
||||
$self->run_generic_tests ($pms, $priority,
|
||||
consttype => $Mail::SpamAssassin::Conf::TYPE_BODY_TESTS,
|
||||
type => 'one_line_body',
|
||||
testhash => $pms->{conf}->{one_line_body_tests},
|
||||
args => [ ],
|
||||
loop_body => sub
|
||||
{
|
||||
my ($self, $pms, $conf, $rulename, $pat, %opts) = @_;
|
||||
my $sub = '
|
||||
my ($self, $line) = @_;
|
||||
my $qrptr = $self->{main}->{conf}->{test_qrs};
|
||||
';
|
||||
|
||||
if (($conf->{tflags}->{$rulename}||'') =~ /\bmultiple\b/)
|
||||
{
|
||||
# support multiple matches
|
||||
my ($max) = $conf->{tflags}->{$rulename} =~ /\bmaxhits=(\d+)\b/;
|
||||
$max = untaint_var($max);
|
||||
if ($max) {
|
||||
$sub .= '
|
||||
if (exists $self->{tests_already_hit}->{q{'.$rulename.'}}) {
|
||||
return 0 if $self->{tests_already_hit}->{q{'.$rulename.'}} >= '.$max.';
|
||||
}
|
||||
';
|
||||
}
|
||||
# avoid [perl #86784] bug (fixed in 5.13.x), access the arg through ref
|
||||
$sub .= '
|
||||
my $lref = \$line;
|
||||
pos $$lref = 0;
|
||||
'.$self->hash_line_for_rule($pms, $rulename).'
|
||||
while ($$lref =~ /$qrptr->{q{'.$rulename.'}}/go) {
|
||||
$self->got_hit(q{'.$rulename.'}, "BODY: ", ruletype => "one_line_body");
|
||||
'. $self->hit_rule_plugin_code($pms, $rulename, "one_line_body", "") . '
|
||||
'. ($max? 'last if $self->{tests_already_hit}->{q{'.$rulename.'}} >= '.$max.';' : '') . '
|
||||
}
|
||||
';
|
||||
|
||||
} else {
|
||||
$sub .= '
|
||||
'.$self->hash_line_for_rule($pms, $rulename).'
|
||||
if ($line =~ /$qrptr->{q{'.$rulename.'}}/o) {
|
||||
$self->got_hit(q{'.$rulename.'}, "BODY: ", ruletype => "one_line_body");
|
||||
'. $self->hit_rule_plugin_code($pms, $rulename, "one_line_body", "return 1") . '
|
||||
}
|
||||
';
|
||||
|
||||
}
|
||||
|
||||
return if ($opts{doing_user_rules} &&
|
||||
!$self->is_user_rule_sub($rulename.'_one_line_body_test'));
|
||||
|
||||
$self->add_temporary_method ($rulename.'_one_line_body_test', $sub);
|
||||
},
|
||||
pre_loop_body => sub
|
||||
{
|
||||
my ($self, $pms, $conf, %opts) = @_;
|
||||
$self->add_evalstr($pms, '
|
||||
|
||||
my $bodytext = $self->get_decoded_stripped_body_text_array();
|
||||
$self->{main}->call_plugins("run_body_fast_scan", {
|
||||
permsgstatus => $self, ruletype => "body",
|
||||
priority => '.$opts{priority}.', lines => $bodytext
|
||||
});
|
||||
|
||||
');
|
||||
});
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
751
upstream/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm
Normal file
751
upstream/lib/Mail/SpamAssassin/Plugin/PDFInfo.pm
Normal file
@ -0,0 +1,751 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::PDFInfo - PDFInfo Plugin for SpamAssassin
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::PDFInfo
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin helps detected spam using attached PDF files
|
||||
|
||||
=over 4
|
||||
|
||||
=item See "Usage:" below - more documentation see 20_pdfinfo.cf
|
||||
|
||||
Original info kept for history. For later changes see SVN repo
|
||||
-------------------------------------------------------
|
||||
PDFInfo Plugin for SpamAssassin
|
||||
Version: 0.8
|
||||
Info: $Id: PDFInfo.pm 904 2007-08-12 01:36:23Z root $
|
||||
Created: 2007-08-10
|
||||
Modified: 2007-08-10
|
||||
By: Dallas Engelken
|
||||
|
||||
Changes:
|
||||
0.8 - added .fdf detection (thanks John Lundin) [axb]
|
||||
0.7 - fixed empty body/pdf count buglet(thanks Jeremy) [axb]
|
||||
0.6 - added support for tags - PDFCOUNT, PDFVERSION, PDFPRODUCER, etc.
|
||||
- fixed issue on perl 5.6.1 where pdf_match_details() failed to call
|
||||
_find_pdf_mime_parts(), resulting in no detection of pdf mime parts.
|
||||
- quoted-printable support - requires MIME::QuotedPrint (which should be in everyones
|
||||
install as a part of the MIME-Base64 package which is a SA req)
|
||||
- added simple pdf_is_empty_body() function with counts the body bytes minus the
|
||||
subject line. can add optional <bytes> param if you need to allow for a few bytes.
|
||||
0.5 - fix warns for undef $pdf_tags
|
||||
- remove { } and \ before running eval in pdf_match_details to avoid eval error
|
||||
0.4 - added pdf_is_encrypted() function
|
||||
- added option to look for image HxW on same line
|
||||
0.3 - added 2nd fuzzy md5 which uses pdf tag layout as data
|
||||
- renamed pdf_image_named() to pdf_named()
|
||||
- PDF images are encapsulated and have no names. We are matching the PDF file name.
|
||||
- renamed pdf_image_name_regex() to pdf_name_regex()
|
||||
- PDF images are encapsulated and have no names. We are matching the PDF file name.
|
||||
- changed pdf_image_count() a bit and added pdf_count().
|
||||
- pdf_count() checks how many pdf attachments there are on the mail
|
||||
- pdf_image_count() checks how many images are found within all pdfs in the mail.
|
||||
- removed the restriction of the pdf containing an image in order to md5 it.
|
||||
- added pdf_match_details() function to check the following 'details'
|
||||
- author: Author of PDF if specified
|
||||
- producer: Software used to produce PDF
|
||||
- creator: Software used to produce PDF, usually similar to producer
|
||||
- title: Title of PDF
|
||||
- created: Creation Date
|
||||
- modified: Last Modified
|
||||
0.2 - support PDF octet-stream
|
||||
0.1 - just ported over the imageinfo code, and renamed to pdfinfo.
|
||||
- removed all support for png, gif, and jpg from the code.
|
||||
- prepended pdf_ to all function names to avoid conflicts with ImageInfo in SA 3.2.
|
||||
|
||||
Usage:
|
||||
|
||||
pdf_count()
|
||||
|
||||
body RULENAME eval:pdf_count(<min>,[max])
|
||||
min: required, message contains at least x pdf mime parts
|
||||
max: optional, if specified, must not contain more than x pdf mime parts
|
||||
|
||||
pdf_image_count()
|
||||
|
||||
body RULENAME eval:pdf_image_count(<min>,[max])
|
||||
min: required, message contains at least x images in pdf attachments.
|
||||
max: optional, if specified, must not contain more than x pdf images
|
||||
|
||||
pdf_pixel_coverage()
|
||||
|
||||
body RULENAME eval:pdf_pixel_coverage(<min>,[max])
|
||||
min: required, message contains at least this much pixel area
|
||||
max: optional, if specified, message must not contain more than this much pixel area
|
||||
|
||||
pdf_named()
|
||||
|
||||
body RULENAME eval:pdf_named(<string>)
|
||||
string: exact file name match, if you need partial match, see pdf_name_regex()
|
||||
|
||||
pdf_name_regex()
|
||||
|
||||
body RULENAME eval:pdf_name_regex(<regex>)
|
||||
regex: regular expression, see examples in ruleset
|
||||
|
||||
pdf_match_md5()
|
||||
|
||||
body RULENAME eval:pdf_match_md5(<string>)
|
||||
string: 32-byte md5 hex
|
||||
|
||||
pdf_match_fuzzy_md5()
|
||||
|
||||
body RULENAME eval:pdf_match_md5(<string>)
|
||||
string: 32-byte md5 hex - see ruleset for obtaining the fuzzy md5
|
||||
|
||||
pdf_match_details()
|
||||
|
||||
body RULENAME eval:pdf_match_details(<detail>,<regex>);
|
||||
detail: author, creator, created, modified, producer, title
|
||||
regex: regular expression, see examples in ruleset
|
||||
|
||||
pdf_is_encrypted()
|
||||
|
||||
body RULENAME eval:pdf_is_encrypted()
|
||||
|
||||
pdf_is_empty_body()
|
||||
|
||||
body RULENAME eval:pdf_is_empty_body(<bytes>)
|
||||
bytes: maximum byte count to allow and still consider it empty
|
||||
|
||||
NOTE: See the ruleset for more examples that are not documented here.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
# -------------------------------------------------------
|
||||
|
||||
package Mail::SpamAssassin::Plugin::PDFInfo;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use Digest::MD5 qw(md5_hex);
|
||||
use MIME::QuotedPrint;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule ("pdf_count");
|
||||
$self->register_eval_rule ("pdf_image_count");
|
||||
$self->register_eval_rule ("pdf_pixel_coverage");
|
||||
$self->register_eval_rule ("pdf_image_size_exact");
|
||||
$self->register_eval_rule ("pdf_image_size_range");
|
||||
$self->register_eval_rule ("pdf_named");
|
||||
$self->register_eval_rule ("pdf_name_regex");
|
||||
$self->register_eval_rule ("pdf_image_to_text_ratio");
|
||||
$self->register_eval_rule ("pdf_match_md5");
|
||||
$self->register_eval_rule ("pdf_match_fuzzy_md5");
|
||||
$self->register_eval_rule ("pdf_match_details");
|
||||
$self->register_eval_rule ("pdf_is_encrypted");
|
||||
$self->register_eval_rule ("pdf_is_empty_body");
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
my %get_details = (
|
||||
'pdf' => sub {
|
||||
my ($self, $pms, $part) = @_;
|
||||
|
||||
my $type = $part->{'type'} || 'base64';
|
||||
my $data = '';
|
||||
|
||||
if ($type eq 'quoted-printable') {
|
||||
$data = decode_qp($data); # use QuotedPrint->decode_qp
|
||||
}
|
||||
else {
|
||||
$data = $part->decode(); # just use built in base64 decoder
|
||||
}
|
||||
|
||||
my $index = substr($data, 0, 8);
|
||||
|
||||
return unless ($index =~ /.PDF\-(\d\.\d)/);
|
||||
my $version = $1;
|
||||
$self->_set_tag($pms, 'PDFVERSION', $version);
|
||||
# dbg("pdfinfo: pdf version = $version");
|
||||
|
||||
my ($height, $width, $fuzzy_data, $pdf_tags);
|
||||
my ($producer, $created, $modified, $title, $creator, $author) = ('unknown','0','0','untitled','unknown','unknown');
|
||||
my ($md5, $fuzzy_md5) = ('', '');
|
||||
my ($total_height, $total_width, $total_area, $line_count) = (0,0,0,0);
|
||||
|
||||
my $name = $part->{'name'} || '';
|
||||
$self->_set_tag($pms, 'PDFNAME', $name);
|
||||
|
||||
my $no_more_fuzzy = 0;
|
||||
my $got_image = 0;
|
||||
my $encrypted = 0;
|
||||
|
||||
while($data =~ /([^\n]+)/g) {
|
||||
# dbg("pdfinfo: line=$1");
|
||||
my $line = $1;
|
||||
|
||||
$line_count++;
|
||||
|
||||
# lines containing high bytes will have no data we need, so save some cycles
|
||||
next if ($line =~ /[\x80-\xff]/);
|
||||
|
||||
if (!$no_more_fuzzy && $line_count < 70) {
|
||||
if ($line !~ m/^\%/ && $line !~ m/^\/(?:Height|Width|(?:(?:Media|Crop)Box))/ && $line !~ m/^\d+\s+\d+\s+\d+\s+\d+\s+\d+\s+\d+\s+cm$/) {
|
||||
$line =~ s/\s+$//; # strip off whitespace at end.
|
||||
$fuzzy_data .= $line;
|
||||
}
|
||||
}
|
||||
|
||||
if ($line =~ m/^\/([A-Za-z]+)/) {
|
||||
$pdf_tags .= $1;
|
||||
}
|
||||
|
||||
$got_image=1 if ($line =~ m/\/Image/);
|
||||
$encrypted=1 if ($line =~ m/^\/Encrypt/);
|
||||
|
||||
# once we hit the first stream, we stop collecting data for fuzzy md5
|
||||
$no_more_fuzzy = 1 if ($line =~ m/stream/);
|
||||
|
||||
# From a v1.3 pdf
|
||||
# [12234] dbg: pdfinfo: line=630 0 0 149 0 0 cm
|
||||
# [12234] dbg: pdfinfo: line=/Width 630
|
||||
# [12234] dbg: pdfinfo: line=/Height 149
|
||||
if ($got_image) {
|
||||
if ($line =~ /^(\d+)\s+\d+\s+\d+\s+(\d+)\s+\d+\s+\d+\s+cm$/) {
|
||||
$width = $1;
|
||||
$height = $2;
|
||||
}
|
||||
elsif ($line =~ /^\/Width\s(\d+)/) {
|
||||
$width = $1;
|
||||
}
|
||||
elsif ($line =~ /^\/Height\s(\d+)/) {
|
||||
$height = $1;
|
||||
}
|
||||
elsif ($line =~ m/\/Width\s(\d+)\/Height\s(\d+)/) {
|
||||
$width = $1;
|
||||
$height = $2;
|
||||
}
|
||||
}
|
||||
|
||||
# did pdf contain image data?
|
||||
if ($got_image && $width && $height) {
|
||||
$no_more_fuzzy = 1;
|
||||
my $area = $width * $height;
|
||||
$total_height += $height;
|
||||
$total_width += $width;
|
||||
$total_area += $area;
|
||||
$pms->{pdfinfo}->{dems_pdf}->{"${height}x${width}"} = 1;
|
||||
$pms->{'pdfinfo'}->{"count_pdf_images"} ++;
|
||||
dbg("pdfinfo: Found image in PDF ".($name ? $name : '')." - $height x $width pixels ($area pixels sq.)");
|
||||
$self->_set_tag($pms, 'PDFIMGDIM', "${height}x${width}");
|
||||
$height=0; $width=0; # reset and check for next image
|
||||
$got_image = 0;
|
||||
}
|
||||
|
||||
# [5310] dbg: pdfinfo: line=<</Producer(GPL Ghostscript 8.15)
|
||||
# [5310] dbg: pdfinfo: line=/CreationDate(D:20070703144220)
|
||||
# [5310] dbg: pdfinfo: line=/ModDate(D:20070703144220)
|
||||
# [5310] dbg: pdfinfo: line=/Title(Microsoft Word - Document1)
|
||||
# [5310] dbg: pdfinfo: line=/Creator(PScript5.dll Version 5.2)
|
||||
# [5310] dbg: pdfinfo: line=/Author(colet)>>endobj
|
||||
# or all on same line inside xml - v1.6+
|
||||
# <</CreationDate(D:20070226165054-06'00')/Creator( Adobe Photoshop CS2 Windows)/Producer(Adobe Photoshop for Windows -- Image Conversion Plug-in)/ModDate(D:20070226165100-06'00')>>
|
||||
|
||||
if ($line =~ /\/Producer\s?\(([^\)\\]+)/) {
|
||||
$producer = $1;
|
||||
}
|
||||
if ($line =~ /\/CreationDate\s?\(D\:(\d+)/) {
|
||||
$created = $1;
|
||||
}
|
||||
if ($line =~ /\/ModDate\s?\(D\:(\d+)/) {
|
||||
$modified = $1;
|
||||
}
|
||||
if ($line =~ /\/Title\s?\(([^\)\\]+)/) {
|
||||
$title = $1;
|
||||
# Title=\376\377\000w\000w\000n\000g
|
||||
# Title=wwng
|
||||
$title =~ s/\\\d{3}//g;
|
||||
}
|
||||
if ($line =~ /\/Creator\s?\(([^\)\\]+)/) {
|
||||
$creator = $1;
|
||||
}
|
||||
if ($line =~ /\/Author\s?\(([^\)]+)/) {
|
||||
$author = $1;
|
||||
# Author=\376\377\000H\000P\000_\000A\000d\000m\000i\000n\000i\000s\000t\000r\000a\000t\000o\000r
|
||||
# Author=HP_Administrator
|
||||
$author =~ s/\\\d{3}//g;
|
||||
}
|
||||
}
|
||||
|
||||
# store the file name so we can check pdf_named() or pdf_name_match() later.
|
||||
$pms->{pdfinfo}->{names_pdf}->{$name} = 1 if $name;
|
||||
|
||||
# store encrypted flag.
|
||||
$pms->{pdfinfo}->{encrypted} = $encrypted;
|
||||
|
||||
# if we had multiple images in the pdf, we need to store the total HxW as well.
|
||||
# If it was a single Image PDF, then this value will already be in the hash.
|
||||
$pms->{pdfinfo}->{dems_pdf}->{"${total_height}x${total_width}"} = 1 if ($total_height && $total_width);;
|
||||
|
||||
if ($total_area) {
|
||||
$pms->{pdfinfo}->{pc_pdf} = $total_area;
|
||||
$self->_set_tag($pms, 'PDFIMGAREA', $total_area);
|
||||
dbg("pdfinfo: Filename=$name Total HxW: $total_height x $total_width ($total_area area)") if ($total_area);
|
||||
}
|
||||
|
||||
dbg("pdfinfo: Filename=$name Title=$title Author=$author Producer=$producer Created=$created Modified=$modified");
|
||||
|
||||
$md5 = uc(md5_hex($data)) if $data;
|
||||
$fuzzy_md5 = uc(md5_hex($fuzzy_data)) if $fuzzy_data;
|
||||
my $tags_md5;
|
||||
$tags_md5 = uc(md5_hex($pdf_tags)) if $pdf_tags;
|
||||
|
||||
dbg("pdfinfo: MD5 results for ".($name ? $name : '')." - md5=".($md5 ? $md5 : '')." fuzzy1=".($fuzzy_md5 ? $fuzzy_md5 : '')." fuzzy2=".($tags_md5 ? $tags_md5 : ''));
|
||||
|
||||
# we dont need tags for these.
|
||||
$pms->{pdfinfo}->{details}->{created} = $created if $created;
|
||||
$pms->{pdfinfo}->{details}->{modified} = $modified if $modified;
|
||||
|
||||
if ($producer) {
|
||||
$pms->{pdfinfo}->{details}->{producer} = $producer if $producer;
|
||||
$self->_set_tag($pms, 'PDFPRODUCER', $producer);
|
||||
}
|
||||
if ($title) {
|
||||
$pms->{pdfinfo}->{details}->{title} = $title;
|
||||
$self->_set_tag($pms, 'PDFTITLE', $title);
|
||||
}
|
||||
if ($creator) {
|
||||
$pms->{pdfinfo}->{details}->{creator} = $creator;
|
||||
$self->_set_tag($pms, 'PDFCREATOR', $creator);
|
||||
}
|
||||
if ($author) {
|
||||
$pms->{pdfinfo}->{details}->{author} = $author;
|
||||
$self->_set_tag($pms, 'PDFAUTHOR', $author);
|
||||
}
|
||||
if ($md5) {
|
||||
$pms->{pdfinfo}->{md5}->{$md5} = 1;
|
||||
$self->_set_tag($pms, 'PDFMD5', $fuzzy_md5);
|
||||
}
|
||||
if ($fuzzy_md5) {
|
||||
$pms->{pdfinfo}->{fuzzy_md5}->{$fuzzy_md5} = 1;
|
||||
$self->_set_tag($pms, 'PDFMD5FUZZY1', $fuzzy_md5);
|
||||
}
|
||||
if ($tags_md5) {
|
||||
$pms->{pdfinfo}->{fuzzy_md5}->{$tags_md5} = 1;
|
||||
$self->_set_tag($pms, 'PDFMD5FUZZY2', $tags_md5);
|
||||
}
|
||||
},
|
||||
|
||||
);
|
||||
|
||||
# ----------------------------------------
|
||||
|
||||
sub _set_tag {
|
||||
|
||||
my ($self, $pms, $tag, $value) = @_;
|
||||
|
||||
dbg("pdfinfo: set_tag called for $tag $value");
|
||||
return unless ($tag && $value);
|
||||
|
||||
if (exists $pms->{tag_data}->{$tag}) {
|
||||
$pms->{tag_data}->{$tag} .= " $value"; # append value
|
||||
}
|
||||
else {
|
||||
$pms->{tag_data}->{$tag} = $value;
|
||||
}
|
||||
}
|
||||
|
||||
# ----------------------------------------
|
||||
|
||||
sub _find_pdf_mime_parts {
|
||||
my ($self,$pms) = @_;
|
||||
|
||||
# bail early if message does not have pdf parts
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
|
||||
# initialize
|
||||
$pms->{'pdfinfo'}->{"pc_pdf"} = 0;
|
||||
$pms->{'pdfinfo'}->{"count_pdf"} = 0;
|
||||
$pms->{'pdfinfo'}->{"count_pdf_images"} = 0;
|
||||
|
||||
my @parts = $pms->{msg}->find_parts(qr@^(image|application)/(pdf|octet\-stream)$@, 1);
|
||||
my $part_count = scalar @parts;
|
||||
|
||||
dbg("pdfinfo: Identified $part_count possible mime parts that need checked for PDF content");
|
||||
|
||||
# cache this so we can easily bail
|
||||
$pms->{'pdfinfo'}->{'no_parts'} = 1 unless $part_count;
|
||||
|
||||
foreach my $p (@parts) {
|
||||
my $type = $p->{'type'} =~ m@/([\w\-]+)$@;
|
||||
my $name = $p->{'name'} || '';
|
||||
|
||||
my $cte = lc( $p->get_header('content-transfer-encoding') || '' );
|
||||
|
||||
dbg("pdfinfo: found part, type=".($type ? $type : '')." file=".($name ? $name : '')." cte=".($cte ? $cte : '')."");
|
||||
|
||||
# make sure its a cte we support
|
||||
next unless ($cte =~ /^(?:base64|quoted\-printable)$/);
|
||||
|
||||
# filename must end with .pdf, or application type can be pdf
|
||||
# sometimes windows muas will wrap a pdf up inside a .dat file
|
||||
# v0.8 - Added .fdf phoney PDF detection
|
||||
next unless ($name =~ /\.[fp]df$/ || $type eq 'pdf');
|
||||
|
||||
# if we get this far, make sure type is pdf for sure (not octet-stream or anything else)
|
||||
$type='pdf';
|
||||
|
||||
if ($type && exists $get_details{$type}) {
|
||||
$get_details{$type}->($self, $pms, $p);
|
||||
$pms->{'pdfinfo'}->{"count_$type"} ++;
|
||||
}
|
||||
}
|
||||
|
||||
$self->_set_tag($pms, 'PDFCOUNT', $pms->{'pdfinfo'}->{"count_pdf"});
|
||||
$self->_set_tag($pms, 'PDFIMGCOUNT', $pms->{'pdfinfo'}->{"count_pdf_images"});
|
||||
|
||||
}
|
||||
|
||||
# ----------------------------------------
|
||||
|
||||
sub pdf_named {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
return unless (defined $name);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"names_pdf"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"names_pdf"}->{$name});
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_name_regex {
|
||||
my ($self,$pms,$body,$re) = @_;
|
||||
return unless (defined $re);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"names_pdf"});
|
||||
|
||||
my ($rec, $err) = compile_regexp($re, 2);
|
||||
if (!$rec) {
|
||||
info("pdfinfo: invalid regexp '$re': $err");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $hit = 0;
|
||||
foreach my $name (keys %{$pms->{'pdfinfo'}->{"names_pdf"}}) {
|
||||
if ($name =~ $rec) {
|
||||
dbg("pdfinfo: pdf_name_regex hit on $name");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_is_encrypted {
|
||||
my ($self,$pms,$body) = @_;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return $pms->{'pdfinfo'}->{'encrypted'};
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_count {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless defined $min;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"count_pdf"});
|
||||
return result_check($min, $max, $pms->{'pdfinfo'}->{"count_pdf"});
|
||||
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_count {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless defined $min;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"count_pdf_images"});
|
||||
return result_check($min, $max, $pms->{'pdfinfo'}->{"count_pdf_images"});
|
||||
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_pixel_coverage {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless (defined $min);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
|
||||
# dbg("pdfinfo: pc_$type: $min, ".($max ? $max:'').", $type, ".$pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
return result_check($min, $max, $pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_to_text_ratio {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless (defined $min && defined $max);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
|
||||
# depending on how you call this eval (body vs rawbody),
|
||||
# the $textlen will differ.
|
||||
my $textlen = length(join('',@$body));
|
||||
|
||||
return 0 unless ( $textlen > 0 && exists $pms->{'pdfinfo'}->{"pc_pdf"} && $pms->{'pdfinfo'}->{"pc_pdf"} > 0);
|
||||
|
||||
my $ratio = $textlen / $pms->{'pdfinfo'}->{"pc_pdf"};
|
||||
dbg("pdfinfo: image ratio=$ratio, min=$min max=$max");
|
||||
return result_check($min, $max, $ratio, 1);
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_is_empty_body {
|
||||
my ($self,$pms,$body,$min) = @_;
|
||||
|
||||
$min ||= 0; # default to 0 bytes
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless $pms->{'pdfinfo'}->{"count_pdf"};
|
||||
|
||||
# check for cached result
|
||||
return 1 if $pms->{'pdfinfo'}->{"no_body_text"};
|
||||
|
||||
shift @$body; # shift body array removes line #1 -> subject line.
|
||||
|
||||
my $bytes = 0;
|
||||
my $textlen = length(join('',@$body));
|
||||
foreach my $line (@$body) {
|
||||
next unless ($line =~ m/\S/);
|
||||
next if ($line =~ m/^Subject/);
|
||||
$bytes += length($line);
|
||||
}
|
||||
|
||||
dbg("pdfinfo: is_empty_body = $bytes bytes");
|
||||
|
||||
if ($bytes == 0 || ($bytes <= $min)) {
|
||||
$pms->{'pdfinfo'}->{"no_body_text"} = 1;
|
||||
return 1;
|
||||
}
|
||||
|
||||
# cache it and return 0
|
||||
$pms->{'pdfinfo'}->{"no_body_text"} = 0;
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_size_exact {
|
||||
my ($self,$pms,$body,$height,$width) = @_;
|
||||
return unless (defined $height && defined $width);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"dems_pdf"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"dems_pdf"}->{"${height}x${width}"});
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_size_range {
|
||||
my ($self,$pms,$body,$minh,$minw,$maxh,$maxw) = @_;
|
||||
return unless (defined $minh && defined $minw);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"dems_pdf"});
|
||||
|
||||
foreach my $dem ( keys %{$pms->{'pdfinfo'}->{"dems_pdf"}}) {
|
||||
my ($h,$w) = split(/x/,$dem);
|
||||
next if ($h < $minh); # height less than min height
|
||||
next if ($w < $minw); # width less than min width
|
||||
next if (defined $maxh && $h > $maxh); # height more than max height
|
||||
next if (defined $maxw && $w > $maxw); # width more than max width
|
||||
|
||||
# if we make it here, we have a match
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_match_md5 {
|
||||
|
||||
my ($self,$pms,$body,$md5) = @_;
|
||||
return unless defined $md5;
|
||||
|
||||
my $uc_md5 = uc($md5); # uppercase matches only
|
||||
|
||||
# make sure we have pdf data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"md5"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"md5"}->{$uc_md5});
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_match_fuzzy_md5 {
|
||||
|
||||
my ($self,$pms,$body,$md5) = @_;
|
||||
return unless defined $md5;
|
||||
|
||||
my $uc_md5 = uc($md5); # uppercase matches only
|
||||
|
||||
# make sure we have pdf data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"fuzzy_md5"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"fuzzy_md5"}->{$uc_md5});
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_match_details {
|
||||
my ($self, $pms, $body, $detail, $regex) = @_;
|
||||
return unless ($detail && $regex);
|
||||
|
||||
# make sure we have pdf data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{'details'});
|
||||
|
||||
my $check_value = $pms->{pdfinfo}->{details}->{$detail};
|
||||
return unless $check_value;
|
||||
|
||||
my ($rec, $err) = compile_regexp($regex, 2);
|
||||
if (!$rec) {
|
||||
info("pdfinfo: invalid regexp '$regex': $err");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($check_value =~ $rec) {
|
||||
dbg("pdfinfo: pdf_match_details $detail $regex matches $check_value");
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub result_check {
|
||||
my ($min, $max, $value, $nomaxequal) = @_;
|
||||
return 0 unless defined $value;
|
||||
return 0 if ($value < $min);
|
||||
return 0 if (defined $max && $value > $max);
|
||||
return 0 if (defined $nomaxequal && $nomaxequal && $value == $max);
|
||||
return 1;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
1;
|
||||
|
273
upstream/lib/Mail/SpamAssassin/Plugin/PhishTag.pm
Normal file
273
upstream/lib/Mail/SpamAssassin/Plugin/PhishTag.pm
Normal file
@ -0,0 +1,273 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# # Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
#
|
||||
###########################################################################
|
||||
|
||||
package Mail::SpamAssassin::Plugin::PhishTag;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use Errno qw(EBADF);
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub new{
|
||||
my ($class, $mailsa)=@_;
|
||||
$class=ref($class) ||$class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless($self,$class);
|
||||
$self->set_config($mailsa->{conf});
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config{
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'trigger_target',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
is_admin => 1,
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'trigger_config',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
is_admin => 1,
|
||||
default => '',
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'trigger_ratio',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
is_admin => 1,
|
||||
default => 0,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
#prepare the plugin
|
||||
sub check_start{
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
|
||||
#initialize the PHISHTAG data structure for
|
||||
#saving configuration information
|
||||
$pms->{PHISHTAG} = {};
|
||||
$pms->{PHISHTAG}->{triggers}={};
|
||||
$pms->{PHISHTAG}->{targets}=[];
|
||||
|
||||
#read the configuration info
|
||||
$self->read_configfile($params);
|
||||
$self->read_settings($params);
|
||||
}
|
||||
|
||||
sub read_settings{
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
|
||||
my $triggers= $pms->{PHISHTAG}->{triggers};
|
||||
my $targets= $pms->{PHISHTAG}->{targets};
|
||||
while (my ($tname,$ttarget)=each %{$pms->{conf}->{trigger_target}}){
|
||||
push @$targets, [$ttarget, $tname];
|
||||
$$triggers{$tname}=0;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
sub read_configfile{
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
|
||||
#nothing interesting here if there is not a configuration file
|
||||
return if($pms->{conf}->{trigger_config} !~/\S/);
|
||||
|
||||
my $triggers= $pms->{PHISHTAG}->{triggers};
|
||||
my $targets= $pms->{PHISHTAG}->{targets};
|
||||
|
||||
my $target;
|
||||
local *F;
|
||||
open(F, '<', $pms->{conf}->{trigger_config});
|
||||
for ($!=0; <F>; $!=0) {
|
||||
#each entry is separated by blank lines
|
||||
undef($target) if(!/\S/);
|
||||
|
||||
#lines that start with pound are comments
|
||||
next if(/^\s*\#/);
|
||||
|
||||
#an entry starts with a URL line prefixed with the word "target"
|
||||
if(/^target\s+(\S+)/){
|
||||
$target=[$1];
|
||||
push @$targets,$target;
|
||||
}
|
||||
#add the test to the list of listened triggers
|
||||
#and to the triggers of the last target
|
||||
elsif(defined $target){
|
||||
s/\s+//g;
|
||||
$$triggers{$_}=0;
|
||||
push @$target, $_;
|
||||
}
|
||||
}
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("PHISHTAG: error reading config file: $!")
|
||||
: die "error reading config file: $!";
|
||||
close(F) or die "error closing config file: $!";
|
||||
}
|
||||
|
||||
sub hit_rule {
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
my $rulename = $params->{rulename};
|
||||
|
||||
#mark the rule as hit
|
||||
if(defined($pms->{PHISHTAG}->{triggers}->{$rulename})){
|
||||
$pms->{PHISHTAG}->{triggers}->{$rulename}=1;
|
||||
dbg("PHISHTAG: $rulename has been caught\n");
|
||||
}
|
||||
}
|
||||
|
||||
sub check_post_learn {
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
|
||||
#find out which targets have fulfilled their requirements
|
||||
my $triggers= $pms->{PHISHTAG}->{triggers};
|
||||
my $targets= $pms->{PHISHTAG}->{targets};
|
||||
my @filled;
|
||||
foreach my $target(@$targets){
|
||||
my $uri= $$target[0];
|
||||
my $fulfilled=1;
|
||||
#all the triggers of a target have to exist for it to be fulfilled
|
||||
foreach my $i(1..$#$target){
|
||||
if(! $triggers->{$$target[$i]}){
|
||||
$fulfilled=0;
|
||||
last;
|
||||
}
|
||||
}
|
||||
if($fulfilled){
|
||||
push @filled, $uri;
|
||||
dbg("PHISHTAG: Fulfilled $uri\n");
|
||||
}
|
||||
}
|
||||
|
||||
if(scalar(@filled) &&
|
||||
$pms->{conf}->{trigger_ratio} > rand(100)){
|
||||
$pms->{PHISHTAG}->{letgo}=0;
|
||||
$pms->{PHISHTAG}->{uri}=$filled[int(rand(scalar(@filled)))];
|
||||
|
||||
dbg("PHISHTAG: Decided to keep this email and point to ".
|
||||
$pms->{PHISHTAG}->{uri});
|
||||
#make sure that SpamAssassin does not remove this email
|
||||
$pms->got_hit("PHISHTAG_TOSS",
|
||||
"BODY: ",
|
||||
score => -100);
|
||||
}
|
||||
else{
|
||||
dbg("PHISHTAG: Will let this email to SpamAssassin's discretion\n");
|
||||
$pms->{PHISHTAG}->{letgo}=1;
|
||||
}
|
||||
|
||||
|
||||
#nothing interesting here, if we will not rewrite the email
|
||||
if($pms->{PHISHTAG}->{letgo}){
|
||||
return;
|
||||
}
|
||||
|
||||
my $pristine_body=\$pms->{msg}->{pristine_body};
|
||||
#dbg("PRISTINE>>\n".$$pristine_body);
|
||||
|
||||
my $uris = $pms->get_uri_detail_list();
|
||||
#rewrite the url
|
||||
while (my($uri, $info) = each %{$uris}) {
|
||||
if(defined ($info->{types}->{a})){
|
||||
$$pristine_body=~s/$uri/$pms->{PHISHTAG}->{uri}/mg;
|
||||
}
|
||||
}
|
||||
dbg("PRISTINE>>\n".$$pristine_body);
|
||||
}
|
||||
|
||||
1;
|
||||
__END__
|
||||
|
||||
=head1 NAME
|
||||
|
||||
PhishTag - SpamAssassin plugin for redirecting links in incoming emails.
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::PhishTag
|
||||
|
||||
trigger_ratio 0.1
|
||||
trigger_target RULE_NAME http://www.antiphishing.org/consumer_recs.html
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
PhishTag enables administrators to rewrite links in emails that trigger certain
|
||||
tests, preferably anti-phishing blacklist tests. The plugin will inhibit the
|
||||
blocking of a portion of the emails that trigger the test by SpamAssassin, and
|
||||
let them pass to the users' inbox after the rewrite. It is useful in providing
|
||||
training to email users about company policies and general email usage.
|
||||
|
||||
=head1 OPTIONS
|
||||
|
||||
The following options can be set by modifying the configuration file.
|
||||
|
||||
=over 4
|
||||
|
||||
=item * trigger_ratio percentage_value
|
||||
|
||||
Sets the probability in percentage that a positive test will trigger the
|
||||
email rewrite, e.g. 0.1 will rewrite on the average 1 in 1000 emails that
|
||||
match the trigger.
|
||||
|
||||
=item * trigger_target RULE_NAME http_url
|
||||
|
||||
The name of the test which would trigger the email rewrite; all the URLs
|
||||
will be replaced by http_url.
|
||||
|
||||
=back
|
||||
|
||||
=head1 DOWNLOAD
|
||||
|
||||
The source of this plugin is available at:
|
||||
http://umut.topkara.org/PhishTag/PhishTag.pm
|
||||
a sample configuration file is also available:
|
||||
http://umut.topkara.org/PhishTag/PhishTag.cf
|
||||
|
||||
=head1 SEE ALSO
|
||||
|
||||
Check the list of tests performed by SpamAssassin to modify the
|
||||
configuration file to match your needs from
|
||||
https://spamassassin.apache.org/tests.html
|
||||
|
||||
=head1 AUTHOR
|
||||
|
||||
Umut Topkara, 2008, E<lt>umut@topkara.orgE<gt>
|
||||
http://umut.topkara.org
|
||||
|
||||
=head1 COPYRIGHT AND LICENSE
|
||||
|
||||
This plugin is free software; you can redistribute it and/or modify
|
||||
it under the same terms as SpamAssassin itself, either version 3.2.4
|
||||
or, at your option, any later version of SpamAssassin you may have
|
||||
available.
|
||||
|
||||
|
||||
=cut
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user