mirror of
https://git.proxmox.com/git/proxmox-spamassassin
synced 2025-04-28 13:40:10 +00:00
update SpamAssassin to 4.0.0
generated by make update-upstream Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
This commit is contained in:
parent
56cebc6b1a
commit
ae52237fd8
@ -1,4 +1,4 @@
|
||||
Copyright (C) 2021 The Apache Software Foundation
|
||||
Copyright (C) 2022 The Apache Software Foundation
|
||||
|
||||
Project Management Committee (PMC):
|
||||
|
||||
|
6947
upstream/Changes
6947
upstream/Changes
File diff suppressed because it is too large
Load Diff
359
upstream/INSTALL
359
upstream/INSTALL
@ -11,16 +11,16 @@ Installing or Upgrading SpamAssassin
|
||||
|
||||
Using CPAN via CPAN.pm:
|
||||
|
||||
perl -MCPAN -e shell [as root]
|
||||
perl -MCPAN -e shell # as root
|
||||
o conf prerequisites_policy ask
|
||||
install Mail::SpamAssassin
|
||||
quit
|
||||
|
||||
Using Linux:
|
||||
|
||||
Debian unstable: apt-get install spamassassin
|
||||
Debian/Ubuntu: apt-get install spamassassin
|
||||
Gentoo: emerge mail-filter/spamassassin
|
||||
Fedora: yum install spamassassin
|
||||
Fedora/CentOS/RedHat: yum install spamassassin
|
||||
|
||||
Alternatively download the tarfile, zipfile, and/or build your own RPM
|
||||
from https://spamassassin.apache.org/. Building from tar/zip file is
|
||||
@ -28,10 +28,9 @@ usually as simple as:
|
||||
|
||||
[unzip/untar the archive]
|
||||
cd Mail-SpamAssassin-*
|
||||
perl Makefile.PL
|
||||
[option: add -DSPAMC_SSL to $CFLAGS to build an SSL-enabled spamc]
|
||||
perl Makefile.PL # add ENABLE_SSL=yes for SSL support
|
||||
make
|
||||
make install [as root]
|
||||
make install # as root
|
||||
|
||||
After installing SpamAssassin, you need to download and install the
|
||||
SpamAssassin ruleset using "sa-update". See the "Installing Rules"
|
||||
@ -42,16 +41,10 @@ the prerequisite information further down.
|
||||
|
||||
To install as non-root, see the directions below.
|
||||
|
||||
If you are running AFS, you may also need to specify INSTALLSITELIB and
|
||||
SITELIBEXP.
|
||||
|
||||
Note that you can upgrade SpamAssassin using these instructions, as long
|
||||
as you take care to read the caveats in the file UPGRADE. Upgrading
|
||||
will not delete your learnt Bayes data or local rule modifications.
|
||||
|
||||
If you're using SunOS 4.1.x, see
|
||||
http://wiki.spamassassin.org/w/BuildingOnSunOS4 for build tips.
|
||||
|
||||
|
||||
Installing SpamAssassin for Personal Use (Not System-Wide)
|
||||
----------------------------------------------------------
|
||||
@ -62,7 +55,7 @@ These steps assume the following, so substitute as necessary:
|
||||
- The location of the procmail executable is /usr/bin/procmail
|
||||
|
||||
Many more details of this process are at
|
||||
http://wiki.apache.org/spamassassin/SingleUserUnixInstall
|
||||
https://wiki.apache.org/spamassassin/SingleUserUnixInstall
|
||||
|
||||
1. Uncompress and extract the SpamAssassin archive, using "unzip" or
|
||||
"tar xvfz", in a temporary directory.
|
||||
@ -119,7 +112,7 @@ http://wiki.apache.org/spamassassin/SingleUserUnixInstall
|
||||
caughtspam
|
||||
|
||||
Also, see the file procmailrc.example and
|
||||
http://wiki.apache.org/spamassassin/UsedViaProcmail
|
||||
https://wiki.apache.org/spamassassin/UsedViaProcmail
|
||||
|
||||
8. Now, you should be ready to send some test emails and ensure everything
|
||||
works as expected. First, send yourself a test email that doesn't
|
||||
@ -136,7 +129,7 @@ http://wiki.apache.org/spamassassin/UsedViaProcmail
|
||||
don't lose incoming email.
|
||||
|
||||
Note: one possible cause for this is the use of smrsh on the MTA system;
|
||||
see http://wiki.spamassassin.org/w/ProcmailVsSmrsh for details.
|
||||
see https://wiki.apache.org/spamassassin/ProcmailVsSmrsh for details.
|
||||
|
||||
9. You can now customize SpamAssassin. See README for more information.
|
||||
|
||||
@ -153,18 +146,25 @@ Installing rules from network is done with a single command:
|
||||
|
||||
sa-update
|
||||
|
||||
This is normally run as root.
|
||||
For security reasons, it should not be run as root, but as the user normally
|
||||
running SpamAssassin. You can run the initial setup once as root, to create
|
||||
necessary directories etc. Then you need to change ownership of
|
||||
LOCAL_STATE_DIR to that user (usually: chown -R user:user
|
||||
/var/lib/spamassassin), you can find out the default directory with
|
||||
sa-update --help (look for --updatedir). Same needs to be done for
|
||||
LOCAL_RULES_DIR/sa-update-keys (usually: chown -R user:user
|
||||
/etc/mail/spamassassin/sa-update-keys), the directory can be found with
|
||||
spamassassin --help (look for --siteconfigpath).
|
||||
|
||||
If you wish to install rules from downloaded files, rather than "live" from
|
||||
the latest online ruleset, here is how to do it.
|
||||
|
||||
Obtain all the following files:
|
||||
Obtain all the following files from https://spamassassin.apache.org/downloads.cgi:
|
||||
|
||||
Mail-SpamAssassin-rules-xxx.tgz
|
||||
Mail-SpamAssassin-rules-xxx.tgz.asc
|
||||
Mail-SpamAssassin-rules-xxx.tgz.md5
|
||||
Mail-SpamAssassin-rules-xxx.tgz.sha1
|
||||
(where xxx may look something like '3.3.0-rc1.r893295')
|
||||
Mail-SpamAssassin-rules-xxx.tgz.sha512
|
||||
(where xxx may look something like '4.0.0.r1900144')
|
||||
|
||||
Save them all to the current directory.
|
||||
Obtain a rules-signing public key:
|
||||
@ -180,7 +180,7 @@ Install rules from a compressed tar archive:
|
||||
|
||||
sa-update --install Mail-SpamAssassin-rules-xxx.tgz
|
||||
|
||||
Note that the ".tgz.asc", ".tgz.md5" and ".tgz.sha1" files all need to
|
||||
Note that the ".tgz", ".tgz.asc" and ".tgz.sha512" files all need to
|
||||
be in the same directory, otherwise sa-update will fail.
|
||||
|
||||
|
||||
@ -197,7 +197,7 @@ CPAN
|
||||
----
|
||||
|
||||
Most of the modules listed below are available via the Comprehensive Perl
|
||||
Archive Network (CPAN, see http://www.cpan.org/ for more information).
|
||||
Archive Network (CPAN, see https://www.cpan.org/ for more information).
|
||||
While each module is different, most can be installed via a few simple
|
||||
commands such as:
|
||||
|
||||
@ -222,11 +222,7 @@ through those mechanisms, too, if you prefer.
|
||||
Required Perl Interpreter
|
||||
-------------------------
|
||||
|
||||
Perl 5.8.1 or a later version is required.
|
||||
Preferred versions are 5.8.8, or 5.10.1 or later.
|
||||
|
||||
Most of the functions might still work with Perl 5.6.1 or 5.6.2,
|
||||
but 5.6.* is no longer a supported version.
|
||||
Perl 5.14.0 or a later version is required.
|
||||
|
||||
|
||||
Required Perl Modules
|
||||
@ -243,81 +239,41 @@ SpamAssassin build directory.
|
||||
The list of required modules that do not ship with Perl and must be
|
||||
installed:
|
||||
|
||||
- Digest::SHA1 (from CPAN),
|
||||
or the newer Digest::SHA which is a perl base module since Perl 5.10.0
|
||||
- Digest::SHA (from CPAN)
|
||||
|
||||
The Digest::SHA1 module is used as a cryptographic hash for some
|
||||
tests and the Bayes subsystem if Digest::SHA module is not available.
|
||||
|
||||
An external perl module razor-agents-2.84 as used by a Razor2 plugin
|
||||
seems to be the only remaining component depending on Digest::SHA1
|
||||
(note that a packager may ship a patched version of razor-agents which
|
||||
can use Digest::SHA instead)
|
||||
|
||||
Debian: apt-get install libdigest-sha1-perl
|
||||
Gentoo: emerge dev-perl/Digest-SHA1
|
||||
Fedora: yum install perl-Digest-SHA1
|
||||
Used as a cryptographic hash for some tests and the Bayes subsystem.
|
||||
It is also required by the DKIM plugin.
|
||||
|
||||
- HTML::Parser >= 3.43 (from CPAN)
|
||||
|
||||
HTML is used for an ever-increasing amount of email so this dependency
|
||||
is unavoidable. Run "perldoc -q html" for additional information.
|
||||
|
||||
Debian: apt-get install libhtml-parser-perl
|
||||
Gentoo: emerge dev-perl/HTML-Parser
|
||||
Fedora: yum install perl-HTML-Parser
|
||||
|
||||
- Net::DNS (from CPAN)
|
||||
- Net::DNS >= 0.69 (from CPAN)
|
||||
|
||||
Used for all DNS-based tests (SBL, XBL, SpamCop, DSBL, etc.),
|
||||
perform MX checks, used when manually reporting spam to SpamCop,
|
||||
and used by sa-update to gather version information.
|
||||
|
||||
You need to make sure the Net::DNS version is sufficiently up-to-date:
|
||||
- NetAddr::IP >= 4.010 (from CPAN)
|
||||
|
||||
- version 0.34 or higher on Unix systems
|
||||
- version 0.46 or higher on Windows systems
|
||||
Used to parse IP addresses and IP address ranges for "trusted_networks".
|
||||
Used in determining which DNS tests are to be done for each of the
|
||||
header's received fields. Used by AWL plugin for extracting network
|
||||
addresses. Used by DNSxL rules for assembling DNS queries.
|
||||
|
||||
Debian/Ubuntu: apt-get install libnet-dns-perl
|
||||
Fedora: yum install perl-Net-DNS
|
||||
Avoid buggy versions 4.034-4.035 and 4.045-4.054.
|
||||
|
||||
- NetAddr::IP (from CPAN)
|
||||
Examples of installing required modules on popular distributions:
|
||||
|
||||
Used to parse IP addresses and IP address ranges for
|
||||
"trusted_networks".
|
||||
Debian/Ubuntu:
|
||||
apt-get install libdigest-sha-perl libhtml-parser-perl libnet-dns-perl libnetaddr-ip-perl
|
||||
|
||||
Debian/Ubuntu: apt-get install libnetaddr-ip-perl
|
||||
Fedora: yum install perl-NetAddr-IP
|
||||
Gentoo:
|
||||
emerge dev-perl/Digest-SHA dev-perl/HTML-Parser dev-perl/Net-DNS dev-perl/NetAddr-IP
|
||||
|
||||
- Time::HiRes (from CPAN)
|
||||
|
||||
Used by asynchronous DNS lookups to operate timeouts with subsecond
|
||||
precision and to report processing times accurately.
|
||||
|
||||
- LWP (aka libwww-perl) (from CPAN)
|
||||
|
||||
This set of modules will include both the LWP::UserAgent and
|
||||
HTTP::Date modules, used by sa-update to retrieve update archives.
|
||||
|
||||
Fedora: yum install perl-libwww-perl
|
||||
|
||||
- HTTP::Date (from CPAN)
|
||||
|
||||
Used by sa-update to deal with certain Date requests.
|
||||
|
||||
- IO::Zlib (from CPAN)
|
||||
|
||||
Used by sa-update to uncompress update archives.
|
||||
Version 1.04 or later is required.
|
||||
|
||||
Fedora: yum install perl-IO-Zlib
|
||||
|
||||
- Archive::Tar (from CPAN)
|
||||
|
||||
Used by sa-update to expand update archives.
|
||||
Version 1.23 or later is required.
|
||||
|
||||
Fedora: yum install perl-Archive-Tar
|
||||
Fedora/CentOS/RedHat:
|
||||
yum install perl-Digest-SHA perl-HTML-Parser perl-Net-DNS perl-NetAddr-IP
|
||||
|
||||
|
||||
Optional Modules
|
||||
@ -329,140 +285,161 @@ their version is too low, SpamAssassin will still work, just not as
|
||||
effectively because some of the spam-detection tests will have to be
|
||||
skipped.
|
||||
|
||||
Note: SpamAssassin will not warn you if these are installed, but the
|
||||
version is too low for them to be used.
|
||||
Note: SpamAssassin may not warn you if these are installed, but the version
|
||||
is too low for them to be used.
|
||||
|
||||
- MIME::Base64
|
||||
- MIME::Base64 (from CPAN)
|
||||
|
||||
This module is highly recommended to increase the speed with which
|
||||
Base64 encoded messages/mail parts are decoded.
|
||||
|
||||
- Encode::Detect::Detector (from CPAN)
|
||||
|
||||
- DB_File (from CPAN, included in many distributions)
|
||||
For proper detection of charsets and converting them into Unicode, you
|
||||
will need to install this module.
|
||||
|
||||
Used to store data on-disk, for the Bayes-style logic, TxRep, and
|
||||
auto-whitelist. *Much* more efficient than the other standard Perl
|
||||
database packages. Strongly recommended.
|
||||
- Net::LibIDN2 (from CPAN)
|
||||
- Net::LibIDN (from CPAN)
|
||||
|
||||
There seems to be a bug in libdb 4.1.25, which is
|
||||
distributed by default on some versions of Linux. See
|
||||
http://wiki.apache.org/spamassassin/DbFileSleepBug for details.
|
||||
Provides mapping between Internationalized Domain Names (IDN) in Unicode
|
||||
and ASCII-compatible encoding (ACE) for use in DNS and comparisions.
|
||||
The module is optional, but without it Unicode IDN names found in mail
|
||||
will not be suitable for DNS queries and welcome/blocklisting.
|
||||
|
||||
Either module should work fine, but newer Net::LibIDN2 might not be
|
||||
available in all distributions.
|
||||
|
||||
- Net::SMTP (from CPAN)
|
||||
|
||||
Used when manually reporting spam to SpamCop.
|
||||
|
||||
|
||||
- Mail::SPF (from CPAN)
|
||||
|
||||
Used to check DNS Sender Policy Framework (SPF) records to fight email
|
||||
address forgery and make it easier to identify spams. This module
|
||||
makes Mail::SPF::Query obsolete.
|
||||
|
||||
Net::DNS version 0.58 or higher is required.
|
||||
|
||||
Note that NetAddr::IP (required by Mail::SPF) versions up to and
|
||||
including version 4.006 include a bug that will slow down the entire
|
||||
perl interpreter. NetAddr::IP version 4.007 or later fixes this.
|
||||
|
||||
|
||||
- Geo::IP (from CPAN)
|
||||
|
||||
Used by the RelayCountry plugin (not enabled by default) to determine
|
||||
the domain country codes of each relay in the path of an email.
|
||||
|
||||
IP::Country::Fast is used as alternative if Geo::IP is not installed.
|
||||
This is not recommended as it's obsolete.
|
||||
|
||||
|
||||
- Net::Ident (from CPAN)
|
||||
|
||||
If you plan to use the --auth-ident option to spamd, you will need
|
||||
to install this module.
|
||||
|
||||
|
||||
- IO::Socket::INET6 (from CPAN)
|
||||
|
||||
This is required if the first nameserver listed in your IP
|
||||
configuration or /etc/resolv.conf file is available only via an IPv6
|
||||
address.
|
||||
|
||||
Fedora: yum install perl-IO-Socket-INET6
|
||||
|
||||
|
||||
- IO::Socket::SSL (from CPAN)
|
||||
|
||||
If you wish to use SSL encryption to communicate between spamc and
|
||||
spamd (the --ssl option to spamd), you need to install this
|
||||
module. (You will need the OpenSSL libraries and use the
|
||||
ENABLE_SSL="yes" argument to Makefile.PL to build and run an SSL
|
||||
compatible spamc.)
|
||||
|
||||
Fedora: yum install perl-IO-Socket-SSL
|
||||
|
||||
|
||||
- Compress::Zlib (from CPAN)
|
||||
|
||||
If you wish to use the optional zlib compression for communication
|
||||
between spamc and spamd (the -z option to spamc), useful for
|
||||
long-distance use of spamc over the internet, you need to install
|
||||
this module.
|
||||
|
||||
Fedora: yum install perl-Compress-Zlib
|
||||
- Email::Address::XS
|
||||
|
||||
Used to parse email addresses from header fields like To/From/cc, per
|
||||
RFC 5322. If installed, it may additionally be used by internal parser
|
||||
to process complex lists.
|
||||
|
||||
- Mail::DKIM (from CPAN)
|
||||
|
||||
If this module is installed, and you enable the DKIM plugin,
|
||||
SpamAssassin will perform DKIM lookups when a DKIM-Signature header is
|
||||
present in the message headers. Current versions of Mail::DKIM (0.20
|
||||
or later) also perform Domain Key lookups on DomainKey-Signature headers,
|
||||
present in the message headers. Current versions of Mail::DKIM (0.20 or
|
||||
later) also perform Domain Key lookups on DomainKey-Signature headers,
|
||||
without requiring the Mail::DomainKeys module, which is now obsolete.
|
||||
Version 0.37 or later is preferred, the absolute minimal version is 0.31.
|
||||
Version 0.37 or later is preferred, the absolute minimal version is
|
||||
0.31.
|
||||
|
||||
Note that the Mail::DKIM module in turn requires the Digest::SHA module
|
||||
and OpenSSL libraries.
|
||||
- Mail::SPF (from CPAN)
|
||||
|
||||
Used to check DNS Sender Policy Framework (SPF) records to fight email
|
||||
address forgery and make it easier to identify spams.
|
||||
|
||||
- MaxMind::DB::Reader::XS (GeoIP2) (from CPAN)
|
||||
- MaxMind::DB::Reader (GeoIP2) (from CPAN)
|
||||
- IP::Country::DB_File (from CPAN)
|
||||
- Geo::IP (old deprecated GeoIP) (from CPAN)
|
||||
- IP::Country::Fast (deprecated) (from CPAN)
|
||||
|
||||
Geolocation modules, choose one from the list (in recommended order).
|
||||
|
||||
Used by the RelayCountry plugin (not enabled by default) to determine
|
||||
the domain country codes of each relay in the path of an email. Also
|
||||
used by the URILocalBL plugin (not enabled by default) to provide ISP
|
||||
and Country code based filtering.
|
||||
|
||||
See: https://wiki.apache.org/spamassassin/RelayCountryPlugin
|
||||
|
||||
- Mail::DMARC
|
||||
|
||||
Used by the optional DMARC check plugin, which itself requires DKIM and
|
||||
SPF features working.
|
||||
|
||||
- DB_File (from CPAN)
|
||||
|
||||
Used to store data on-disk, for the Bayes-style logic, TxRep, and
|
||||
auto-welcomelist. *Much* more efficient than the other standard Perl
|
||||
database packages. Strongly recommended.
|
||||
|
||||
There seems to be a bug in libdb 4.1.25, which is
|
||||
distributed by default on some versions of Linux. See
|
||||
https://wiki.apache.org/spamassassin/DbFileSleepBug for details.
|
||||
|
||||
- IO::Socket::IP (from CPAN)
|
||||
- IO::Socket::INET6 (from CPAN)
|
||||
|
||||
Installing IO::Socket::IP is recommended if spamd is to listen on IPv6
|
||||
sockets or if DNS queries should go to an IPv6 name server. If
|
||||
IO::Socket::IP is not available, using a deprecated module
|
||||
IO::Socket::INET6 will be attempted, and in its absence the support for
|
||||
IPv6 will not be available. Some plugins and underlying modules may
|
||||
also prefer IO::Socket::IP over IO::Socket::INET6.
|
||||
|
||||
- IO::Socket::SSL (from CPAN)
|
||||
|
||||
If you wish to use SSL encryption to communicate between spamc and spamd
|
||||
(the --ssl option to spamd), you need to install this module. (You will
|
||||
need the OpenSSL libraries and use the ENABLE_SSL=yes argument to
|
||||
Makefile.PL to build and run an SSL compatible spamc.)
|
||||
|
||||
- Net::Patricia
|
||||
|
||||
If this module is available, it will be used for IP address lookups in
|
||||
tables internal_networks, trusted_networks, and msa_networks.
|
||||
Recommended when a number of entries in these tables is hundred or more.
|
||||
However, in case of overlapping (or conflicting) networks in these
|
||||
tables, lookup results may differ as Net::Patricia finds a
|
||||
tightest-matching entry, while a sequential NetAddr::IP search finds a
|
||||
first-matching entry. So when overlapping network ranges are given,
|
||||
specifying more specific subnets (longest netmask) first, followed by
|
||||
wider subnets ensures predictable results.
|
||||
|
||||
- DBI *and* DBD driver/modules for your database (from CPAN)
|
||||
|
||||
If you intend to use SpamAssassin with an SQL database backend for
|
||||
user configuration data, Bayes storage, or other storage, you will need
|
||||
to have these installed; both the basic DBI module and the driver for
|
||||
your database.
|
||||
If you intend to use SpamAssassin with an SQL database backend for user
|
||||
configuration data, Bayes storage, or other storage, you will need to
|
||||
have these installed; both the basic DBI module and the driver for your
|
||||
database (for example DBD::MariaDB, DBD::mysql or DBD::Pg).
|
||||
|
||||
- Archive::Zip
|
||||
- IO::String
|
||||
|
||||
- Encode::Detect (from CPAN)
|
||||
|
||||
If you plan to use the normalize_charset config setting to detect
|
||||
charsets and convert them into Unicode, you will need to install
|
||||
this module.
|
||||
|
||||
|
||||
- Apache::Test (from CPAN)
|
||||
|
||||
If you plan to run the Apache2 version of spamd in the
|
||||
"spamd-apache2" directory, you will need to install this
|
||||
module.
|
||||
|
||||
|
||||
- Apache 2 and mod_perl
|
||||
|
||||
If you plan to run the Apache2 version of spamd in the "spamd-apache2"
|
||||
directory, you will need to ensure these are installed.
|
||||
|
||||
Ubuntu: sudo apt-get install apache2 libapache2-mod-perl2
|
||||
|
||||
Required by the optional OLEVBMacro plugin.
|
||||
|
||||
- Razor2
|
||||
|
||||
If you plan to use Vipul's Razor, note that versions up to and
|
||||
including version 2.82 include a bug that will slow down the entire
|
||||
perl interpreter. Version 2.83 or later fixes this.
|
||||
If you plan to use Vipul's Razor, note that versions up to and including
|
||||
version 2.82 include a bug that will slow down the entire perl
|
||||
interpreter. Version 2.83 or later fixes this.
|
||||
|
||||
If you do not plan to use this plugin, be sure to comment out
|
||||
its loadplugin line in "/etc/mail/spamassassin/v310.pre".
|
||||
If you do not plan to use this plugin, be sure to comment out its
|
||||
loadplugin line in "/etc/mail/spamassassin/v310.pre".
|
||||
|
||||
- Digest::SHA1 (from CPAN)
|
||||
|
||||
An external perl module razor-agents-2.84 as used by a Razor2 plugin
|
||||
seems to be the only remaining component depending on Digest::SHA1 (note
|
||||
that a packager may ship a patched version of razor-agents which can use
|
||||
Digest::SHA instead)
|
||||
|
||||
- LWP::UserAgent (aka libwww-perl) (from CPAN)
|
||||
|
||||
Can be used by sa-update to retrieve update archives, as alternative to
|
||||
curl/wget/fetch.
|
||||
|
||||
- Net::SMTP (from CPAN)
|
||||
|
||||
Used when manually reporting spam to SpamCop.
|
||||
|
||||
Examples of installing most recommended modules on popular distributions:
|
||||
|
||||
Debian/Ubuntu:
|
||||
apt-get install libencode-detect-perl libnet-libidn-perl \
|
||||
libemail-address-xs-perl libmail-dkim-perl libmail-spf-perl \
|
||||
libio-socket-ip-perl
|
||||
|
||||
Gentoo:
|
||||
emerge dev-perl/Encode-Detect dev-perl/Net-LibIDN \
|
||||
dev-perl/Email-Address-XS dev-perl/Mail-DKIM dev-perl/Mail-SPF
|
||||
|
||||
Fedora/CentOS/RedHat:
|
||||
yum install perl-MIME-Base64 perl-Encode-Detect perl-Net-LibIDN \
|
||||
perl-Email-Address-XS perl-Mail-DKIM perl-Mail-SPF perl-IO-Socket-IP
|
||||
|
||||
|
||||
What Next?
|
||||
@ -471,7 +448,7 @@ What Next?
|
||||
Take a look at the USAGE document for more information on how to use
|
||||
SpamAssassin.
|
||||
|
||||
The SpamAssassin Wiki <http://wiki.spamassassin.org/> contains
|
||||
The SpamAssassin Wiki <https://wiki.apache.org/spamassassin/> contains
|
||||
information on custom plugins, extensions, and other optional modules
|
||||
included with SpamAssassin.
|
||||
|
||||
|
@ -30,6 +30,6 @@ Notes on building SpamAssassin on VMS
|
||||
|
||||
- bug 1099 in the SA Bugzilla is being used to track progress.
|
||||
|
||||
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=1099
|
||||
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=1099
|
||||
|
||||
|
||||
|
@ -27,7 +27,7 @@ lib/Mail/SpamAssassin.pm
|
||||
lib/Mail/SpamAssassin/AICache.pm
|
||||
lib/Mail/SpamAssassin/ArchiveIterator.pm
|
||||
lib/Mail/SpamAssassin/AsyncLoop.pm
|
||||
lib/Mail/SpamAssassin/AutoWhitelist.pm
|
||||
lib/Mail/SpamAssassin/AutoWelcomelist.pm
|
||||
lib/Mail/SpamAssassin/Bayes.pm
|
||||
lib/Mail/SpamAssassin/Bayes/CombineChi.pm
|
||||
lib/Mail/SpamAssassin/Bayes/CombineNaiveBayes.pm
|
||||
@ -49,6 +49,7 @@ lib/Mail/SpamAssassin/Constants.pm
|
||||
lib/Mail/SpamAssassin/DBBasedAddrList.pm
|
||||
lib/Mail/SpamAssassin/Dns.pm
|
||||
lib/Mail/SpamAssassin/DnsResolver.pm
|
||||
lib/Mail/SpamAssassin/GeoDB.pm
|
||||
lib/Mail/SpamAssassin/HTML.pm
|
||||
lib/Mail/SpamAssassin/Locales.pm
|
||||
lib/Mail/SpamAssassin/Locker.pm
|
||||
@ -74,20 +75,23 @@ lib/Mail/SpamAssassin/Plugin/ASN.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AWL.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AccessDB.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AntiVirus.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AuthRes.pm
|
||||
lib/Mail/SpamAssassin/Plugin/AutoLearnThreshold.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Bayes.pm
|
||||
lib/Mail/SpamAssassin/Plugin/BodyEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/BodyRuleBaseExtractor.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Check.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DCC.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DKIM.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DMARC.pm
|
||||
lib/Mail/SpamAssassin/Plugin/DNSEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/ExtractText.pm
|
||||
lib/Mail/SpamAssassin/Plugin/FreeMail.pm
|
||||
lib/Mail/SpamAssassin/Plugin/FromNameSpoof.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HashBL.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HTMLEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HTTPSMismatch.pm
|
||||
lib/Mail/SpamAssassin/Plugin/Hashcash.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HashBL.pm
|
||||
lib/Mail/SpamAssassin/Plugin/HeaderEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/ImageInfo.pm
|
||||
lib/Mail/SpamAssassin/Plugin/MIMEEval.pm
|
||||
@ -116,7 +120,7 @@ lib/Mail/SpamAssassin/Plugin/URIDetail.pm
|
||||
lib/Mail/SpamAssassin/Plugin/URIEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/VBounce.pm
|
||||
lib/Mail/SpamAssassin/Plugin/WLBLEval.pm
|
||||
lib/Mail/SpamAssassin/Plugin/WhiteListSubject.pm
|
||||
lib/Mail/SpamAssassin/Plugin/WelcomeListSubject.pm
|
||||
lib/Mail/SpamAssassin/PluginHandler.pm
|
||||
lib/Mail/SpamAssassin/Plugin/URILocalBL.pm
|
||||
lib/Mail/SpamAssassin/RegistryBoundaries.pm
|
||||
@ -132,7 +136,6 @@ lib/Mail/SpamAssassin/Util/ScopedTimer.pm
|
||||
lib/Mail/SpamAssassin/Util/TieOneStringHash.pm
|
||||
lib/spamassassin-run.pod
|
||||
procmailrc.example
|
||||
rules.README
|
||||
rules/active.list
|
||||
rules/init.pre
|
||||
rules/languages
|
||||
@ -148,7 +151,10 @@ rules/v340.pre
|
||||
rules/v341.pre
|
||||
rules/v342.pre
|
||||
rules/v343.pre
|
||||
rules/v400.pre
|
||||
rules/20_aux_tlds.cf
|
||||
rules-extras/README.txt
|
||||
rules-extras/10_uridnsbl_skip_financial.cf
|
||||
sa-awl.raw
|
||||
sa-check_spamd.raw
|
||||
sa-compile.raw
|
||||
@ -222,6 +228,9 @@ sql/awl_mysql.sql
|
||||
sql/awl_pg.sql
|
||||
sql/bayes_mysql.sql
|
||||
sql/bayes_pg.sql
|
||||
sql/decodeshorturl_mysql.sql
|
||||
sql/decodeshorturl_pg.sql
|
||||
sql/decodeshorturl_sqlite.sql
|
||||
sql/userpref_mysql.sql
|
||||
sql/userpref_pg.sql
|
||||
sql/txrep_mysql.sql
|
||||
@ -231,10 +240,13 @@ t/README
|
||||
t/SATest.pl
|
||||
t/SATest.pm
|
||||
t/all_modules.t
|
||||
t/askdns.t
|
||||
t/authres.t
|
||||
t/autolearn.t
|
||||
t/autolearn_force.t
|
||||
t/autolearn_force_fail.t
|
||||
t/basic_lint.t
|
||||
t/basic_lint_net.t
|
||||
t/basic_lint_without_sandbox.t
|
||||
t/basic_meta.t
|
||||
t/basic_meta2.t
|
||||
@ -246,6 +258,7 @@ t/bayessdbm.t
|
||||
t/bayessdbm_seen_delete.t
|
||||
t/bayessql.t
|
||||
t/blacklist_autolearn.t
|
||||
t/blocklist_autolearn.t
|
||||
t/body_mod.t
|
||||
t/body_str.t
|
||||
t/check_implemented.t
|
||||
@ -256,10 +269,13 @@ t/config_text.t
|
||||
t/config_tree_recurse.t
|
||||
t/cpp_comments_in_spamc.t
|
||||
t/cross_user_config_leak.t
|
||||
t/olevbmacro.t
|
||||
t/dmarc.t
|
||||
t/arc.t
|
||||
t/data/01_test_rules.cf
|
||||
t/data/01_test_rules.pre
|
||||
t/data/Dumpheaders.pm
|
||||
t/data/dkim/arc/ok01.eml
|
||||
t/data/dkim/arc/ko01.eml
|
||||
t/data/dkim/test-adsp-11.msg
|
||||
t/data/dkim/test-adsp-12.msg
|
||||
t/data/dkim/test-adsp-13.msg
|
||||
@ -307,6 +323,15 @@ t/data/dkim/test-pass-23.msg
|
||||
t/data/etc/hello.txt
|
||||
t/data/etc/testhost.cert
|
||||
t/data/etc/testhost.key
|
||||
t/data/geodb/GeoIP2-City.mmdb
|
||||
t/data/geodb/GeoIP2-Country.mmdb
|
||||
t/data/geodb/GeoIP2-ISP.mmdb
|
||||
t/data/geodb/GeoIPCity.dat
|
||||
t/data/geodb/GeoIPISP.dat
|
||||
t/data/geodb/create_GeoIPCity.README
|
||||
t/data/geodb/create_GeoIPISP.README
|
||||
t/data/geodb/create_ipcc.sh
|
||||
t/data/geodb/ipcc.db
|
||||
t/data/mime-subject.txt
|
||||
t/data/nice/001
|
||||
t/data/nice/002
|
||||
@ -324,6 +349,7 @@ t/data/nice/013
|
||||
t/data/nice/014
|
||||
t/data/nice/015
|
||||
t/data/nice/016
|
||||
t/data/nice/authres
|
||||
t/data/nice/base64.txt
|
||||
t/data/nice/crlf-endings
|
||||
t/data/nice/dkim/AddedVtag_07
|
||||
@ -336,6 +362,10 @@ t/data/nice/dkim/MultipleSig_06
|
||||
t/data/nice/dkim/NonExistingHeader_09
|
||||
t/data/nice/dkim/Nowsp_03
|
||||
t/data/nice/dkim/Simple_02
|
||||
t/data/nice/dmarc/noneok.eml
|
||||
t/data/nice/dmarc/quarok.eml
|
||||
t/data/nice/dmarc/rejectok.eml
|
||||
t/data/nice/dmarc/strictrejectok.eml
|
||||
t/data/nice/mailman_message.txt
|
||||
t/data/nice/mime1
|
||||
t/data/nice/mime2
|
||||
@ -353,6 +383,14 @@ t/data/nice/spf1
|
||||
t/data/nice/spf2
|
||||
t/data/nice/spf3
|
||||
t/data/nice/spf3-received-spf
|
||||
t/data/nice/spf4-received-spf-nofold
|
||||
t/data/nice/spf5-received-spf-crlf
|
||||
t/data/nice/spf6-received-spf-crlf2
|
||||
t/data/nice/unicode1
|
||||
t/data/nice/unicode2
|
||||
t/data/nice.mbox
|
||||
t/data/phishing/openphish-feed.txt
|
||||
t/data/phishing/phishtank-feed.csv
|
||||
t/data/reporterplugin.pm
|
||||
t/data/spam/001
|
||||
t/data/spam/002
|
||||
@ -380,107 +418,149 @@ t/data/spam/badmime3.txt
|
||||
t/data/spam/base64.txt
|
||||
t/data/spam/bsmtp
|
||||
t/data/spam/bsmtpnull
|
||||
t/data/spam/decodeshorturl/base.eml
|
||||
t/data/spam/decodeshorturl/base2.eml
|
||||
t/data/spam/decodeshorturl/chain.eml
|
||||
t/data/spam/dmarc/nodmarc.eml
|
||||
t/data/spam/dmarc/noneko.eml
|
||||
t/data/spam/dmarc/quarko.eml
|
||||
t/data/spam/dmarc/rejectko.eml
|
||||
t/data/spam/dmarc/strictrejectko.eml
|
||||
t/data/spam/dnsbl.eml
|
||||
t/data/spam/dnsbl_domsonly.eml
|
||||
t/data/spam/dnsbl_ipsonly.eml
|
||||
t/data/spam/extracttext/gtube_png.eml
|
||||
t/data/spam/extracttext/gtube_pdf.eml
|
||||
t/data/spam/extracttext/gtube_b64_oct.eml
|
||||
t/data/spam/freemail1
|
||||
t/data/spam/freemail2
|
||||
t/data/spam/freemail3
|
||||
t/data/spam/fromnamespoof/spoof1
|
||||
t/data/spam/gtube.eml
|
||||
t/data/spam/gtubedcc.eml
|
||||
t/data/spam/gtubedcc_crlf.eml
|
||||
t/data/spam/hashbl
|
||||
t/data/spam/olevbmacro/encrypted.eml
|
||||
t/data/spam/olevbmacro/goodcsv.eml
|
||||
t/data/spam/olevbmacro/macro.eml
|
||||
t/data/spam/olevbmacro/malicemacro.eml
|
||||
t/data/spam/olevbmacro/nomacro.eml
|
||||
t/data/spam/olevbmacro/renamedmacro.eml
|
||||
t/data/spam/olevbmacro/target_uri.eml
|
||||
t/data/spam/olevbmacro/zippwmacro.eml
|
||||
t/data/spam/phishing_openphish.eml
|
||||
t/data/spam/phishing_phishtank.eml
|
||||
t/data/spam/pyzor
|
||||
t/data/spam/razor2
|
||||
t/data/spam/relayUS.eml
|
||||
t/data/spam/spf1
|
||||
t/data/spam/spf2
|
||||
t/data/spam/spf3
|
||||
t/data/spam/unicode1
|
||||
t/data/spam/urilocalbl_net.eml
|
||||
t/data/spamc_blank.cf
|
||||
t/data/taintcheckplugin.pm
|
||||
t/data/testplugin.pm
|
||||
t/data/testplugin2.pm
|
||||
t/data/validuserplugin.pm
|
||||
t/data/whitelists/action.eff.org
|
||||
t/data/whitelists/amazon_co_uk_ship
|
||||
t/data/whitelists/amazon_com_ship
|
||||
t/data/whitelists/cert.org
|
||||
t/data/whitelists/debian_bts_reassign
|
||||
t/data/whitelists/ibm_enews_de
|
||||
t/data/whitelists/infoworld
|
||||
t/data/whitelists/linuxplanet
|
||||
t/data/whitelists/lp.org
|
||||
t/data/whitelists/media_unspun
|
||||
t/data/whitelists/mlist_mailman_message
|
||||
t/data/whitelists/mlist_yahoo_groups_message
|
||||
t/data/whitelists/mypoints
|
||||
t/data/whitelists/neat_net_tricks
|
||||
t/data/whitelists/netcenter-direct_de
|
||||
t/data/whitelists/netsol_renewal
|
||||
t/data/whitelists/networkworld
|
||||
t/data/whitelists/oracle_net_techblast
|
||||
t/data/whitelists/orbitz.com
|
||||
t/data/whitelists/paypal.com
|
||||
t/data/whitelists/register.com_password
|
||||
t/data/whitelists/ryanairmail.com
|
||||
t/data/whitelists/sf.net
|
||||
t/data/whitelists/winxpnews.com
|
||||
t/data/whitelists/yahoo-inc.com
|
||||
t/data/phishing/openphish-feed.txt
|
||||
t/data/phishing/phishtank-feed.csv
|
||||
t/data/spam/phishing_openphish.eml
|
||||
t/data/spam/phishing_phishtank.eml
|
||||
t/phishing.t
|
||||
t/data/welcomelists/action.eff.org
|
||||
t/data/welcomelists/amazon_co_uk_ship
|
||||
t/data/welcomelists/amazon_com_ship
|
||||
t/data/welcomelists/cert.org
|
||||
t/data/welcomelists/debian_bts_reassign
|
||||
t/data/welcomelists/ibm_enews_de
|
||||
t/data/welcomelists/infoworld
|
||||
t/data/welcomelists/linuxplanet
|
||||
t/data/welcomelists/lp.org
|
||||
t/data/welcomelists/media_unspun
|
||||
t/data/welcomelists/mlist_mailman_message
|
||||
t/data/welcomelists/mlist_yahoo_groups_message
|
||||
t/data/welcomelists/mypoints
|
||||
t/data/welcomelists/neat_net_tricks
|
||||
t/data/welcomelists/netcenter-direct_de
|
||||
t/data/welcomelists/netsol_renewal
|
||||
t/data/welcomelists/networkworld
|
||||
t/data/welcomelists/oracle_net_techblast
|
||||
t/data/welcomelists/orbitz.com
|
||||
t/data/welcomelists/paypal.com
|
||||
t/data/welcomelists/register.com_password
|
||||
t/data/welcomelists/ryanairmail.com
|
||||
t/data/welcomelists/sf.net
|
||||
t/data/welcomelists/winxpnews.com
|
||||
t/data/welcomelists/yahoo-inc.com
|
||||
t/date.t
|
||||
t/db_awl_path.t
|
||||
t/db_awl_path_welcome_block.t
|
||||
t/db_awl_perms.t
|
||||
t/db_awl_perms_welcome_block.t
|
||||
t/db_based_welcomelist.t
|
||||
t/db_based_welcomelist_ips.t
|
||||
t/db_based_whitelist.t
|
||||
t/db_based_whitelist_ips.t
|
||||
t/dcc.t
|
||||
t/debug.t
|
||||
t/decodeshorturl.t
|
||||
t/desc_wrap.t
|
||||
t/dkim.t
|
||||
t/dnsbl.t
|
||||
t/dnsbl_sc_meta.t
|
||||
t/duplicates.t
|
||||
t/dnsbl_subtests.t
|
||||
t/enable_compat.t
|
||||
t/extracttext.t
|
||||
t/freemail.t
|
||||
t/freemail_welcome_block.t
|
||||
t/fromnamespoof.t
|
||||
t/get_all_headers.t
|
||||
t/get_headers.t
|
||||
t/gtube.t
|
||||
t/hashcash.t
|
||||
t/header.t
|
||||
t/header_utf8.t
|
||||
t/hashbl.t
|
||||
t/html_colors.t
|
||||
t/html_obfu.t
|
||||
t/html_utf8.t
|
||||
t/idn_dots.t
|
||||
t/if_can.t
|
||||
t/if_else.t
|
||||
t/ifversion.t
|
||||
t/ip_addrs.t
|
||||
t/lang_lint.t
|
||||
t/lang_pl_tests.t
|
||||
t/line_endings.t
|
||||
t/lint_nocreate_prefs.t
|
||||
t/local_tests_only.t
|
||||
t/memory_cycles.t
|
||||
t/metadata.t
|
||||
t/mimeheader.t
|
||||
t/mimeparse.t
|
||||
t/missing_hb_separator.t
|
||||
t/mkrules.t
|
||||
t/mkrules_else.t
|
||||
t/nonspam.t
|
||||
t/olevbmacro.t
|
||||
t/originating_ip_hdr.t
|
||||
t/pdfinfo.t
|
||||
t/phishing.t
|
||||
t/plugin.t
|
||||
t/plugin_file.t
|
||||
t/plugin_priorities.t
|
||||
t/prefs_include.t
|
||||
t/priorities.t
|
||||
t/priorities_welcome_block.t
|
||||
t/perlcritic.t
|
||||
t/perlcritic.pl
|
||||
t/podchecker.t
|
||||
t/pyzor.t
|
||||
t/razor2.t
|
||||
t/rcvd_parser.t
|
||||
t/re_base_extraction.t
|
||||
t/recips.t
|
||||
t/recreate.t
|
||||
t/recursion.t
|
||||
t/regexp_named_capture.t
|
||||
t/regexp_valid.t
|
||||
t/relaycountry_fast.t
|
||||
t/relaycountry_geoip.t
|
||||
t/relaycountry_geoip2.t
|
||||
t/relative_scores.t
|
||||
t/relaycountry.t
|
||||
t/report_safe.t
|
||||
t/reportheader.t
|
||||
t/reportheader_8bit.t
|
||||
@ -500,14 +580,18 @@ t/rule_multiple.t
|
||||
t/rule_names.t
|
||||
t/rule_types.t
|
||||
t/sa_awl.t
|
||||
t/sa_awl_welcome_block.t
|
||||
t/sa_check_spamd.t
|
||||
t/sa_compile.t
|
||||
t/sha1.t
|
||||
t/shortcircuit.t
|
||||
t/shortcircuit_before_dns.t
|
||||
t/spam.t
|
||||
t/spamc.t
|
||||
t/spamc_B.t
|
||||
t/spamc_E.t
|
||||
t/spamc_H.t
|
||||
t/spamc_bug6176.t
|
||||
t/spamc_c.t
|
||||
t/spamc_c_stdout_closed.t
|
||||
t/spamc_cf.t
|
||||
@ -540,6 +624,7 @@ t/spamd_report.t
|
||||
t/spamd_report_ifspam.t
|
||||
t/spamd_sql_prefs.t
|
||||
t/spamd_ssl.t
|
||||
t/spamd_ssl_z.t
|
||||
t/spamd_ssl_accept_fail.t
|
||||
t/spamd_stop.t
|
||||
t/spamd_symbols.t
|
||||
@ -548,8 +633,11 @@ t/spamd_unix.t
|
||||
t/spamd_unix_and_tcp.t
|
||||
t/spamd_user_rules_leak.t
|
||||
t/spamd_utf8.t
|
||||
t/spamd_welcomelist_leak.t
|
||||
t/spamd_whitelist_leak.t
|
||||
t/spf.t
|
||||
t/spf_welcome_block.t
|
||||
t/sql_based_welcomelist.t
|
||||
t/sql_based_whitelist.t
|
||||
t/stop_always_matching_regexps.t
|
||||
t/strip2.t
|
||||
@ -557,30 +645,31 @@ t/strip_no_subject.t
|
||||
t/stripmarkup.t
|
||||
t/tainted_msg.t
|
||||
t/test_dir
|
||||
t/testrules.yml
|
||||
t/text_bad_ctype.t
|
||||
t/timeout.t
|
||||
t/trust_path.t
|
||||
t/uri.t
|
||||
t/uri_html.t
|
||||
t/uri_list.t
|
||||
t/uri_saferedirect.t
|
||||
t/uri_text.t
|
||||
t/uribl.t
|
||||
t/urilocalbl_geoip.t
|
||||
t/uri_saferedirect.t
|
||||
t/uribl_all_types.t
|
||||
t/uribl_domains_only.t
|
||||
t/uribl_ips_only.t
|
||||
t/urilocalbl.t
|
||||
t/utf8.t
|
||||
t/util_wrap.t
|
||||
t/welcomelist_addrs.t
|
||||
t/welcomelist_from.t
|
||||
t/welcomelist_subject.t
|
||||
t/welcomelist_to.t
|
||||
t/wlbl_uri.t
|
||||
t/whitelist_addrs.t
|
||||
t/whitelist_from.t
|
||||
t/whitelist_subject.t
|
||||
t/whitelist_to.t
|
||||
t/zz_cleanup.t
|
||||
t/spamc_bug6176.t
|
||||
t/data/spam/dnsbl_domsonly.eml
|
||||
t/uribl_domains_only.t
|
||||
t/data/spam/dnsbl_ipsonly.eml
|
||||
t/uribl_all_types.t
|
||||
t/uribl_ips_only.t
|
||||
t/uri_list.t
|
||||
t/dnsbl_subtests.t
|
||||
powered_by/128-powered-by-spamassassin.png
|
||||
powered_by/256-powered-by-spamassassin.png
|
||||
powered_by/512-powered-by-spamassassin.png
|
||||
|
@ -10,6 +10,7 @@
|
||||
\.pid$
|
||||
\.so$
|
||||
\.svn/
|
||||
\.gitattributes$
|
||||
\.gitignore$
|
||||
\.swp$
|
||||
\.tmp$
|
||||
@ -22,6 +23,7 @@
|
||||
\btmon\.out$
|
||||
\b[Oo][Ll][Dd]$
|
||||
\b[Oo][Uu][Tt]$
|
||||
^.github/
|
||||
^blib/
|
||||
^blibdirs$
|
||||
^build/3\.\d\.\d_change_summary$
|
||||
@ -63,7 +65,7 @@
|
||||
^t/bayessql\.cf$
|
||||
^t/config$
|
||||
^t/data/nice/cjk/
|
||||
^t/data/whitelists/
|
||||
^t/data/welcomelists/
|
||||
^t/do_net$
|
||||
^t/log/
|
||||
^t/rule_tests\.t$
|
||||
@ -118,7 +120,4 @@
|
||||
^build/rebuild_xt$
|
||||
^build/repackage_latest_update_rules$
|
||||
^MYMETA.(json|yml)$
|
||||
^trunk-only.*
|
||||
^t/mkrules\.t$
|
||||
^t/mkrules_else\.t$
|
||||
^t/spamc_H\.t$
|
||||
^textcat/
|
||||
|
@ -4,13 +4,14 @@
|
||||
"The Apache SpamAssassin Project <dev@spamassassin.apache.org>"
|
||||
],
|
||||
"dynamic_config" : 1,
|
||||
"generated_by" : "ExtUtils::MakeMaker version 6.68, CPAN::Meta::Converter version 2.120921",
|
||||
"generated_by" : "ExtUtils::MakeMaker version 7.62, CPAN::Meta::Converter version 2.150010",
|
||||
"license" : [
|
||||
"unknown",
|
||||
"apache_2_0"
|
||||
],
|
||||
"meta-spec" : {
|
||||
"url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec",
|
||||
"version" : "2"
|
||||
"version" : 2
|
||||
},
|
||||
"name" : "Mail-SpamAssassin",
|
||||
"no_index" : {
|
||||
@ -22,26 +23,72 @@
|
||||
"prereqs" : {
|
||||
"build" : {
|
||||
"requires" : {
|
||||
"ExtUtils::MakeMaker" : "0"
|
||||
"ExtUtils::MakeMaker" : "6.64"
|
||||
}
|
||||
},
|
||||
"configure" : {
|
||||
"requires" : {
|
||||
"ExtUtils::MakeMaker" : "6.64"
|
||||
}
|
||||
},
|
||||
"runtime" : {
|
||||
"recommends" : {
|
||||
"Archive::Zip" : "0",
|
||||
"BSD::Resource" : "0",
|
||||
"Compress::Zlib" : "0",
|
||||
"DBD::SQLite" : "1.5901",
|
||||
"DBI" : "0",
|
||||
"DB_File" : "0",
|
||||
"Email::Address::XS" : "0",
|
||||
"Encode::Detect::Detector" : "0",
|
||||
"Geo::IP" : "0",
|
||||
"IO::Socket::INET6" : "0",
|
||||
"IO::Socket::IP" : "0.09",
|
||||
"IO::Socket::SSL" : "1.76",
|
||||
"IO::String" : "0",
|
||||
"IP::Country::DB_File" : "0",
|
||||
"IP::Country::Fast" : "0",
|
||||
"LWP::UserAgent" : "0",
|
||||
"MIME::Base64" : "0",
|
||||
"Mail::DKIM" : "0.37",
|
||||
"Mail::DMARC" : "0",
|
||||
"Mail::SPF" : "0",
|
||||
"MaxMind::DB::Reader" : "0",
|
||||
"MaxMind::DB::Reader::XS" : "0",
|
||||
"Net::CIDR::Lite" : "0",
|
||||
"Net::LibIDN" : "0",
|
||||
"Net::LibIDN2" : "0",
|
||||
"Net::Patricia" : "1.16",
|
||||
"Net::SMTP" : "0",
|
||||
"Razor2::Client::Agent" : "2.61"
|
||||
},
|
||||
"requires" : {
|
||||
"Archive::Tar" : "1.23",
|
||||
"Digest::SHA1" : "0",
|
||||
"Errno" : "0",
|
||||
"File::Copy" : "2.02",
|
||||
"File::Spec" : "0.8",
|
||||
"HTML::Parser" : "3.43",
|
||||
"IO::Zlib" : "1.04",
|
||||
"Mail::DKIM" : "0.31",
|
||||
"Net::DNS" : "0.34",
|
||||
"Net::DNS" : "0.69",
|
||||
"NetAddr::IP" : "4.01",
|
||||
"Pod::Usage" : "1.1",
|
||||
"Sys::Hostname" : "0",
|
||||
"Test::More" : "0",
|
||||
"Time::HiRes" : "0",
|
||||
"Time::Local" : "0"
|
||||
"Time::Local" : "0",
|
||||
"perl" : "5.014"
|
||||
}
|
||||
},
|
||||
"test" : {
|
||||
"recommends" : {
|
||||
"Net::DNS::Nameserver" : "0"
|
||||
},
|
||||
"requires" : {
|
||||
"Devel::Cycle" : "0",
|
||||
"Perl::Critic::Policy::Perlsecret" : "0",
|
||||
"Perl::Critic::Policy::TestingAndDebugging::ProhibitNoStrict" : "0",
|
||||
"Test::More" : "0",
|
||||
"Text::Diff" : "0"
|
||||
}
|
||||
}
|
||||
},
|
||||
@ -57,5 +104,6 @@
|
||||
},
|
||||
"x_MailingList" : "http://wiki.apache.org/spamassassin/MailingLists"
|
||||
},
|
||||
"version" : "3.004006"
|
||||
"version" : "4.000000",
|
||||
"x_serialization_backend" : "JSON::PP version 4.06"
|
||||
}
|
||||
|
@ -3,37 +3,73 @@ abstract: 'Apache SpamAssassin is an extensible email filter which is used to id
|
||||
author:
|
||||
- 'The Apache SpamAssassin Project <dev@spamassassin.apache.org>'
|
||||
build_requires:
|
||||
ExtUtils::MakeMaker: 0
|
||||
Devel::Cycle: '0'
|
||||
ExtUtils::MakeMaker: '6.64'
|
||||
Perl::Critic::Policy::Perlsecret: '0'
|
||||
Perl::Critic::Policy::TestingAndDebugging::ProhibitNoStrict: '0'
|
||||
Test::More: '0'
|
||||
Text::Diff: '0'
|
||||
configure_requires:
|
||||
ExtUtils::MakeMaker: '6.64'
|
||||
dynamic_config: 1
|
||||
generated_by: 'ExtUtils::MakeMaker version 6.68, CPAN::Meta::Converter version 2.120921'
|
||||
license: apache
|
||||
generated_by: 'ExtUtils::MakeMaker version 7.62, CPAN::Meta::Converter version 2.150010'
|
||||
license: unknown
|
||||
meta-spec:
|
||||
url: http://module-build.sourceforge.net/META-spec-v1.4.html
|
||||
version: 1.4
|
||||
version: '1.4'
|
||||
name: Mail-SpamAssassin
|
||||
no_index:
|
||||
directory:
|
||||
- t
|
||||
- inc
|
||||
recommends:
|
||||
Archive::Zip: '0'
|
||||
BSD::Resource: '0'
|
||||
Compress::Zlib: '0'
|
||||
DBD::SQLite: '1.5901'
|
||||
DBI: '0'
|
||||
DB_File: '0'
|
||||
Email::Address::XS: '0'
|
||||
Encode::Detect::Detector: '0'
|
||||
Geo::IP: '0'
|
||||
IO::Socket::INET6: '0'
|
||||
IO::Socket::IP: '0.09'
|
||||
IO::Socket::SSL: '1.76'
|
||||
IO::String: '0'
|
||||
IP::Country::DB_File: '0'
|
||||
IP::Country::Fast: '0'
|
||||
LWP::UserAgent: '0'
|
||||
MIME::Base64: '0'
|
||||
Mail::DKIM: '0.37'
|
||||
Mail::DMARC: '0'
|
||||
Mail::SPF: '0'
|
||||
MaxMind::DB::Reader: '0'
|
||||
MaxMind::DB::Reader::XS: '0'
|
||||
Net::CIDR::Lite: '0'
|
||||
Net::LibIDN: '0'
|
||||
Net::LibIDN2: '0'
|
||||
Net::Patricia: '1.16'
|
||||
Net::SMTP: '0'
|
||||
Razor2::Client::Agent: '2.61'
|
||||
requires:
|
||||
Archive::Tar: 1.23
|
||||
Digest::SHA1: 0
|
||||
Errno: 0
|
||||
File::Copy: 2.02
|
||||
File::Spec: 0.8
|
||||
HTML::Parser: 3.43
|
||||
IO::Zlib: 1.04
|
||||
Mail::DKIM: 0.31
|
||||
Net::DNS: 0.34
|
||||
NetAddr::IP: 4.01
|
||||
Pod::Usage: 1.1
|
||||
Sys::Hostname: 0
|
||||
Test::More: 0
|
||||
Time::HiRes: 0
|
||||
Time::Local: 0
|
||||
Archive::Tar: '1.23'
|
||||
Errno: '0'
|
||||
File::Copy: '2.02'
|
||||
File::Spec: '0.8'
|
||||
HTML::Parser: '3.43'
|
||||
IO::Zlib: '1.04'
|
||||
Mail::DKIM: '0.31'
|
||||
Net::DNS: '0.69'
|
||||
NetAddr::IP: '4.01'
|
||||
Pod::Usage: '1.1'
|
||||
Sys::Hostname: '0'
|
||||
Time::HiRes: '0'
|
||||
Time::Local: '0'
|
||||
perl: '5.014'
|
||||
resources:
|
||||
MailingList: http://wiki.apache.org/spamassassin/MailingLists
|
||||
homepage: https://spamassassin.apache.org/
|
||||
license: http://www.apache.org/licenses/LICENSE-2.0.html
|
||||
repository: http://svn.apache.org/repos/asf/spamassassin/
|
||||
x_MailingList: http://wiki.apache.org/spamassassin/MailingLists
|
||||
version: 3.004006
|
||||
version: '4.000000'
|
||||
x_serialization_backend: 'CPAN::Meta::YAML version 0.018'
|
||||
|
@ -1,19 +1,14 @@
|
||||
#!/usr/bin/perl
|
||||
require 5.6.1;
|
||||
require v5.14.0;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use Config;
|
||||
|
||||
use ExtUtils::MakeMaker;
|
||||
use ExtUtils::MakeMaker 6.64;
|
||||
|
||||
# avoid stupid 'Argument "6.30_01" isn't numeric in numeric ge (>=)' warnings;
|
||||
my $mm_version = eval $ExtUtils::MakeMaker::VERSION;
|
||||
|
||||
# raising the version of makemaker to 6.46 per bug 6598 & 6131
|
||||
if ($mm_version < 6.46) {
|
||||
die "Apache SpamAssassin Makefile.PL requires at least ExtUtils::MakeMaker v6.46";
|
||||
}
|
||||
# raising the version of makemaker to 6.64 to use TEST_REQUIRES
|
||||
use constant MIN_MAKEMAKER_VERSION => 6.64;
|
||||
|
||||
use constant RUNNING_ON_WINDOWS => ($^O =~ /^(mswin|dos|os2)/oi);
|
||||
use constant HAS_DBI => eval { require DBI; };
|
||||
@ -177,9 +172,7 @@ my %makefile = (
|
||||
},
|
||||
|
||||
# be quite explicit about this; afaik CPAN.pm is sensible using this
|
||||
# also see CURRENT_PM below
|
||||
'PREREQ_PM' => {
|
||||
'Digest::SHA1' => 0, # 2.0 is oldest tested version
|
||||
'File::Spec' => 0.8, # older versions lack some routines we need
|
||||
'File::Copy' => 2.02, # this version is shipped with 5.005_03, the oldest version known to work
|
||||
'Pod::Usage' => 1.10, # all versions prior to this do seem to be buggy
|
||||
@ -187,18 +180,37 @@ my %makefile = (
|
||||
'Archive::Tar' => 1.23, # for sa-update
|
||||
'IO::Zlib' => 1.04, # for sa-update
|
||||
'Mail::DKIM' => 0.31,
|
||||
'Net::DNS' => (RUNNING_ON_WINDOWS ? 0.46 : 0.34), # bugs in older revs
|
||||
'Net::DNS' => 0.69,
|
||||
'NetAddr::IP' => 4.010,
|
||||
'Sys::Hostname' => 0,
|
||||
'Test::More' => 0,
|
||||
'Time::HiRes' => 0,
|
||||
'Time::Local' => 0,
|
||||
'Errno' => 0,
|
||||
},
|
||||
|
||||
# In case MIN_MAKEMAKER_VERSION is greater than the version bundled in the core of MIN_PERL_VERSION
|
||||
# use this to ensure CPAN will automatically upgrade MakeMaker if needed
|
||||
'BUILD_REQUIRES' => {
|
||||
'ExtUtils::MakeMaker' => MIN_MAKEMAKER_VERSION,
|
||||
},
|
||||
|
||||
'CONFIGURE_REQUIRES' => {
|
||||
'ExtUtils::MakeMaker' => MIN_MAKEMAKER_VERSION,
|
||||
},
|
||||
|
||||
# The modules that are not core that are used in default tests
|
||||
'TEST_REQUIRES' => {
|
||||
'Devel::Cycle' => 0,
|
||||
'Test::More' => 0,
|
||||
'Text::Diff' => 0,
|
||||
'Perl::Critic::Policy::TestingAndDebugging::ProhibitNoStrict' => 0,
|
||||
'Perl::Critic::Policy::Perlsecret' => 0,
|
||||
},
|
||||
|
||||
'dist' => {
|
||||
COMPRESS => 'gzip -9f',
|
||||
SUFFIX => 'gz',
|
||||
SUFFIX => '.gz',
|
||||
TARFLAGS => 'cf',
|
||||
DIST_DEFAULT => 'tardist',
|
||||
|
||||
CI => 'svn commit',
|
||||
@ -228,12 +240,8 @@ my %makefile = (
|
||||
|
||||
'rules/*.pm',
|
||||
|
||||
# don't remove these. they are built from 'rulesrc' in SVN, but
|
||||
# in a distribution tarball, they're not
|
||||
# 'rules/70_sandbox.cf',
|
||||
# 'rules/72_active.cf',
|
||||
|
||||
# this file is no longer built, or used
|
||||
'rules/70_sandbox.cf',
|
||||
'rules/72_active.cf',
|
||||
'rules/70_inactive.cf',
|
||||
|
||||
)
|
||||
@ -246,6 +254,7 @@ my %makefile = (
|
||||
# We have only this Makefile.PL and this option keeps MakeMaker from
|
||||
# asking all questions twice after a 'make dist*'.
|
||||
'NORECURS' => 1,
|
||||
'MIN_PERL_VERSION'=> 5.014000,
|
||||
);
|
||||
|
||||
# rules/72_active.cf is built from "rulesrc", but *must* exist before
|
||||
@ -254,14 +263,6 @@ my @FILES_THAT_MUST_EXIST = qw(
|
||||
rules/72_active.cf
|
||||
);
|
||||
|
||||
# make sure certain optional modules are up-to-date if they are installed
|
||||
# also see PREREQ_PM above
|
||||
my %CURRENT_PM = (
|
||||
'Net::DNS' => (RUNNING_ON_WINDOWS ? 0.46 : 0.34),
|
||||
'Razor2::Client::Agent' => 2.40,
|
||||
);
|
||||
|
||||
|
||||
# All the $(*MAN1*) stuff is empty/zero if Perl was Configured with -Dman1dir=none;
|
||||
# however, support site/vendor man1 dirs (bug 5338)
|
||||
unless($Config{installman1dir}
|
||||
@ -324,11 +325,13 @@ See that file if you wish to customize what tests are run, and how.
|
||||
|
||||
# check optional module versions
|
||||
use lib 'lib';
|
||||
use Mail::SpamAssassin::Util::DependencyInfo;
|
||||
if (Mail::SpamAssassin::Util::DependencyInfo::long_diagnostics() != 0) {
|
||||
# missing required module? die!
|
||||
# bug 5908: http://cpantest.grango.org/wiki/CPANAuthorNotes says
|
||||
# we should exit with a status of 0, but without creating Makefile
|
||||
require Mail::SpamAssassin::Util::DependencyInfo;
|
||||
if (Mail::SpamAssassin::Util::DependencyInfo::long_diagnostics(1) != 0) {
|
||||
# This prints a full report of missing required and optional modules and binaries
|
||||
# but only exit 0 without creating Makefile if there are missing required binaries.
|
||||
# See http://cpantest.grango.org/wiki/CPANAuthorNotes
|
||||
# Continuing when there are missing required CPAN modules allows cpan to install them
|
||||
# before it runs make on the Makefile
|
||||
exit 0;
|
||||
}
|
||||
|
||||
@ -362,26 +365,45 @@ $makefile{META_MERGE} = {
|
||||
MailingList => 'http://wiki.apache.org/spamassassin/MailingLists',
|
||||
},
|
||||
|
||||
prereqs => {
|
||||
runtime => {
|
||||
recommends => {
|
||||
'MIME::Base64' => 0,
|
||||
'DB_File' => 0,
|
||||
'Net::SMTP' => 0,
|
||||
'Net::LibIDN2' => 0,
|
||||
'Net::LibIDN' => 0,
|
||||
'Mail::SPF' => 0,
|
||||
'MaxMind::DB::Reader' => 0,
|
||||
'MaxMind::DB::Reader::XS' => 0,
|
||||
'Geo::IP' => 0,
|
||||
'IP::Country::DB_File' => 0,
|
||||
'IP::Country::Fast' => 0,
|
||||
'Razor2::Client::Agent' => 2.61,
|
||||
'Net::Ident' => 0,
|
||||
'IO::Socket::IP' => 0.09,
|
||||
'IO::Socket::INET6' => 0,
|
||||
'IO::Socket::SSL' => 1.76,
|
||||
'Compress::Zlib' => 0,
|
||||
'Mail::DKIM' => 0.37,
|
||||
'DBI' => 0,
|
||||
'Getopt::Long' => 2.32,
|
||||
'DBD::SQLite' => 1.59_01,
|
||||
'LWP::UserAgent' => 0,
|
||||
'HTTP::Date' => 0,
|
||||
'Archive::Tar' => 1.23,
|
||||
'IO::Zlib' => 1.04,
|
||||
'Encode::Detect' => 0
|
||||
}
|
||||
'Encode::Detect::Detector' => 0,
|
||||
'Net::Patricia' => 1.16,
|
||||
'Net::CIDR::Lite' => 0,
|
||||
'BSD::Resource' => 0,
|
||||
'Archive::Zip' => 0,
|
||||
'IO::String' => 0,
|
||||
'Email::Address::XS' => 0,
|
||||
'Mail::DMARC' => 0,
|
||||
},
|
||||
},
|
||||
test => {
|
||||
recommends => {
|
||||
'Net::DNS::Nameserver' => 0,
|
||||
},
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
#######################################################################
|
||||
@ -390,7 +412,7 @@ $makefile{META_MERGE} = {
|
||||
$makefile{EXE_FILES} = [ values %{$makefile{EXE_FILES}} ];
|
||||
$makefile{AUTHOR} =~ s/(<.+) at (.+>)/$1\@$2/;
|
||||
WriteMakefile(%makefile);
|
||||
print "Makefile written by ExtUtils::MakeMaker $mm_version\n";
|
||||
print "Makefile written by ExtUtils::MakeMaker $ExtUtils::MakeMaker::VERSION\n";
|
||||
|
||||
#######################################################################
|
||||
|
||||
@ -795,7 +817,14 @@ sub _set_macro_PERL_TAINT {
|
||||
sub _set_macro_PREPROCESS {
|
||||
|
||||
return if get_macro('PREPROCESS');
|
||||
set_macro('PREPROCESS', join(' ', macro_ref('PERL_BIN'), qq{build/preprocessor}));
|
||||
# Bug 8038 - work around quirk of newer Extutils::MakeMaker on Windows with dmake
|
||||
my $perl_bin = get_expanded_macro('FULLPERL');
|
||||
if ($RUNNING_ON_WINDOWS and ($::Config{make} eq 'dmake') and ($perl_bin =~ /^([a-zA-Z]:)?\\"(.*)$/)) {
|
||||
$perl_bin = "\"$1\\$2";
|
||||
} else {
|
||||
$perl_bin = macro_ref('PERL_BIN');
|
||||
}
|
||||
set_macro('PREPROCESS', join(' ', $perl_bin, qq{build/preprocessor}));
|
||||
}
|
||||
|
||||
# This routine sets the value for CONFIGURE (spamc only).
|
||||
@ -1131,6 +1160,7 @@ conf__install:
|
||||
$(PERL) -MFile::Copy -e "copy(q[rules/v341.pre], q[$(B_CONFDIR)/v341.pre]) unless -f q[$(B_CONFDIR)/v341.pre]"
|
||||
$(PERL) -MFile::Copy -e "copy(q[rules/v342.pre], q[$(B_CONFDIR)/v342.pre]) unless -f q[$(B_CONFDIR)/v342.pre]"
|
||||
$(PERL) -MFile::Copy -e "copy(q[rules/v343.pre], q[$(B_CONFDIR)/v343.pre]) unless -f q[$(B_CONFDIR)/v343.pre]"
|
||||
$(PERL) -MFile::Copy -e "copy(q[rules/v400.pre], q[$(B_CONFDIR)/v400.pre]) unless -f q[$(B_CONFDIR)/v400.pre]"
|
||||
|
||||
data__install:
|
||||
-$(MKPATH) $(B_DATADIR)
|
||||
|
@ -289,25 +289,25 @@ Resources
|
||||
---------
|
||||
|
||||
[BUGZILLA] SpamAssassin bug database:
|
||||
<http://issues.apache.org/SpamAssassin/>
|
||||
<https://issues.apache.org/SpamAssassin/>
|
||||
|
||||
[DEBPERL] Debian Perl Policy, Chapter 3: Packaged Modules:
|
||||
<http://www.debian.org/doc/packaging-manuals/perl-policy/ch-module_packages.html>
|
||||
<https://www.debian.org/doc/packaging-manuals/perl-policy/ch-module_packages.html>
|
||||
|
||||
[GNUMAKECMD] GNU make manual: Make Conventions: Variables for Specifying
|
||||
Commands
|
||||
<http://www.gnu.org/manual/make-3.79.1/html_chapter/make_14.html#SEC119>
|
||||
<https://www.gnu.org/software/make/manual/html_node/Command-Variables.html#Command-Variables>
|
||||
|
||||
[MANEUMM616] The man page for ExtUtils::MakeMaker 6.16:
|
||||
<http://search.cpan.org/author/MSCHWERN/ExtUtils-MakeMaker-6.16/lib/ExtUtils/MakeMaker.pm#Default_Makefile_Behaviour>
|
||||
<https://search.cpan.org/author/MSCHWERN/ExtUtils-MakeMaker-6.16/lib/ExtUtils/MakeMaker.pm#Default_Makefile_Behaviour>
|
||||
|
||||
[MM00779] makemaker-at-perl-dot-org: Michael G Schwern: "Re: MakeMaker
|
||||
problems with relocation" (PREFIX was broken):
|
||||
<http://www.mail-archive.com/makemaker@perl.org/msg00779.html>
|
||||
<https://www.mail-archive.com/makemaker@perl.org/msg00779.html>
|
||||
|
||||
[P5P94113] perl5-porters: Michael G Schwern: "Re: OS X's vendorlib default
|
||||
seems wrong" (description of different repositoreis):
|
||||
<http://archive.develooper.com/perl5-porters@perl.org/msg94113.html>
|
||||
<https://archive.develooper.com/perl5-porters@perl.org/msg94113.html>
|
||||
|
||||
[RHBUG78053] Red Hat bug 78053: "incompatible changes in behavior of
|
||||
MakeMaker; affects rpm build process" (introduction of DESTDIR):
|
||||
|
@ -14,7 +14,7 @@ filtering, DNS blocklists, and collaborative filtering databases.
|
||||
Apache SpamAssassin is a project of the Apache Software Foundation (ASF).
|
||||
|
||||
|
||||
What Apache SpamAssassin is Not
|
||||
What Apache SpamAssassin Is Not
|
||||
-------------------------------
|
||||
|
||||
Apache SpamAssassin is not a program to delete spam, route spam and ham to
|
||||
@ -71,9 +71,9 @@ spamassassin-users mailing list[2]. If you've found a bug (and you're
|
||||
sure it's a bug after checking the Wiki), please file a report in our
|
||||
Bugzilla[3].
|
||||
|
||||
[1]: http://wiki.apache.org/spamassassin/
|
||||
[2]: http://wiki.apache.org/spamassassin/MailingLists
|
||||
[3]: http://issues.apache.org/SpamAssassin/
|
||||
[1]: https://wiki.apache.org/spamassassin/
|
||||
[2]: https://wiki.apache.org/spamassassin/MailingLists
|
||||
[3]: https://issues.apache.org/SpamAssassin/
|
||||
|
||||
Please also be sure to read the man pages.
|
||||
|
||||
@ -151,7 +151,7 @@ default locations that Apache SpamAssassin will look at the end.
|
||||
- $USER_HOME/.spamassassin:
|
||||
|
||||
User state directory. Used to hold spamassassin state, such
|
||||
as a per-user automatic whitelist, and the user's preferences
|
||||
as a per-user automatic welcomelist, and the user's preferences
|
||||
file.
|
||||
|
||||
- $USER_HOME/.spamassassin/user_prefs:
|
||||
@ -298,7 +298,7 @@ these translations, so that they can be added to the
|
||||
distribution. Please file a bug in our Bugzilla[4], and attach your
|
||||
translations. You will, of course, be credited for this work!
|
||||
|
||||
[4]: http://issues.apache.org/SpamAssassin/
|
||||
[4]: https://issues.apache.org/SpamAssassin/
|
||||
|
||||
|
||||
Disabled code
|
||||
|
848
upstream/UPGRADE
848
upstream/UPGRADE
@ -1,607 +1,259 @@
|
||||
Note for Users Upgrading to SpamAssassin 3.4.5
|
||||
Note for Users Upgrading to SpamAssassin 4.0.0
|
||||
----------------------------------------------
|
||||
|
||||
- Spamassassin test suite can now run against the installed
|
||||
SpamAssassin files (rather than those in the source directory)
|
||||
Apache SpamAssassin 4.0.0 represents years of work by the project with
|
||||
numerous improvements, new rule types, and internal native handling
|
||||
of messages in international languages. We highly recommend looking
|
||||
through this file and all of the .pre files to evaluate your
|
||||
configuration thoroughly. Plugins have been added, removed, and
|
||||
improved throughout.
|
||||
|
||||
- All rules, functions, command line options and modules that contain
|
||||
"whitelist" or "blacklist" have been renamed to contain more
|
||||
racially neutral "welcomelist" and "blocklist" terms. This allows
|
||||
acronyms like WL and BL to remain the same. Previous options will
|
||||
continue work at least until version 4.1.0 is released. If you have
|
||||
local settings including scores or meta rules referring to old rule
|
||||
names, these should be changed and "enable_compat
|
||||
welcomelist_blocklist" added in init.pre. See:
|
||||
https://wiki.apache.org/spamassassin/WelcomelistBlocklist (Bug 7826)
|
||||
|
||||
- Meta rules no longer use priority values, they are evaluated
|
||||
dynamically when the rules they depend on are finished. (Bug 7735)
|
||||
|
||||
- API: New $pms->rule_ready() function. Any asynchronous eval-function
|
||||
must now return undef (instead of 0 or 1), if rule result is not
|
||||
ready when exiting the function. $pms->rule_ready($rulename) or
|
||||
$pms->got_hit(...) must be called when the result has arrived. If
|
||||
these are not used, it can break depending meta rule evaluation.
|
||||
|
||||
- Setting normalize_charset is now enabled by default. Note that rules
|
||||
should not expect specific non-UTF8 or UTF8 encoding in
|
||||
body. Matching is done against the raw data which may vary depending
|
||||
on normalize_charset setting and whether decoding to UTF8 was
|
||||
successful. See:
|
||||
https://wiki.apache.org/spamassassin/WritingRulesAdvanced
|
||||
|
||||
- DKIM plugin has added support for ARC signature verification
|
||||
|
||||
- The DecodeShortURL plugin has been added and decodes URIs from URL
|
||||
shorteners that may be used to evade scanning
|
||||
|
||||
- Strings can now be captured from rules and later reused using the
|
||||
special %{TAGNAME} syntax
|
||||
|
||||
- The Bayes stopwords, or noise words, are now configurable in order
|
||||
to optimize Bayes usage for non-English languages. Stopwords for 16
|
||||
foreign languages have been included. See 60_bayes_stopwords.cf in
|
||||
the rules files. See Mail::SpamAssassin::Plugin::Bayes and the
|
||||
bayes_stopword_languages option if you wish to use a different
|
||||
stopword list. This is highly recommended if you are using Bayes and
|
||||
you are processing messages in languages other than English.
|
||||
|
||||
- The OLEVBMacro plugin has been improved to identify more macros
|
||||
while also extracting uris from the attachments for automatic
|
||||
inclusion in RBL lookups
|
||||
|
||||
- Internationalized domain name (IDN) support has been added and
|
||||
requires Net::LibIDN2 or Net::LibIDN module with a new
|
||||
Util::idn_to_ascii() function. (Bug 7215)
|
||||
|
||||
- Improved internal header address (From/To/Cc) parser, now also
|
||||
handles multiple addresses and includes optional support for
|
||||
external Email::Address::XS parser, which can handle nested comments
|
||||
and other oddities.
|
||||
|
||||
- Header :addr :name modifiers now return all addresses. Options of
|
||||
:first :last select only first (topmost) or last header to process
|
||||
when there are multiple headers with the same name. :addr and :name
|
||||
may still return multiple values from a single header.
|
||||
|
||||
- API: $pms->get() can and should now be called in list
|
||||
context. Scalar context continues to return multiple values newline
|
||||
separated, but this should be considered deprecated.
|
||||
|
||||
- New ExtractText plugin that extracts text from documents or images
|
||||
to feed the data into SpamAssassin for standard processing with
|
||||
existing rules, URIs extracted from documents will fall into normal
|
||||
RBL lookups.
|
||||
|
||||
- New "nolog" tflag added to hide info coming from rules in
|
||||
SpamAssassin reports
|
||||
|
||||
- All log output (stderr, file, syslog) is now escaped properly for \r
|
||||
\n \t \\, control chars, DEL, and UTF-8 sequences presented as
|
||||
\x{XX}. Whitespace is not normalized anymore like in versions prior
|
||||
to 4.0.0.
|
||||
|
||||
- API: Logger::add() has new optional 'escape' parameter. New
|
||||
Logger::escape_str() function.
|
||||
|
||||
- API: New $pms->add_uri_detail_list() function. Also new
|
||||
uri_detail_list types: unlinked, schemeless
|
||||
|
||||
- Util::split_domain, trim_domain, and is_domain_valid functions have
|
||||
a new optional argument ($is_ascii)
|
||||
|
||||
- Header names support new :host :domain :ip :revip modifiers
|
||||
|
||||
- AskDNS: tag HEADER(hdrname) supported to query any header content
|
||||
similarly to header rules
|
||||
|
||||
- The HashCash module and support has been removed completely, as it
|
||||
has been long since deprecated
|
||||
|
||||
- URILocalBL: uri_block_cc/uri_block_cont now support negation (Bug
|
||||
7528)
|
||||
|
||||
- URILocalBL: IPv6 lookups for hosts is now support, if provided by
|
||||
your database
|
||||
|
||||
- DNS and other asynchronous lookups such as Pyzor and DCC are now
|
||||
only launched when priority -100 is reached. This allows short
|
||||
circuiting at a lower priority without sending unneeded DNS queries
|
||||
and starting process forms. (Bug 5930)
|
||||
|
||||
- API: New plugin method callback method check_dnsbl added to launch
|
||||
network lookups at priority -100 and check_post_dnsbl to harvest own
|
||||
network lookups
|
||||
|
||||
- API: New plugin callback method check_cleanup for cleaning up
|
||||
things...
|
||||
|
||||
- FreeMail: new options freemail_import_welcomelist_auth and
|
||||
freemail_import_def_welcomelist_auth added (Bug 6451)
|
||||
|
||||
- New internal Mail::SpamAssassin::GeoDB module that provides a
|
||||
unified interface to modules MaxMind::DB::Reader (GeoIP2), Geo::IP,
|
||||
IP::Country::DB_File, and IP::Country::Fast.
|
||||
|
||||
This is utilized by RelayCountry and URILocalBL with settings
|
||||
geodb_module, geodb_options, and geodb_search_path.
|
||||
|
||||
Deprecated settings still work such as country_db_type,
|
||||
country_db_path, uri_country_db_path, and uri_country_db_isp_path
|
||||
but will print a warning to migrate to geodb_module/options.
|
||||
|
||||
- Razor2 razor_fork option added to create separate Razor2 processes
|
||||
and read in the results later asynchronously, increasing throughput,
|
||||
and automatically adjusting rule priorities to -100.
|
||||
|
||||
- DCC checks are now done asynchronously if using dccifd, improving
|
||||
throughput. With dccifd, rule priorities are automatically adjusted
|
||||
to -100. Commercial reputation rules can be ignored with the option
|
||||
"use_dcc_rep 0" to save a few CPU cycles.
|
||||
|
||||
- Pyzor pyzor_fork option added to create separate Pyzor processes and
|
||||
read in the results later asynchronously, increasing throughput, and
|
||||
automatically adjusting rule priorities to -100. Renamed pyzor_max
|
||||
setting to pyzor_count_min. Added pyzor_welcomelist_min and
|
||||
pyzor_welcomelist_factor setting. Also try to improve false
|
||||
positives by ignoring "empty body" messages.
|
||||
|
||||
- API: deprecated $pms->register_async_rule_start() and
|
||||
$pms->register_async_rule_finish() calls though left in for
|
||||
backwards compatibility. Plugins should only use
|
||||
$pms->bgsend_and_start_lookup(), which handles required things
|
||||
Automatically. Direct calls to bgsend or start_lookup should not be
|
||||
used. $pms->bgsend_and_start_lookup() should always contain
|
||||
$ent->{rulename} for correct meta dependency handling. Deprecated
|
||||
start_lookup, get_lookup, lookup_ns, harvest_until_rule_completes,
|
||||
and is_rule_complete.
|
||||
|
||||
- SPF: Mail::SPF is now the only supported perl module and
|
||||
Mail::SPF::Query is deprecated along with the settings
|
||||
do_not_use_mail_spf, and do_not_use_mail_spf_query. SPF lookups are
|
||||
not done asynchronously so using an MTA filter such as pypolicyd-spf
|
||||
or spf-engine can generate Received-SPF for SpamAssassin to parse.
|
||||
|
||||
- unwhitelist_auth now also removes def_whitelist_auth entries
|
||||
- "ALL" pseudo-header now returns decoded headers, so it's usage is
|
||||
consistent with single header matching. Using the :raw option mimics
|
||||
the previous behavior of with undecoded and folded headers.
|
||||
|
||||
- SPF: add unwhitelist_from_spf to remove both whitelist_from_spf and
|
||||
def_whitelist_from_spf entries
|
||||
- New dns_block_rule option handles blocked DNSBLs (Bug 6728)
|
||||
|
||||
- Default SQL schema for userpref.value changed from varchar(100) to
|
||||
varchar(255), no need to modify unless you hit the limit. (Bug 7803)
|
||||
- ASN: Support GeoDB for ASN lookups (asn_use_geodb, asn_prefer_geodb,
|
||||
asn_use_dns).
|
||||
|
||||
- URIDetail can now match full hostname with "host" key
|
||||
- ASN: Default sa-update ruleset doesn't make ASN lookups or add
|
||||
headers anymore. Configure desired methods, asn_use_geodb or
|
||||
asn_use_dns, and add_header clauses manually as described in the
|
||||
plugin documentation. Usage of asn_use_geodb without DNS is
|
||||
recommended unless ASNCIDR is needed. Do not use rules that check
|
||||
metadata X-ASN header! Only the new eval function check_asn()
|
||||
described in plugin manual works reliably.
|
||||
|
||||
- BodyEval: plaintext_body_sig_ratio: eval rules for the (first text/plain
|
||||
MIME part's) body and signature lengths and ratio
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.4
|
||||
----------------------------------------------
|
||||
|
||||
- FromNameSpoof: fns_extrachars parameter default value has been increased to 50
|
||||
|
||||
- nosubject and maxhits tflags now work correctly with sa-compile
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.3
|
||||
----------------------------------------------
|
||||
|
||||
- New subjprefix keyword added, this can be used to add a prefix to
|
||||
email Subject if the original email matches a particular rule
|
||||
|
||||
- New Util::is_fqdn_valid() function to validate hostname (DNS name) format
|
||||
(Bug 7736). To check if a name contains valid TLD, it's still needed to
|
||||
additionally use RegistryBoundaries::is_domain_valid()
|
||||
|
||||
- New OLEVBMacro plugin to detect OLE Macro inside documents attached to emails,
|
||||
this plugin requires Archive::Zip and IO::String Perl modules to work.
|
||||
|
||||
- Due to the dangerous nature of sa-update --allowplugins option, it
|
||||
now prints a warning that --reallyallowplugins is required to use it.
|
||||
This is to make sure all the legacy installations and wiki guides etc
|
||||
still using it needlessly get fixed.
|
||||
|
||||
- TxRep and Awl plugins has been modified to be compatible
|
||||
with latest Postgresql versions.
|
||||
You should upgrade your sql database running the following command:
|
||||
MySQL:
|
||||
"ALTER TABLE `txrep` CHANGE `count` `msgcount` INT(11) NOT NULL DEFAULT '0';"
|
||||
"ALTER TABLE `awl` CHANGE `count` `msgcount` INT(11) NOT NULL DEFAULT '0';"
|
||||
"ALTER TABLE `awl` ADD last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP;"
|
||||
PostgreSQL:
|
||||
"ALTER TABLE txrep RENAME COLUMN count TO msgcount;"
|
||||
"ALTER TABLE awl RENAME COLUMN count TO msgcount;"
|
||||
"ALTER TABLE awl ADD last_hit timestamp NOT NULL default CURRENT_TIMESTAMP;"
|
||||
|
||||
- body_part_scan_size 50000, rawbody_part_scan_size 500000 defaults added (Bug 6582)
|
||||
These enable safer and faster scanning of large emails.
|
||||
|
||||
- ALL pseudo-header now returns decoded headers, so it's usage is consistent
|
||||
with single header matching (:raw returns undecoded and folded like before).
|
||||
|
||||
- RegistryBoundaries did not load 20_aux_tlds.cf properly in older versions.
|
||||
Old hardcoded list is now removed and RB will print "no tlds defined, need
|
||||
to run sa-update" unless it can find list from config files.
|
||||
|
||||
- Deprecated functions: Parser::is_delimited_regexp_valid(),
|
||||
Parser::is_regexp_valid(), Util::regexp_remove_delimiters(),
|
||||
Util::make_qr(). These all are combined into new Util::compile_regexp().
|
||||
|
||||
- DNSEval: add check_rbl_headers to check specific headers in rbl
|
||||
|
||||
- DNSEval: add check_rbl_ns_from to check against an rbl for dns servers
|
||||
|
||||
- HashBL: Add check_hashbl_bodyre, check_hashbl_emails, check_hashbl_uris,
|
||||
hashbl_ignore
|
||||
|
||||
- ASN: Support IPv6 with asn_lookup_ipv6 (Bug 7211)
|
||||
|
||||
- sa-update: New option --httputil to force used download utility
|
||||
|
||||
- Add rules_matching() expression to meta rules
|
||||
|
||||
- Add tflags domains_only/ips_only to DNSEval.pm functions
|
||||
|
||||
- RelayCountry: Added new metadata: X-Spam-Countries-External (_RELAYCOUNTRYEXT_),
|
||||
X-Spam-Countries-Auth (_RELAYCOUNTRYAUTH_), X-Spam-Countries-All (_RELAYCOUNTRYALL_)
|
||||
|
||||
- New tflag "nosubject" for 'body' rules, to stop matching the Subject
|
||||
header which is part of the body text.
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.2
|
||||
----------------------------------------------
|
||||
|
||||
- We now support SHA-512 and SHA-256 signatures for our rules updates.
|
||||
|
||||
- We may stop producing SHA-1 signatures in the near future so upgrading
|
||||
to 3.4.2 is important. sa-update no longer uses these signatures.
|
||||
|
||||
See https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7614
|
||||
|
||||
- freemail_import_whitelist_auth, freemail_import_def_whitelist_auth added (Bug 6451)
|
||||
|
||||
|
||||
New plugins
|
||||
-----------
|
||||
|
||||
There are four new plugins added with this release:
|
||||
|
||||
Mail::SpamAssassin::Plugin::HashBL
|
||||
|
||||
The HashBL plugin is the interface to The Email Blocklist (EBL).
|
||||
The EBL is intended to filter spam that is sent from IP addresses
|
||||
and domains that cannot be blocked without causing significant
|
||||
numbers of false positives.
|
||||
|
||||
Mail::SpamAssassin::Plugin::ResourceLimits
|
||||
|
||||
This plugin leverages BSD::Resource to assure your spamd child processes
|
||||
do not exceed specified CPU or memory limit. If this happens, the child
|
||||
process will die. See the BSD::Resource for more details.
|
||||
|
||||
Mail::SpamAssassin::Plugin::FromNameSpoof
|
||||
|
||||
This plugin allows for detection of the From:name field being used to mislead
|
||||
recipients into thinking an email is from another address. The man page
|
||||
includes examples and we expect to put test rules for this plugin into
|
||||
rulesrc soon!
|
||||
|
||||
Mail::SpamAssassin::Plugin::Phishing
|
||||
|
||||
This plugin finds uris used in phishing campaigns detected by
|
||||
OpenPhish (https://openphish.com) or PhishTank (https://phishtank.com) feeds.
|
||||
|
||||
These plugins are disabled by default. To enable, uncomment
|
||||
the loadplugin configuration options in file v342.pre, or add it to
|
||||
some local .pre file such as local.pre .
|
||||
|
||||
Notable changes
|
||||
---------------
|
||||
|
||||
For security reasons SSLv3 support has been removed from spamc(1).
|
||||
|
||||
GeoIP2 support has been added to RelayCountry and URILocalBL plugins due
|
||||
to GeoIP legacy api deprecations.
|
||||
|
||||
New configuration options
|
||||
-------------------------
|
||||
|
||||
A new template tag _DKIMSELECTOR_ that maps to the DKIM selector (the 's' tag)
|
||||
from valid signatures has been added.
|
||||
|
||||
A 'uri_block_cont' option to URILocalBL plugin to score uris per continent has been added.
|
||||
Possible continent codes are:
|
||||
af, as, eu, na, oc, sa for Africa, Asia, Europe, North America,
|
||||
Oceania and South America.
|
||||
|
||||
The 'country_db_type' and 'country_db_path' options has been added to be able
|
||||
to choose in RelayCountry plugin between GeoIP legacy
|
||||
(discontinued from 04/01/2018), GeoIP2, IP::Country::Fast
|
||||
and IP::Country::DB_File.
|
||||
GeoIP legacy is still the default option but it will be deprecated
|
||||
in future releases.
|
||||
|
||||
A config option 'uri_country_db_path' has been added to be able to choose
|
||||
in URILocalBL plugin between GeoIP legacy and new GeoIP2 api.
|
||||
|
||||
A config option 'resource_limit_cpu' (default: 0 or no limit) has been added
|
||||
to configure how many cpu cycles are allowed on a child process before it dies.
|
||||
|
||||
A config option 'resource_limit_mem' (default: 0 or no limit) has been added
|
||||
to configure the maximum number of bytes of memory allowed both for
|
||||
(virtual) address space bytes and resident set size.
|
||||
|
||||
A new config option 'report_wrap_width' (default: 70) has been added
|
||||
to set the wrap width for description lines in the X-Spam-Report header.
|
||||
|
||||
Notable Internal changes
|
||||
------------------------
|
||||
|
||||
SpamAssassin can cope with new Net::DNS module versions.
|
||||
|
||||
The "bytes" pragma has been remove from both core modules and plugins for
|
||||
better utf-8 compatibility, there has been also some other utf-8 related fixes.
|
||||
|
||||
The spamc(1) client can now be build against OpenSSL 1.1.0.
|
||||
|
||||
The test framework has been switched to Test::More module.
|
||||
|
||||
Other updates
|
||||
-------------
|
||||
|
||||
A list of top-level domains in registrar boundaries was updated.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.1
|
||||
----------------------------------------------
|
||||
|
||||
- The TxRep plugin is now included and disabled by default for new installs.
|
||||
To replace an existing AWL configuration with TxRep, follow the steps below:
|
||||
- Disable AWL
|
||||
- Enable TxRep
|
||||
- Set txrep_factor equal to your previous AWL factor
|
||||
- Set use_txrep to 1
|
||||
|
||||
For more detailed information and more configuration options, consult the
|
||||
documentation in Mail::SpamAssassin::Plugin::TxRep.
|
||||
|
||||
- The $VALID_TLDS_RE global in registrar boundaries is deprecated but kept for
|
||||
third-party plugin compatibility. It will become increasingly out of date
|
||||
and may be removed in a future release.
|
||||
|
||||
See lib/Mail/SpamAssassin/Plugin/FreeMail.pm for an example of the new way
|
||||
to obtain a valid list of TLDs, i.e.
|
||||
|
||||
$self->{main}->{registryboundaries}->{valid_tlds_re}
|
||||
|
||||
- Mail::SpamAssassin::Util::RegistrarBoundaries is being replaced by
|
||||
Mail::SpamAssassin::RegistryBoundaries so that new TLDs can be updated via
|
||||
20_aux_tlds.cf delivered via sa-update.
|
||||
|
||||
***3rd Party Plugin Authors, Please Note***
|
||||
|
||||
The following functions will be removed in the next release after 3.4.1
|
||||
excepting any emergency break/fix releases immediately after 3.4.1:
|
||||
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::is_domain_valid
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::trim_domain
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::split_domain
|
||||
Mail::SpamAssassin::Util::uri_to_domain
|
||||
|
||||
And the following variables will also be removed:
|
||||
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::US_STATES
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::THREE_LEVEL_DOMAINS
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::TWO_LEVEL_DOMAINS
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::VALID_TLDS_RE
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries::VALID_TLDS
|
||||
|
||||
|
||||
This change should only affect 3rd party plugin authors who will need to
|
||||
update their code to utilize Mail::SpamAssassin::RegistryBoundaries
|
||||
instead of the functions and variables in
|
||||
Mail::SpamAssassin::Util::RegistrarBoundaries and the function
|
||||
Mail::SpamAssassin::Util::uri_to_domain which are deprecated and will be
|
||||
removed.
|
||||
|
||||
For example, the $VALID_TLDS_RE global in registrar boundaries is
|
||||
deprecated but kept for third-party plugin compatibility. It will become
|
||||
increasingly out of date and may be removed in a future release.
|
||||
|
||||
See lib/Mail/SpamAssassin/Plugin/FreeMail.pm for an example of the new way
|
||||
to obtain a valid list of TLDs, i.e.
|
||||
|
||||
$self->{main}->{registryboundaries}->{valid_tlds_re}
|
||||
|
||||
- It is now recommended that users uncomment "normalize_charset 1" in
|
||||
local.cf. It will break rules that depend on messages being in non-UTF8
|
||||
encodings, but going forward this will enable more robust unicode rules and
|
||||
will eventually become the default.
|
||||
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.4.0
|
||||
----------------------------------------------
|
||||
|
||||
- When Bayes classification is in use and messages are 'learned' as spam
|
||||
or ham and stored in a database, the Bayes plugin generates internal
|
||||
message IDs of learned messages and stores them in a 'seen' database to
|
||||
avoid re-learning duplicates and accidental un-learning messages that
|
||||
were not previously learned. With changes in bug 5185, the calculation
|
||||
of message IDs in a bayes 'seen' database has changed, so new code can
|
||||
no longer associate new messages with those learned before the change.
|
||||
|
||||
- Note that this change does not affect recognition of old tokens and the
|
||||
classification algorithm, only duplicate detection and unlearning of old
|
||||
messages is affected.
|
||||
|
||||
- Because of this change, if you use Bayes and you are upgrading from a
|
||||
version prior to 3.4.0, you may consider wiping your Bayes database
|
||||
and starting fresh.
|
||||
|
||||
- There is a new optional dependency on Net::Patricia to speed up lookups
|
||||
on internal_networks, trusted_networks or msa_networks when these lists
|
||||
contain a larger number of entries. Consider installing this module to
|
||||
speed up classification.
|
||||
|
||||
- The minimal required version of NetAddr::IP was bumped to 4.010
|
||||
|
||||
- In addition to existing backends, the 3.4.0 introduces support for keeping
|
||||
a Bayes database on a Redis server, either running locally, or accessed
|
||||
over network. Similar to SQL backends, the database may be concurrently
|
||||
used by several hosts running SpamAssassin.
|
||||
|
||||
- For more detail on these and other changes, please see the Announcement
|
||||
file at:
|
||||
http://svn.apache.org/repos/asf/spamassassin/branches/3.4/build/announcements/3.4.0.txt
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.3.0
|
||||
-----------------------------------------------
|
||||
|
||||
- Rules are no longer included with SpamAssassin "out of the box". You will
|
||||
need to immediately run "sa-update", or download the additional rules .tgz
|
||||
package and run "sa-update --install" with it, to get a ruleset.
|
||||
|
||||
- The BETA label has been taken off of the SpamAssassin SQL support. Please
|
||||
be aware that occasional changes may still be made to this area of the
|
||||
code. You should be sure to read this upgrade document each time you
|
||||
install a new version to determine if any SQL updates need to be made to
|
||||
your local installation.
|
||||
|
||||
- The DKIM plugin is now enabled by default for new installs, if the perl
|
||||
module Mail::DKIM is installed. However, installation of SpamAssassin
|
||||
will not overwrite existing .pre configuration files, so to use DKIM when
|
||||
upgrading from a previous release that did not use DKIM, a directive:
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::DKIM
|
||||
|
||||
will need to be uncommented in file "v312.pre", or added to some
|
||||
other .pre file, such as local.pre.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.2.0
|
||||
-----------------------------------------------
|
||||
|
||||
- The "127/8" network, including 127.0.0.1, is now always implicitly part of
|
||||
"trusted_networks" and "internal_networks". It's impossible to remove these
|
||||
from the trusted/internal sets, since if you cannot trust the host where
|
||||
SpamAssassin is running, you cannot trust SpamAssassin itself. If you
|
||||
previously had "trusted_networks" and "internal_networks" lines listing those
|
||||
hosts, you should now remove them, otherwise a minor (non-lint-error) warning
|
||||
will be output.
|
||||
|
||||
- For ISPs -- version 3.2.0 now includes a new way to specify Mail Submission
|
||||
Agents, relay hosts which accept mail from your own users and authenticates
|
||||
them appropriately. See the Mail::SpamAssassin::Conf manual page for the
|
||||
"msa_networks" setting.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.1.0
|
||||
-----------------------------------------------
|
||||
|
||||
- A significant amount of core functionality has been moved into
|
||||
plugins. These include, AWL (auto-whitelist), DCC, Pyzor, Razor2,
|
||||
SpamCop reporting and TextCat. For information on configuring these
|
||||
plugins please refer to their individual documentation:
|
||||
perldoc Mail::SpamAssassin::Plugin::* (ie AWL, DCC, etc)
|
||||
|
||||
- There are now multiple files read to enable plugins in the
|
||||
/etc/mail/spamassassin directory; previously only one, "init.pre" was
|
||||
read. Now both "init.pre", "v310.pre", and any other files ending
|
||||
in ".pre" will be read. As future releases are made, new plugins
|
||||
will be added to new files named according to the release they're
|
||||
added in.
|
||||
|
||||
- Due to license restrictions the DCC and Razor2 plugins are disabled
|
||||
by default. We encourage you to read the appropriate license
|
||||
yourself and decide if you are able to re-enable the plugins for
|
||||
your site.
|
||||
|
||||
- The use_auto_whitelist config option has been moved to a user config
|
||||
option, this allows individual users to turn on or off whitelisting
|
||||
regardless of what is set in the system config. If you would like to
|
||||
disable AWL (auto-whitelist) on a site-wide basis, then you can comment
|
||||
out the plugin in "v310.pre".
|
||||
|
||||
- The bayes_auto_learn_threshold_* config options for bayes have moved
|
||||
to a plugin. In general the default should work just fine however
|
||||
if you are interested in changing these values you should see:
|
||||
perldoc Mail::SpamAssassin::Plugin::AutoLearnThreshold
|
||||
|
||||
- The AWL support for NDBM_File databases has been dropped, due to a
|
||||
bug in that package which renders it useless (bug 4353). It appears
|
||||
that SDBM_File, the package which will be used instead, is fully
|
||||
compatible with NDBM however, so this should be unnoticeable.
|
||||
|
||||
- The prefork algorithm for spamd has been changed. In this version spamd
|
||||
will attempt to keep a small number of "hot" child processes as busy as
|
||||
possible, and keep any others as idle as possible, using something
|
||||
similar to the Apache httpd server scaling algorithm. This reduces
|
||||
memory usage and swapping. You can use the --round-robin switch for
|
||||
spamd to disable this scaling algorithm, and the behaviour seen in the
|
||||
3.0.x versions will be used instead, where all processes receive an
|
||||
equal load and no scaling takes place.
|
||||
|
||||
- As of 3.1.0, in addition to the generic BayesSQL support (via
|
||||
Mail::SpamAssassin::BayesStore::SQL) usable by multiple database
|
||||
drivers there is now specific support for MySQL 4.1+ and
|
||||
PostgreSQL. This support is based on non-standard features present
|
||||
in both database servers that allow for various performance boosts.
|
||||
|
||||
If you were using the previous BayesSQL support with MySQL, and
|
||||
already have MySQL 4.1+ installed you can begin using the new module
|
||||
immediately by replacing the bayes_store_module line in your
|
||||
configuration with: Mail::SpamAssassin::BayesStore::MySQL
|
||||
|
||||
We do however recommend that you switch your MySQL tables over to
|
||||
InnoDB for better data integrity and multi user support. You can
|
||||
most often do this via a simple ALTER TABLE command, refer to the
|
||||
MySQL documentation for more information.
|
||||
|
||||
If you were previously using PostgreSQL for your bayes database then
|
||||
we STRONGLY recommend switching to the PostgreSQL specific backend:
|
||||
Mail::SpamAssassin::BayesStore::PgSQL
|
||||
To switch to this backend you should first run sa-learn --backup to
|
||||
make a backup of your existing data and then drop and recreate the
|
||||
database following the instructions in sql/README.bayes. Then you
|
||||
can restore the database with sa-learn --restore. If you have
|
||||
multiple users then you will have to run --backup and --restore for
|
||||
each user to fully restore the database.
|
||||
|
||||
- See http://wiki.apache.org/spamassassin/UpgradeTo310 for a
|
||||
supplementary list of upgrade notes. It will be updated with any
|
||||
upgrade notes not captured in this document.
|
||||
|
||||
- Finally, this document is likely not complete. Other configuration
|
||||
options/arguments may have changed from older versions, etc. It would
|
||||
be good to double-check any custom configuration options to make sure
|
||||
they're still valid. This could be as simple as running "spamassassin
|
||||
--lint", or more complex, as required by the environment.
|
||||
|
||||
|
||||
Note for Users Upgrading to SpamAssassin 3.0.x
|
||||
----------------------------------------------
|
||||
|
||||
- The SpamAssassin 2.6x release series was the last set of releases to
|
||||
officially support perl versions earlier than perl 5.6.1. If you are
|
||||
using an earlier version of perl, you will need to upgrade before you
|
||||
can use the 3.0.0 version of SpamAssassin. You will also want to make
|
||||
sure that you have the appropriate versions of required and optional
|
||||
modules as they may have changed from old versions. The INSTALL
|
||||
document has the modules and version requirements listed.
|
||||
|
||||
- See http://wiki.apache.org/spamassassin/UpgradeTo300 for a
|
||||
supplementary list of upgrade notes. It will be updated with any
|
||||
upgrade notes not captured in this document.
|
||||
|
||||
- SpamAssassin 3.0.0 has a significantly different API (Application Program
|
||||
Interface) from the 2.x series of code. This means that if you use
|
||||
SpamAssassin through a third-party utility (milter, etc,) you need to make
|
||||
sure you have an updated version which supports 3.0.0. See
|
||||
http://wiki.apache.org/spamassassin/UpgradeTo300 for information about
|
||||
third-party software.
|
||||
|
||||
- The --auto-whitelist, --whitelist and -a options for "spamd" and
|
||||
"spamassassin" to turn on the auto-whitelist have been removed and
|
||||
replaced by the "use_auto_whitelist" configuration option which is
|
||||
also now turned on by default.
|
||||
|
||||
- The --virtual-config switch for spamd had to be dropped, due to licensing
|
||||
issues. It is replaced by the --virtual-config-dir switch.
|
||||
|
||||
- The "rewrite_subject" and "subject_tag" configuration options were
|
||||
deprecated and are now removed. Instead, using "rewrite_header Subject
|
||||
[your desired setting]". e.g.
|
||||
|
||||
rewrite_subject 1
|
||||
subject_tag ****SPAM(_SCORE_)****
|
||||
|
||||
becomes
|
||||
|
||||
rewrite_header Subject ****SPAM(_SCORE_)****
|
||||
|
||||
- The "sa-learn --rebuild" command has been deprecated; please use
|
||||
"sa-learn --sync" instead. The --rebuild option will remain temporarily
|
||||
for backward compatibility.
|
||||
|
||||
- The Bayesian storage modules have been completely re-written and now
|
||||
include Berkeley DB (DBM) storage as well as SQL based storage (see
|
||||
sql/README.bayes for more information). In addition, a new format
|
||||
has been introduced for the bayes database that stores tokens in fixed
|
||||
length hashes (Bayes v3). All DBM databases should be automatically
|
||||
converted to this new format the first time they are opened for write.
|
||||
You can manually perform the upgrade by running "sa-learn --sync"
|
||||
from the command line.
|
||||
|
||||
Due to the database format change, you will want to do something like
|
||||
this when upgrading:
|
||||
|
||||
- stop running spamassassin/spamd (ie: you don't want it to be running
|
||||
during the upgrade)
|
||||
- run "sa-learn --rebuild", this will sync your journal. if you skip
|
||||
this step, any data from the journal will be lost when the DB is
|
||||
upgraded.
|
||||
- upgrade SA to 3.0.0
|
||||
- run "sa-learn --sync", which will cause the db format to be upgraded.
|
||||
if you want to see what is going on, you can add the "-D" option.
|
||||
- test the new database by running some sample mails through
|
||||
SpamAssassin, and/or at least running "sa-learn --dump" to make sure
|
||||
the data looks valid.
|
||||
- start running spamassassin/spamd again
|
||||
|
||||
If, instead of uprading your Bayes database, you want to wipe it and
|
||||
start fresh, you can run "sa-learn --clear" to safely remove your
|
||||
Bayes database files. If the --clear command issues an error then
|
||||
you can simply delete the Bayes database files ("bayes_*") while SA
|
||||
is not running; SpamAssassin will recreate them in the current
|
||||
format when it runs.
|
||||
|
||||
- "spamd" now has a default max-children setting of 5; no more than 5
|
||||
child scanner processes will be run in parallel. Previously, there was
|
||||
no default limit unless you specified the "-m" switch when starting
|
||||
spamd.
|
||||
|
||||
- If you are using a UNIX machine with all database files on local disks,
|
||||
and no sharing of those databases across NFS filesystems, you can use a
|
||||
more efficient, but non-NFS-safe, locking mechanism. Do this by adding
|
||||
the line "lock_method flock" to the /etc/mail/spamassassin/local.cf
|
||||
file. This is strongly recommended if you're not using NFS, as it is
|
||||
much faster than the NFS-safe locker.
|
||||
|
||||
- Please note that the use of the following commandline parameters for
|
||||
spamassassin and spamd have been deprecated and may be removed in
|
||||
upcoming versions of SpamAssassin. Please discontinue usage of these
|
||||
options:
|
||||
|
||||
in the 2.6x series: --add-from, --pipe, -F, --stop-at-threshold,
|
||||
-S, -P (spamassassin only)
|
||||
in the 3.0.x series: --auto-whitelist, -a, --whitelist-factory, -M,
|
||||
--warning-from, -w, --log-to-mbox, -l
|
||||
|
||||
- user_scores_sql_table is no longer supported. If you need to use a table
|
||||
name, other than the default, create a custom query using the
|
||||
user_scores_sql_custom_query config option.
|
||||
|
||||
- SpamAssassin runs in "taint mode" by default for improved security.
|
||||
Certain third-party modules may be incompatible with taint mode.
|
||||
|
||||
- 2.6x deprecated the use of the "check_bayes_db" script, and it
|
||||
has been removed in 3.0.0. Please see the sa-learn man/pod
|
||||
documentation for more info.
|
||||
|
||||
- Finally, this document is likely not complete. Other configuration
|
||||
options/arguments may have changed from older versions, etc. It would
|
||||
be good to double-check any custom configuration options to make sure
|
||||
they're still valid. This could be as simple as running "spamassassin
|
||||
--lint", or more complex, as required by the environment.
|
||||
|
||||
An example: "require_version <version>" hasn't changed itself, but the
|
||||
internal version representation is now "x.yyyzzz" instead of "x.yz"
|
||||
which could cause issues if "require_version 3.00" is expected to work
|
||||
(it won't, it needs to be "require_version 3.000000").
|
||||
|
||||
|
||||
Note for Users Upgrading from SpamAssassin 2.5x
|
||||
-----------------------------------------------
|
||||
|
||||
- Due to major reliability shortcomings in the database support libraries
|
||||
other than DB_File, we now require that the DB_File module be installed
|
||||
to use SpamAssassin's Bayes rules.
|
||||
|
||||
SpamAssassin will still work without DB_File installed, but the Bayes
|
||||
support will be disabled.
|
||||
|
||||
If you install DB_File and wish to import old Bayes database data, each
|
||||
user with a Bayes db should run "sa-learn --import" to copy old entries
|
||||
from the other formats into a new DB_File file.
|
||||
|
||||
Due to the database library change, and the change to the database
|
||||
format itself, you will want to do something like this when upgrading:
|
||||
|
||||
- stop running spamassassin/spamd (ie: you don't want it to be running
|
||||
during the upgrade)
|
||||
- run "sa-learn --rebuild", this will sync your journal. if you skip
|
||||
this step, any data from the journal will be lost when the DB is
|
||||
upgraded.
|
||||
- install DB_File module if necessary
|
||||
- upgrade SA to 3.0.0
|
||||
- if you were using another database module previously, run "sa-learn
|
||||
--import" to migrate the data into new DB_File files
|
||||
- run "sa-learn --sync", which will cause the db format to be upgraded.
|
||||
if you want to see what is going on, you can add the "-D" option.
|
||||
- test the new database by running some sample mails through
|
||||
SpamAssassin, and/or at least running "sa-learn --dump" to make sure
|
||||
the data looks valid.
|
||||
- start running spamassassin/spamd again
|
||||
|
||||
Obviously the steps will be different depending on your environment, but
|
||||
you get the idea. :)
|
||||
|
||||
|
||||
Note For Users Upgrading From SpamAssassin 2.3x or 2.4x
|
||||
-------------------------------------------------------
|
||||
|
||||
- SpamAssassin no longer includes code to handle local mail delivery, as
|
||||
it was not reliable enough, compared to procmail. So now, if you relied
|
||||
on spamassassin to write the mail into your mail folder, you'll have to
|
||||
change your setup to use procmail as detailed below. If you used
|
||||
spamassassin to filter your mail and then something else wrote it into a
|
||||
folder for you, then you should be fine.
|
||||
|
||||
- Support for versions of the optional Mail::Audit module is no longer
|
||||
included.
|
||||
|
||||
- The default mode of tagging (which used to be ***SPAM*** in the subject
|
||||
line) no longer takes place. Instead the message is rewritten. If an
|
||||
incoming message is tagged as spam, instead of modifying the original
|
||||
message, SpamAssassin will create a new report message and attach the
|
||||
original message as a message/rfc822 MIME part (ensuring the original
|
||||
message is completely preserved and easier to recover). If you do not
|
||||
want to modify the body of incoming spam, use the "report_safe" option.
|
||||
The "report_header" and "defang_mime" options have been removed as a
|
||||
result.
|
||||
- sa-update: New --score-multiplier, --score-limit, and --forcemirror
|
||||
options added.
|
||||
#1 forcemirror: forces sa-update to use a specific mirror server,
|
||||
#2 score-multiplier: adjust all scores from update channel by a
|
||||
given multiplier to quickly level set scores to match your
|
||||
preferred threshold
|
||||
#3 score-limit adjusts all scores from update channel over a
|
||||
specified limit to a new limit
|
||||
|
||||
- New dns_options "nov4" and "nov6" added. IMPORTANT:; You must set
|
||||
nov6 if your DNS resolver is filtering IPv6 AAAA replies.
|
||||
|
||||
- API: Added Message::get_pristine_body_digest(),
|
||||
Message::get_msgid(), and Message::generate_msgid()
|
||||
functions. Removed deprecated private Plugin::Bayes::get_msgid()
|
||||
function.
|
||||
|
||||
- Bayes and TxRep seen Message-ID tracking hashing method changed. No
|
||||
actions are required. If re-learning some old messages, they might
|
||||
be learned twice but old IDs should expire automatically.
|
||||
|
||||
- report_charset defaults now to UTF-8.
|
||||
|
||||
- Meta rules inherit net tflag setting from dependencies (Bug 7735)
|
||||
|
||||
- BodyEval: Added plaintext_body_sig_ratio eval rules for the first
|
||||
text/plain MIME part's body and signature length ratio.
|
||||
|
||||
- API: Now supports multiple calls of $pms->test_log() for
|
||||
rules. Added $pms->check_cleanup() to finalize tags, reports,
|
||||
etc. Deprecated internal $pms->{test_log_msgs}, renamed to
|
||||
$pms->{test_logs}. Deprecated $pms->clear_test_state() as it is not
|
||||
needed anymore. $pms->test_log() now accepts $rulename as second
|
||||
argument.
|
||||
|
||||
- URIDNSBL: urirhsbl/urirhssub rules support "notrim" tflag to force
|
||||
querying the full hostname instead of just the domain. This works
|
||||
best if the specific uribl supports this mode. (Bug 7835)
|
||||
|
||||
- Removed deprecated --auth-ident and --ident-timeout options from
|
||||
spamd
|
||||
|
||||
- MIMEHeader: support matching ALL header, tflags range, and tflags
|
||||
concat
|
||||
|
||||
- Autolearn: add new tflags autolearn_header/autolearn_body. These can
|
||||
force a rule to count as header or body points accordingly. (Bug
|
||||
7907)
|
||||
|
||||
- SSL client certificate support for spamc/spamd is now easier. New
|
||||
spamc options --ssl-cert, --ssl-key, --ssl-ca-file, and
|
||||
--ssl-ca-path. New spamd options --ssl-verify, --ssl-ca-file, and
|
||||
--ssl-ca-path (Bug 7267)
|
||||
|
||||
- ArchiveIterator now automatically uncompressed all gzip, bzip2, xz,
|
||||
lz4, lzip, and lzo-compressed files (Bug 7598). These apply to
|
||||
spamassassin and sa-learn commands also.
|
||||
|
||||
- New DMARC policy check plugin.
|
||||
|
||||
- New project maintained DecodeShortURLs plugin which may not be
|
||||
directly compatible with rules from other third party plugins. See
|
||||
The plugin documentation for configuration and rule format.
|
||||
|
||||
- Installing module Net::CIDR::Lite allows the use of dash-separated
|
||||
IP range format (e.g. 192.168.1.1-192.168.255.255) for NetSet tables
|
||||
including internal_networks, trusted_networks, msa_networks, and
|
||||
uri_local_cidr.
|
||||
|
||||
- The HashBL plugin in 342.pre is now enabled by default.
|
||||
|
||||
- HeaderEval check_for_unique_subject_id() function is deprecated.
|
||||
|
||||
(end of UPGRADE)
|
||||
|
||||
//vim:tw=74:
|
||||
|
@ -84,7 +84,7 @@ If you use procmail, or haven't decided on any of the above examples:
|
||||
If you want to use SpamAssassin site-wide:
|
||||
|
||||
- take a look at the notes on the Wiki website, currently at
|
||||
<http://wiki.apache.org/spamassassin/UsingSiteWide>. You will probably
|
||||
<https://wiki.apache.org/spamassassin/UsingSiteWide>. You will probably
|
||||
want to use 'spamd' (see below). You may want to investigate the
|
||||
new Apache mod_perl module, in the 'spamd-apache2' directory, too.
|
||||
|
||||
@ -101,11 +101,11 @@ If you want to use SpamAssassin site-wide:
|
||||
add the line 'DROPPRIVS=yes' at the top of the file.
|
||||
|
||||
|
||||
The Auto-Whitelist
|
||||
The Auto-Welcomelist
|
||||
------------------
|
||||
|
||||
The auto-whitelist is enabled using the 'use_auto_whitelist' option.
|
||||
(See http://wiki.apache.org/spamassassin/AutoWhitelist for details on
|
||||
The auto-welcomelist is enabled using the 'use_auto_welcomelist' option.
|
||||
(See https://wiki.apache.org/spamassassin/AutoWelcomelist for details on
|
||||
how it works, if you're curious.)
|
||||
|
||||
|
||||
@ -113,15 +113,6 @@ Other Installation Notes
|
||||
------------------------
|
||||
|
||||
|
||||
- Hashcash is a useful system; it requires that senders exercise a
|
||||
CPU-intensive task before they can send mail to you, so we give that
|
||||
some bonus points. However, it requires that you list what addresses
|
||||
you expect to receive mail for, by adding 'hashcash_accept' lines to
|
||||
your ~/.spamassassin/user_prefs or /etc/mail/spamassassin/local.cf
|
||||
files. See the Mail::SpamAssassin::Plugin::Hashcash manual page for
|
||||
details on how to specify these.
|
||||
|
||||
|
||||
- SpamAssassin now uses a temporary file in /tmp (or $TMPDIR, if that's
|
||||
set in the environment) for Pyzor and DCC checks. Make sure that this
|
||||
directory is either (a) not writable by other users, or (b) not shared
|
||||
@ -161,7 +152,7 @@ Other Installation Notes
|
||||
- A very handy new feature is SPF support, which allows you to check
|
||||
that the message sender is permitted by their domain to send from the
|
||||
IP address used. This has the potential to greatly cut down on mail
|
||||
forgery. (see http://spf.pobox.com/ for more details.)
|
||||
forgery.
|
||||
|
||||
|
||||
- MDaemon users should add this line to their "local.cf" file:
|
||||
@ -219,7 +210,7 @@ Other Installation Notes
|
||||
to do this, take a look here [1] for a simple forwarding-based
|
||||
alternative.
|
||||
|
||||
[1]: http://wiki.apache.org/spamassassin/SpamTrapping
|
||||
[1]: https://wiki.apache.org/spamassassin/SpamTrapping
|
||||
|
||||
|
||||
- Scores and other user preferences can now be loaded from, and Bayes
|
||||
@ -242,7 +233,7 @@ Other Installation Notes
|
||||
|
||||
|
||||
- Lots more ways to integrate SpamAssassin can be read at
|
||||
http://wiki.SpamAssassin.org/ .
|
||||
https://wiki.apache.org/spamassassin/ .
|
||||
|
||||
|
||||
(end of USAGE)
|
||||
|
@ -239,6 +239,7 @@ sub lint_rule_text {
|
||||
my $pretext = q{
|
||||
loadplugin Mail::SpamAssassin::Plugin::Check
|
||||
loadplugin Mail::SpamAssassin::Plugin::URIDNSBL
|
||||
util_rb_tld com # skip "need to run sa-update" warn
|
||||
use_bayes 0
|
||||
};
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
#!/usr/bin/perl
|
||||
|
||||
BEGIN {
|
||||
require Digest::SHA; import Digest::SHA qw(sha256_hex sha512_hex);
|
||||
require Digest::SHA; Digest::SHA->import(qw(sha256_hex sha512_hex));
|
||||
}
|
||||
|
||||
$/=undef;
|
||||
|
@ -1,7 +1,7 @@
|
||||
#!/usr/bin/perl
|
||||
|
||||
BEGIN {
|
||||
require Digest::SHA; import Digest::SHA qw(sha256_hex sha512_hex);
|
||||
require Digest::SHA; Digest::SHA->import(qw(sha256_hex sha512_hex));
|
||||
}
|
||||
|
||||
$/=undef;
|
||||
|
@ -6,9 +6,9 @@ SpamAssassin can now load users' score files from an LDAP server. The concept
|
||||
here is to have a web application (PHP/perl/ASP/etc.) that will allow users to
|
||||
be able to update their local preferences on how SpamAssassin will filter their
|
||||
e-mail. The most common use for a system like this would be for users to be
|
||||
able to update the white list of addresses (whitelist_from) without the need
|
||||
for them to update their $HOME/.spamassassin/user_prefs file. It is also quite
|
||||
common for users listed in /etc/passwd to not have a home directory,
|
||||
able to update the white list of addresses (welcomelist_from, previously whitelist_from)
|
||||
without the need for them to update their $HOME/.spamassassin/user_prefs file.
|
||||
It is also quite common for users listed in /etc/passwd to not have a home directory,
|
||||
therefore, the only way to have their own local settings would be through a
|
||||
database or LDAP server.
|
||||
|
||||
@ -38,8 +38,8 @@ Examples:
|
||||
|
||||
If the user_scores_dsn option does not exist, SpamAssassin will not attempt
|
||||
to use an LDAP server for retrieving users' preferences. Note that this will
|
||||
NOT look for test rules, only local scores, whitelist_from(s), and
|
||||
required_score.
|
||||
NOT look for test rules, only local scores, welcomelist_from(s) (previously whitelist_from),
|
||||
and required_score.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
@ -68,7 +68,7 @@ our inetOrgPerson subclass.
|
||||
Here's an example for openldap's /etc/openldap/schema/inetorgperson.schema :
|
||||
|
||||
# SpamAssassin
|
||||
# see http://SpamAssassin.org/ .
|
||||
# see https://SpamAssassin.org/ .
|
||||
attributetype ( 2.16.840.1.113730.3.1.217
|
||||
NAME 'spamassassin'
|
||||
DESC 'SpamAssassin user preferences settings'
|
||||
@ -97,9 +97,6 @@ If you do not see the above text, then the LDAP query was not successful, and
|
||||
you should see any error messages reported. <username> should be the user
|
||||
that was passed to spamd and is usually the user executing spamc.
|
||||
|
||||
If you need to set up LDAP, a good guide is here:
|
||||
http://yolinux.com/TUTORIALS/LinuxTutorialLDAP.html
|
||||
|
||||
To test LDAP support using the SpamAssassin test suite, you need to
|
||||
perform a little bit of manual configuration first. See the file
|
||||
"ldap/README.testing" for details.
|
||||
@ -111,6 +108,6 @@ operation of LDAP support may change at any time with future releases of SA.
|
||||
******
|
||||
|
||||
Please send any comments to <kris at koehntopp.de> and file bugs via
|
||||
<http://issues.apache.org/SpamAssassin/>.
|
||||
<https://issues.apache.org/SpamAssassin/>.
|
||||
|
||||
Kristian Köhntopp
|
||||
|
@ -40,7 +40,7 @@ Mail::SpamAssassin - Spam detector and markup engine
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin is a module to identify spam using several methods
|
||||
including text analysis, internet-based realtime blacklists, statistical
|
||||
including text analysis, internet-based realtime blocklists, statistical
|
||||
analysis, and internet-based hashing algorithms.
|
||||
|
||||
Using its rule base, it uses a wide range of heuristic tests on mail
|
||||
@ -64,7 +64,7 @@ use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
require 5.006_001;
|
||||
require v5.14.0;
|
||||
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants;
|
||||
@ -87,11 +87,10 @@ use Time::HiRes qw(time);
|
||||
use Cwd;
|
||||
use Config;
|
||||
|
||||
our $VERSION = "3.004006"; # update after release (same format as perl $])
|
||||
our $VERSION = "4.000000"; # update after release (same format as perl $])
|
||||
#our $IS_DEVEL_BUILD = 1; # 1 for devel build
|
||||
our $IS_DEVEL_BUILD = 0; # 0 for release versions including rc & pre releases
|
||||
|
||||
|
||||
# Used during the prerelease/release-candidate part of the official release
|
||||
# process. If you hacked up your SA, you should add a version_tag to your .cf
|
||||
# files; this variable should not be modified.
|
||||
@ -101,18 +100,18 @@ our @ISA = qw();
|
||||
|
||||
# SUB_VERSION is now just <yyyy>-<mm>-<dd>
|
||||
our $SUB_VERSION = 'svnunknown';
|
||||
if ('$LastChangedDate: 2021-04-09 19:54:52 +1200 (Fri, 09 Apr 2021) $' =~ ':') {
|
||||
# Subversion keyword "$LastChangedDate: 2021-04-09 19:54:52 +1200 (Fri, 09 Apr 2021) $" has been successfully expanded.
|
||||
if ('$LastChangedDate: 2022-12-14 02:29:30 +0000 (Wed, 14 Dec 2022) $' =~ ':') {
|
||||
# Subversion keyword "$LastChangedDate: 2022-12-14 02:29:30 +0000 (Wed, 14 Dec 2022) $" has been successfully expanded.
|
||||
# Doesn't happen with automated launchpad builds:
|
||||
# https://bugs.launchpad.net/launchpad/+bug/780916
|
||||
$SUB_VERSION = (split(/\s+/,'$LastChangedDate: 2021-04-09 19:54:52 +1200 (Fri, 09 Apr 2021) $ updated by SVN'))[1];
|
||||
$SUB_VERSION = (split(/\s+/,'$LastChangedDate: 2022-12-14 02:29:30 +0000 (Wed, 14 Dec 2022) $ updated by SVN'))[1];
|
||||
}
|
||||
|
||||
|
||||
if (defined $IS_DEVEL_BUILD && $IS_DEVEL_BUILD) {
|
||||
if ('$LastChangedRevision: 1888548 $' =~ ':') {
|
||||
# Subversion keyword "$LastChangedRevision: 1888548 $" has been successfully expanded.
|
||||
push(@EXTRA_VERSION, ('r' . qw{$LastChangedRevision: 1888548 $ updated by SVN}[1]));
|
||||
if ('$LastChangedRevision: 1905971 $' =~ ':') {
|
||||
# Subversion keyword "$LastChangedRevision: 1905971 $" has been successfully expanded.
|
||||
push(@EXTRA_VERSION, ('r' . qw{$LastChangedRevision: 1905971 $ updated by SVN}[1]));
|
||||
} else {
|
||||
push(@EXTRA_VERSION, ('r' . 'svnunknown'));
|
||||
}
|
||||
@ -428,8 +427,8 @@ sub new {
|
||||
$self->timer_enable();
|
||||
}
|
||||
|
||||
$self->{conf} ||= new Mail::SpamAssassin::Conf ($self);
|
||||
$self->{plugins} = Mail::SpamAssassin::PluginHandler->new ($self);
|
||||
$self->{conf} ||= Mail::SpamAssassin::Conf->new($self);
|
||||
$self->{plugins} = Mail::SpamAssassin::PluginHandler->new($self);
|
||||
|
||||
$self->{save_pattern_hits} ||= 0;
|
||||
|
||||
@ -469,7 +468,7 @@ sub create_locker {
|
||||
# for slow but safe, by keeping in quotes
|
||||
eval '
|
||||
use Mail::SpamAssassin::Locker::'.$class.';
|
||||
$self->{locker} = new Mail::SpamAssassin::Locker::'.$class.' ($self);
|
||||
$self->{locker} = Mail::SpamAssassin::Locker::'.$class.'->new($self);
|
||||
1;
|
||||
' or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
@ -728,6 +727,9 @@ sub init_learner {
|
||||
if (exists $opts->{$k}) { $self->{$v} = $opts->{$k}; }
|
||||
}
|
||||
|
||||
# Set flag which can be checked from plugins etc
|
||||
$self->{learning} = 1;
|
||||
|
||||
return \%ret;
|
||||
}
|
||||
|
||||
@ -757,6 +759,7 @@ Finish learning.
|
||||
sub finish_learner {
|
||||
my $self = shift;
|
||||
$self->{bayes_scanner}->force_close(1) if $self->{bayes_scanner};
|
||||
delete $self->{learning};
|
||||
1;
|
||||
}
|
||||
|
||||
@ -832,7 +835,7 @@ sub signal_user_changed {
|
||||
$self->{bayes_scanner}->finish() if $self->{bayes_scanner};
|
||||
if ($self->{conf}->{use_bayes}) {
|
||||
require Mail::SpamAssassin::Bayes;
|
||||
$self->{bayes_scanner} = new Mail::SpamAssassin::Bayes ($self);
|
||||
$self->{bayes_scanner} = Mail::SpamAssassin::Bayes->new($self);
|
||||
} else {
|
||||
delete $self->{bayes_scanner} if $self->{bayes_scanner};
|
||||
}
|
||||
@ -949,73 +952,84 @@ sub revoke_as_spam {
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->add_address_to_whitelist ($addr, $cli_p)
|
||||
=item $f->add_address_to_welcomelist ($addr, $cli_p)
|
||||
|
||||
Previously add_address_to_whitelist which will work interchangeably until 4.1.
|
||||
|
||||
Given a string containing an email address, add it to the automatic
|
||||
whitelist database.
|
||||
welcomelist database.
|
||||
|
||||
If $cli_p is set then underlying plugin may give visual feedback on additions/failures.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_address_to_whitelist {
|
||||
sub add_address_to_welcomelist {
|
||||
my ($self, $addr, $cli_p) = @_;
|
||||
|
||||
$self->call_plugins("whitelist_address", { address => $addr,
|
||||
$self->call_plugins("welcomelist_address", { address => $addr,
|
||||
cli_p => $cli_p });
|
||||
}
|
||||
*add_address_to_whitelist = \&add_address_to_welcomelist; # removed in 4.1
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->add_all_addresses_to_whitelist ($mail, $cli_p)
|
||||
=item $f->add_all_addresses_to_welcomelist ($mail, $cli_p)
|
||||
|
||||
Previously add_all_addresses_to_whitelist which will work interchangeably until 4.1.
|
||||
|
||||
Given a mail message, find as many addresses in the usual headers (To, Cc, From
|
||||
etc.), and the message body, and add them to the automatic whitelist database.
|
||||
etc.), and the message body, and add them to the automatic welcomelist database.
|
||||
|
||||
If $cli_p is set then underlying plugin may give visual feedback on additions/failures.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_all_addresses_to_whitelist {
|
||||
sub add_all_addresses_to_welcomelist {
|
||||
my ($self, $mail_obj, $cli_p) = @_;
|
||||
|
||||
foreach my $addr ($self->find_all_addrs_in_mail ($mail_obj)) {
|
||||
$self->call_plugins("whitelist_address", { address => $addr,
|
||||
$self->call_plugins("welcomelist_address", { address => $addr,
|
||||
cli_p => $cli_p });
|
||||
}
|
||||
}
|
||||
*add_all_addresses_to_whitelist = \&add_all_addresses_to_welcomelist; # removed in 4.1
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->remove_address_from_whitelist ($addr, $cli_p)
|
||||
=item $f->remove_address_from_welcomelist ($addr, $cli_p)
|
||||
|
||||
Previously remove_address_from_whitelist which will work interchangeably until 4.1.
|
||||
|
||||
Given a string containing an email address, remove it from the automatic
|
||||
whitelist database.
|
||||
welcomelist database.
|
||||
|
||||
If $cli_p is set then underlying plugin may give visual feedback on additions/failures.
|
||||
|
||||
=cut
|
||||
|
||||
sub remove_address_from_whitelist {
|
||||
sub remove_address_from_welcomelist {
|
||||
my ($self, $addr, $cli_p) = @_;
|
||||
|
||||
$self->call_plugins("remove_address", { address => $addr,
|
||||
cli_p => $cli_p });
|
||||
}
|
||||
*remove_address_from_whitelist = \&remove_address_from_welcomelist; # removed in 4.1
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->remove_all_addresses_from_whitelist ($mail, $cli_p)
|
||||
=item $f->remove_all_addresses_from_welcomelist ($mail, $cli_p)
|
||||
|
||||
Previously remove_all_addresses_from_whitelist which will work interchangeably until 4.1.
|
||||
|
||||
Given a mail message, find as many addresses in the usual headers (To, Cc, From
|
||||
etc.), and the message body, and remove them from the automatic whitelist
|
||||
etc.), and the message body, and remove them from the automatic welcomelist
|
||||
database.
|
||||
|
||||
If $cli_p is set then underlying plugin may give visual feedback on additions/failures.
|
||||
|
||||
=cut
|
||||
|
||||
sub remove_all_addresses_from_whitelist {
|
||||
sub remove_all_addresses_from_welcomelist {
|
||||
my ($self, $mail_obj, $cli_p) = @_;
|
||||
|
||||
foreach my $addr ($self->find_all_addrs_in_mail ($mail_obj)) {
|
||||
@ -1023,30 +1037,36 @@ sub remove_all_addresses_from_whitelist {
|
||||
cli_p => $cli_p });
|
||||
}
|
||||
}
|
||||
*remove_all_addresses_from_whitelist = \&remove_all_addresses_from_welcomelist; # removed in 4.1
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->add_address_to_blacklist ($addr, $cli_p)
|
||||
=item $f->add_address_to_blocklist ($addr, $cli_p)
|
||||
|
||||
Previously add_address_to_blacklist which will work interchangeably until 4.1.
|
||||
|
||||
Given a string containing an email address, add it to the automatic
|
||||
whitelist database with a high score, effectively blacklisting them.
|
||||
welcomelist database with a high score, effectively blocklisting them.
|
||||
|
||||
If $cli_p is set then underlying plugin may give visual feedback on additions/failures.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_address_to_blacklist {
|
||||
sub add_address_to_blocklist {
|
||||
my ($self, $addr, $cli_p) = @_;
|
||||
$self->call_plugins("blacklist_address", { address => $addr,
|
||||
$self->call_plugins("blocklist_address", { address => $addr,
|
||||
cli_p => $cli_p });
|
||||
}
|
||||
*add_address_to_blacklist = \&add_address_to_blocklist; # removed in 4.1
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $f->add_all_addresses_to_blacklist ($mail, $cli_p)
|
||||
=item $f->add_all_addresses_to_blocklist ($mail, $cli_p)
|
||||
|
||||
Previously add_all_addresses_to_blacklist which will work interchangeably until 4.1.
|
||||
|
||||
Given a mail message, find addresses in the From headers and add them to the
|
||||
automatic whitelist database with a high score, effectively blacklisting them.
|
||||
automatic welcomelist database with a high score, effectively blocklisting them.
|
||||
|
||||
Note that To and Cc addresses are not used.
|
||||
|
||||
@ -1054,23 +1074,26 @@ If $cli_p is set then underlying plugin may give visual feedback on additions/fa
|
||||
|
||||
=cut
|
||||
|
||||
sub add_all_addresses_to_blacklist {
|
||||
sub add_all_addresses_to_blocklist {
|
||||
my ($self, $mail_obj, $cli_p) = @_;
|
||||
|
||||
$self->init(1);
|
||||
|
||||
my @addrlist;
|
||||
my @hdrs = $mail_obj->get_header('From');
|
||||
if ($#hdrs >= 0) {
|
||||
push (@addrlist, $self->find_all_addrs_in_line (join (" ", @hdrs)));
|
||||
foreach my $hdr (@hdrs) {
|
||||
my @addrs = Mail::SpamAssassin::Util::parse_header_addresses($hdr);
|
||||
foreach my $addr (@addrs) {
|
||||
push @addrlist, $addr->{address} if defined $addr->{address};
|
||||
}
|
||||
}
|
||||
|
||||
foreach my $addr (@addrlist) {
|
||||
$self->call_plugins("blacklist_address", { address => $addr,
|
||||
$self->call_plugins("blocklist_address", { address => $addr,
|
||||
cli_p => $cli_p });
|
||||
}
|
||||
|
||||
}
|
||||
*add_all_addresses_to_blacklist = \&add_all_addresses_to_blocklist; # removed in 4.1
|
||||
|
||||
###########################################################################
|
||||
|
||||
@ -1158,13 +1181,12 @@ sub remove_spamassassin_markup {
|
||||
my $hdrs = $mail_obj->get_pristine_header();
|
||||
my $body = $mail_obj->get_pristine_body();
|
||||
|
||||
# remove DOS line endings
|
||||
$hdrs =~ s/\r//gs;
|
||||
# force \n for line-ending processing temporarily
|
||||
$hdrs =~ s/\015?\012/\n/gs;
|
||||
$body =~ s/\015?\012/\n/gs;
|
||||
|
||||
# unfold SA added headers, but not X-Spam-Prev headers ...
|
||||
$hdrs = "\n".$hdrs; # simplifies regexp below
|
||||
1 while $hdrs =~ s/(\nX-Spam-(?!Prev).+?)\n[ \t]+(\S.*\n)/$1 $2/g;
|
||||
$hdrs =~ s/^\n//;
|
||||
1 while $hdrs =~ s/((?:^|\n)X-Spam-(?!Prev).+?)\n[ \t]+(\S.*\n)/$1 $2/g;
|
||||
|
||||
###########################################################################
|
||||
# Backward Compatibility, pre 3.0.x.
|
||||
@ -1215,14 +1237,11 @@ sub remove_spamassassin_markup {
|
||||
}
|
||||
|
||||
# remove any other X-Spam headers we added, will be unfolded
|
||||
$hdrs = "\n".$hdrs; # simplifies regexp below
|
||||
1 while $hdrs =~ s/\nX-Spam-.*\n/\n/g;
|
||||
$hdrs =~ s/^\n//;
|
||||
1 while $hdrs =~ s/(^|\n)X-Spam-.*\n/$1/g;
|
||||
|
||||
# re-add DOS line endings
|
||||
if ($mail_obj->{line_ending} ne "\n") {
|
||||
$hdrs =~ s/\r?\n/$mail_obj->{line_ending}/gs;
|
||||
}
|
||||
# force original message line endings
|
||||
$hdrs =~ s/\n/$mail_obj->{line_ending}/gs;
|
||||
$body =~ s/\n/$mail_obj->{line_ending}/gs;
|
||||
|
||||
# Put the whole thing back together ...
|
||||
return join ('', $mbox, $hdrs, $body);
|
||||
@ -1235,8 +1254,8 @@ sub remove_spamassassin_markup {
|
||||
Read a configuration file and parse user preferences from it.
|
||||
|
||||
User preferences are as defined in the C<Mail::SpamAssassin::Conf> manual page.
|
||||
In other words, they include scoring options, scores, whitelists and
|
||||
blacklists, and so on, but do not include rule definitions, privileged
|
||||
In other words, they include scoring options, scores, welcomelists and
|
||||
blocklists, and so on, but do not include rule definitions, privileged
|
||||
settings, etc. unless C<allow_user_rules> is enabled; and they never include
|
||||
the administrator settings.
|
||||
|
||||
@ -1261,7 +1280,8 @@ sub read_scoreonly_config {
|
||||
|
||||
$text = "file start $filename\n" . $text;
|
||||
# add an extra \n in case file did not end in one.
|
||||
$text .= "\nfile end $filename\n";
|
||||
$text .= "\n" unless $text =~ /\n\z/;
|
||||
$text .= "file end $filename\n";
|
||||
|
||||
$self->{conf}->{main} = $self;
|
||||
$self->{conf}->parse_scores_only ($text);
|
||||
@ -1326,7 +1346,7 @@ sub load_scoreonly_ldap {
|
||||
=item $f->set_persistent_address_list_factory ($factoryobj)
|
||||
|
||||
Set the persistent address list factory, used to create objects for the
|
||||
automatic whitelist algorithm's persistent-storage back-end. See
|
||||
automatic welcomelist algorithm's persistent-storage back-end. See
|
||||
C<Mail::SpamAssassin::PersistentAddrList> for the API these factory objects
|
||||
must implement, and the API the objects they produce must implement.
|
||||
|
||||
@ -1499,7 +1519,7 @@ sub lint_rules {
|
||||
$self->{dont_copy_prefs} = $olddcp; # revert back to previous
|
||||
|
||||
# bug 5048: override settings to ensure a faster lint
|
||||
$self->{'conf'}->{'use_auto_whitelist'} = 0;
|
||||
$self->{'conf'}->{'use_auto_welcomelist'} = 0;
|
||||
$self->{'conf'}->{'bayes_auto_learn'} = 0;
|
||||
|
||||
my $mail = $self->parse(\@testmsg, 1, { master_deadline => undef });
|
||||
@ -1531,7 +1551,6 @@ sub finish {
|
||||
$self->call_plugins("finish_tests", { conf => $self->{conf},
|
||||
main => $self });
|
||||
|
||||
$self->{conf}->finish(); delete $self->{conf};
|
||||
$self->{plugins}->finish(); delete $self->{plugins};
|
||||
|
||||
if ($self->{bayes_scanner}) {
|
||||
@ -1541,6 +1560,8 @@ sub finish {
|
||||
|
||||
$self->{resolver}->finish() if $self->{resolver};
|
||||
|
||||
$self->{conf}->finish(); delete $self->{conf};
|
||||
|
||||
$self->timer_end("finish");
|
||||
%{$self} = ();
|
||||
}
|
||||
@ -1662,6 +1683,11 @@ sub init {
|
||||
# Note that this PID has run init()
|
||||
$self->{_initted} = $$;
|
||||
|
||||
# if spamd or other forking, wait for spamd_child_init
|
||||
if (!$self->{skip_prng_reseeding}) {
|
||||
$self->set_global_state_dir();
|
||||
}
|
||||
|
||||
#fix spamd reading root prefs file
|
||||
if (!defined $use_user_pref) {
|
||||
$use_user_pref = 1;
|
||||
@ -1737,10 +1763,18 @@ sub init {
|
||||
}
|
||||
|
||||
if ($self->{pre_config_text}) {
|
||||
$self->{config_text} = $self->{pre_config_text} . $self->{config_text};
|
||||
$self->{pre_config_text} .= "\n" unless $self->{pre_config_text} =~ /\n\z/;
|
||||
$self->{config_text} = "file start (pre_config_text)\n".
|
||||
$self->{pre_config_text}.
|
||||
"file end (pre_config_text)\n".
|
||||
$self->{config_text};
|
||||
}
|
||||
if ($self->{post_config_text}) {
|
||||
$self->{config_text} .= $self->{post_config_text};
|
||||
$self->{post_config_text} .= "\n" unless $self->{post_config_text} =~ /\n\z/;
|
||||
$self->{config_text} .= "\n" unless $self->{config_text} =~ /\n\z/;
|
||||
$self->{config_text} .= "file start (post_config_text)\n".
|
||||
$self->{post_config_text}.
|
||||
"file end (post_config_text)\n";
|
||||
}
|
||||
|
||||
if ($self->{config_text} !~ /\S/) {
|
||||
@ -1771,7 +1805,7 @@ sub init {
|
||||
# Initialize the Bayes subsystem
|
||||
if ($self->{conf}->{use_bayes}) {
|
||||
require Mail::SpamAssassin::Bayes;
|
||||
$self->{bayes_scanner} = new Mail::SpamAssassin::Bayes ($self);
|
||||
$self->{bayes_scanner} = Mail::SpamAssassin::Bayes->new($self);
|
||||
}
|
||||
$self->{'learn_to_journal'} = $self->{conf}->{bayes_learn_to_journal};
|
||||
|
||||
@ -1796,6 +1830,21 @@ sub init {
|
||||
# should be called only after configuration has been parsed
|
||||
$self->{resolver} = Mail::SpamAssassin::DnsResolver->new($self);
|
||||
|
||||
# load GeoDB if some plugin wants it
|
||||
if ($self->{geodb_wanted}) {
|
||||
eval '
|
||||
use Mail::SpamAssassin::GeoDB;
|
||||
$self->{geodb} = Mail::SpamAssassin::GeoDB->new({
|
||||
conf => $self->{conf}->{geodb},
|
||||
wanted => $self->{geodb_wanted},
|
||||
});
|
||||
1;
|
||||
';
|
||||
if ($@ || !$self->{geodb}) {
|
||||
dbg("config: GeoDB disabled: $@");
|
||||
}
|
||||
}
|
||||
|
||||
# TODO -- open DNS cache etc. if necessary
|
||||
}
|
||||
|
||||
@ -1826,10 +1875,10 @@ sub _read_cf_pre {
|
||||
dbg("config: file or directory $path not accessible: $!");
|
||||
} elsif (-d _) {
|
||||
foreach my $file ($self->$filelistmethod($path)) {
|
||||
$txt .= read_cf_file($file);
|
||||
$txt .= $self->read_cf_file($file);
|
||||
}
|
||||
} elsif (-f _ && -s _ && -r _) {
|
||||
$txt .= read_cf_file($path);
|
||||
$txt .= $self->read_cf_file($path);
|
||||
}
|
||||
}
|
||||
|
||||
@ -1838,9 +1887,14 @@ sub _read_cf_pre {
|
||||
|
||||
|
||||
sub read_cf_file {
|
||||
my($path) = @_;
|
||||
my($self, $path) = @_;
|
||||
my $txt = '';
|
||||
|
||||
if ($self->{cf_files_read}->{$path}++) {
|
||||
dbg("config: skipping already read file: $path");
|
||||
return $txt;
|
||||
}
|
||||
|
||||
local *IN;
|
||||
if (open (IN, "<".$path)) {
|
||||
|
||||
@ -1852,7 +1906,8 @@ sub read_cf_file {
|
||||
|
||||
$txt = "file start $path\n" . $txt;
|
||||
# add an extra \n in case file did not end in one.
|
||||
$txt .= "\nfile end $path\n";
|
||||
$txt .= "\n" unless $txt =~ /\n\z/;
|
||||
$txt .= "file end $path\n";
|
||||
|
||||
dbg("config: read file $path");
|
||||
}
|
||||
@ -1900,7 +1955,7 @@ sub get_and_create_userstate_dir {
|
||||
dbg("config: error accessing $fname: $!");
|
||||
} else { # does not exist, create it
|
||||
eval {
|
||||
mkpath($fname, 0, 0700); 1;
|
||||
mkpath(Mail::SpamAssassin::Util::untaint_file_path($fname), 0, 0700); 1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("config: mkdir $fname failed: $eval_stat");
|
||||
@ -1910,6 +1965,58 @@ sub get_and_create_userstate_dir {
|
||||
$fname;
|
||||
}
|
||||
|
||||
# find the most global writable state dir
|
||||
# used by dns_block_rule state files etc
|
||||
sub set_global_state_dir {
|
||||
my ($self) = @_;
|
||||
# try home_dir_for_helpers
|
||||
my $helper_dir = $self->{home_dir_for_helpers} || '';
|
||||
if ($helper_dir) {
|
||||
my $dir = File::Spec->catdir($helper_dir, ".spamassassin");
|
||||
return if $self->test_global_state_dir($dir);
|
||||
}
|
||||
# try user home (if different from helper home)
|
||||
my $home;
|
||||
if (am_running_on_windows()) {
|
||||
# Windows has a special folder for common appdata (Bug 8050)
|
||||
$home = Mail::SpamAssassin::Util::common_application_data_directory();
|
||||
} else {
|
||||
$home = (Mail::SpamAssassin::Util::portable_getpwuid ($>))[7];
|
||||
}
|
||||
if ($home && $home ne $helper_dir) {
|
||||
my $dir = File::Spec->catdir($home, ".spamassassin");
|
||||
return if $self->test_global_state_dir($dir);
|
||||
}
|
||||
# try LOCAL_STATE_DIR
|
||||
return if $self->test_global_state_dir($self->{LOCAL_STATE_DIR});
|
||||
# fallback to userstate
|
||||
$self->{global_state_dir} = $self->get_and_create_userstate_dir();
|
||||
dbg("config: global_state_dir set to userstate_dir: $self->{global_state_dir}");
|
||||
}
|
||||
|
||||
sub test_global_state_dir {
|
||||
my ($self, $dir) = @_;
|
||||
eval { mkpath($dir, 0, 0700); }; # just a single stat if exists already
|
||||
# Purge stale test files (enough to do only some times randomly)
|
||||
if (rand() < 0.2 && opendir(WT_DIR, $dir)) {
|
||||
foreach (grep {index($_, '.sawritetest') == 0 &&
|
||||
(-M File::Spec->catfile($dir, $_)||0) > 0.0001} readdir(WT_DIR)) {
|
||||
unlink(Mail::SpamAssassin::Util::untaint_file_path(File::Spec->catfile($dir, $_)));
|
||||
}
|
||||
closedir WT_DIR;
|
||||
}
|
||||
my $n = ".sawritetest$$".Mail::SpamAssassin::Util::pseudo_random_string(6);
|
||||
my $file = File::Spec->catfile($dir, $n);
|
||||
if (Mail::SpamAssassin::Util::touch_file($file, { create_exclusive => 1 })) {
|
||||
dbg("config: global_state_dir set to $dir");
|
||||
$self->{global_state_dir} = $dir;
|
||||
unlink($file);
|
||||
return 1;
|
||||
}
|
||||
unlink($file); # just in case?
|
||||
return 0;
|
||||
}
|
||||
|
||||
=item $fullpath = $f->find_rule_support_file ($filename)
|
||||
|
||||
Find a rule-support file, such as C<languages> or C<triplets.txt>,
|
||||
@ -1923,8 +2030,22 @@ it exists, or undef if it doesn't exist.
|
||||
sub find_rule_support_file {
|
||||
my ($self, $filename) = @_;
|
||||
|
||||
my @paths;
|
||||
# search custom directories first
|
||||
if ($self->{site_rules_filename}) {
|
||||
foreach my $path (split("\000", $self->{site_rules_filename})) {
|
||||
push @paths, $path if -d $path;
|
||||
}
|
||||
}
|
||||
if ($self->{rules_filename} && -d $self->{rules_filename}) {
|
||||
push @paths, $self->{rules_filename}
|
||||
}
|
||||
# updates sub-directory missing from @default_rules_path
|
||||
push @paths, '__local_state_dir__/__version__/updates_spamassassin_org';
|
||||
push @paths, @default_rules_path;
|
||||
|
||||
return $self->first_existing_path(
|
||||
map { my $p = $_; $p =~ s{$}{/$filename}; $p } @default_rules_path );
|
||||
map { my $p = $_; $p =~ s{$}{/$filename}; $p } @paths );
|
||||
}
|
||||
|
||||
=item $f->create_default_prefs ($filename, $username [ , $userdir ] )
|
||||
@ -2006,15 +2127,15 @@ sub expand_name {
|
||||
if (am_running_on_windows()) {
|
||||
my $userprofile = $ENV{USERPROFILE} || '';
|
||||
|
||||
return $userprofile if ($userprofile && $userprofile =~ m/^[a-z]\:[\/\\]/oi);
|
||||
return $userprofile if ($userprofile =~ m/^\\\\/o);
|
||||
return $userprofile if ($userprofile && $userprofile =~ m/^[a-z]\:[\/\\]/i);
|
||||
return $userprofile if ($userprofile =~ m/^\\\\/);
|
||||
|
||||
return $home if ($home && $home =~ m/^[a-z]\:[\/\\]/oi);
|
||||
return $home if ($home =~ m/^\\\\/o);
|
||||
return $home if ($home && $home =~ m/^[a-z]\:[\/\\]/i);
|
||||
return $home if ($home =~ m/^\\\\/);
|
||||
|
||||
return '';
|
||||
} else {
|
||||
return $home if ($home && $home =~ /\//o);
|
||||
return $home if ($home && index($home, '/') != -1);
|
||||
return (getpwnam($name))[7] if ($name ne '');
|
||||
return (getpwuid($>))[7];
|
||||
}
|
||||
@ -2028,6 +2149,9 @@ sub sed_path {
|
||||
return $self->{conf}->{sed_path_cache}->{$path};
|
||||
}
|
||||
|
||||
# <4.0 compatibility check, to be removed in 4.1
|
||||
my $check_compat = $path eq '__userstate__/auto-welcomelist';
|
||||
|
||||
my $orig_path = $path;
|
||||
|
||||
$path =~ s/__local_rules_dir__/$self->{LOCAL_RULES_DIR} || ''/ges;
|
||||
@ -2035,10 +2159,21 @@ sub sed_path {
|
||||
$path =~ s/__def_rules_dir__/$self->{DEF_RULES_DIR} || ''/ges;
|
||||
$path =~ s{__prefix__}{$self->{PREFIX} || $Config{prefix} || '/usr'}ges;
|
||||
$path =~ s{__userstate__}{$self->get_and_create_userstate_dir() || ''}ges;
|
||||
$path =~ s/__global_state_dir__/$self->{global_state_dir} || ''/ges;
|
||||
$path =~ s{__perl_major_ver__}{$self->get_perl_major_version()}ges;
|
||||
$path =~ s/__version__/${VERSION}/gs;
|
||||
$path =~ s/^\~([^\/]*)/$self->expand_name($1)/es;
|
||||
|
||||
# <4.0 compatibility check, to be removed in 4.1
|
||||
if ($check_compat) {
|
||||
if ($path =~ m{^(.+)/(.+)$}) {
|
||||
# Use auto-whitelist if found
|
||||
if (!-e $path && -e "$1/auto-whitelist") {
|
||||
$path = "$1/auto-whitelist";
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
$path = Mail::SpamAssassin::Util::untaint_file_path ($path);
|
||||
$self->{conf}->{sed_path_cache}->{$orig_path} = $path;
|
||||
return $path;
|
||||
@ -2077,16 +2212,32 @@ sub get_pre_files_in_dir {
|
||||
return $self->_get_cf_pre_files_in_dir($dir, 'pre');
|
||||
}
|
||||
|
||||
sub _reorder_dir {
|
||||
# Official ASF channel should be loaded first in
|
||||
# order to be able to override scores by using custom channels
|
||||
# bz 7991
|
||||
if($a eq 'updates_spamassassin_org.cf') {
|
||||
return -1;
|
||||
} elsif ($b eq 'updates_spamassassin_org.cf') {
|
||||
return 1;
|
||||
}
|
||||
return $a cmp $b;
|
||||
}
|
||||
|
||||
sub _get_cf_pre_files_in_dir {
|
||||
my ($self, $dir, $type) = @_;
|
||||
|
||||
if ($self->{config_tree_recurse}) {
|
||||
my @cfs;
|
||||
|
||||
# copied from Mail::SpamAssassin::Util::untaint_file_path
|
||||
# fix bugs 8010 and 8025 by using an untaint pattern that is better on Windows than File::Find's default
|
||||
my $chars = '-_A-Za-z0-9.#%=+,/:()\\@\\xA0-\\xFF\\\\';
|
||||
my $re = qr{^\s*([$chars][${chars}~ ]*)\z};
|
||||
# use "eval" to avoid loading File::Find unless this is specified
|
||||
eval ' use File::Find qw();
|
||||
File::Find::find(
|
||||
{ untaint => 1,
|
||||
am_running_on_windows() ? (untaint_pattern => $re) : (),
|
||||
follow => 1,
|
||||
wanted =>
|
||||
sub { push(@cfs, $File::Find::name) if /\.\Q$type\E$/i && -f $_ }
|
||||
@ -2095,7 +2246,7 @@ sub _get_cf_pre_files_in_dir {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
die "_get_cf_pre_files_in_dir error: $eval_stat";
|
||||
};
|
||||
@cfs = sort { $a cmp $b } @cfs;
|
||||
@cfs = sort { _reorder_dir($a, $b) } @cfs;
|
||||
return @cfs;
|
||||
}
|
||||
else {
|
||||
@ -2104,7 +2255,7 @@ sub _get_cf_pre_files_in_dir {
|
||||
/\.${type}$/i && -f "$dir/$_" } readdir(SA_CF_DIR);
|
||||
closedir SA_CF_DIR;
|
||||
|
||||
return map { "$dir/$_" } sort { $a cmp $b } @cfs;
|
||||
return map { "$dir/$_" } sort { _reorder_dir($a, $b) } @cfs;
|
||||
}
|
||||
}
|
||||
|
||||
@ -2127,11 +2278,18 @@ sub call_plugins {
|
||||
return unless $self->{plugins};
|
||||
|
||||
# Use some calls ourself too
|
||||
if ($subname eq 'finish_parsing_end') {
|
||||
if ($subname eq 'spamd_child_init') {
|
||||
# set global dir now if spamd
|
||||
$self->set_global_state_dir();
|
||||
} elsif ($subname eq 'finish_parsing_end') {
|
||||
# Initialize RegistryBoundaries, now that util_rb_tld etc from config is
|
||||
# read. Plugins can also now use {valid_tlds_re} to one time compile
|
||||
# regexes in finish_parsing_end.
|
||||
$self->{registryboundaries} = Mail::SpamAssassin::RegistryBoundaries->new ($self);
|
||||
} elsif ($subname eq 'whitelist_address' || $subname eq 'blacklist_address') {
|
||||
# Warn about backwards compatibility, removed in 4.1
|
||||
# Third party usage should be rare event, so do not translate function names
|
||||
warn "config: Deprecated $subname called from call_plugins, use welcomelist_address or blocklist_address\n";
|
||||
}
|
||||
|
||||
# safety net in case some plugin changes global settings, Bug 6218
|
||||
@ -2152,8 +2310,12 @@ sub find_all_addrs_in_mail {
|
||||
Errors-To Mail-Followup-To))
|
||||
{
|
||||
my @hdrs = $mail_obj->get_header($header);
|
||||
if ($#hdrs < 0) { next; }
|
||||
push (@addrlist, $self->find_all_addrs_in_line(join (" ", @hdrs)));
|
||||
foreach my $hdr (@hdrs) {
|
||||
my @addrs = Mail::SpamAssassin::Util::parse_header_addresses($hdr);
|
||||
foreach my $addr (@addrs) {
|
||||
push @addrlist, $addr->{address} if defined $addr->{address};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# find addrs in body, too
|
||||
@ -2176,17 +2338,22 @@ sub find_all_addrs_in_mail {
|
||||
sub find_all_addrs_in_line {
|
||||
my ($self, $line) = @_;
|
||||
|
||||
return () unless defined $line;
|
||||
|
||||
# a more permissive pattern based on "dot-atom" as per RFC2822
|
||||
my $ID_PATTERN = '[-a-z0-9_\+\:\=\!\#\$\%\&\*\^\?\{\}\|\~\/\.]+';
|
||||
my $HOST_PATTERN = '[-a-z0-9_\+\:\/]+';
|
||||
my $ID_PATTERN = qr/[-a-zA-Z0-9_\+\:\=\!\#\$\%\&\*\^\?\{\}\|\~\/\.]+/;
|
||||
my $HOST_PATTERN = qr/[-a-zA-Z0-9_\+\:\/]+/;
|
||||
|
||||
my @addrs;
|
||||
my %seen;
|
||||
while ($line =~ s/(?:mailto:)?\s*
|
||||
($ID_PATTERN \@
|
||||
$HOST_PATTERN(?:\.$HOST_PATTERN)+)//oix)
|
||||
($HOST_PATTERN(?:\.$HOST_PATTERN)+))//oix)
|
||||
{
|
||||
my $addr = $1;
|
||||
my $host = $2;
|
||||
next unless Mail::SpamAssassin::Util::is_fqdn_valid($host);
|
||||
next unless $self->{registryboundaries}->is_domain_valid($host);
|
||||
$addr =~ s/^mailto://;
|
||||
next if (defined ($seen{$addr})); $seen{$addr} = 1;
|
||||
push (@addrs, $addr);
|
||||
|
@ -25,13 +25,13 @@ use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use Errno qw(ENOENT EACCES EBADF);
|
||||
use Mail::SpamAssassin::Util;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::AICache;
|
||||
|
||||
# 256 KiB is a big email, unless stated otherwise
|
||||
use constant BIG_BYTES => 256*1024;
|
||||
# 500 KiB is a big email, unless stated otherwise
|
||||
use constant BIG_BYTES => 500*1024;
|
||||
|
||||
our ( $MESSAGES, $AICache, %class_opts );
|
||||
|
||||
@ -43,9 +43,9 @@ Mail::SpamAssassin::ArchiveIterator - find and process messages one at a time
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
my $iter = new Mail::SpamAssassin::ArchiveIterator(
|
||||
my $iter = Mail::SpamAssassin::ArchiveIterator->new(
|
||||
{
|
||||
'opt_max_size' => 256 * 1024, # 0 implies no limit
|
||||
'opt_max_size' => 500 * 1024, # 0 implies no limit
|
||||
'opt_cache' => 1,
|
||||
}
|
||||
);
|
||||
@ -77,7 +77,7 @@ and C<result_sub> functions appropriately per message.
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $item = new Mail::SpamAssassin::ArchiveIterator( [ { opt => val, ... } ] )
|
||||
=item $item = Mail::SpamAssassin::ArchiveIterator->new( [ { opt => val, ... } ] )
|
||||
|
||||
Constructs a new C<Mail::SpamAssassin::ArchiveIterator> object. You may
|
||||
pass the following attribute-value pairs to the constructor. The pairs are
|
||||
@ -91,7 +91,7 @@ A value of option I<opt_max_size> determines a limit (number of bytes)
|
||||
beyond which a message is considered large and is skipped by ArchiveIterator.
|
||||
|
||||
A value 0 implies no size limit, all messages are examined. An undefined
|
||||
value implies a default limit of 256 KiB.
|
||||
value implies a default limit of 500 KiB.
|
||||
|
||||
=item opt_all
|
||||
|
||||
@ -205,8 +205,6 @@ sub new {
|
||||
$self->{s} = [ ]; # spam, of course
|
||||
$self->{h} = [ ]; # ham, as if you couldn't guess
|
||||
|
||||
$self->{access_problem} = 0;
|
||||
|
||||
if ($self->{opt_all}) {
|
||||
$self->{opt_max_size} = 0;
|
||||
} elsif (!defined $self->{opt_max_size}) {
|
||||
@ -235,10 +233,14 @@ sub set_functions {
|
||||
|
||||
=item run ( @target_paths )
|
||||
|
||||
Generates the list of messages to process, then runs each message through the
|
||||
configured wanted subroutine. Files which have a name ending in C<.gz> or
|
||||
C<.bz2> will be properly uncompressed via call to C<gzip -dc> and C<bzip2 -dc>
|
||||
respectively.
|
||||
Generates the list of messages to process, then runs each message through
|
||||
the configured wanted subroutine.
|
||||
|
||||
Compressed files are detected and uncompressed automatically regardless of
|
||||
file extension. Supported formats are C<gzip>, C<bzip2>, C<xz>, C<lz4>,
|
||||
C<lzip>, C<lzo>. Gzip is uncompressed via IO::Zlib module, others use their
|
||||
specific command line tool (bzip2/xz/lz4/lzip/lzop). Compressed
|
||||
mailbox/mbox files are not supported.
|
||||
|
||||
The target_paths array is expected to be either one element per path in the
|
||||
following format: C<class:format:raw_location>, or a hash reference containing
|
||||
@ -292,6 +294,11 @@ sub run {
|
||||
return 0;
|
||||
}
|
||||
|
||||
# Find some uncompressors (gzip is handled with IO::Zlib)
|
||||
foreach ('bzip2','xz','lz4','lzip','lzop') {
|
||||
$self->{$_.'_path'} = Mail::SpamAssassin::Util::find_executable_in_env_path($_);
|
||||
}
|
||||
|
||||
# scan the targets and get the number and list of messages
|
||||
$self->_scan_targets(\@targets,
|
||||
sub {
|
||||
@ -316,11 +323,16 @@ sub run {
|
||||
sub _run {
|
||||
my ($self, $messages) = @_;
|
||||
|
||||
my $messages_run = 0;
|
||||
while (my $message = shift @{$messages}) {
|
||||
my($class, undef, $date, undef, $result) = $self->_run_message($message);
|
||||
&{$self->{result_sub}}($class, $result, $date) if $result;
|
||||
if ($result) {
|
||||
$messages_run++;
|
||||
&{$self->{result_sub}}($class, $result, $date);
|
||||
}
|
||||
return ! $self->{access_problem};
|
||||
}
|
||||
# Return success if atleast some files were processed through
|
||||
return $messages_run > 0;
|
||||
}
|
||||
|
||||
############################################################################
|
||||
@ -346,24 +358,8 @@ sub _run_message {
|
||||
sub _run_file {
|
||||
my ($self, $class, $format, $where, $date) = @_;
|
||||
|
||||
if (!_mail_open($where)) {
|
||||
$self->{access_problem} = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
my $stat_errn = stat(INPUT) ? 0 : 0+$!;
|
||||
if ($stat_errn == ENOENT) {
|
||||
dbg("archive-iterator: no such input ($where)");
|
||||
return;
|
||||
}
|
||||
elsif ($stat_errn != 0) {
|
||||
warn "archive-iterator: no access to input ($where): $!";
|
||||
return;
|
||||
}
|
||||
elsif (!-f _ && !-c _ && !-p _) {
|
||||
warn "archive-iterator: not a plain file (or char.spec. or pipe) ($where)";
|
||||
return;
|
||||
}
|
||||
my $fh = $self->_mail_open($where, 1);
|
||||
return unless $fh;
|
||||
|
||||
my $opt_max_size = $self->{opt_max_size};
|
||||
if (!$opt_max_size) {
|
||||
@ -375,7 +371,7 @@ sub _run_file {
|
||||
# note that -s can only deal with files, it returns 0 on char.spec. STDIN
|
||||
info("archive-iterator: skipping large message: ".
|
||||
"file size %d, limit %d bytes", -s _, $opt_max_size);
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
return;
|
||||
}
|
||||
|
||||
@ -384,12 +380,12 @@ sub _run_file {
|
||||
my $len = 0;
|
||||
my $str = '';
|
||||
my($inbuf,$nread);
|
||||
while ( $nread=read(INPUT,$inbuf,16384) ) {
|
||||
while ( $nread=read($fh,$inbuf,16384) ) {
|
||||
$len += $nread;
|
||||
if ($opt_max_size && $len > $opt_max_size) {
|
||||
info("archive-iterator: skipping large message: read %d, limit %d bytes",
|
||||
$len, $opt_max_size);
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
return;
|
||||
}
|
||||
$str .= $inbuf;
|
||||
@ -400,7 +396,7 @@ sub _run_file {
|
||||
for my $j (0..$#msg) {
|
||||
if ($msg[$j] =~ /^\015?$/) { $header = $j; last }
|
||||
}
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
|
||||
if ($date == AI_TIME_UNKNOWN && $self->{determine_receive_date}) {
|
||||
$date = Mail::SpamAssassin::Util::receive_date(join('', splice(@msg, 0, $header)));
|
||||
@ -418,21 +414,20 @@ sub _run_mailbox {
|
||||
}
|
||||
my @msg;
|
||||
my $header;
|
||||
if (!_mail_open($file)) {
|
||||
$self->{access_problem} = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
my $fh = $self->_mail_open($file, 1);
|
||||
return unless $fh;
|
||||
|
||||
my $opt_max_size = $self->{opt_max_size};
|
||||
dbg("archive-iterator: _run_mailbox %s, ofs %d, limit %d",
|
||||
$file, $offset, $opt_max_size||0);
|
||||
|
||||
seek(INPUT,$offset,0) or die "cannot reposition file to $offset: $!";
|
||||
seek($fh,$offset,0) or die "cannot reposition file to $offset: $!";
|
||||
|
||||
my $size = 0;
|
||||
for ($!=0; <INPUT>; $!=0) {
|
||||
for ($!=0; <$fh>; $!=0) {
|
||||
#Changed Regex to use option Per bug 6703
|
||||
last if (substr($_,0,5) eq "From " && @msg && /$self->{opt_from_regex}/o);
|
||||
last if (/^From / && @msg && $_ =~ $self->{opt_from_regex});
|
||||
$size += length($_);
|
||||
push (@msg, $_);
|
||||
|
||||
@ -441,7 +436,7 @@ sub _run_mailbox {
|
||||
info("archive-iterator: skipping large message: ".
|
||||
"%d lines, %d bytes, limit %d bytes",
|
||||
scalar @msg, $size, $opt_max_size);
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
return;
|
||||
}
|
||||
|
||||
@ -452,7 +447,7 @@ sub _run_mailbox {
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("archive-iterator: error reading: $!")
|
||||
: die "error reading: $!";
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
|
||||
if ($date == AI_TIME_UNKNOWN && $self->{determine_receive_date}) {
|
||||
$date = Mail::SpamAssassin::Util::receive_date(join('', splice(@msg, 0, $header)));
|
||||
@ -464,23 +459,24 @@ sub _run_mailbox {
|
||||
sub _run_mbx {
|
||||
my ($self, $class, $format, $where, $date) = @_;
|
||||
|
||||
my ($file, $offset) = ($where =~ m/(.*)\.(\d+)$/);
|
||||
my ($file, $offset);
|
||||
{ local($1,$2); # Bug 7140 (avoids perl bug [perl #123880])
|
||||
($file, $offset) = ($where =~ m/(.*)\.(\d+)$/);
|
||||
}
|
||||
my @msg;
|
||||
my $header;
|
||||
|
||||
if (!_mail_open($file)) {
|
||||
$self->{access_problem} = 1;
|
||||
return;
|
||||
}
|
||||
my $fh = $self->_mail_open($file, 1);
|
||||
return unless $fh;
|
||||
|
||||
my $opt_max_size = $self->{opt_max_size};
|
||||
dbg("archive-iterator: _run_mbx %s, ofs %d, limit %d",
|
||||
$file, $offset, $opt_max_size||0);
|
||||
|
||||
seek(INPUT,$offset,0) or die "cannot reposition file to $offset: $!";
|
||||
seek($fh,$offset,0) or die "cannot reposition file to $offset: $!";
|
||||
|
||||
my $size = 0;
|
||||
for ($!=0; <INPUT>; $!=0) {
|
||||
for ($!=0; <$fh>; $!=0) {
|
||||
last if ($_ =~ MBX_SEPARATOR);
|
||||
$size += length($_);
|
||||
push (@msg, $_);
|
||||
@ -490,7 +486,7 @@ sub _run_mbx {
|
||||
info("archive-iterator: skipping large message: ".
|
||||
"%d lines, %d bytes, limit %d bytes",
|
||||
scalar @msg, $size, $opt_max_size);
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
return;
|
||||
}
|
||||
|
||||
@ -501,7 +497,7 @@ sub _run_mbx {
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("archive-iterator: error reading: $!")
|
||||
: die "error reading: $!";
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
|
||||
if ($date == AI_TIME_UNKNOWN && $self->{determine_receive_date}) {
|
||||
$date = Mail::SpamAssassin::Util::receive_date(join('', splice(@msg, 0, $header)));
|
||||
@ -579,12 +575,9 @@ sub _scan_targets {
|
||||
if ($format eq 'detect') {
|
||||
# detect the format
|
||||
my $stat_errn = stat($location) ? 0 : 0+$!;
|
||||
if ($stat_errn == ENOENT) {
|
||||
$thisformat = 'file'; # actually, no file - to be detected later
|
||||
}
|
||||
elsif ($stat_errn != 0) {
|
||||
warn "archive-iterator: no access to $location: $!";
|
||||
$thisformat = 'file';
|
||||
if ($stat_errn != 0) {
|
||||
warn "archive-iterator: no access to $location: $!\n";
|
||||
next;
|
||||
}
|
||||
elsif (-d _) {
|
||||
# it's a directory
|
||||
@ -623,37 +616,121 @@ sub _scan_targets {
|
||||
}
|
||||
|
||||
sub _mail_open {
|
||||
my ($file) = @_;
|
||||
my ($self, $file, $ignore_missing) = @_;
|
||||
my $fh;
|
||||
|
||||
# Go ahead and try to open the file
|
||||
# bug 5288: the "magic" version of open will strip leading and trailing
|
||||
# whitespace from the expression. switch to the three-argument version
|
||||
# of open which does not strip whitespace. see "perldoc -f open" and
|
||||
# "perldoc perlipc" for more information.
|
||||
|
||||
# Assume that the file by default is just a plain file
|
||||
my @expr = ( $file );
|
||||
my $mode = '<';
|
||||
|
||||
# Handle different types of compressed files
|
||||
if ($file =~ /\.gz$/) {
|
||||
$mode = '-|';
|
||||
unshift @expr, 'gunzip', '-cd';
|
||||
if (!open($fh, '<', $file)) {
|
||||
# Don't warn about disappeared files
|
||||
if ($ignore_missing && $! == ENOENT) {
|
||||
dbg("archive-iterator: no access to $file: $!");
|
||||
} else {
|
||||
warn "archive-iterator: no access to $file: $!\n"
|
||||
}
|
||||
elsif ($file =~ /\.bz2$/) {
|
||||
$mode = '-|';
|
||||
unshift @expr, 'bzip2', '-cd';
|
||||
}
|
||||
|
||||
# Go ahead and try to open the file
|
||||
if (!open (INPUT, $mode, @expr)) {
|
||||
warn "archive-iterator: unable to open $file: $!\n";
|
||||
return 0;
|
||||
return;
|
||||
}
|
||||
|
||||
# bug 5249: mail could have 8-bit data, need this on some platforms
|
||||
binmode INPUT or die "cannot set input file to binmode: $!";
|
||||
binmode $fh or die "cannot set input file to binmode: $!";
|
||||
|
||||
return 1;
|
||||
# Detect compressed data (only from files, can't reopen pipe)
|
||||
if (-f $file && read($fh, my $magic, 6)) {
|
||||
# GZIP
|
||||
if ($magic =~ /^\x1F\x8B/) {
|
||||
dbg("archive-iterator: detected gzip file $file, reopening with IO::Zlib");
|
||||
close $fh or die "error closing input file: $!";
|
||||
eval { require IO::Zlib; };
|
||||
if ($@) { warn "archive-iterator: IO::Zlib required for $file: $@\n"; return; }
|
||||
$fh = IO::Zlib->new($file, "rb");
|
||||
if (!$fh) {
|
||||
if ($ignore_missing && $! == ENOENT) {
|
||||
dbg("archive-iterator: no access to $file: $!");
|
||||
} else {
|
||||
warn "archive-iterator: no access to $file: $!\n";
|
||||
}
|
||||
return;
|
||||
}
|
||||
}
|
||||
# BZIP2
|
||||
elsif ($magic =~ /^\x42\x5A(?:\x68|\x30)/) {
|
||||
dbg("archive-iterator: detected bzip2 file $file, reopening with bzip2");
|
||||
close $fh or die "error closing input file: $!";
|
||||
if (!$self->{bzip2_path}) {
|
||||
warn "archive-iterator: bzip2 executable required for $file\n";
|
||||
return;
|
||||
}
|
||||
if (!open($fh, '-|', $self->{bzip2_path}, '-cd', $file)) {
|
||||
warn "archive-iterator: no access to $file: $!\n";
|
||||
return;
|
||||
}
|
||||
binmode $fh or die "cannot set input file to binmode: $!";
|
||||
}
|
||||
# XZ
|
||||
elsif ($magic =~ /^\xFD\x37\x7A\x58\x5A\x00/) {
|
||||
dbg("archive-iterator: detected xz file $file, reopening with xz");
|
||||
close $fh or die "error closing input file: $!";
|
||||
if (!$self->{xz_path}) {
|
||||
warn "archive-iterator: xz executable required for $file\n";
|
||||
return;
|
||||
}
|
||||
if (!open($fh, '-|', $self->{xz_path}, '-cd', $file)) {
|
||||
warn "archive-iterator: no access to $file: $!\n";
|
||||
return;
|
||||
}
|
||||
binmode $fh or die "cannot set input file to binmode: $!";
|
||||
}
|
||||
# LZ4
|
||||
elsif ($magic =~ /^\x04\x22\x4D\x18/) {
|
||||
dbg("archive-iterator: detected lz4 file $file, reopening with lz4");
|
||||
close $fh or die "error closing input file: $!";
|
||||
if (!$self->{lz4_path}) {
|
||||
warn "archive-iterator: lz4 executable required for $file\n";
|
||||
return;
|
||||
}
|
||||
if (!open($fh, '-|', $self->{lz4_path}, '-cd', $file)) {
|
||||
warn "archive-iterator: no access to $file: $!\n";
|
||||
return;
|
||||
}
|
||||
binmode $fh or die "cannot set input file to binmode: $!";
|
||||
}
|
||||
# LZIP
|
||||
elsif ($magic =~ /^\x4C\x5A\x49\x50/) {
|
||||
dbg("archive-iterator: detected lzip file $file, reopening with lzip");
|
||||
close $fh or die "error closing input file: $!";
|
||||
if (!$self->{lzip_path}) {
|
||||
warn "archive-iterator: lzip executable required for $file\n";
|
||||
return;
|
||||
}
|
||||
if (!open($fh, '-|', $self->{lzip_path}, '-cd', $file)) {
|
||||
warn "archive-iterator: no access to $file: $!\n";
|
||||
return;
|
||||
}
|
||||
binmode $fh or die "cannot set input file to binmode: $!";
|
||||
}
|
||||
# LZO
|
||||
elsif ($magic =~ /^\x89\x4C\x5A\x4F\x00\x0D/) {
|
||||
dbg("archive-iterator: detected lzo file $file, reopening with lzop");
|
||||
close $fh or die "error closing input file: $!";
|
||||
if (!$self->{lzop_path}) {
|
||||
warn "archive-iterator: lzop executable required for $file\n";
|
||||
return;
|
||||
}
|
||||
if (!open($fh, '-|', $self->{lzop_path}, '-cd', $file)) {
|
||||
warn "archive-iterator: no access to $file: $!\n";
|
||||
return;
|
||||
}
|
||||
binmode $fh or die "cannot set input file to binmode: $!";
|
||||
} else {
|
||||
# Reset position
|
||||
seek($fh,0,0);
|
||||
}
|
||||
}
|
||||
|
||||
return $fh;
|
||||
}
|
||||
|
||||
sub _set_default_message_selection_opts {
|
||||
@ -663,11 +740,15 @@ sub _set_default_message_selection_opts {
|
||||
$self->{opt_want_date} = 1 unless (defined $self->{opt_want_date});
|
||||
$self->{opt_cache} = 0 unless (defined $self->{opt_cache});
|
||||
#Changed Regex to include boundaries for Communigate Pro versions (5.2.x and later). per Bug 6413
|
||||
$self->{opt_from_regex} = '^From \S+ ?(\S\S\S \S\S\S .?\d .?\d:\d\d:\d\d \d{4}|.?\d-\d\d-\d{4}_\d\d:\d\d:\d\d_)' unless (defined $self->{opt_from_regex});
|
||||
|
||||
#STRIP LEADING AND TRAILING / FROM REGEX FOR OPTION
|
||||
$self->{opt_from_regex} =~ s/^\///;
|
||||
$self->{opt_from_regex} =~ s/\/$//;
|
||||
if (!defined $self->{opt_from_regex}) {
|
||||
$self->{opt_from_regex} = qr/^From \S+ ?(\S\S\S \S\S\S .?\d .?\d:\d\d:\d\d \d{4}|.?\d-\d\d-\d{4}_\d\d:\d\d:\d\d_)/;
|
||||
} elsif (ref($self->{opt_from_regex}) ne 'Regexp') {
|
||||
my ($rec, $err) = compile_regexp($self->{opt_from_regex}, 1);
|
||||
if (!$rec) {
|
||||
die "fatal: invalid mbox_format_from_regex '$self->{opt_from_regex}': $err\n";
|
||||
}
|
||||
$self->{opt_from_regex} = $rec;
|
||||
}
|
||||
|
||||
dbg("archive-iterator: _set_default_message_selection_opts After: Scanprob[$self->{opt_scanprob}], want_date[$self->{opt_want_date}], cache[$self->{opt_cache}], from_regex[$self->{opt_from_regex}]");
|
||||
|
||||
@ -750,7 +831,7 @@ sub _scan_directory {
|
||||
# Maildir format: bug 3003
|
||||
for my $sub ("new", "cur") {
|
||||
opendir (DIR, "$folder/$sub")
|
||||
or die "Can't open '$folder/$sub' dir: $!\n";
|
||||
or die "archive-iterator: can't open '$folder/$sub' dir: $!\n";
|
||||
# Don't learn from messages marked as deleted
|
||||
# Or files starting with a leading dot
|
||||
push @files, map { "$sub/$_" } grep { !/^\.|:2,.*T/ } readdir(DIR);
|
||||
@ -790,9 +871,10 @@ sub _scan_directory {
|
||||
my $stat_errn = stat($file) ? 0 : 0+$!;
|
||||
if ($stat_errn == ENOENT) {
|
||||
# no longer there?
|
||||
dbg("archive-iterator: no access to $file: $!");
|
||||
}
|
||||
elsif ($stat_errn != 0) {
|
||||
warn "archive-iterator: no access to $file: $!";
|
||||
warn "archive-iterator: no access to $file: $!\n";
|
||||
}
|
||||
elsif (-f _ || -c _ || -p _) {
|
||||
$self->_scan_file($class, $file, $bkfunc);
|
||||
@ -801,7 +883,7 @@ sub _scan_directory {
|
||||
push(@subdirs, $file);
|
||||
}
|
||||
else {
|
||||
warn "archive-iterator: $file is not a plain file or directory: $!";
|
||||
warn "archive-iterator: $file is not a plain file or directory\n";
|
||||
}
|
||||
}
|
||||
undef @files; # release storage
|
||||
@ -844,18 +926,17 @@ sub _scan_file {
|
||||
}
|
||||
|
||||
my $header = '';
|
||||
if (!_mail_open($mail)) {
|
||||
$self->{access_problem} = 1;
|
||||
return;
|
||||
}
|
||||
for ($!=0; <INPUT>; $!=0) {
|
||||
my $fh = $self->_mail_open($mail);
|
||||
return unless $fh;
|
||||
|
||||
for ($!=0; <$fh>; $!=0) {
|
||||
last if /^\015?$/s;
|
||||
$header .= $_;
|
||||
}
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("archive-iterator: error reading: $!")
|
||||
: die "error reading: $!";
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
|
||||
return if ($self->{opt_skip_empty_messages} && $header eq '');
|
||||
|
||||
@ -896,7 +977,6 @@ sub _scan_mailbox {
|
||||
$folder =~ s/\/\s*$//; #Remove trailing slash, if there
|
||||
if (!opendir(DIR, $folder)) {
|
||||
warn "archive-iterator: can't open '$folder' dir: $!\n";
|
||||
$self->{access_problem} = 1;
|
||||
return;
|
||||
}
|
||||
while ($_ = readdir(DIR)) {
|
||||
@ -921,9 +1001,8 @@ sub _scan_mailbox {
|
||||
|
||||
foreach my $file (@files) {
|
||||
$self->_bump_scan_progress();
|
||||
if ($file =~ /\.(?:gz|bz2)$/) {
|
||||
if ($file =~ /\.(?:gz|bz2|xz|lz[o4]?)$/i) {
|
||||
warn "archive-iterator: compressed mbox folders are not supported at this time\n";
|
||||
$self->{access_problem} = 1;
|
||||
next;
|
||||
}
|
||||
|
||||
@ -943,20 +1022,18 @@ sub _scan_mailbox {
|
||||
}
|
||||
|
||||
unless ($count) {
|
||||
if (!_mail_open($file)) {
|
||||
$self->{access_problem} = 1;
|
||||
next;
|
||||
}
|
||||
my $fh = $self->_mail_open($file);
|
||||
next unless $fh;
|
||||
|
||||
my $start = 0; # start of a message
|
||||
my $where = 0; # current byte offset
|
||||
my $first = ''; # first line of message
|
||||
my $header = ''; # header text
|
||||
my $in_header = 0; # are in we a header?
|
||||
while (!eof INPUT) {
|
||||
while (!eof $fh) {
|
||||
my $offset = $start; # byte offset of this message
|
||||
my $header = $first; # remember first line
|
||||
for ($!=0; <INPUT>; $!=0) {
|
||||
for ($!=0; <$fh>; $!=0) {
|
||||
if ($in_header) {
|
||||
if (/^\015?$/s) {
|
||||
$in_header = 0;
|
||||
@ -966,15 +1043,15 @@ sub _scan_mailbox {
|
||||
}
|
||||
}
|
||||
#Changed Regex to use option Per bug 6703
|
||||
if (substr($_,0,5) eq "From " && /$self->{opt_from_regex}/o) {
|
||||
if (/^From / && $_ =~ $self->{opt_from_regex}) {
|
||||
$in_header = 1;
|
||||
$first = $_;
|
||||
$start = $where;
|
||||
$where = tell INPUT;
|
||||
$where = tell $fh;
|
||||
$where >= 0 or die "cannot obtain file position: $!";
|
||||
last;
|
||||
}
|
||||
$where = tell INPUT;
|
||||
$where = tell $fh;
|
||||
$where >= 0 or die "cannot obtain file position: $!";
|
||||
}
|
||||
defined $_ || $!==0 or
|
||||
@ -986,7 +1063,7 @@ sub _scan_mailbox {
|
||||
$info->{$offset} = Mail::SpamAssassin::Util::receive_date($header);
|
||||
}
|
||||
}
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
}
|
||||
|
||||
while(my($k,$v) = each %{$info}) {
|
||||
@ -1027,7 +1104,6 @@ sub _scan_mbx {
|
||||
$folder =~ s/\/\s*$//; # remove trailing slash, if there is one
|
||||
if (!opendir(DIR, $folder)) {
|
||||
warn "archive-iterator: can't open '$folder' dir: $!\n";
|
||||
$self->{access_problem} = 1;
|
||||
return;
|
||||
}
|
||||
while ($_ = readdir(DIR)) {
|
||||
@ -1053,9 +1129,8 @@ sub _scan_mbx {
|
||||
foreach my $file (@files) {
|
||||
$self->_bump_scan_progress();
|
||||
|
||||
if ($folder =~ /\.(?:gz|bz2)$/) {
|
||||
if ($folder =~ /\.(?:gz|bz2|xz|lz[o4]?)$/i) {
|
||||
warn "archive-iterator: compressed mbx folders are not supported at this time\n";
|
||||
$self->{access_problem} = 1;
|
||||
next;
|
||||
}
|
||||
|
||||
@ -1075,13 +1150,11 @@ sub _scan_mbx {
|
||||
}
|
||||
|
||||
unless ($count) {
|
||||
if (!_mail_open($file)) {
|
||||
$self->{access_problem} = 1;
|
||||
next;
|
||||
}
|
||||
my $fh = $self->_mail_open($file);
|
||||
next unless $fh;
|
||||
|
||||
# check the mailbox is in mbx format
|
||||
$! = 0; $fp = <INPUT>;
|
||||
$! = 0; $fp = <$fh>;
|
||||
defined $fp || $!==0 or
|
||||
$!==EBADF ? dbg("archive-iterator: error reading: $!")
|
||||
: die "error reading: $!";
|
||||
@ -1092,18 +1165,17 @@ sub _scan_mbx {
|
||||
}
|
||||
|
||||
# skip mbx headers to the first email...
|
||||
seek(INPUT,2048,0) or die "cannot reposition file to 2048: $!";
|
||||
my $sep = MBX_SEPARATOR;
|
||||
seek($fh,2048,0) or die "cannot reposition file to 2048: $!";
|
||||
|
||||
for ($!=0; <INPUT>; $!=0) {
|
||||
if ($_ =~ /$sep/) {
|
||||
my $offset = tell INPUT;
|
||||
for ($!=0; <$fh>; $!=0) {
|
||||
if ($_ =~ MBX_SEPARATOR) {
|
||||
my $offset = tell $fh;
|
||||
$offset >= 0 or die "cannot obtain file position: $!";
|
||||
my $size = $2;
|
||||
|
||||
# gather up the headers...
|
||||
my $header = '';
|
||||
for ($!=0; <INPUT>; $!=0) {
|
||||
for ($!=0; <$fh>; $!=0) {
|
||||
last if (/^\015?$/s);
|
||||
$header .= $_;
|
||||
}
|
||||
@ -1116,7 +1188,7 @@ sub _scan_mbx {
|
||||
}
|
||||
|
||||
# go onto the next message
|
||||
seek(INPUT, $offset + $size, 0)
|
||||
seek($fh, $offset + $size, 0)
|
||||
or die "cannot reposition file to $offset + $size: $!";
|
||||
}
|
||||
else {
|
||||
@ -1126,7 +1198,7 @@ sub _scan_mbx {
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("archive-iterator: error reading: $!")
|
||||
: die "error reading: $!";
|
||||
close INPUT or die "error closing input file: $!";
|
||||
close $fh or die "error closing input file: $!";
|
||||
}
|
||||
|
||||
while(my($k,$v) = each %{$info}) {
|
||||
|
@ -42,6 +42,7 @@ use Time::HiRes qw(time);
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(idn_to_ascii domain_to_search_list);
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
@ -71,9 +72,9 @@ sub new {
|
||||
main => $main,
|
||||
queries_started => 0,
|
||||
queries_completed => 0,
|
||||
total_queries_started => 0,
|
||||
total_queries_completed => 0,
|
||||
pending_lookups => { },
|
||||
pending_rules => { }, # maintain pending rules list for meta evaluation
|
||||
rules_for_key => { }, # record all rules used by a key for logging
|
||||
timing_by_query => { },
|
||||
all_lookups => { }, # keyed by "rr_type/domain"
|
||||
};
|
||||
@ -82,66 +83,54 @@ sub new {
|
||||
$self;
|
||||
}
|
||||
|
||||
# Given a domain name, produces a listref of successively stripped down
|
||||
# parent domains, e.g. a domain '2.10.Example.COM' would produce a list:
|
||||
# '2.10.example.com', '10.example.com', 'example.com', 'com', ''
|
||||
#
|
||||
sub domain_to_search_list {
|
||||
my ($domain) = @_;
|
||||
$domain =~ s/^\.+//; $domain =~ s/\.+\z//; # strip leading and trailing dots
|
||||
my @search_keys;
|
||||
if ($domain =~ /\[/) { # don't split address literals
|
||||
@search_keys = ( $domain, '' ); # presumably an address literal
|
||||
} else {
|
||||
local $1;
|
||||
$domain = lc $domain;
|
||||
for (;;) {
|
||||
push(@search_keys, $domain);
|
||||
last if $domain eq '';
|
||||
# strip one level
|
||||
$domain = ($domain =~ /^ (?: [^.]* ) \. (.*) \z/xs) ? $1 : '';
|
||||
}
|
||||
if (@search_keys > 20) { # enforce some sanity limit
|
||||
@search_keys = @search_keys[$#search_keys-19 .. $#search_keys];
|
||||
}
|
||||
}
|
||||
return \@search_keys;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $ent = $async->start_lookup($ent, $master_deadline)
|
||||
=item $ent = $async->bgsend_and_start_lookup($name, $type, $class, $ent, $cb, %options)
|
||||
|
||||
Register the start of a long-running asynchronous lookup operation.
|
||||
C<$ent> is a hash reference containing the following items:
|
||||
Launch async DNS lookups. This is the only official method supported for
|
||||
plugins since version 4.0.0. Do not use bgsend and start_lookup separately.
|
||||
|
||||
Merges duplicate queries automatically, only launches one and calls all
|
||||
related callbacks on answer.
|
||||
|
||||
=over 4
|
||||
|
||||
=item key (required)
|
||||
=item $name (required)
|
||||
|
||||
A key string, unique to this lookup. This is what is reported in
|
||||
debug messages, used as the key for C<get_lookup()>, etc.
|
||||
Name to query.
|
||||
|
||||
=item id (required)
|
||||
=item $type (required)
|
||||
|
||||
An ID string, also unique to this lookup. Typically, this is the DNS packet ID
|
||||
as returned by DnsResolver's C<bgsend> method. Sadly, the Net::DNS
|
||||
architecture forces us to keep a separate ID string for this task instead of
|
||||
reusing C<key> -- if you are not using DNS lookups through DnsResolver, it
|
||||
should be OK to just reuse C<key>.
|
||||
Type to query, A, TXT, NS, etc.
|
||||
|
||||
=item type (required)
|
||||
=item $class (required/deprecated)
|
||||
|
||||
Deprecated, ignored, set as undef.
|
||||
|
||||
=item C<$ent> is a required hash reference containing the following items:
|
||||
|
||||
=over 4
|
||||
|
||||
=item $ent->{rulename} (required)
|
||||
|
||||
The rulename that started and/or depends on this query. Required for rule
|
||||
dependencies to work correctly. Can be a single rulename, or array of
|
||||
multiple rulenames.
|
||||
|
||||
=item $ent->{type} (optional)
|
||||
|
||||
A string, typically one word, used to describe the type of lookup in log
|
||||
messages, such as C<DNSBL>, C<MX>, C<TXT>.
|
||||
messages, such as C<DNSBL>, C<URIBL-A>. If not defined, default is value of
|
||||
$type.
|
||||
|
||||
=item zone (optional)
|
||||
=item $ent->{zone} (optional)
|
||||
|
||||
A zone specification (typically a DNS zone name - e.g. host, domain, or RBL)
|
||||
which may be used as a key to look up per-zone settings. No semantics on this
|
||||
parameter is imposed by this module. Currently used to fetch by-zone timeouts.
|
||||
A zone specification (typically a DNS zone name - e.g. host, domain, or
|
||||
RBL) which may be used as a key to look up per-zone settings. No semantics
|
||||
on this parameter is imposed by this module. Currently used to fetch
|
||||
by-zone timeouts (from rbl_timeout setting). Defaults to $name.
|
||||
|
||||
=item timeout_initial (optional)
|
||||
=item $ent->{timeout_initial} (optional)
|
||||
|
||||
An initial value of elapsed time for which we are willing to wait for a
|
||||
response (time in seconds, floating point value is allowed). When elapsed
|
||||
@ -158,20 +147,267 @@ variable rbl_timeout.
|
||||
If a value of the timeout_initial parameter is below timeout_min, the initial
|
||||
timeout is set to timeout_min.
|
||||
|
||||
=item timeout_min (optional)
|
||||
=item $ent->{timeout_min} (optional)
|
||||
|
||||
A lower bound (in seconds) to which the actual timeout approaches as the
|
||||
number of queries completed approaches the number of all queries started.
|
||||
Defaults to 0.2 * timeout_initial.
|
||||
|
||||
=item $ent->{key}, $ent->{id} (deprecated)
|
||||
|
||||
Deprecated, ignored, automatically generated since 4.0.0.
|
||||
|
||||
=item $ent->{YOUR_OWN_ITEM}
|
||||
|
||||
Any other custom values/objects that you want to pass on to the answer
|
||||
callback.
|
||||
|
||||
=back
|
||||
|
||||
C<$ent> is returned by this method, with its contents augmented by additional
|
||||
information.
|
||||
=item $cb (required)
|
||||
|
||||
Callback function for answer, called as $cb->($ent, $pkt). C<$ent> is the
|
||||
same object that bgsend_and_start_lookup was called with. C<$pkt> is the
|
||||
packet object for the response, Net::DNS:RR objects can be found from
|
||||
$pkt->answer.
|
||||
|
||||
=item %options (required)
|
||||
|
||||
Hash of options. Only supported and required option is master_deadline:
|
||||
|
||||
master_deadline => $pms->{master_deadline}
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub start_queue {
|
||||
my($self) = @_;
|
||||
|
||||
$self->{wait_queue} = 1;
|
||||
}
|
||||
|
||||
sub launch_queue {
|
||||
my($self) = @_;
|
||||
|
||||
delete $self->{wait_queue};
|
||||
|
||||
if ($self->{bgsend_queue}) {
|
||||
dbg("async: launching queued lookups");
|
||||
foreach (@{$self->{bgsend_queue}}) {
|
||||
$self->bgsend_and_start_lookup(@$_);
|
||||
}
|
||||
delete $self->{bgsend_queue};
|
||||
}
|
||||
}
|
||||
|
||||
sub bgsend_and_start_lookup {
|
||||
my $self = shift;
|
||||
my($domain, $type, $class, $ent, $cb, %options) = @_;
|
||||
|
||||
return if $self->{main}->{resolver}->{no_resolver};
|
||||
|
||||
# Waiting for priority -100 to launch?
|
||||
if ($self->{wait_queue}) {
|
||||
push @{$self->{bgsend_queue}}, [@_];
|
||||
dbg("async: DNS priority not reached, queueing lookup: $domain/$type");
|
||||
return $ent;
|
||||
}
|
||||
|
||||
if (!defined $ent->{rulename} && !$self->{rulename_warned}++) {
|
||||
my($package, $filename, $line) = caller;
|
||||
warn "async: bgsend_and_start_lookup called without rulename, ".
|
||||
"from $package ($filename) line $line. You are likely using ".
|
||||
"a plugin that is not compatible with SpamAssasin 4.0.0.";
|
||||
}
|
||||
|
||||
$domain =~ s/\.+\z//s; # strip trailing dots, these sometimes still sneak in
|
||||
$domain = idn_to_ascii($domain);
|
||||
|
||||
# At this point the $domain should already be encoded to UTF-8 and
|
||||
# IDN converted to ASCII-compatible encoding (ACE). Make sure this is
|
||||
# really the case in order to be able to catch any leftover omissions.
|
||||
if (utf8::is_utf8($domain)) {
|
||||
utf8::encode($domain);
|
||||
my($package, $filename, $line) = caller;
|
||||
info("bgsend_and_start_lookup: Unicode domain name, expected octets: %s, ".
|
||||
"called from %s line %d", $domain, $package, $line);
|
||||
} elsif ($domain =~ tr/\x00-\x7F//c) { # is not all-ASCII
|
||||
my($package, $filename, $line) = caller;
|
||||
info("bgsend_and_start_lookup: non-ASCII domain name: %s, ".
|
||||
"called from %s line %d", $domain, $package, $line);
|
||||
}
|
||||
|
||||
my $dnskey = uc($type).'/'.lc($domain);
|
||||
my $dns_query_info = $self->{all_lookups}{$dnskey};
|
||||
|
||||
$ent = {} if !$ent;
|
||||
$ent->{id} = undef;
|
||||
my $key = $ent->{key} = $dnskey;
|
||||
$ent->{query_type} = $type;
|
||||
$ent->{query_domain} = $domain;
|
||||
$ent->{type} = $type if !exists $ent->{type};
|
||||
$ent->{zone} = $domain if !exists $ent->{zone};
|
||||
$cb = $ent->{completed_callback} if !$cb; # compatibility with SA < 3.4
|
||||
|
||||
my @rulenames = grep { defined } (ref $ent->{rulename} ?
|
||||
@{$ent->{rulename}} : $ent->{rulename});
|
||||
|
||||
$self->{rules_for_key}->{$key}{$_} = 1 foreach (@rulenames);
|
||||
|
||||
if ($dns_query_info) { # DNS query already underway or completed
|
||||
if ($dns_query_info->{blocked}) {
|
||||
dbg("async: blocked by %s: %s, rules: %s", $dns_query_info->{blocked},
|
||||
$dnskey, join(", ", @rulenames));
|
||||
return;
|
||||
}
|
||||
my $id = $ent->{id} = $dns_query_info->{id}; # re-use existing query
|
||||
return if !defined $id; # presumably some fatal failure
|
||||
my $id_tail = $id; $id_tail =~ s{^\d+/IN/}{};
|
||||
lc($id_tail) eq lc($dnskey)
|
||||
or info("async: unmatched id %s, key=%s", $id, $dnskey);
|
||||
|
||||
my $pkt = $dns_query_info->{pkt};
|
||||
if (!$pkt) { # DNS query underway, still waiting for results
|
||||
# just add our query to the existing one
|
||||
push(@{$dns_query_info->{applicants}}, [$ent,$cb]);
|
||||
$self->{pending_rules}->{$_}{$key} = 1 foreach (@rulenames);
|
||||
dbg("async: query %s already underway, adding no.%d, rules: %s",
|
||||
$id, scalar @{$dns_query_info->{applicants}},
|
||||
join(", ", @rulenames));
|
||||
|
||||
} else { # DNS query already completed, re-use results
|
||||
# answer already known, just do the callback and be done with it
|
||||
delete $self->{pending_rules}->{$_}{$key} foreach (@rulenames);
|
||||
if (!$cb) {
|
||||
dbg("async: query %s already done, re-using for %s, rules: %s",
|
||||
$id, $key, join(", ", @rulenames));
|
||||
} else {
|
||||
dbg("async: query %s already done, re-using for %s, callback, rules: %s",
|
||||
$id, $key, join(", ", @rulenames));
|
||||
eval {
|
||||
$cb->($ent, $pkt); 1;
|
||||
} or do {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (1) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("async: query %s completed, callback %s failed: %s\n",
|
||||
$id, $key, $@);
|
||||
};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
else { # no existing query, open a new DNS query
|
||||
$dns_query_info = $self->{all_lookups}{$dnskey} = {}; # new query needed
|
||||
my($id, $blocked, $check_dbrdom);
|
||||
# dns_query_restriction
|
||||
my $blocked_by = 'dns_query_restriction';
|
||||
my $dns_query_blockages = $self->{main}->{conf}->{dns_query_blocked};
|
||||
# dns_block_rule
|
||||
my $dns_block_domains = $self->{main}->{conf}->{dns_block_rule_domains};
|
||||
if ($dns_query_blockages || $dns_block_domains) {
|
||||
my $search_list = domain_to_search_list($domain);
|
||||
foreach my $parent_domain ((@$search_list, '*')) {
|
||||
if ($dns_query_blockages) {
|
||||
$blocked = $dns_query_blockages->{$parent_domain};
|
||||
last if defined $blocked; # stop at first defined, can be true or false
|
||||
}
|
||||
if ($parent_domain ne '*' && exists $dns_block_domains->{$parent_domain}) {
|
||||
# save for later check.. ps. untainted already
|
||||
$check_dbrdom = $dns_block_domains->{$parent_domain};
|
||||
}
|
||||
}
|
||||
}
|
||||
if (!$blocked && $check_dbrdom) {
|
||||
my $blockfile =
|
||||
$self->{main}->sed_path("__global_state_dir__/dnsblock_${check_dbrdom}");
|
||||
if (my $mtime = (stat($blockfile))[9]) {
|
||||
if (time - $mtime <= $self->{main}->{conf}->{dns_block_time}) {
|
||||
$blocked = 1;
|
||||
$blocked_by = 'dns_block_rule';
|
||||
} else {
|
||||
dbg("async: dns_block_rule removing expired $blockfile");
|
||||
unlink($blockfile);
|
||||
}
|
||||
}
|
||||
}
|
||||
if ($blocked) {
|
||||
dbg("async: blocked by %s: %s, rules: %s", $blocked_by, $dnskey,
|
||||
join(", ", @rulenames));
|
||||
$dns_query_info->{blocked} = $blocked_by;
|
||||
} else {
|
||||
dbg("async: launching %s, rules: %s", $dnskey, join(", ", @rulenames));
|
||||
$id = $self->{main}->{resolver}->bgsend($domain, $type, $class, sub {
|
||||
my($pkt, $pkt_id, $timestamp) = @_;
|
||||
# this callback sub is called from DnsResolver::poll_responses()
|
||||
# dbg("async: in a bgsend_and_start_lookup callback, id %s", $pkt_id);
|
||||
if ($pkt_id ne $id) {
|
||||
warn "async: mismatched dns id: got $pkt_id, expected $id\n";
|
||||
return;
|
||||
}
|
||||
$self->set_response_packet($pkt_id, $pkt, $ent->{key}, $timestamp);
|
||||
$dns_query_info->{pkt} = $pkt;
|
||||
my $cb_count = 0;
|
||||
foreach my $tuple (@{$dns_query_info->{applicants}}) {
|
||||
my($appl_ent, $appl_cb) = @$tuple;
|
||||
my @rulenames = grep { defined } (ref $appl_ent->{rulename} ?
|
||||
@{$appl_ent->{rulename}} : $appl_ent->{rulename});
|
||||
foreach (@rulenames) {
|
||||
delete $self->{pending_rules}->{$_}{$appl_ent->{key}};
|
||||
}
|
||||
if ($appl_cb) {
|
||||
dbg("async: calling callback on key %s, rules: %s",
|
||||
$key, join(", ", @rulenames));
|
||||
$cb_count++;
|
||||
eval {
|
||||
$appl_cb->($appl_ent, $pkt); 1;
|
||||
} or do {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (2) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("async: query %s completed, callback %s failed: %s\n",
|
||||
$id, $appl_ent->{key}, $@);
|
||||
};
|
||||
}
|
||||
}
|
||||
delete $dns_query_info->{applicants};
|
||||
dbg("async: query $id completed, no callbacks run") if !$cb_count;
|
||||
});
|
||||
}
|
||||
return if !defined $id;
|
||||
$dns_query_info->{id} = $ent->{id} = $id;
|
||||
push(@{$dns_query_info->{applicants}}, [$ent,$cb]);
|
||||
$self->{pending_rules}->{$_}{$key} = 1 foreach (@rulenames);
|
||||
$self->_start_lookup($ent, $options{master_deadline});
|
||||
}
|
||||
return $ent;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $ent = $async->start_lookup($ent, $master_deadline)
|
||||
|
||||
DIRECT USE DEPRECATED since 4.0.0, please use bgsend_and_start_lookup.
|
||||
|
||||
=cut
|
||||
|
||||
sub start_lookup {
|
||||
my $self = shift;
|
||||
|
||||
if (!$self->{start_lookup_warned}++) {
|
||||
my($package, $filename, $line) = caller;
|
||||
warn "async: deprecated start_lookup called, ".
|
||||
"from $package ($filename) line $line. You are likely using ".
|
||||
"a plugin that is not compatible with SpamAssasin 4.0.0.";
|
||||
}
|
||||
|
||||
return if $self->{main}->{resolver}->{no_resolver};
|
||||
$self->_start_lookup(@_);
|
||||
}
|
||||
|
||||
# Internal use not deprecated. :-)
|
||||
sub _start_lookup {
|
||||
my ($self, $ent, $master_deadline) = @_;
|
||||
|
||||
my $id = $ent->{id};
|
||||
@ -229,16 +465,16 @@ sub start_lookup {
|
||||
$ent->{timeout_initial} = $t_init;
|
||||
$ent->{timeout_min} = $t_end;
|
||||
|
||||
my @rulenames = grep { defined } (ref $ent->{rulename} ?
|
||||
@{$ent->{rulename}} : $ent->{rulename});
|
||||
$ent->{display_id} = # identifies entry in debug logging and similar
|
||||
join(", ", grep { defined }
|
||||
map { ref $ent->{$_} ? @{$ent->{$_}} : $ent->{$_} }
|
||||
qw(sets rules rulename type key) );
|
||||
join(", ", grep { defined } map { $ent->{$_} } qw(type key));
|
||||
|
||||
$self->{pending_lookups}->{$key} = $ent;
|
||||
|
||||
$self->{queries_started}++;
|
||||
$self->{total_queries_started}++;
|
||||
dbg("async: starting: %s (timeout %.1fs, min %.1fs)%s",
|
||||
dbg("async: starting: %s%s (timeout %.1fs, min %.1fs)%s",
|
||||
@rulenames ? join(", ", @rulenames).", " : '',
|
||||
$ent->{display_id}, $ent->{timeout_initial}, $ent->{timeout_min},
|
||||
!$clipped_by_master_deadline ? '' : ', capped by time limit');
|
||||
|
||||
@ -247,135 +483,15 @@ sub start_lookup {
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $ent = $async->bgsend_and_start_lookup($domain, $type, $class, $ent, $cb, %options)
|
||||
|
||||
A common idiom: calls C<bgsend>, followed by a call to C<start_lookup>,
|
||||
returning the argument $ent object as modified by C<start_lookup> and
|
||||
filled-in with a query ID.
|
||||
|
||||
=cut
|
||||
|
||||
sub bgsend_and_start_lookup {
|
||||
my($self, $domain, $type, $class, $ent, $cb, %options) = @_;
|
||||
$ent = {} if !$ent;
|
||||
$domain =~ s/\.+\z//s; # strip trailing dots, these sometimes still sneak in
|
||||
$ent->{id} = undef;
|
||||
$ent->{query_type} = $type;
|
||||
$ent->{query_domain} = $domain;
|
||||
$ent->{type} = $type if !exists $ent->{type};
|
||||
$cb = $ent->{completed_callback} if !$cb; # compatibility with SA < 3.4
|
||||
|
||||
my $key = $ent->{key} || '';
|
||||
|
||||
my $dnskey = uc($type) . '/' . lc($domain);
|
||||
my $dns_query_info = $self->{all_lookups}{$dnskey};
|
||||
|
||||
if ($dns_query_info) { # DNS query already underway or completed
|
||||
my $id = $ent->{id} = $dns_query_info->{id}; # re-use existing query
|
||||
return if !defined $id; # presumably blocked, or other fatal failure
|
||||
my $id_tail = $id; $id_tail =~ s{^\d+/IN/}{};
|
||||
lc($id_tail) eq lc($dnskey)
|
||||
or info("async: unmatched id %s, key=%s", $id, $dnskey);
|
||||
|
||||
my $pkt = $dns_query_info->{pkt};
|
||||
if (!$pkt) { # DNS query underway, still waiting for results
|
||||
# just add our query to the existing one
|
||||
push(@{$dns_query_info->{applicants}}, [$ent,$cb]);
|
||||
dbg("async: query %s already underway, adding no.%d %s",
|
||||
$id, scalar @{$dns_query_info->{applicants}},
|
||||
$ent->{rulename} || $key);
|
||||
|
||||
} else { # DNS query already completed, re-use results
|
||||
# answer already known, just do the callback and be done with it
|
||||
if (!$cb) {
|
||||
dbg("async: query %s already done, re-using for %s", $id, $key);
|
||||
} else {
|
||||
dbg("async: query %s already done, re-using for %s, callback",
|
||||
$id, $key);
|
||||
eval {
|
||||
$cb->($ent, $pkt); 1;
|
||||
} or do {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (1) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("query %s completed, callback %s failed: %s\n",
|
||||
$id, $key, $@);
|
||||
};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
else { # no existing query, open a new DNS query
|
||||
$dns_query_info = $self->{all_lookups}{$dnskey} = {}; # new query needed
|
||||
my($id, $blocked);
|
||||
my $dns_query_blockages = $self->{main}->{conf}->{dns_query_blocked};
|
||||
if ($dns_query_blockages) {
|
||||
my $search_list = domain_to_search_list($domain);
|
||||
foreach my $parent_domain (@$search_list) {
|
||||
$blocked = $dns_query_blockages->{$parent_domain};
|
||||
last if defined $blocked; # stop at first defined, can be true or false
|
||||
}
|
||||
}
|
||||
if ($blocked) {
|
||||
dbg("async: blocked by dns_query_restriction: %s", $dnskey);
|
||||
} else {
|
||||
dbg("async: launching %s for %s", $dnskey, $key);
|
||||
$id = $self->{main}->{resolver}->bgsend($domain, $type, $class, sub {
|
||||
my($pkt, $pkt_id, $timestamp) = @_;
|
||||
# this callback sub is called from DnsResolver::poll_responses()
|
||||
# dbg("async: in a bgsend_and_start_lookup callback, id %s", $pkt_id);
|
||||
if ($pkt_id ne $id) {
|
||||
warn "async: mismatched dns id: got $pkt_id, expected $id\n";
|
||||
return;
|
||||
}
|
||||
$self->set_response_packet($pkt_id, $pkt, $ent->{key}, $timestamp);
|
||||
$dns_query_info->{pkt} = $pkt;
|
||||
my $cb_count = 0;
|
||||
foreach my $tuple (@{$dns_query_info->{applicants}}) {
|
||||
my($appl_ent, $appl_cb) = @$tuple;
|
||||
if ($appl_cb) {
|
||||
dbg("async: calling callback on key %s%s", $key,
|
||||
!defined $appl_ent->{rulename} ? ''
|
||||
: ", rule ".$appl_ent->{rulename});
|
||||
$cb_count++;
|
||||
eval {
|
||||
$appl_cb->($appl_ent, $pkt); 1;
|
||||
} or do {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (2) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("query %s completed, callback %s failed: %s\n",
|
||||
$id, $appl_ent->{key}, $@);
|
||||
};
|
||||
}
|
||||
}
|
||||
delete $dns_query_info->{applicants};
|
||||
dbg("async: query $id completed, no callbacks run") if !$cb_count;
|
||||
});
|
||||
}
|
||||
return if !defined $id;
|
||||
$dns_query_info->{id} = $ent->{id} = $id;
|
||||
push(@{$dns_query_info->{applicants}}, [$ent,$cb]);
|
||||
$self->start_lookup($ent, $options{master_deadline});
|
||||
}
|
||||
return $ent;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item $ent = $async->get_lookup($key)
|
||||
|
||||
Retrieve the pending-lookup object for the given key C<$key>.
|
||||
|
||||
If the lookup is complete, this will return C<undef>.
|
||||
|
||||
Note that a lookup is still considered "pending" until C<complete_lookups()> is
|
||||
called, even if it has been reported as complete via C<set_response_packet()>.
|
||||
DEPRECATED since 4.0.0. Do not use.
|
||||
|
||||
=cut
|
||||
|
||||
sub get_lookup {
|
||||
my ($self, $key) = @_;
|
||||
warn("async: deprecated get_lookup function used\n");
|
||||
return $self->{pending_lookups}->{$key};
|
||||
}
|
||||
|
||||
@ -415,18 +531,16 @@ sub complete_lookups {
|
||||
my %typecount;
|
||||
|
||||
my $pending = $self->{pending_lookups};
|
||||
$self->{queries_started} = 0;
|
||||
$self->{queries_completed} = 0;
|
||||
|
||||
my $now = time;
|
||||
|
||||
if (defined $timeout && $timeout > 0 &&
|
||||
%$pending && $self->{total_queries_started} > 0)
|
||||
%$pending && $self->{queries_started} > 0)
|
||||
{
|
||||
# shrink a 'select' timeout if a caller specified unnecessarily long
|
||||
# value beyond the latest deadline of any outstanding request;
|
||||
# can save needless wait time (up to 1 second in harvest_dnsbl_queries)
|
||||
my $r = $self->{total_queries_completed} / $self->{total_queries_started};
|
||||
my $r = $self->{queries_completed} / $self->{queries_started};
|
||||
my $r2 = $r * $r; # 0..1
|
||||
my $max_deadline;
|
||||
while (my($key,$ent) = each %$pending) {
|
||||
@ -457,9 +571,9 @@ sub complete_lookups {
|
||||
|
||||
if (%$pending) { # any outstanding requests still?
|
||||
$self->{last_poll_responses_time} = $now;
|
||||
my $nfound = $self->{main}->{resolver}->poll_responses($timeout);
|
||||
dbg("async: select found %s responses ready (t.o.=%.1f)",
|
||||
!$nfound ? 'no' : $nfound, $timeout);
|
||||
my ($nfound, $ncb) = $self->{main}->{resolver}->poll_responses($timeout);
|
||||
dbg("async: select found %d responses ready (t.o.=%.1f), did %d callbacks",
|
||||
$nfound, $timeout, $ncb);
|
||||
}
|
||||
$now = time; # capture new timestamp, after possible sleep in 'select'
|
||||
|
||||
@ -475,18 +589,19 @@ sub complete_lookups {
|
||||
$anydone = 1;
|
||||
$ent->{finish_time} = $now if !defined $ent->{finish_time};
|
||||
my $elapsed = $ent->{finish_time} - $ent->{start_time};
|
||||
dbg("async: completed in %.3f s: %s", $elapsed, $ent->{display_id});
|
||||
$self->{timing_by_query}->{". $key"} += $elapsed;
|
||||
my @rulenames = keys %{$self->{rules_for_key}->{$key}};
|
||||
dbg("async: completed in %.3f s: %s, rules: %s",
|
||||
$elapsed, $ent->{display_id}, join(", ", @rulenames));
|
||||
$self->{timing_by_query}->{". $key ($ent->{type})"} += $elapsed;
|
||||
$self->{queries_completed}++;
|
||||
$self->{total_queries_completed}++;
|
||||
delete $pending->{$key};
|
||||
}
|
||||
}
|
||||
|
||||
if (%$pending) { # still any requests outstanding? are they expired?
|
||||
my $r =
|
||||
!$allow_aborting_of_expired || !$self->{total_queries_started} ? 1.0
|
||||
: $self->{total_queries_completed} / $self->{total_queries_started};
|
||||
!$allow_aborting_of_expired || !$self->{queries_started} ? 1.0
|
||||
: $self->{queries_completed} / $self->{queries_started};
|
||||
my $r2 = $r * $r; # 0..1
|
||||
while (my($key,$ent) = each %$pending) {
|
||||
$typecount{$ent->{type}}++;
|
||||
@ -496,8 +611,6 @@ sub complete_lookups {
|
||||
$dt = 1 + int $dt if $timer_resolution == 1 && $dt > int $dt;
|
||||
$allexpired = 0 if $now <= $ent->{start_time} + $dt;
|
||||
}
|
||||
dbg("async: queries completed: %d, started: %d",
|
||||
$self->{queries_completed}, $self->{queries_started});
|
||||
}
|
||||
|
||||
# ensure we don't get stuck if a request gets lost in the ether.
|
||||
@ -511,9 +624,9 @@ sub complete_lookups {
|
||||
$alldone = 1;
|
||||
}
|
||||
else {
|
||||
dbg("async: queries active: %s%s at %s",
|
||||
dbg("async: queries still pending: %s%s",
|
||||
join (' ', map { "$_=$typecount{$_}" } sort keys %typecount),
|
||||
$allexpired ? ', all expired' : '', scalar(localtime(time)));
|
||||
$allexpired ? ', all expired' : '');
|
||||
$alldone = 0;
|
||||
}
|
||||
1;
|
||||
@ -544,15 +657,20 @@ sub abort_remaining_lookups {
|
||||
my $foundcnt = 0;
|
||||
my $now = time;
|
||||
|
||||
$self->{pending_rules} = {};
|
||||
|
||||
while (my($key,$ent) = each %$pending) {
|
||||
dbg("async: aborting after %.3f s, %s: %s",
|
||||
$now - $ent->{start_time},
|
||||
my $dur = $now - $ent->{start_time};
|
||||
my @rulenames = keys %{$self->{rules_for_key}->{$key}};
|
||||
my $msg = sprintf( "async: aborting after %.3f s, %s: %s, rules: %s",
|
||||
$dur,
|
||||
(defined $ent->{timeout_initial} &&
|
||||
$now > $ent->{start_time} + $ent->{timeout_initial}
|
||||
? 'past original deadline' : 'deadline shrunk'),
|
||||
$ent->{display_id} );
|
||||
$ent->{display_id}, join(", ", @rulenames) );
|
||||
$dur > 1 ? info($msg) : dbg($msg);
|
||||
$foundcnt++;
|
||||
$self->{timing_by_query}->{"X $key"} = $now - $ent->{start_time};
|
||||
$self->{timing_by_query}->{"X $key"} = $dur;
|
||||
$ent->{finish_time} = $now if !defined $ent->{finish_time};
|
||||
delete $pending->{$key};
|
||||
}
|
||||
@ -566,8 +684,10 @@ sub abort_remaining_lookups {
|
||||
foreach my $tuple (@{$dns_query_info->{applicants}}) {
|
||||
my($ent, $cb) = @$tuple;
|
||||
if ($cb) {
|
||||
dbg("async: calling callback/abort on key %s%s", $dnskey,
|
||||
!defined $ent->{rulename} ? '' : ", rule ".$ent->{rulename});
|
||||
my @rulenames = grep { defined } (ref $ent->{rulename} ?
|
||||
@{$ent->{rulename}} : $ent->{rulename});
|
||||
dbg("async: calling callback/abort on key %s, rules: %s", $dnskey,
|
||||
join(", ", @rulenames));
|
||||
$cb_count++;
|
||||
eval {
|
||||
$cb->($ent, undef); 1;
|
||||
@ -575,7 +695,7 @@ sub abort_remaining_lookups {
|
||||
chomp $@;
|
||||
# resignal if alarm went off
|
||||
die "async: (2) $@\n" if $@ =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn sprintf("query %s aborted, callback %s failed: %s\n",
|
||||
warn sprintf("async: query %s aborted, callback %s failed: %s\n",
|
||||
$dnskey, $ent->{key}, $@);
|
||||
};
|
||||
}
|
||||
@ -594,6 +714,8 @@ sub abort_remaining_lookups {
|
||||
|
||||
=item $async->set_response_packet($id, $pkt, $key, $timestamp)
|
||||
|
||||
For internal use, do not call from plugins.
|
||||
|
||||
Register a "response packet" for a given query. C<$id> is the ID for the
|
||||
query, and must match the C<id> supplied in C<start_lookup()>. C<$pkt> is the
|
||||
packet object for the response. A parameter C<$key> identifies an entry in a
|
||||
@ -645,6 +767,8 @@ sub set_response_packet {
|
||||
|
||||
=item $async->report_id_complete($id,$key,$key,$timestamp)
|
||||
|
||||
DEPRECATED since 4.0.0. Do not use.
|
||||
|
||||
Legacy. Equivalent to $self->set_response_packet($id,undef,$key,$timestamp),
|
||||
i.e. providing undef as a response packet. Register that a query has
|
||||
completed and is no longer "pending". C<$id> is the ID for the query,
|
||||
|
@ -17,7 +17,7 @@
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::AutoWhitelist - auto-whitelist handler for SpamAssassin
|
||||
Mail::SpamAssassin::AutoWelcomelist - auto-welcomelist handler for SpamAssassin
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
@ -27,10 +27,10 @@ Mail::SpamAssassin::AutoWhitelist - auto-whitelist handler for SpamAssassin
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin is a module to identify spam using text analysis and
|
||||
several internet-based realtime blacklists.
|
||||
several internet-based realtime blocklists.
|
||||
|
||||
This class is used internally by SpamAssassin to manage the automatic
|
||||
whitelisting functionality. Please refer to the C<Mail::SpamAssassin>
|
||||
welcomelisting functionality. Please refer to the C<Mail::SpamAssassin>
|
||||
documentation for public interfaces.
|
||||
|
||||
=head1 METHODS
|
||||
@ -39,7 +39,7 @@ documentation for public interfaces.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::AutoWhitelist;
|
||||
package Mail::SpamAssassin::AutoWelcomelist;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
@ -64,9 +64,9 @@ sub new {
|
||||
my $conf = $main->{conf};
|
||||
my $self = {
|
||||
main => $main,
|
||||
factor => $conf->{auto_whitelist_factor},
|
||||
ipv4_mask_len => $conf->{auto_whitelist_ipv4_mask_len},
|
||||
ipv6_mask_len => $conf->{auto_whitelist_ipv6_mask_len},
|
||||
factor => $conf->{auto_welcomelist_factor},
|
||||
ipv4_mask_len => $conf->{auto_welcomelist_ipv4_mask_len},
|
||||
ipv6_mask_len => $conf->{auto_welcomelist_ipv6_mask_len},
|
||||
};
|
||||
|
||||
my $factory;
|
||||
@ -74,23 +74,22 @@ sub new {
|
||||
$factory = $main->{pers_addr_list_factory};
|
||||
}
|
||||
else {
|
||||
my $type = $conf->{auto_whitelist_factory};
|
||||
my $type = $conf->{auto_welcomelist_factory};
|
||||
if ($type =~ /^([_A-Za-z0-9:]+)$/) {
|
||||
$type = untaint_var($type);
|
||||
eval '
|
||||
require '.$type.';
|
||||
$factory = '.$type.'->new();
|
||||
1;
|
||||
'
|
||||
or do {
|
||||
' or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn "auto-whitelist: $eval_stat\n";
|
||||
warn "auto-welcomelist: $eval_stat\n";
|
||||
undef $factory;
|
||||
};
|
||||
$main->set_persistent_address_list_factory($factory) if $factory;
|
||||
}
|
||||
else {
|
||||
warn "auto-whitelist: illegal auto_whitelist_factory setting\n";
|
||||
warn "auto-welcomelist: illegal auto_welcomelist_factory setting\n";
|
||||
}
|
||||
}
|
||||
|
||||
@ -136,7 +135,7 @@ sub check_address {
|
||||
my $noipent = $self->{checker}->get_addr_entry ($noipaddr, undef);
|
||||
|
||||
if (defined $noipent->{msgcount} && $noipent->{msgcount} > 0) {
|
||||
dbg("auto-whitelist: found entry w/o IP address for $addr: replacing with $origip");
|
||||
dbg("auto-welcomelist: found entry w/o IP address for $addr: replacing with $origip");
|
||||
$self->{checker}->remove_entry($noipent);
|
||||
# Now assign proper entry the count and totscore values of the
|
||||
# no-IP entry instead of assigning the whole value to avoid
|
||||
@ -151,7 +150,7 @@ sub check_address {
|
||||
$entry->{msgcount} != $entry->{msgcount} || # test for NaN
|
||||
$entry->{totscore} != $entry->{totscore})
|
||||
{
|
||||
warn "auto-whitelist: resetting bad data for ($addr, $origip), ".
|
||||
warn "auto-welcomelist: resetting bad data for ($addr, $origip), ".
|
||||
"count: $entry->{msgcount}, totscore: $entry->{totscore}\n";
|
||||
$entry->{msgcount} = $entry->{totscore} = 0;
|
||||
}
|
||||
@ -164,7 +163,7 @@ sub check_address {
|
||||
=item awl->count();
|
||||
|
||||
This method will return the count of messages used in determining the
|
||||
whitelist correction.
|
||||
welcomelist correction.
|
||||
|
||||
=cut
|
||||
|
||||
@ -191,7 +190,7 @@ sub add_score {
|
||||
return; # no factory defined; we can't check
|
||||
}
|
||||
if ($score != $score) {
|
||||
warn "auto-whitelist: attempt to add a $score to AWL entry ignored\n";
|
||||
warn "auto-welcomelist: attempt to add a $score to AWL entry ignored\n";
|
||||
return; # don't try to add a NaN
|
||||
}
|
||||
|
||||
@ -204,7 +203,7 @@ sub add_score {
|
||||
=item awl->add_known_good_address($addr);
|
||||
|
||||
This method will add a score of -100 to the given address -- effectively
|
||||
"bootstrapping" the address as being one that should be whitelisted.
|
||||
"bootstrapping" the address as being one that should be welcomelisted.
|
||||
|
||||
=cut
|
||||
|
||||
@ -220,7 +219,7 @@ sub add_known_good_address {
|
||||
=item awl->add_known_bad_address($addr);
|
||||
|
||||
This method will add a score of 100 to the given address -- effectively
|
||||
"bootstrapping" the address as being one that should be blacklisted.
|
||||
"bootstrapping" the address as being one that should be blocklisted.
|
||||
|
||||
=cut
|
||||
|
||||
@ -295,13 +294,13 @@ sub ip_to_awl_key {
|
||||
} else {
|
||||
my $origip_obj = NetAddr::IP->new($origip . '/' . $mask_len);
|
||||
if (!defined $origip_obj) { # invalid IPv4 address
|
||||
dbg("auto-whitelist: bad IPv4 address $origip");
|
||||
dbg("auto-welcomelist: bad IPv4 address $origip");
|
||||
} else {
|
||||
$result = $origip_obj->network->addr;
|
||||
$result =~s/(\.0){1,3}\z//; # truncate zero tail
|
||||
}
|
||||
}
|
||||
} elsif ($origip =~ /:/ && # triage
|
||||
} elsif (index($origip, ':') >= 0 && # triage
|
||||
$origip =~
|
||||
/^ [0-9a-f]{0,4} (?: : [0-9a-f]{0,4} | \. [0-9]{1,3} ){2,9} $/xsi) {
|
||||
# looks like an IPv6 address
|
||||
@ -309,19 +308,19 @@ sub ip_to_awl_key {
|
||||
$mask_len = 48 if !defined $mask_len;
|
||||
my $origip_obj = NetAddr::IP->new6($origip . '/' . $mask_len);
|
||||
if (!defined $origip_obj) { # invalid IPv6 address
|
||||
dbg("auto-whitelist: bad IPv6 address $origip");
|
||||
dbg("auto-welcomelist: bad IPv6 address $origip");
|
||||
} elsif (NetAddr::IP->can('full6')) { # since NetAddr::IP 4.010
|
||||
$result = $origip_obj->network->full6; # string in a canonical form
|
||||
$result =~ s/(:0000){1,7}\z/::/; # compress zero tail
|
||||
}
|
||||
} else {
|
||||
dbg("auto-whitelist: bad IP address $origip");
|
||||
dbg("auto-welcomelist: bad IP address $origip");
|
||||
}
|
||||
if (defined $result && length($result) > 39) { # just in case, keep under
|
||||
$result = substr($result,0,39); # the awl.ip field size
|
||||
}
|
||||
if (defined $result) {
|
||||
dbg("auto-whitelist: IP masking %s -> %s", $origip,$result);
|
||||
dbg("auto-welcomelist: IP masking %s -> %s", $origip,$result);
|
||||
}
|
||||
return $result;
|
||||
}
|
@ -111,7 +111,7 @@ sub learn {
|
||||
{
|
||||
# DMK, koppel@ece.lsu.edu: Hoping that the ultimate fix to bug 2263 will
|
||||
# make it unnecessary to construct a PerMsgStatus here.
|
||||
my $PMS = new Mail::SpamAssassin::PerMsgStatus $self->{main}, $msg;
|
||||
my $PMS = Mail::SpamAssassin::PerMsgStatus->new($self->{main}, $msg);
|
||||
my $ignore = $self->ignore_message($PMS);
|
||||
$PMS->finish();
|
||||
return 0 if $ignore;
|
||||
|
@ -35,14 +35,11 @@ use Errno qw(EBADF);
|
||||
#use Data::Dumper;
|
||||
use File::Basename;
|
||||
use File::Path;
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1) }
|
||||
}
|
||||
use Digest::SHA qw(sha1);
|
||||
|
||||
use Mail::SpamAssassin::BayesStore;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
|
||||
our @ISA = qw( Mail::SpamAssassin::BayesStore );
|
||||
|
||||
@ -77,6 +74,7 @@ sub new {
|
||||
|
||||
sub DESTROY {
|
||||
my $self = shift;
|
||||
|
||||
$self->_close_db;
|
||||
}
|
||||
|
||||
@ -650,6 +648,14 @@ sub dump_tokens {
|
||||
my($self, $template, $regex, @vars) = @_;
|
||||
dbg("bayes: dump_tokens starting");
|
||||
|
||||
if (defined $regex) {
|
||||
my ($rec, $err) = compile_regexp($regex, 2);
|
||||
if (!$rec) {
|
||||
die "Invalid dump_tokens regex '$regex': $err\n";
|
||||
}
|
||||
$regex = $rec;
|
||||
}
|
||||
|
||||
my $cursor = $self->{handles}->{tokens}->db_cursor;
|
||||
$cursor or die "Couldn't get cursor: $BerkeleyDB::Error";
|
||||
my ($token, $value) = ("", "");
|
||||
|
@ -27,14 +27,10 @@ use Errno qw(EBADF);
|
||||
use File::Basename;
|
||||
use File::Spec;
|
||||
use File::Path;
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1) }
|
||||
}
|
||||
use Digest::SHA qw(sha1);
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var am_running_on_windows);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var am_running_on_windows compile_regexp);
|
||||
use Mail::SpamAssassin::BayesStore;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
@ -992,9 +988,17 @@ sub get_storage_variables {
|
||||
sub dump_db_toks {
|
||||
my ($self, $template, $regex, @vars) = @_;
|
||||
|
||||
if (defined $regex) {
|
||||
my ($rec, $err) = compile_regexp($regex, 2);
|
||||
if (!$rec) {
|
||||
die "Invalid dump_tokens regex '$regex': $err\n";
|
||||
}
|
||||
$regex = $rec;
|
||||
}
|
||||
|
||||
while (my ($tok, $tokvalue) = each %{$self->{db_toks}}) {
|
||||
next if ($tok =~ MAGIC_RE); # skip magic tokens
|
||||
next if (defined $regex && ($tok !~ /$regex/o));
|
||||
next if (defined $regex && $tok !~ /$regex/o);
|
||||
|
||||
# We have the value already, so just unpack it.
|
||||
my ($ts, $th, $atime) = $self->tok_unpack ($tokvalue);
|
||||
|
@ -30,6 +30,9 @@ In addition, this module will support rollback on error, if you are
|
||||
using the InnoDB database table type in MySQL. For more information
|
||||
please review the instructions in sql/README.bayes.
|
||||
|
||||
This module is also compatible with MariaDB and DBD::MariaDB can be used
|
||||
instead of DBD::mysql driver.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::BayesStore::MySQL;
|
||||
@ -292,7 +295,8 @@ sub set_running_expire_tok {
|
||||
|
||||
return 0 unless (defined($self->{_dbh}));
|
||||
|
||||
my $sql = "INSERT INTO bayes_expire (id,runtime) VALUES (?,?)";
|
||||
my $sql = "INSERT INTO bayes_expire (id,runtime) VALUES (?,?)
|
||||
ON DUPLICATE KEY UPDATE runtime=VALUES(runtime)";
|
||||
|
||||
my $time = time();
|
||||
|
||||
@ -339,6 +343,147 @@ sub remove_running_expire_tok {
|
||||
return 1;
|
||||
}
|
||||
|
||||
=head2 tok_get
|
||||
|
||||
public instance (Integer, Integer, Integer) tok_get (String $token)
|
||||
|
||||
Description:
|
||||
This method retrieves a specified token (C<$token>) from the database
|
||||
and returns it's spam_count, ham_count and last access time.
|
||||
|
||||
=cut
|
||||
|
||||
sub tok_get {
|
||||
my ($self, $token) = @_;
|
||||
|
||||
return (0,0,0) unless (defined($self->{_dbh}));
|
||||
|
||||
my $sql = "SELECT spam_count, ham_count, atime
|
||||
FROM bayes_token
|
||||
WHERE id = ?
|
||||
AND token = ?";
|
||||
|
||||
my $sth = $self->{_dbh}->prepare_cached($sql);
|
||||
|
||||
unless (defined($sth)) {
|
||||
dbg("bayes: tok_get: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return (0,0,0);
|
||||
}
|
||||
|
||||
$sth->bind_param(1, $self->{_userid});
|
||||
$sth->bind_param(2, $token, DBI::SQL_BINARY);
|
||||
|
||||
my $rc = $sth->execute();
|
||||
|
||||
unless ($rc) {
|
||||
dbg("bayes: tok_get: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return (0,0,0);
|
||||
}
|
||||
|
||||
my ($spam_count, $ham_count, $atime) = $sth->fetchrow_array();
|
||||
|
||||
$sth->finish();
|
||||
|
||||
$spam_count = 0 if (!$spam_count || $spam_count < 0);
|
||||
$ham_count = 0 if (!$ham_count || $ham_count < 0);
|
||||
$atime = 0 if (!$atime);
|
||||
|
||||
return ($spam_count, $ham_count, $atime)
|
||||
}
|
||||
|
||||
=head2 tok_get_all
|
||||
|
||||
public instance (\@) tok_get (@ $tokens)
|
||||
|
||||
Description:
|
||||
This method retrieves the specified tokens (C<$tokens>) from storage and returns
|
||||
an array ref of arrays spam count, ham count and last access time.
|
||||
|
||||
=cut
|
||||
|
||||
sub tok_get_all {
|
||||
my ($self, @tokens) = @_;
|
||||
|
||||
return [] unless (defined($self->{_dbh}));
|
||||
|
||||
my $token_list_size = scalar(@tokens);
|
||||
dbg("bayes: tok_get_all: token count: $token_list_size");
|
||||
my @tok_results;
|
||||
|
||||
my $search_index = 0;
|
||||
my $results_index = 0;
|
||||
my $bunch_end;
|
||||
|
||||
my $token_select = $self->_token_select_string();
|
||||
|
||||
my $multi_sql = "SELECT $token_select, spam_count, ham_count, atime
|
||||
FROM bayes_token
|
||||
WHERE id = ?
|
||||
AND token IN ";
|
||||
|
||||
# fetch tokens in bunches of 100 until there are <= 100 left, then just fetch the rest
|
||||
while ($token_list_size > $search_index) {
|
||||
my $bunch_size;
|
||||
if ($token_list_size - $search_index > 100) {
|
||||
$bunch_size = 100;
|
||||
}
|
||||
else {
|
||||
$bunch_size = $token_list_size - $search_index;
|
||||
}
|
||||
while ($token_list_size - $search_index >= $bunch_size) {
|
||||
my @tok;
|
||||
my $in_str = '(';
|
||||
|
||||
$bunch_end = $search_index + $bunch_size;
|
||||
for ( ; $search_index < $bunch_end; $search_index++) {
|
||||
$in_str .= '?,';
|
||||
push(@tok, $tokens[$search_index]);
|
||||
}
|
||||
chop $in_str;
|
||||
$in_str .= ')';
|
||||
|
||||
my $dynamic_sql = $multi_sql . $in_str;
|
||||
|
||||
my $sth = $self->{_dbh}->prepare($dynamic_sql);
|
||||
|
||||
unless (defined($sth)) {
|
||||
dbg("bayes: tok_get_all: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return [];
|
||||
}
|
||||
|
||||
my $idx = 0;
|
||||
$sth->bind_param(++$idx, $self->{_userid});
|
||||
$sth->bind_param(++$idx, $_, DBI::SQL_BINARY) foreach (@tok);
|
||||
|
||||
my $rc = $sth->execute();
|
||||
|
||||
unless ($rc) {
|
||||
dbg("bayes: tok_get_all: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return [];
|
||||
}
|
||||
|
||||
my $results = $sth->fetchall_arrayref();
|
||||
|
||||
$sth->finish();
|
||||
|
||||
foreach my $result (@{$results}) {
|
||||
# Make sure that spam_count and ham_count are not negative
|
||||
$result->[1] = 0 if (!$result->[1] || $result->[1] < 0);
|
||||
$result->[2] = 0 if (!$result->[2] || $result->[2] < 0);
|
||||
# Make sure that atime has a value
|
||||
$result->[3] = 0 if (!$result->[3]);
|
||||
$tok_results[$results_index++] = $result;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return \@tok_results;
|
||||
}
|
||||
|
||||
=head2 nspam_nham_change
|
||||
|
||||
public instance (Boolean) nspam_nham_change (Integer $num_spam,
|
||||
@ -421,10 +566,22 @@ sub tok_touch {
|
||||
AND token = ?
|
||||
AND atime < ?";
|
||||
|
||||
my $rows = $self->{_dbh}->do($sql, undef, $atime, $self->{_userid},
|
||||
$token, $atime);
|
||||
my $sth = $self->{_dbh}->prepare_cached($sql);
|
||||
|
||||
unless (defined($rows)) {
|
||||
unless (defined($sth)) {
|
||||
dbg("bayes: tok_touch: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return 0;
|
||||
}
|
||||
|
||||
$sth->bind_param(1, $atime);
|
||||
$sth->bind_param(2, $self->{_userid});
|
||||
$sth->bind_param(3, $token, DBI::SQL_BINARY);
|
||||
$sth->bind_param(4, $atime);
|
||||
|
||||
my $rows = $sth->execute();
|
||||
|
||||
unless ($rows) {
|
||||
dbg("bayes: tok_touch: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return 0;
|
||||
@ -478,20 +635,29 @@ sub tok_touch_all {
|
||||
return 1 unless (scalar(@{$tokens}));
|
||||
|
||||
my $sql = "UPDATE bayes_token SET atime = ? WHERE id = ? AND token IN (";
|
||||
|
||||
my @bindings = ($atime, $self->{_userid});
|
||||
foreach my $token (@{$tokens}) {
|
||||
foreach (@{$tokens}) {
|
||||
$sql .= "?,";
|
||||
push(@bindings, $token);
|
||||
}
|
||||
chop($sql); # get rid of trailing ,
|
||||
|
||||
$sql .= ") AND atime < ?";
|
||||
push(@bindings, $atime);
|
||||
|
||||
my $rows = $self->{_dbh}->do($sql, undef, @bindings);
|
||||
my $sth = $self->{_dbh}->prepare($sql);
|
||||
|
||||
unless (defined($rows)) {
|
||||
unless (defined($sth)) {
|
||||
dbg("bayes: tok_touch_all: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return [];
|
||||
}
|
||||
|
||||
my $idx = 0;
|
||||
$sth->bind_param(++$idx, $atime);
|
||||
$sth->bind_param(++$idx, $self->{_userid});
|
||||
$sth->bind_param(++$idx, $_, DBI::SQL_BINARY) foreach (@{$tokens});
|
||||
$sth->bind_param(++$idx, $atime);
|
||||
|
||||
my $rows = $sth->execute();
|
||||
|
||||
unless ($rows) {
|
||||
dbg("bayes: tok_touch_all: SQL error: ".$self->{_dbh}->errstr());
|
||||
$self->{_dbh}->rollback();
|
||||
return 0;
|
||||
@ -735,7 +901,8 @@ sub _initialize_db {
|
||||
return 0;
|
||||
}
|
||||
|
||||
$id = $self->{_dbh}->{'mysql_insertid'};
|
||||
$id = $self->{_dsn} =~ /^DBI:MariaDB/i ?
|
||||
$self->{_dbh}->{'mariadb_insertid'} : $self->{_dbh}->{'mysql_insertid'};
|
||||
|
||||
$self->{_dbh}->commit();
|
||||
|
||||
@ -797,10 +964,12 @@ sub _put_token {
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $rc = $sth->execute($spam_count,
|
||||
$ham_count,
|
||||
$self->{_userid},
|
||||
$token);
|
||||
$sth->bind_param(1, $spam_count);
|
||||
$sth->bind_param(2, $ham_count);
|
||||
$sth->bind_param(3, $self->{_userid});
|
||||
$sth->bind_param(4, $token, DBI::SQL_BINARY);
|
||||
|
||||
my $rc = $sth->execute();
|
||||
|
||||
unless ($rc) {
|
||||
dbg("bayes: _put_token: SQL error: ".$self->{_dbh}->errstr());
|
||||
@ -824,14 +993,16 @@ sub _put_token {
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $rc = $sth->execute($self->{_userid},
|
||||
$token,
|
||||
$spam_count,
|
||||
$ham_count,
|
||||
$atime,
|
||||
$spam_count,
|
||||
$ham_count,
|
||||
$atime);
|
||||
$sth->bind_param(1, $self->{_userid});
|
||||
$sth->bind_param(2, $token, DBI::SQL_BINARY);
|
||||
$sth->bind_param(3, $spam_count);
|
||||
$sth->bind_param(4, $ham_count);
|
||||
$sth->bind_param(5, $atime);
|
||||
$sth->bind_param(6, $spam_count);
|
||||
$sth->bind_param(7, $ham_count);
|
||||
$sth->bind_param(8, $atime);
|
||||
|
||||
my $rc = $sth->execute();
|
||||
|
||||
unless ($rc) {
|
||||
dbg("bayes: _put_token: SQL error: ".$self->{_dbh}->errstr());
|
||||
@ -948,12 +1119,15 @@ sub _put_tokens {
|
||||
return 0;
|
||||
}
|
||||
|
||||
$sth->bind_param(1, $spam_count);
|
||||
$sth->bind_param(2, $ham_count);
|
||||
$sth->bind_param(3, $self->{_userid});
|
||||
# 4, update token in foreach loop
|
||||
|
||||
my $error_p = 0;
|
||||
foreach my $token (keys %{$tokens}) {
|
||||
my $rc = $sth->execute($spam_count,
|
||||
$ham_count,
|
||||
$self->{_userid},
|
||||
$token);
|
||||
$sth->bind_param(4, $token, DBI::SQL_BINARY);
|
||||
my $rc = $sth->execute();
|
||||
|
||||
unless ($rc) {
|
||||
dbg("bayes: _put_tokens: SQL error: ".$self->{_dbh}->errstr());
|
||||
@ -984,18 +1158,21 @@ sub _put_tokens {
|
||||
return 0;
|
||||
}
|
||||
|
||||
$sth->bind_param(1, $self->{_userid});
|
||||
# 2, update token in foreach loop
|
||||
$sth->bind_param(3, $spam_count);
|
||||
$sth->bind_param(4, $ham_count);
|
||||
$sth->bind_param(5, $atime);
|
||||
$sth->bind_param(6, $spam_count);
|
||||
$sth->bind_param(7, $ham_count);
|
||||
$sth->bind_param(8, $atime);
|
||||
|
||||
my $error_p = 0;
|
||||
my $new_tokens = 0;
|
||||
my $need_atime_update_p = 0;
|
||||
foreach my $token (keys %{$tokens}) {
|
||||
my $rc = $sth->execute($self->{_userid},
|
||||
$token,
|
||||
$spam_count,
|
||||
$ham_count,
|
||||
$atime,
|
||||
$spam_count,
|
||||
$ham_count,
|
||||
$atime);
|
||||
$sth->bind_param(2, $token, DBI::SQL_BINARY);
|
||||
my $rc = $sth->execute();
|
||||
|
||||
if (!$rc) {
|
||||
dbg("bayes: _put_tokens: SQL error: ".$self->{_dbh}->errstr());
|
||||
@ -1070,6 +1247,22 @@ sub _put_tokens {
|
||||
return 1;
|
||||
}
|
||||
|
||||
=head2 _token_select_string
|
||||
|
||||
private instance (String) _token_select_string
|
||||
|
||||
Description:
|
||||
This method returns the string to be used in SELECT statements to represent
|
||||
the token column.
|
||||
|
||||
The default is to use the RPAD function to pad the token out to 5 characters.
|
||||
|
||||
=cut
|
||||
|
||||
sub _token_select_string {
|
||||
return "RPAD(token, 5, ' ')";
|
||||
}
|
||||
|
||||
sub sa_die { Mail::SpamAssassin::sa_die(@_); }
|
||||
|
||||
1;
|
||||
|
@ -51,7 +51,7 @@ use constant HAS_DBI => eval { require DBI; };
|
||||
|
||||
# We need this so we can import the pg_types, since this is a DBD::Pg specific module it should be ok
|
||||
# YUCK! This little require/import trick is required for the rpm stuff
|
||||
BEGIN { require DBD::Pg; import DBD::Pg qw(:pg_types); }
|
||||
BEGIN { require DBD::Pg; DBD::Pg->import(qw(:pg_types)); }
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
|
@ -119,16 +119,12 @@ use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Errno qw(EBADF);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Timeout;
|
||||
use Digest::SHA qw(sha1);
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1) }
|
||||
}
|
||||
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::BayesStore;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Timeout;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Util::TinyRedis;
|
||||
|
||||
our $VERSION = 0.09;
|
||||
|
@ -21,7 +21,11 @@ Mail::SpamAssassin::BayesStore::SQL - SQL Bayesian Storage Module Implementation
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This module implements a SQL based bayesian storage module.
|
||||
This module implements a SQL based bayesian storage module. It's compatible
|
||||
with SQLite and possibly other standard SQL servers.
|
||||
|
||||
Do not use this for MySQL/MariaDB or PgSQL, they have their own specific
|
||||
modules.
|
||||
|
||||
=cut
|
||||
|
||||
@ -32,11 +36,7 @@ use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Errno qw(EBADF);
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1) }
|
||||
}
|
||||
use Digest::SHA qw(sha1);
|
||||
|
||||
use Mail::SpamAssassin::BayesStore;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
@ -1676,6 +1676,13 @@ sub _connect_db {
|
||||
dbg("bayes: database connection established");
|
||||
}
|
||||
|
||||
# SQLite PRAGMA attributes - here for tests, see bug 8033
|
||||
if ($self->{_dsn} =~ /^dbi:SQLite:.*?;(.+)/i) {
|
||||
foreach my $attr (split(/;/, $1)) {
|
||||
$dbh->do("PRAGMA $attr");
|
||||
}
|
||||
}
|
||||
|
||||
$self->{_dbh} = $dbh;
|
||||
|
||||
return 1;
|
||||
@ -2348,12 +2355,13 @@ Description:
|
||||
This method returns the string to be used in SELECT statements to represent
|
||||
the token column.
|
||||
|
||||
The default is to use the RPAD function to pad the token out to 5 characters.
|
||||
The default is to use the SUBSTR function to pad the token out to 5 characters.
|
||||
|
||||
=cut
|
||||
|
||||
sub _token_select_string {
|
||||
return "RPAD(token, 5, ' ')";
|
||||
# Use SQLite compatible RPAD alternative
|
||||
return "SUBSTR(token || ' ', 1, 5)";
|
||||
}
|
||||
|
||||
sub sa_die { Mail::SpamAssassin::sa_die(@_); }
|
||||
|
@ -106,6 +106,10 @@ sub new {
|
||||
$self->{username} = $args->{username};
|
||||
}
|
||||
|
||||
if ($args->{max_size}) {
|
||||
$self->{max_size} = $args->{max_size};
|
||||
}
|
||||
|
||||
if ($args->{timeout}) {
|
||||
$self->{timeout} = $args->{timeout} || 30;
|
||||
}
|
||||
@ -135,6 +139,8 @@ threshold
|
||||
|
||||
message
|
||||
|
||||
report
|
||||
|
||||
=cut
|
||||
|
||||
sub process {
|
||||
@ -150,6 +156,41 @@ sub process {
|
||||
return $self->_filter($msg, $command);
|
||||
}
|
||||
|
||||
=head2 spam_report
|
||||
|
||||
public instance (\%) spam_report (String $msg)
|
||||
|
||||
Description:
|
||||
The method implements the report call.
|
||||
|
||||
See the process method for the return value.
|
||||
|
||||
=cut
|
||||
|
||||
sub spam_report {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
return $self->_filter($msg, 'REPORT');
|
||||
}
|
||||
|
||||
=head2 spam_report_ifspam
|
||||
|
||||
public instance (\%) spam_report_ifspam (String $msg)
|
||||
|
||||
Description:
|
||||
The method implements the report_ifspam call.
|
||||
A report will be returned only if the message is spam.
|
||||
|
||||
See the process method for the return value.
|
||||
|
||||
=cut
|
||||
|
||||
sub spam_report_ifspam {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
return $self->_filter($msg, 'REPORT_IFSPAM');
|
||||
}
|
||||
|
||||
=head2 check
|
||||
|
||||
public instance (\%) check (String $msg)
|
||||
@ -270,10 +311,10 @@ sub learn {
|
||||
close $remote or die "error closing socket: $!";
|
||||
|
||||
if ($learntype == 0 || $learntype == 1) {
|
||||
return $did_set =~ /local/;
|
||||
return index($did_set, 'local') >= 0;
|
||||
}
|
||||
else { #safe since we've already checked the $learntype values
|
||||
return $did_remove =~ /local/;
|
||||
return index($did_remove, 'local') >= 0;
|
||||
}
|
||||
}
|
||||
|
||||
@ -534,12 +575,15 @@ threshold
|
||||
|
||||
message (if available)
|
||||
|
||||
report (if available)
|
||||
|
||||
=cut
|
||||
|
||||
sub _filter {
|
||||
my ($self, $msg, $command) = @_;
|
||||
|
||||
my %data;
|
||||
my $msgsize;
|
||||
|
||||
$self->_clear_errors();
|
||||
|
||||
@ -547,7 +591,10 @@ sub _filter {
|
||||
|
||||
return 0 unless ($remote);
|
||||
|
||||
my $msgsize = length($msg.$EOL);
|
||||
if(defined $self->{max_size}) {
|
||||
$msg = substr($msg,0,$self->{max_size});
|
||||
}
|
||||
$msgsize = length($msg.$EOL);
|
||||
|
||||
print $remote "$command $PROTOVERSION$EOL";
|
||||
print $remote "Content-length: $msgsize$EOL";
|
||||
@ -595,7 +642,11 @@ sub _filter {
|
||||
$!==EBADF ? dbg("error reading from spamd (10): $!")
|
||||
: die "error reading from spamd (10): $!";
|
||||
|
||||
if($command =~ /^REPORT/) {
|
||||
$data{report} = $return_msg if ($return_msg);
|
||||
} else {
|
||||
$data{message} = $return_msg if ($return_msg);
|
||||
}
|
||||
|
||||
close $remote or die "error closing socket: $!";
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -27,7 +27,7 @@ Mail::SpamAssassin::Conf::LDAP - load SpamAssassin scores from LDAP database
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin is a module to identify spam using text analysis and
|
||||
several internet-based realtime blacklists.
|
||||
several internet-based realtime blocklists.
|
||||
|
||||
This class is used internally by SpamAssassin to load scores from an LDAP
|
||||
database. Please refer to the C<Mail::SpamAssassin> documentation for public
|
||||
@ -191,6 +191,9 @@ sub load_with_ldap {
|
||||
}
|
||||
if ($config_text ne '') {
|
||||
$conf->{main} = $main;
|
||||
$config_text = "file start (ldap config)\n".
|
||||
$config_text.
|
||||
"file end (ldap config)\n";
|
||||
$conf->parse_scores_only($config_text);
|
||||
delete $conf->{main};
|
||||
}
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -27,7 +27,7 @@ Mail::SpamAssassin::Conf::SQL - load SpamAssassin scores from SQL database
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Mail::SpamAssassin is a module to identify spam using text analysis and
|
||||
several internet-based realtime blacklists.
|
||||
several internet-based realtime blocklists.
|
||||
|
||||
This class is used internally by SpamAssassin to load scores from an SQL
|
||||
database. Please refer to the C<Mail::SpamAssassin> documentation for public
|
||||
@ -166,6 +166,9 @@ sub load_with_dbi {
|
||||
}
|
||||
if ($config_text ne '') {
|
||||
$conf->{main} = $main;
|
||||
$config_text = "file start (sql config)\n".
|
||||
$config_text.
|
||||
"file end (sql config)\n";
|
||||
$conf->parse_scores_only($config_text);
|
||||
delete $conf->{main};
|
||||
}
|
||||
|
@ -34,13 +34,14 @@ our(@BAYES_VARS, @IP_VARS, @SA_VARS, %EXPORT_TAGS, @EXPORT_OK);
|
||||
BEGIN {
|
||||
@IP_VARS = qw(
|
||||
IP_IN_RESERVED_RANGE IP_PRIVATE LOCALHOST IPV4_ADDRESS IP_ADDRESS
|
||||
IS_IP_PRIVATE IS_LOCALHOST IS_IPV4_ADDRESS IS_IP_ADDRESS
|
||||
);
|
||||
@BAYES_VARS = qw(
|
||||
DUMP_MAGIC DUMP_TOKEN DUMP_BACKUP
|
||||
);
|
||||
# These are generic constants that may be used across several modules
|
||||
@SA_VARS = qw(
|
||||
HARVEST_DNSBL_PRIORITY MBX_SEPARATOR
|
||||
MBX_SEPARATOR
|
||||
MAX_BODY_LINE_LENGTH MAX_HEADER_KEY_LENGTH MAX_HEADER_VALUE_LENGTH
|
||||
MAX_HEADER_LENGTH ARITH_EXPRESSION_LEXER AI_TIME_UNKNOWN
|
||||
CHARSETS_LIKELY_TO_FP_AS_CAPS MAX_URI_LENGTH RULENAME_RE IS_RULENAME
|
||||
@ -168,7 +169,10 @@ use constant IP_PRIVATE => qr{^(?:
|
||||
)
|
||||
(?![a-f0-9:])
|
||||
)
|
||||
)}oxi;
|
||||
)}xi;
|
||||
|
||||
# exact match
|
||||
use constant IS_IP_PRIVATE => qr/^${\(IP_PRIVATE)}$/;
|
||||
|
||||
# backward compatibility
|
||||
use constant IP_IN_RESERVED_RANGE => IP_PRIVATE;
|
||||
@ -246,7 +250,10 @@ use constant LOCALHOST => qr/
|
||||
)
|
||||
(?![a-f0-9:])
|
||||
)
|
||||
/oxi;
|
||||
/xi;
|
||||
|
||||
# exact match
|
||||
use constant IS_LOCALHOST => qr/^${\(LOCALHOST)}$/;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# an IP address, in IPv4 format only.
|
||||
@ -256,11 +263,13 @@ use constant IPV4_ADDRESS => qr/\b
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.
|
||||
(?:1\d\d|2[0-4]\d|25[0-5]|[1-9]\d|\d)
|
||||
\b/ox;
|
||||
\b/x;
|
||||
|
||||
# exact match
|
||||
use constant IS_IPV4_ADDRESS => qr/^${\(IPV4_ADDRESS)}$/;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# an IP address, in IPv4, IPv4-mapped-in-IPv6, or IPv6 format. NOTE: cannot
|
||||
# just refer to $IPV4_ADDRESS, due to perl bug reported in nesting qr//s. :(
|
||||
# an IP address, in IPv4, IPv4-mapped-in-IPv6, or IPv6 format.
|
||||
#
|
||||
use constant IP_ADDRESS => qr/
|
||||
(?:
|
||||
@ -344,12 +353,13 @@ use constant IP_ADDRESS => qr/
|
||||
)
|
||||
(?![a-f0-9:])
|
||||
)
|
||||
/oxi;
|
||||
/xi;
|
||||
|
||||
# exact match
|
||||
use constant IS_IP_ADDRESS => qr/^${\(IP_ADDRESS)}$/;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
use constant HARVEST_DNSBL_PRIORITY => 500;
|
||||
|
||||
# regular expression that matches message separators in The University of
|
||||
# Washington's MBX mailbox format
|
||||
use constant MBX_SEPARATOR => qr/^([\s\d]\d-[a-zA-Z]{3}-\d{4}\s\d{2}:\d{2}:\d{2}.*),(\d+);([\da-f]{12})-(\w{8})\r?$/;
|
||||
@ -388,7 +398,7 @@ use constant ARITH_EXPRESSION_LEXER => qr/(?:
|
||||
!=| # NEQ
|
||||
[\+\-\*\/]| # Mathematical Operator
|
||||
[\?:] # ? : Operator
|
||||
)/ox;
|
||||
)/x;
|
||||
|
||||
# ArchiveIterator
|
||||
|
||||
|
@ -53,23 +53,23 @@ sub new_checker {
|
||||
'locked_file' => ''
|
||||
};
|
||||
|
||||
my @order = split(/\s+/, $main->{conf}->{auto_whitelist_db_modules});
|
||||
my @order = split(/\s+/, $main->{conf}->{auto_welcomelist_db_modules});
|
||||
untaint_var(\@order);
|
||||
my $dbm_module = Mail::SpamAssassin::Util::first_available_module (@order);
|
||||
if (!$dbm_module) {
|
||||
die "auto-whitelist: cannot find a usable DB package from auto_whitelist_db_modules: " .
|
||||
$main->{conf}->{auto_whitelist_db_modules}."\n";
|
||||
die "auto-welcomelist: cannot find a usable DB package from auto_welcomelist_db_modules: " .
|
||||
$main->{conf}->{auto_welcomelist_db_modules}."\n";
|
||||
}
|
||||
|
||||
my $umask = umask ~ (oct($main->{conf}->{auto_whitelist_file_mode}));
|
||||
my $umask = umask ~ (oct($main->{conf}->{auto_welcomelist_file_mode}));
|
||||
|
||||
# if undef then don't worry -- empty hash!
|
||||
if (defined($main->{conf}->{auto_whitelist_path})) {
|
||||
my $path = $main->sed_path($main->{conf}->{auto_whitelist_path});
|
||||
if (defined($main->{conf}->{auto_welcomelist_path})) {
|
||||
my $path = $main->sed_path($main->{conf}->{auto_welcomelist_path});
|
||||
my ($mod1, $mod2);
|
||||
|
||||
if ($main->{locker}->safe_lock
|
||||
($path, 30, $main->{conf}->{auto_whitelist_file_mode}))
|
||||
($path, 30, $main->{conf}->{auto_welcomelist_file_mode}))
|
||||
{
|
||||
$self->{locked_file} = $path;
|
||||
$self->{is_locked} = 1;
|
||||
@ -80,20 +80,20 @@ sub new_checker {
|
||||
($mod1, $mod2) = ('R/O', O_RDONLY);
|
||||
}
|
||||
|
||||
dbg("auto-whitelist: tie-ing to DB file of type $dbm_module $mod1 in $path");
|
||||
dbg("auto-welcomelist: tie-ing to DB file of type $dbm_module $mod1 in $path");
|
||||
|
||||
($self->{is_locked} && $dbm_module eq 'DB_File') and
|
||||
Mail::SpamAssassin::Util::avoid_db_file_locking_bug($path);
|
||||
|
||||
if (! tie %{ $self->{accum} }, $dbm_module, $path, $mod2,
|
||||
oct($main->{conf}->{auto_whitelist_file_mode}) & 0666)
|
||||
oct($main->{conf}->{auto_welcomelist_file_mode}) & 0666)
|
||||
{
|
||||
my $err = $!; # might get overwritten later
|
||||
if ($self->{is_locked}) {
|
||||
$self->{main}->{locker}->safe_unlock($self->{locked_file});
|
||||
$self->{is_locked} = 0;
|
||||
}
|
||||
die "auto-whitelist: cannot open auto_whitelist_path $path: $err\n";
|
||||
die "auto-welcomelist: cannot open auto_welcomelist_path $path: $err\n";
|
||||
}
|
||||
}
|
||||
umask $umask;
|
||||
@ -106,10 +106,10 @@ sub new_checker {
|
||||
|
||||
sub finish {
|
||||
my $self = shift;
|
||||
dbg("auto-whitelist: DB addr list: untie-ing and unlocking");
|
||||
dbg("auto-welcomelist: DB addr list: untie-ing and unlocking");
|
||||
untie %{$self->{accum}};
|
||||
if ($self->{is_locked}) {
|
||||
dbg("auto-whitelist: DB addr list: file locked, breaking lock");
|
||||
dbg("auto-welcomelist: DB addr list: file locked, breaking lock");
|
||||
$self->{main}->{locker}->safe_unlock ($self->{locked_file});
|
||||
$self->{is_locked} = 0;
|
||||
}
|
||||
@ -128,7 +128,7 @@ sub get_addr_entry {
|
||||
$entry->{msgcount} = $self->{accum}->{$addr} || 0;
|
||||
$entry->{totscore} = $self->{accum}->{$addr.'|totscore'} || 0;
|
||||
|
||||
dbg("auto-whitelist: db-based $addr scores ".$entry->{msgcount}.'/'.$entry->{totscore});
|
||||
dbg("auto-welcomelist: db-based $addr scores ".$entry->{msgcount}.'/'.$entry->{totscore});
|
||||
return $entry;
|
||||
}
|
||||
|
||||
@ -143,7 +143,7 @@ sub add_score {
|
||||
$entry->{msgcount}++;
|
||||
$entry->{totscore} += $score;
|
||||
|
||||
dbg("auto-whitelist: add_score: new count: ".$entry->{msgcount}.", new totscore: ".$entry->{totscore});
|
||||
dbg("auto-welcomelist: add_score: new count: ".$entry->{msgcount}.", new totscore: ".$entry->{totscore});
|
||||
|
||||
$self->{accum}->{$entry->{addr}} = $entry->{msgcount};
|
||||
$self->{accum}->{$entry->{addr}.'|totscore'} = $entry->{totscore};
|
||||
|
@ -29,7 +29,7 @@ use Mail::SpamAssassin::Conf;
|
||||
use Mail::SpamAssassin::PerMsgStatus;
|
||||
use Mail::SpamAssassin::AsyncLoop;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var am_running_on_windows);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var am_running_on_windows compile_regexp);
|
||||
|
||||
use File::Spec;
|
||||
use IO::Socket;
|
||||
@ -37,30 +37,24 @@ use POSIX ":sys_wait_h";
|
||||
|
||||
|
||||
our $KNOWN_BAD_DIALUP_RANGES; # Nothing uses this var???
|
||||
our $LAST_DNS_CHECK;
|
||||
our $LAST_DNS_CHECK = 0;
|
||||
|
||||
# use very well-connected domains (fast DNS response, many DNS servers,
|
||||
# geographical distribution is a plus, TTL of at least 3600s)
|
||||
# these MUST contain both A/AAAA records so we can test dns_options v6
|
||||
# Updated 8/2019 from https://ip6.nl/#!list?db=alexa500
|
||||
#
|
||||
our @EXISTING_DOMAINS = qw{
|
||||
adelphia.net
|
||||
akamai.com
|
||||
apache.org
|
||||
cingular.com
|
||||
colorado.edu
|
||||
comcast.net
|
||||
doubleclick.com
|
||||
ebay.com
|
||||
gmx.net
|
||||
bing.com
|
||||
cloudflare.com
|
||||
digitalpoint.com
|
||||
facebook.com
|
||||
google.com
|
||||
intel.com
|
||||
kernel.org
|
||||
linux.org
|
||||
mit.edu
|
||||
motorola.com
|
||||
msn.com
|
||||
sourceforge.net
|
||||
sun.com
|
||||
w3.org
|
||||
linkedin.com
|
||||
netflix.com
|
||||
php.net
|
||||
wikipedia.org
|
||||
yahoo.com
|
||||
};
|
||||
|
||||
@ -84,10 +78,6 @@ BEGIN {
|
||||
# local ($^W) = 0;
|
||||
|
||||
no warnings;
|
||||
eval {
|
||||
require Net::DNS;
|
||||
require Net::DNS::Resolver;
|
||||
};
|
||||
eval {
|
||||
require MIME::Base64;
|
||||
};
|
||||
@ -101,71 +91,52 @@ BEGIN {
|
||||
sub do_rbl_lookup {
|
||||
my ($self, $rule, $set, $type, $host, $subtest) = @_;
|
||||
|
||||
$host =~ s/\.\z//s; # strip a redundant trailing dot
|
||||
my $key = "dns:$type:$host";
|
||||
my $existing_ent = $self->{async}->get_lookup($key);
|
||||
|
||||
# only make a specific query once
|
||||
if (!$existing_ent) {
|
||||
my $ent = {
|
||||
key => $key,
|
||||
zone => $host, # serves to fetch other per-zone settings
|
||||
type => "DNSBL-".$type,
|
||||
sets => [ ], # filled in below
|
||||
rules => [ ], # filled in below
|
||||
# id is filled in after we send the query below
|
||||
};
|
||||
$existing_ent = $self->{async}->bgsend_and_start_lookup(
|
||||
$host, $type, undef, $ent,
|
||||
sub { my($ent, $pkt) = @_; $self->process_dnsbl_result($ent, $pkt) },
|
||||
master_deadline => $self->{master_deadline} );
|
||||
}
|
||||
|
||||
if ($existing_ent) {
|
||||
# always add set
|
||||
push @{$existing_ent->{sets}}, $set;
|
||||
|
||||
# sometimes match or always match
|
||||
if (defined $subtest) {
|
||||
$self->{dnspost}->{$set}->{$subtest} = $rule;
|
||||
} else {
|
||||
push @{$existing_ent->{rules}}, $rule;
|
||||
}
|
||||
|
||||
$self->{rule_to_rblkey}->{$rule} = $key;
|
||||
}
|
||||
}
|
||||
|
||||
# TODO: these are constant so they should only be added once at startup
|
||||
sub register_rbl_subtest {
|
||||
my ($self, $rule, $set, $subtest) = @_;
|
||||
|
||||
if ($subtest =~ /^sb:/) {
|
||||
warn("dns: ignored $rule, SenderBase rules are deprecated\n");
|
||||
info("dns: ignored $rule, SenderBase rules are deprecated");
|
||||
return 0;
|
||||
}
|
||||
# Compile as regex if not pure ip/bitmask (same check in process_dnsbl_result)
|
||||
if ($subtest !~ /^\d+(?:\.\d+\.\d+\.\d+)?$/) {
|
||||
my ($rec, $err) = compile_regexp($subtest, 0);
|
||||
if (!$rec) {
|
||||
warn("dns: invalid rule $rule subtest regexp '$subtest': $err\n");
|
||||
return 0;
|
||||
}
|
||||
$subtest = $rec;
|
||||
}
|
||||
}
|
||||
|
||||
$self->{dnspost}->{$set}->{$subtest} = $rule;
|
||||
dbg("dns: launching rule %s, set %s, type %s, %s", $rule, $set, $type,
|
||||
defined $subtest ? "subtest $subtest" : 'no subtest');
|
||||
|
||||
my $ent = {
|
||||
rulename => $rule,
|
||||
type => "DNSBL",
|
||||
set => $set,
|
||||
subtest => $subtest,
|
||||
};
|
||||
my $ret = $self->{async}->bgsend_and_start_lookup($host, $type, undef, $ent,
|
||||
sub { my($ent, $pkt) = @_; $self->process_dnsbl_result($ent, $pkt) },
|
||||
master_deadline => $self->{master_deadline}
|
||||
);
|
||||
|
||||
return 0 if defined $ret; # no query started
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
# Deprecated, was only used from DNSEval.pm?
|
||||
sub do_dns_lookup {
|
||||
my ($self, $rule, $type, $host) = @_;
|
||||
|
||||
$host =~ s/\.\z//s; # strip a redundant trailing dot
|
||||
my $key = "dns:$type:$host";
|
||||
|
||||
my $ent = {
|
||||
key => $key,
|
||||
zone => $host, # serves to fetch other per-zone settings
|
||||
type => "DNSBL-".$type,
|
||||
rules => [ $rule ],
|
||||
# id is filled in after we send the query below
|
||||
rulename => $rule,
|
||||
type => "DNSBL",
|
||||
};
|
||||
$ent = $self->{async}->bgsend_and_start_lookup(
|
||||
$host, $type, undef, $ent,
|
||||
$self->{async}->bgsend_and_start_lookup($host, $type, undef, $ent,
|
||||
sub { my($ent, $pkt) = @_; $self->process_dnsbl_result($ent, $pkt) },
|
||||
master_deadline => $self->{master_deadline} );
|
||||
$ent;
|
||||
master_deadline => $self->{master_deadline}
|
||||
);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
@ -180,38 +151,31 @@ sub dnsbl_hit {
|
||||
# txtdata returns a non- zone-file-format encoded result, unlike rdstring;
|
||||
# avoid space-separated RDATA <character-string> fields if possible,
|
||||
# txtdata provides a list of strings in a list context since Net::DNS 0.69
|
||||
$log = join('',$answer->txtdata);
|
||||
$log = join('', $answer->txtdata);
|
||||
utf8::encode($log) if utf8::is_utf8($log);
|
||||
local $1;
|
||||
$log =~ s{ (?<! [<(\[] ) (https? : // \S+)}{<$1>}xgi;
|
||||
} else { # assuming $answer->type eq 'A'
|
||||
local($1,$2,$3,$4,$5);
|
||||
if ($question->string =~ m/^((?:[0-9a-fA-F]\.){32})(\S+\w)/) {
|
||||
if ($question->string =~ /^((?:[0-9a-fA-F]\.){32})(\S+\w)/) {
|
||||
$log = ' listed in ' . lc($2);
|
||||
my $ipv6addr = join('', reverse split(/\./, lc $1));
|
||||
$ipv6addr =~ s/\G(....)/$1:/g; chop $ipv6addr;
|
||||
$ipv6addr =~ s/:0{1,3}/:/g;
|
||||
$log = $ipv6addr . $log;
|
||||
} elsif ($question->string =~ m/^(\d+)\.(\d+)\.(\d+)\.(\d+)\.(\S+\w)/) {
|
||||
} elsif ($question->string =~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)\.(\S+\w)/) {
|
||||
$log = "$4.$3.$2.$1 listed in " . lc($5);
|
||||
} else {
|
||||
$log = 'listed in ' . $question->string;
|
||||
} elsif ($question->string =~ /^(\S+)(?<!\.)/) {
|
||||
$log = "listed in ".lc($1);
|
||||
}
|
||||
}
|
||||
|
||||
# TODO: this may result in some log messages appearing under the
|
||||
# wrong rules, since we could see this sequence: { test one hits,
|
||||
# test one's message is logged, test two hits, test one fires again
|
||||
# on another IP, test one's message is logged for that other IP --
|
||||
# but under test two's heading }. Right now though it's better
|
||||
# than just not logging at all.
|
||||
|
||||
$self->{already_logged} ||= { };
|
||||
if ($log && !$self->{already_logged}->{$log}) {
|
||||
$self->test_log($log);
|
||||
$self->{already_logged}->{$log} = 1;
|
||||
if ($log) {
|
||||
$self->test_log($log, $rule);
|
||||
}
|
||||
|
||||
if (!$self->{tests_already_hit}->{$rule}) {
|
||||
dbg("dns: rbl rule $rule hit");
|
||||
$self->got_hit($rule, "RBL: ", ruletype => "dnsbl");
|
||||
}
|
||||
}
|
||||
@ -219,16 +183,26 @@ sub dnsbl_hit {
|
||||
sub dnsbl_uri {
|
||||
my ($self, $question, $answer) = @_;
|
||||
|
||||
my $qname = $question->qname;
|
||||
|
||||
my $rdatastr;
|
||||
if ($answer->UNIVERSAL::can('txtdata')) {
|
||||
# txtdata returns a non- zone-file-format encoded result, unlike rdstring;
|
||||
# avoid space-separated RDATA <character-string> fields if possible,
|
||||
# txtdata provides a list of strings in a list context since Net::DNS 0.69
|
||||
#
|
||||
# rdatastr() is historical/undocumented, use rdstring() since Net::DNS 0.69
|
||||
my $rdatastr = $answer->UNIVERSAL::can('txtdata') ? join('',$answer->txtdata)
|
||||
: $answer->UNIVERSAL::can('rdstring') ? $answer->rdstring
|
||||
: $answer->rdatastr;
|
||||
$rdatastr = join('', $answer->txtdata);
|
||||
} else {
|
||||
$rdatastr = $answer->rdstring;
|
||||
# encoded in a RFC 1035 zone file format (escaped), decode it
|
||||
$rdatastr =~ s{ \\ ( [0-9]{3} | (?![0-9]{3}) . ) }
|
||||
{ length($1)==3 && $1 <= 255 ? chr($1) : $1 }xgse;
|
||||
}
|
||||
|
||||
# Bug 7236: Net::DNS attempts to decode text strings in a TXT record as
|
||||
# UTF-8 since version 0.69, which is undesired: octets failing the UTF-8
|
||||
# decoding are converted to a Unicode "replacement character" U+FFFD, and
|
||||
# ASCII text is unnecessarily flagged as perl native characters.
|
||||
utf8::encode($rdatastr) if utf8::is_utf8($rdatastr);
|
||||
|
||||
my $qname = $question->qname;
|
||||
if (defined $qname && defined $rdatastr) {
|
||||
my $qclass = $question->qclass;
|
||||
my $qtype = $question->qtype;
|
||||
@ -236,8 +210,8 @@ sub dnsbl_uri {
|
||||
push(@vals, "class=$qclass") if $qclass ne "IN";
|
||||
push(@vals, "type=$qtype") if $qtype ne "A";
|
||||
my $uri = "dns:$qname" . (@vals ? "?" . join(";", @vals) : "");
|
||||
push @{ $self->{dnsuri}->{$uri} }, $rdatastr;
|
||||
|
||||
$self->{dnsuri}{$uri}{$rdatastr} = 1;
|
||||
dbg("dns: hit <$uri> $rdatastr");
|
||||
}
|
||||
}
|
||||
@ -251,19 +225,17 @@ sub process_dnsbl_result {
|
||||
my $question = ($pkt->question)[0];
|
||||
return if !$question;
|
||||
|
||||
my $sets = $ent->{sets} || [];
|
||||
my $rules = $ent->{rules};
|
||||
my $rulename = $ent->{rulename};
|
||||
|
||||
# NO_DNS_FOR_FROM
|
||||
if ($self->{sender_host} &&
|
||||
# fishy, qname should have been "RFC 1035 zone format" -decoded first
|
||||
lc($question->qname) eq lc($self->{sender_host}) &&
|
||||
$question->qtype =~ /^(?:A|MX)$/ &&
|
||||
$pkt->header->rcode =~ /^(?:NXDOMAIN|SERVFAIL)$/ &&
|
||||
++$self->{sender_host_fail} == 2)
|
||||
{
|
||||
for my $rule (@{$rules}) {
|
||||
$self->got_hit($rule, "DNS: ", ruletype => "dns");
|
||||
# Mark rule ready for meta rules, but only if this was the last lookup
|
||||
# pending, rules can have many lookups launched for different IPs
|
||||
if (!$self->get_async_pending_rules($rulename)) {
|
||||
$self->rule_ready($rulename);
|
||||
# Mark depending check_rbl_sub rules too
|
||||
if (exists $self->{rbl_subs}{$ent->{set}}) {
|
||||
foreach (@{$self->{rbl_subs}{$ent->{set}}}) {
|
||||
$self->rule_ready($_->[1]);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@ -274,44 +246,69 @@ sub process_dnsbl_result {
|
||||
$self->dnsbl_uri($question, $answer);
|
||||
my $answ_type = $answer->type;
|
||||
# TODO: there are some CNAME returns that might be useful
|
||||
next if ($answ_type ne 'A' && $answ_type ne 'TXT');
|
||||
if ($answ_type eq 'A') {
|
||||
# Net::DNS::RR::A::address() is available since Net::DNS 0.69
|
||||
my $ip_address = $answer->UNIVERSAL::can('address') ? $answer->address
|
||||
: $answer->rdatastr;
|
||||
next if $answ_type ne 'A' && $answ_type ne 'TXT';
|
||||
|
||||
my $rdatastr;
|
||||
if ($answer->UNIVERSAL::can('txtdata')) {
|
||||
# txtdata returns a non- zone-file-format encoded result, unlike rdstring;
|
||||
# avoid space-separated RDATA <character-string> fields if possible,
|
||||
# txtdata provides a list of strings in a list context since Net::DNS 0.69
|
||||
$rdatastr = join('', $answer->txtdata);
|
||||
} else {
|
||||
$rdatastr = $answer->rdstring;
|
||||
# encoded in a RFC 1035 zone file format (escaped), decode it
|
||||
$rdatastr =~ s{ \\ ( [0-9]{3} | (?![0-9]{3}) . ) }
|
||||
{ length($1)==3 && $1 <= 255 ? chr($1) : $1 }xgse;
|
||||
}
|
||||
|
||||
# Bug 7236: Net::DNS attempts to decode text strings in a TXT record as
|
||||
# UTF-8 since version 0.69, which is undesired: octets failing the UTF-8
|
||||
# decoding are converted to a Unicode "replacement character" U+FFFD, and
|
||||
# ASCII text is unnecessarily flagged as perl native characters.
|
||||
utf8::encode($rdatastr) if utf8::is_utf8($rdatastr);
|
||||
|
||||
# skip any A record that isn't on 127.0.0.0/8
|
||||
next if $ip_address !~ /^127\./;
|
||||
next if $answ_type eq 'A' && $rdatastr !~ /^127\./;
|
||||
|
||||
# check_rbl tests
|
||||
if (defined $ent->{subtest}) {
|
||||
if ($self->check_subtest($rdatastr, $ent->{subtest})) {
|
||||
$self->dnsbl_hit($rulename, $question, $answer);
|
||||
}
|
||||
for my $rule (@{$rules}) {
|
||||
$self->dnsbl_hit($rule, $question, $answer);
|
||||
}
|
||||
for my $set (@{$sets}) {
|
||||
if ($self->{dnspost}->{$set}) {
|
||||
$self->process_dnsbl_set($set, $question, $answer);
|
||||
} else {
|
||||
$self->dnsbl_hit($rulename, $question, $answer);
|
||||
}
|
||||
|
||||
# check_rbl_sub tests
|
||||
if (exists $self->{rbl_subs}{$ent->{set}}) {
|
||||
$self->process_dnsbl_set($ent->{set}, $question, $answer, $rdatastr);
|
||||
}
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub process_dnsbl_set {
|
||||
my ($self, $set, $question, $answer) = @_;
|
||||
my ($self, $set, $question, $answer, $rdatastr) = @_;
|
||||
|
||||
# txtdata returns a non- zone-file-format encoded result, unlike rdstring;
|
||||
# avoid space-separated RDATA <character-string> fields if possible,
|
||||
# txtdata provides a list of strings in a list context since Net::DNS 0.69
|
||||
#
|
||||
# rdatastr() is historical/undocumented, use rdstring() since Net::DNS 0.69
|
||||
my $rdatastr = $answer->UNIVERSAL::can('txtdata') ? join('',$answer->txtdata)
|
||||
: $answer->UNIVERSAL::can('rdstring') ? $answer->rdstring
|
||||
: $answer->rdatastr;
|
||||
|
||||
while (my ($subtest, $rule) = each %{ $self->{dnspost}->{$set} }) {
|
||||
foreach my $args (@{$self->{rbl_subs}{$set}}) {
|
||||
my $subtest = $args->[0];
|
||||
my $rule = $args->[1];
|
||||
next if $self->{tests_already_hit}->{$rule};
|
||||
if ($self->check_subtest($rdatastr, $subtest)) {
|
||||
$self->dnsbl_hit($rule, $question, $answer);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if ($subtest =~ /^\d+\.\d+\.\d+\.\d+$/) {
|
||||
# test for exact equality, not a regexp (an IPv4 address)
|
||||
$self->dnsbl_hit($rule, $question, $answer) if $subtest eq $rdatastr;
|
||||
sub check_subtest {
|
||||
my ($self, $rdatastr, $subtest) = @_;
|
||||
|
||||
# regular expression
|
||||
if (ref($subtest) eq 'Regexp') {
|
||||
if ($rdatastr =~ $subtest) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
# bitmask
|
||||
elsif ($subtest =~ /^\d+$/) {
|
||||
@ -319,40 +316,21 @@ sub process_dnsbl_set {
|
||||
if ($rdatastr =~ m/^127\.\d{1,3}\.\d{1,3}\.\d{1,3}$/ &&
|
||||
Mail::SpamAssassin::Util::my_inet_aton($rdatastr) & $subtest)
|
||||
{
|
||||
$self->dnsbl_hit($rule, $question, $answer);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
# regular expression
|
||||
else {
|
||||
my $test = qr/$subtest/;
|
||||
if ($rdatastr =~ /$test/) {
|
||||
$self->dnsbl_hit($rule, $question, $answer);
|
||||
}
|
||||
# test for exact equality (an IPv4 address)
|
||||
if ($subtest eq $rdatastr) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub harvest_until_rule_completes {
|
||||
my ($self, $rule) = @_;
|
||||
|
||||
dbg("dns: harvest_until_rule_completes");
|
||||
my $result = 0;
|
||||
|
||||
for (my $first=1; ; $first=0) {
|
||||
# complete_lookups() may call completed_callback(), which may
|
||||
# call start_lookup() again (like in Plugin::URIDNSBL)
|
||||
my ($alldone,$anydone) =
|
||||
$self->{async}->complete_lookups($first ? 0 : 1.0, 1);
|
||||
|
||||
$result = 1 if $self->is_rule_complete($rule);
|
||||
last if $result || $alldone;
|
||||
|
||||
dbg("dns: harvest_until_rule_completes - check_tick");
|
||||
$self->{main}->call_plugins ("check_tick", { permsgstatus => $self });
|
||||
}
|
||||
|
||||
return $result;
|
||||
}
|
||||
# Deprecated since 4.0, meta rules do not depend on priorities anymore
|
||||
sub harvest_until_rule_completes {}
|
||||
|
||||
sub harvest_dnsbl_queries {
|
||||
my ($self) = @_;
|
||||
@ -370,7 +348,7 @@ sub harvest_dnsbl_queries {
|
||||
my ($alldone,$anydone) =
|
||||
$self->{async}->complete_lookups($first ? 0 : 1.0, 1);
|
||||
|
||||
last if $alldone;
|
||||
last if $alldone || $self->{deadline_exceeded} || $self->{shortcircuited};
|
||||
|
||||
dbg("dns: harvest_dnsbl_queries - check_tick");
|
||||
$self->{main}->call_plugins ("check_tick", { permsgstatus => $self });
|
||||
@ -379,7 +357,6 @@ sub harvest_dnsbl_queries {
|
||||
# explicitly abort anything left
|
||||
$self->{async}->abort_remaining_lookups();
|
||||
$self->{async}->log_lookups_timing();
|
||||
$self->mark_all_async_rules_complete();
|
||||
1;
|
||||
}
|
||||
|
||||
@ -403,12 +380,14 @@ sub harvest_completed_queries {
|
||||
sub set_rbl_tag_data {
|
||||
my ($self) = @_;
|
||||
|
||||
return if !$self->{dnsuri};
|
||||
|
||||
# DNS URIs
|
||||
my $rbl_tag = $self->{tag_data}->{RBL}; # just in case, should be empty
|
||||
$rbl_tag = '' if !defined $rbl_tag;
|
||||
while (my ($dnsuri, $answers) = each %{ $self->{dnsuri} }) {
|
||||
while (my ($dnsuri, $answers) = each %{$self->{dnsuri}}) {
|
||||
# when parsing, look for elements of \".*?\" or \S+ with ", " as separator
|
||||
$rbl_tag .= "<$dnsuri>" . " [" . join(", ", @{ $answers }) . "]\n";
|
||||
$rbl_tag .= "<$dnsuri>" . " [" . join(", ", keys %$answers) . "]\n";
|
||||
}
|
||||
if (defined $rbl_tag && $rbl_tag ne '') {
|
||||
chomp $rbl_tag;
|
||||
@ -423,7 +402,7 @@ sub rbl_finish {
|
||||
|
||||
$self->set_rbl_tag_data();
|
||||
|
||||
delete $self->{dnspost};
|
||||
delete $self->{rbl_subs};
|
||||
delete $self->{dnsuri};
|
||||
}
|
||||
|
||||
@ -442,55 +421,103 @@ sub clear_resolver {
|
||||
return 0;
|
||||
}
|
||||
|
||||
# Deprecated since 4.0.0
|
||||
sub lookup_ns {
|
||||
warn "dns: deprecated lookup_ns called, query ignored\n";
|
||||
return;
|
||||
}
|
||||
|
||||
sub test_dns_a_aaaa {
|
||||
my ($self, $dom) = @_;
|
||||
|
||||
return unless $self->load_resolver();
|
||||
return if ($self->server_failed_to_respond_for_domain ($dom));
|
||||
|
||||
my $nsrecords;
|
||||
dbg("dns: looking up NS for '$dom'");
|
||||
my ($a, $aaaa) = (0, 0);
|
||||
|
||||
if ($self->{conf}->{dns_options}->{v4}) {
|
||||
eval {
|
||||
my $query = $self->{resolver}->send($dom, 'NS');
|
||||
my @nses;
|
||||
my $query = $self->{resolver}->send($dom, 'A');
|
||||
if ($query) {
|
||||
foreach my $rr ($query->answer) {
|
||||
if ($rr->type eq "NS") { push (@nses, $rr->nsdname); }
|
||||
if ($rr->type eq 'A') { $a = 1; last; }
|
||||
}
|
||||
}
|
||||
$nsrecords = [ @nses ];
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("dns: NS lookup failed horribly, perhaps bad resolv.conf setting? (%s)", $eval_stat);
|
||||
return;
|
||||
dbg("dns: test A lookup failed horribly, perhaps bad resolv.conf setting? (%s)", $eval_stat);
|
||||
return (undef, undef);
|
||||
};
|
||||
if (!$a) {
|
||||
dbg("dns: test A lookup returned no results, use \"dns_options nov4\" if resolver doesn't support A queries");
|
||||
}
|
||||
} else {
|
||||
$a = 1;
|
||||
}
|
||||
|
||||
$nsrecords;
|
||||
if ($self->{conf}->{dns_options}->{v6}) {
|
||||
eval {
|
||||
my $query = $self->{resolver}->send($dom, 'AAAA');
|
||||
if ($query) {
|
||||
foreach my $rr ($query->answer) {
|
||||
if ($rr->type eq 'AAAA') { $aaaa = 1; last; }
|
||||
}
|
||||
}
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("dns: test AAAA lookup failed horribly, perhaps bad resolv.conf setting? (%s)", $eval_stat);
|
||||
return (undef, undef);
|
||||
};
|
||||
if (!$aaaa) {
|
||||
dbg("dns: test AAAA lookup returned no results, use \"dns_options nov6\" if resolver doesn't support AAAA queries");
|
||||
}
|
||||
} else {
|
||||
$aaaa = 1;
|
||||
}
|
||||
|
||||
return ($a, $aaaa);
|
||||
}
|
||||
|
||||
sub is_dns_available {
|
||||
my ($self) = @_;
|
||||
my $dnsopt = $self->{conf}->{dns_available};
|
||||
my $dnsint = $self->{conf}->{dns_test_interval} || 600;
|
||||
my @domains;
|
||||
|
||||
$LAST_DNS_CHECK ||= 0;
|
||||
my $diff = time() - $LAST_DNS_CHECK;
|
||||
# Fast response for the most common cases
|
||||
return 1 if $IS_DNS_AVAILABLE && $dnsopt eq "yes";
|
||||
return 0 if defined $IS_DNS_AVAILABLE && $dnsopt eq "no";
|
||||
|
||||
# croak on misconfigured flags
|
||||
if (!$self->{conf}->{dns_options}->{v4} &&
|
||||
!$self->{conf}->{dns_options}->{v6})
|
||||
{
|
||||
warn 'dns: error: dns_options "nov4" and "nov6" are both set, '.
|
||||
' only use either, or use "dns_available no" to really disable DNS'.
|
||||
"\n";
|
||||
$IS_DNS_AVAILABLE = 0;
|
||||
$self->{conf}->{dns_available} = "no";
|
||||
return 0;
|
||||
}
|
||||
|
||||
# undef $IS_DNS_AVAILABLE if we should be testing for
|
||||
# working DNS and our check interval time has passed
|
||||
if ($dnsopt eq "test" && $diff > $dnsint) {
|
||||
if ($dnsopt eq "test") {
|
||||
my $diff = time - $LAST_DNS_CHECK;
|
||||
if ($diff > ($self->{conf}->{dns_test_interval}||600)) {
|
||||
$IS_DNS_AVAILABLE = undef;
|
||||
dbg("dns: is_dns_available() last checked %.1f seconds ago; re-checking",
|
||||
$diff);
|
||||
if ($LAST_DNS_CHECK) {
|
||||
dbg("dns: is_dns_available() last checked %.1f seconds ago; re-checking", $diff);
|
||||
} else {
|
||||
dbg("dns: is_dns_available() initial check");
|
||||
}
|
||||
}
|
||||
$LAST_DNS_CHECK = time;
|
||||
}
|
||||
|
||||
return $IS_DNS_AVAILABLE if (defined $IS_DNS_AVAILABLE);
|
||||
$LAST_DNS_CHECK = time();
|
||||
return $IS_DNS_AVAILABLE if defined $IS_DNS_AVAILABLE;
|
||||
|
||||
$IS_DNS_AVAILABLE = 0;
|
||||
|
||||
if ($dnsopt eq "no") {
|
||||
dbg("dns: dns_available set to no in config file, skipping test");
|
||||
return $IS_DNS_AVAILABLE;
|
||||
@ -498,26 +525,16 @@ sub is_dns_available {
|
||||
|
||||
# Even if "dns_available" is explicitly set to "yes", we want to ignore
|
||||
# DNS if we're only supposed to be looking at local tests.
|
||||
goto done if ($self->{main}->{local_tests_only});
|
||||
|
||||
# Check version numbers - runtime check only
|
||||
if (defined $Net::DNS::VERSION) {
|
||||
if (am_running_on_windows()) {
|
||||
if ($Net::DNS::VERSION < 0.46) {
|
||||
warn("dns: Net::DNS version is $Net::DNS::VERSION, but need 0.46 for Win32");
|
||||
if ($self->{main}->{local_tests_only}) {
|
||||
dbg("dns: using local tests only, DNS not available");
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
}
|
||||
else {
|
||||
if ($Net::DNS::VERSION < 0.34) {
|
||||
warn("dns: Net::DNS version is $Net::DNS::VERSION, but need 0.34");
|
||||
|
||||
#$self->clear_resolver();
|
||||
if (!$self->load_resolver()) {
|
||||
dbg("dns: could not load resolver, DNS not available");
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
$self->clear_resolver();
|
||||
goto done unless $self->load_resolver();
|
||||
|
||||
if ($dnsopt eq "yes") {
|
||||
# optionally shuffle the list of nameservers to distribute the load
|
||||
@ -532,13 +549,18 @@ sub is_dns_available {
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
|
||||
my @domains;
|
||||
my @rtypes;
|
||||
push @rtypes, 'A' if $self->{main}->{conf}->{dns_options}->{v4};
|
||||
push @rtypes, 'AAAA' if $self->{main}->{conf}->{dns_options}->{v6};
|
||||
if ($dnsopt =~ /^test:\s*(\S.*)$/) {
|
||||
@domains = split (/\s+/, $1);
|
||||
dbg("dns: looking up NS records for user specified domains: %s",
|
||||
join(", ", @domains));
|
||||
dbg("dns: testing %s records for user specified domains: %s",
|
||||
join("/", @rtypes), join(", ", @domains));
|
||||
} else {
|
||||
@domains = @EXISTING_DOMAINS;
|
||||
dbg("dns: looking up NS records for built-in domains");
|
||||
dbg("dns: testing %s records for built-in domains: %s",
|
||||
join("/", @rtypes), join(", ", @domains));
|
||||
}
|
||||
|
||||
# do the test with a full set of configured nameservers
|
||||
@ -556,19 +578,19 @@ sub is_dns_available {
|
||||
my @good_nameservers;
|
||||
foreach my $ns (@nameservers) {
|
||||
$self->{resolver}->available_nameservers($ns); # try just this one
|
||||
for (my $retry = 3; $retry > 0 && @domains; $retry--) {
|
||||
for (my $retry = 0; $retry < 3 && @domains; $retry++) {
|
||||
my $domain = splice(@domains, rand(@domains), 1);
|
||||
dbg("dns: trying ($retry) $domain, server $ns ...");
|
||||
my $result = $self->lookup_ns($domain);
|
||||
dbg("dns: trying $domain, server $ns ..." .
|
||||
($retry ? " (retry $retry)" : ""));
|
||||
my ($ok_a, $ok_aaaa) = $self->test_dns_a_aaaa($domain);
|
||||
$self->{resolver}->finish_socket();
|
||||
if (!$result) {
|
||||
dbg("dns: NS lookup of $domain using $ns failed horribly, ".
|
||||
"may not be a valid nameserver");
|
||||
if (!defined $ok_a || !defined $ok_aaaa) {
|
||||
# error printed already
|
||||
last;
|
||||
} elsif (!@$result) {
|
||||
dbg("dns: NS lookup of $domain using $ns failed, no results found");
|
||||
} elsif (!$ok_a && !$ok_aaaa) {
|
||||
dbg("dns: lookup of $domain using $ns failed, no results found");
|
||||
} else {
|
||||
dbg("dns: NS lookup of $domain using $ns succeeded => DNS available".
|
||||
dbg("dns: lookup of $domain using $ns succeeded => DNS available".
|
||||
" (set dns_available to override)");
|
||||
push(@good_nameservers, $ns);
|
||||
last;
|
||||
@ -585,8 +607,6 @@ sub is_dns_available {
|
||||
$self->{resolver}->available_nameservers(@good_nameservers);
|
||||
}
|
||||
|
||||
done:
|
||||
# jm: leaving this in!
|
||||
dbg("dns: is DNS available? " . $IS_DNS_AVAILABLE);
|
||||
return $IS_DNS_AVAILABLE;
|
||||
}
|
||||
@ -676,55 +696,26 @@ sub cleanup_kids {
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub register_async_rule_start {
|
||||
# Deprecated async functions, everything is handled automatically
|
||||
# now by bgsend .. $self->{async}->{pending_rules}
|
||||
sub register_async_rule_start {}
|
||||
sub register_async_rule_finish {}
|
||||
sub mark_all_async_rules_complete {}
|
||||
sub is_rule_complete {}
|
||||
|
||||
# Return number of pending DNS lookups for a rule,
|
||||
# or list all of rules still pending
|
||||
sub get_async_pending_rules {
|
||||
my ($self, $rule) = @_;
|
||||
dbg("dns: $rule lookup start");
|
||||
$self->{rule_to_rblkey}->{$rule} = '*ASYNC_START';
|
||||
}
|
||||
|
||||
sub register_async_rule_finish {
|
||||
my ($self, $rule) = @_;
|
||||
dbg("dns: $rule lookup finished");
|
||||
delete $self->{rule_to_rblkey}->{$rule};
|
||||
}
|
||||
|
||||
sub mark_all_async_rules_complete {
|
||||
my ($self) = @_;
|
||||
$self->{rule_to_rblkey} = { };
|
||||
}
|
||||
|
||||
sub is_rule_complete {
|
||||
my ($self, $rule) = @_;
|
||||
|
||||
my $key = $self->{rule_to_rblkey}->{$rule};
|
||||
if (!defined $key) {
|
||||
# dbg("dns: $rule lookup complete, not in list");
|
||||
return 1;
|
||||
if (defined $rule) {
|
||||
return 0 if !exists $self->{async}->{pending_rules}{$rule};
|
||||
return scalar keys %{$self->{async}->{pending_rules}{$rule}};
|
||||
} else {
|
||||
return grep { %{$self->{async}->{pending_rules}{$_}} }
|
||||
keys %{$self->{async}->{pending_rules}};
|
||||
}
|
||||
|
||||
if ($key eq '*ASYNC_START') {
|
||||
dbg("dns: $rule lookup not yet complete");
|
||||
return 0; # not yet complete
|
||||
}
|
||||
|
||||
my $ent = $self->{async}->get_lookup($key);
|
||||
if (!defined $ent) {
|
||||
dbg("dns: $rule lookup complete, $key no longer pending");
|
||||
return 1;
|
||||
}
|
||||
|
||||
dbg("dns: $rule lookup not yet complete");
|
||||
return 0; # not yet complete
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
# interface called by SPF plugin
|
||||
sub check_for_from_dns {
|
||||
my ($self, $pms) = @_;
|
||||
if (defined $pms->{sender_host_fail}) {
|
||||
return ($pms->{sender_host_fail} == 2); # both MX and A need to fail
|
||||
}
|
||||
}
|
||||
|
||||
1;
|
||||
|
@ -40,21 +40,24 @@ use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
require 5.008001; # needs utf8::is_utf8()
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var decode_dns_question_entry);
|
||||
use Mail::SpamAssassin::Util qw(untaint_var decode_dns_question_entry
|
||||
idn_to_ascii reverse_ip_address
|
||||
domain_to_search_list);
|
||||
|
||||
use Socket;
|
||||
use Errno qw(EADDRINUSE EACCES);
|
||||
use Time::HiRes qw(time);
|
||||
use version 0.77;
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
our $have_net_dns;
|
||||
our $io_socket_module_name;
|
||||
BEGIN {
|
||||
$have_net_dns = eval { require Net::DNS; };
|
||||
if (eval { require IO::Socket::IP }) {
|
||||
$io_socket_module_name = 'IO::Socket::IP';
|
||||
} elsif (eval { require IO::Socket::INET6 }) {
|
||||
@ -78,7 +81,6 @@ sub new {
|
||||
};
|
||||
bless ($self, $class);
|
||||
|
||||
$self->load_resolver();
|
||||
$self;
|
||||
}
|
||||
|
||||
@ -94,8 +96,8 @@ Load the C<Net::DNS::Resolver> object. Returns 0 if Net::DNS cannot be used,
|
||||
sub load_resolver {
|
||||
my ($self) = @_;
|
||||
|
||||
if ($self->{res}) { return 1; }
|
||||
$self->{no_resolver} = 1;
|
||||
return 0 if $self->{no_resolver};
|
||||
return 1 if $self->{res};
|
||||
|
||||
# force only ipv4 if no IO::Socket::INET6 or ipv6 doesn't work
|
||||
my $force_ipv4 = $self->{main}->{force_ipv4};
|
||||
@ -112,7 +114,7 @@ sub load_resolver {
|
||||
if ($io_socket_module_name) {
|
||||
$sock6 = $io_socket_module_name->new(LocalAddr=>'::', Proto=>'udp');
|
||||
}
|
||||
if ($sock6) { $sock6->close() or warn "error closing socket: $!" }
|
||||
if ($sock6) { $sock6->close() or warn "dns: error closing socket: $!\n" }
|
||||
$sock6;
|
||||
} or do {
|
||||
dbg("dns: socket module %s is available, but no host support for IPv6",
|
||||
@ -123,13 +125,14 @@ sub load_resolver {
|
||||
}
|
||||
|
||||
eval {
|
||||
require Net::DNS;
|
||||
die "Net::DNS required\n" if !$have_net_dns;
|
||||
die "Net::DNS 0.69 required\n"
|
||||
if (version->parse(Net::DNS->VERSION) < version->parse(0.69));
|
||||
# force_v4 is set in new() to avoid error in older versions of Net::DNS
|
||||
# that don't have it; other options are set by function calls so a typo
|
||||
# or API change will cause an error here
|
||||
my $res = $self->{res} = Net::DNS::Resolver->new(force_v4 => $force_ipv4);
|
||||
if ($res) {
|
||||
$self->{no_resolver} = 0;
|
||||
$self->{force_ipv4} = $force_ipv4;
|
||||
$self->{force_ipv6} = $force_ipv6;
|
||||
$self->{retry} = 1; # retries for non-backgrounded query
|
||||
@ -164,7 +167,7 @@ sub load_resolver {
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("dns: eval failed: $eval_stat");
|
||||
warn("dns: resolver create failed: $eval_stat\n");
|
||||
};
|
||||
|
||||
dbg("dns: using socket module: %s version %s%s",
|
||||
@ -173,12 +176,13 @@ sub load_resolver {
|
||||
$self->{force_ipv4} ? ', forced IPv4' :
|
||||
$self->{force_ipv6} ? ', forced IPv6' : '');
|
||||
dbg("dns: is Net::DNS::Resolver available? %s",
|
||||
$self->{no_resolver} ? "no" : "yes" );
|
||||
if (!$self->{no_resolver} && defined $Net::DNS::VERSION) {
|
||||
$self->{res} ? "yes" : "no" );
|
||||
if ($self->{res} && defined $Net::DNS::VERSION) {
|
||||
dbg("dns: Net::DNS version: %s", $Net::DNS::VERSION);
|
||||
}
|
||||
|
||||
return (!$self->{no_resolver});
|
||||
$self->{no_resolver} = !$self->{res};
|
||||
return defined $self->{res};
|
||||
}
|
||||
|
||||
=item $resolver = $res->get_resolver()
|
||||
@ -237,18 +241,17 @@ sub available_nameservers {
|
||||
}
|
||||
if ($self->{force_ipv4} || $self->{force_ipv6}) {
|
||||
# filter the list according to a chosen protocol family
|
||||
my $ip4_re = IPV4_ADDRESS;
|
||||
my(@filtered_addr_port);
|
||||
for (@{$self->{available_dns_servers}}) {
|
||||
local($1,$2);
|
||||
/^ \[ (.*) \] : (\d+) \z/xs or next;
|
||||
my($addr,$port) = ($1,$2);
|
||||
if ($addr =~ /^${ip4_re}\z/o) {
|
||||
if ($addr =~ IS_IPV4_ADDRESS) {
|
||||
push(@filtered_addr_port, $_) unless $self->{force_ipv6};
|
||||
} elsif ($addr =~ /:.*:/) {
|
||||
push(@filtered_addr_port, $_) unless $self->{force_ipv4};
|
||||
} else {
|
||||
warn "Unrecognized DNS server specification: $_";
|
||||
warn "dns: Unrecognized DNS server specification: $_\n";
|
||||
}
|
||||
}
|
||||
if (@filtered_addr_port < @{$self->{available_dns_servers}}) {
|
||||
@ -361,7 +364,7 @@ sub connect_sock {
|
||||
|
||||
if ($self->{sock}) {
|
||||
$self->{sock}->close()
|
||||
or info("connect_sock: error closing socket %s: %s", $self->{sock}, $!);
|
||||
or info("dns: connect_sock: error closing socket %s: %s", $self->{sock}, $!);
|
||||
$self->{sock} = undef;
|
||||
}
|
||||
my $sock;
|
||||
@ -378,13 +381,12 @@ sub connect_sock {
|
||||
# is unspecified, causing EINVAL failure when automatically assigned local
|
||||
# IP address and a remote address do not belong to the same address family.
|
||||
# Let's choose a suitable source address if possible.
|
||||
my $ip4_re = IPV4_ADDRESS;
|
||||
my $srcaddr;
|
||||
if ($self->{force_ipv4}) {
|
||||
$srcaddr = "0.0.0.0";
|
||||
} elsif ($self->{force_ipv6}) {
|
||||
$srcaddr = "::";
|
||||
} elsif ($ns_addr =~ /^${ip4_re}\z/o) {
|
||||
} elsif ($ns_addr =~ IS_IPV4_ADDRESS) {
|
||||
$srcaddr = "0.0.0.0";
|
||||
} elsif ($ns_addr =~ /:.*:/) {
|
||||
$srcaddr = "::";
|
||||
@ -400,10 +402,10 @@ sub connect_sock {
|
||||
$lport = $self->pick_random_available_port();
|
||||
if (!defined $lport) {
|
||||
$lport = 0;
|
||||
dbg("no configured local ports for DNS queries, letting OS choose");
|
||||
dbg("dns: no configured local ports for DNS queries, letting OS choose");
|
||||
}
|
||||
if ($attempts+1 > 50) { # sanity check
|
||||
warn "could not create a DNS resolver socket in $attempts attempts\n";
|
||||
warn "dns: could not create a DNS resolver socket in $attempts attempts\n";
|
||||
$errno = 0;
|
||||
last;
|
||||
}
|
||||
@ -431,12 +433,12 @@ sub connect_sock {
|
||||
$self->disable_available_port($lport);
|
||||
}
|
||||
} else {
|
||||
warn "error creating a DNS resolver socket: $errno";
|
||||
warn "dns: error creating a DNS resolver socket: $errno";
|
||||
goto no_sock;
|
||||
}
|
||||
}
|
||||
if (!$sock) {
|
||||
warn "could not create a DNS resolver socket in $attempts attempts: $errno";
|
||||
warn "dns: could not create a DNS resolver socket in $attempts attempts: $errno\n";
|
||||
goto no_sock;
|
||||
}
|
||||
|
||||
@ -534,9 +536,8 @@ sub new_dns_packet {
|
||||
|
||||
# construct a PTR query if it looks like an IPv4 address
|
||||
if (!defined($type) || $type eq 'PTR') {
|
||||
local($1,$2,$3,$4);
|
||||
if ($domain =~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) {
|
||||
$domain = "$4.$3.$2.$1.in-addr.arpa.";
|
||||
if ($domain =~ IS_IPV4_ADDRESS) {
|
||||
$domain = reverse_ip_address($domain).".in-addr.arpa.";
|
||||
$type = 'PTR';
|
||||
}
|
||||
}
|
||||
@ -598,13 +599,7 @@ sub new_dns_packet {
|
||||
my $udp_payload_size = $self->{conf}->{dns_options}->{edns};
|
||||
if ($udp_payload_size && $udp_payload_size > 512) {
|
||||
# dbg("dns: adding EDNS ext, UDP payload size %d", $udp_payload_size);
|
||||
if ($packet->UNIVERSAL::can('edns')) { # available since Net::DNS 0.69
|
||||
$packet->edns->size($udp_payload_size);
|
||||
} else { # legacy mechanism
|
||||
my $optrr = Net::DNS::RR->new(Type => 'OPT', Name => '', TTL => 0,
|
||||
Class => $udp_payload_size);
|
||||
$packet->push('additional', $optrr);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@ -658,6 +653,8 @@ sub _packet_id {
|
||||
|
||||
=item $id = $res->bgsend($domain, $type, $class, $cb)
|
||||
|
||||
DIRECT USE DISCOURAGED, please use bgsend_and_start_lookup in plugins.
|
||||
|
||||
Quite similar to C<Net::DNS::Resolver::bgsend>, except that when a reply
|
||||
packet eventually arrives, and C<poll_responses> is called, the callback
|
||||
sub reference C<$cb> will be called.
|
||||
@ -673,7 +670,7 @@ be used, like so:
|
||||
|
||||
my $id = $self->{resolver}->bgsend($domain, $type, undef, sub {
|
||||
my ($reply, $reply_id, $timestamp) = @_;
|
||||
$self->got_a_reply ($reply, $reply_id);
|
||||
$self->got_a_reply($reply, $reply_id);
|
||||
});
|
||||
|
||||
The callback can ignore the reply as an invalid packet sent to the listening
|
||||
@ -685,6 +682,19 @@ sub bgsend {
|
||||
my ($self, $domain, $type, $class, $cb) = @_;
|
||||
return if $self->{no_resolver};
|
||||
|
||||
my $dns_query_blockages = $self->{main}->{conf}->{dns_query_blocked};
|
||||
if ($dns_query_blockages) {
|
||||
my $search_list = domain_to_search_list($domain);
|
||||
foreach my $parent_domain ((@$search_list, '*')) {
|
||||
my $blocked = $dns_query_blockages->{$parent_domain};
|
||||
next if !defined $blocked; # not listed
|
||||
last if !$blocked; # allowed
|
||||
# blocked
|
||||
dbg("dns: bgsend, query $type/$domain blocked by dns_query_restriction: $parent_domain");
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
$self->{send_timed_out} = 0;
|
||||
|
||||
my $pkt = $self->new_dns_packet($domain, $type, $class);
|
||||
@ -747,7 +757,7 @@ sub bgread {
|
||||
$answerpkt or die "bgread: decoding DNS packet failed: $@";
|
||||
$answerpkt->answerfrom($peerhost);
|
||||
if (defined $decoded_length && $decoded_length ne "" && $decoded_length != length($data)) {
|
||||
warn sprintf("bgread: received a %d bytes packet from %s, decoded %d bytes\n",
|
||||
warn sprintf("dns: bgread: received a %d bytes packet from %s, decoded %d bytes\n",
|
||||
length($data), $peerhost, $decoded_length);
|
||||
}
|
||||
return $answerpkt;
|
||||
@ -767,13 +777,16 @@ sub poll_responses {
|
||||
return if $self->{no_resolver};
|
||||
return if !$self->{sock};
|
||||
my $cnt = 0;
|
||||
my $cnt_cb = 0;
|
||||
|
||||
my $rin = $self->{sock_as_vec};
|
||||
my $rout;
|
||||
|
||||
for (;;) {
|
||||
my ($nfound, $timeleft, $eval_stat);
|
||||
eval { # use eval to catch alarm signal
|
||||
# if a restartable signal is caught, retry 3 times before aborting
|
||||
my $eintrcount = 3;
|
||||
eval { # use eval to caught alarm signal
|
||||
my $timer; # collects timestamp when variable goes out of scope
|
||||
if (!defined($timeout) || $timeout > 0)
|
||||
{ $timer = $self->{main}->time_method("poll_dns_idle") }
|
||||
@ -787,16 +800,21 @@ sub poll_responses {
|
||||
# most likely due to an alarm signal, resignal if so
|
||||
die "dns: (2) $eval_stat\n" if $eval_stat =~ /__alarm__ignore__\(.*\)/s;
|
||||
warn "dns: select aborted: $eval_stat\n";
|
||||
return;
|
||||
last;
|
||||
} elsif (!defined $nfound || $nfound < 0) {
|
||||
if ($!{EINTR} and $eintrcount > 0) {
|
||||
$eintrcount--;
|
||||
next;
|
||||
}
|
||||
if ($!) { warn "dns: select failed: $!\n" }
|
||||
else { info("dns: select interrupted") } # shouldn't happen
|
||||
return;
|
||||
last;
|
||||
} elsif (!$nfound) {
|
||||
if (!defined $timeout) { warn("dns: select returned empty-handed\n") }
|
||||
elsif ($timeout > 0) { dbg("dns: select timed out %.3f s", $timeout) }
|
||||
return;
|
||||
last;
|
||||
}
|
||||
$cnt += $nfound;
|
||||
|
||||
my $now = time;
|
||||
$timeout = 0; # next time around collect whatever is available, then exit
|
||||
@ -853,12 +871,12 @@ sub poll_responses {
|
||||
|
||||
if ($cb) {
|
||||
$cb->($packet, $id, $now);
|
||||
$cnt++;
|
||||
$cnt_cb++;
|
||||
} else { # no match, report the problem
|
||||
if ($rcode eq 'REFUSED' || $id =~ m{^\d+/NO_QUESTION_IN_PACKET\z}) {
|
||||
# the failure was already reported above
|
||||
} else {
|
||||
dbg("dns: no callback for id $id, ignored, packet on next debug line");
|
||||
info("dns: no callback for id $id, ignored, packet on next debug line");
|
||||
# prevent filling normal logs with huge packet dumps
|
||||
dbg("dns: %s", $packet ? $packet->string : "undef");
|
||||
}
|
||||
@ -867,11 +885,11 @@ sub poll_responses {
|
||||
if ($id =~ m{^(\d+)/}) {
|
||||
my $dnsid = $1; # the raw DNS packet id
|
||||
my @matches =
|
||||
grep(m{^\Q$dnsid\E/}, keys %{$self->{id_to_callback}});
|
||||
grep(m{^\Q$dnsid\E/}o, keys %{$self->{id_to_callback}});
|
||||
if (!@matches) {
|
||||
dbg("dns: no likely matching queries for id %s", $dnsid);
|
||||
info("dns: no likely matching queries for id %s", $dnsid);
|
||||
} else {
|
||||
dbg("dns: a likely matching query: %s", join(', ', @matches));
|
||||
info("dns: a likely matching query: %s", join(', ', @matches));
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -879,7 +897,35 @@ sub poll_responses {
|
||||
}
|
||||
}
|
||||
|
||||
return $cnt;
|
||||
return ($cnt, $cnt_cb);
|
||||
}
|
||||
|
||||
use constant RECV_FLAGS => eval { MSG_DONTWAIT } || 0; # Not in Windows
|
||||
|
||||
# Used to flush stale DNS responses, which we don't need to process
|
||||
sub flush_responses {
|
||||
my ($self) = @_;
|
||||
return if $self->{no_resolver};
|
||||
return if !$self->{sock};
|
||||
|
||||
my $rin = $self->{sock_as_vec};
|
||||
my $rout;
|
||||
my $nfound;
|
||||
|
||||
my $packetsize = $self->{res}->udppacketsize;
|
||||
$packetsize = 512 if $packetsize < 512; # just in case
|
||||
$self->{sock}->blocking(0) unless(RECV_FLAGS);
|
||||
for (;;) {
|
||||
eval { # use eval to catch alarm signal
|
||||
($nfound, undef) = select($rout=$rin, undef, undef, 0);
|
||||
1;
|
||||
} or do {
|
||||
last;
|
||||
};
|
||||
last if !$nfound;
|
||||
last if !$self->{sock}->recv(my $data, $packetsize+256, RECV_FLAGS);
|
||||
}
|
||||
$self->{sock}->blocking(1) unless(RECV_FLAGS);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
@ -920,8 +966,9 @@ sub send {
|
||||
# using some arbitrary encoding (they are normally just 7-bit ascii
|
||||
# characters anyway, just need to get rid of the utf8 flag). Bug 6959
|
||||
# Most if not all af these come from a SPF plugin.
|
||||
# (was a call to utf8::encode($name), now we prefer a proper idn_to_ascii)
|
||||
#
|
||||
utf8::encode($name);
|
||||
$name = idn_to_ascii($name);
|
||||
|
||||
my $retrans = $self->{retrans};
|
||||
my $retries = $self->{retry};
|
||||
@ -985,7 +1032,7 @@ sub finish_socket {
|
||||
my ($self) = @_;
|
||||
if ($self->{sock}) {
|
||||
$self->{sock}->close()
|
||||
or warn "finish_socket: error closing socket $self->{sock}: $!";
|
||||
or warn "dns: finish_socket: error closing socket $self->{sock}: $!\n";
|
||||
undef $self->{sock};
|
||||
}
|
||||
}
|
||||
@ -1014,7 +1061,7 @@ sub fhs_to_vec {
|
||||
foreach my $sock (@fhlist) {
|
||||
my $fno = fileno($sock);
|
||||
if (!defined $fno) {
|
||||
warn "dns: oops! fileno now undef for $sock";
|
||||
warn "dns: oops! fileno now undef for $sock\n";
|
||||
} else {
|
||||
vec ($rin, $fno, 1) = 1;
|
||||
}
|
||||
|
899
upstream/lib/Mail/SpamAssassin/GeoDB.pm
Normal file
899
upstream/lib/Mail/SpamAssassin/GeoDB.pm
Normal file
@ -0,0 +1,899 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::GeoDB - unified interface for geoip modules
|
||||
|
||||
Plugins need to signal SA main package the modules they want loaded
|
||||
|
||||
package Mail::SpamAssassin::Plugin::MyPlugin;
|
||||
sub new {
|
||||
...
|
||||
$self->{main}->{geodb_wanted}->{country} = 1;
|
||||
$self->{main}->{geodb_wanted}->{isp} = 1;
|
||||
)
|
||||
|
||||
(internal stuff still subject to change)
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::GeoDB;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Socket;
|
||||
use version 0.77;
|
||||
|
||||
our @ISA = qw();
|
||||
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
my @geoip_default_path = qw(
|
||||
/usr/local/share/GeoIP
|
||||
/usr/share/GeoIP
|
||||
/var/lib/GeoIP
|
||||
/opt/share/GeoIP
|
||||
);
|
||||
|
||||
# load order (city contains country, isp contains asn)
|
||||
my @geoip_types = qw( city country isp asn );
|
||||
|
||||
# v6 is not needed, automatically tries *v6.dat also
|
||||
my %geoip_default_files = (
|
||||
'city' => ['GeoIPCity.dat','GeoLiteCity.dat'],
|
||||
'country' => ['GeoIP.dat'],
|
||||
'isp' => ['GeoIPISP.dat'],
|
||||
'asn' => ['GeoIPASNum.dat'],
|
||||
);
|
||||
|
||||
my %geoip2_default_files = (
|
||||
'city' => ['GeoIP2-City.mmdb','GeoLite2-City.mmdb',
|
||||
'dbip-city.mmdb','dbip-city-lite.mmdb'],
|
||||
'country' => ['GeoIP2-Country.mmdb','GeoLite2-Country.mmdb',
|
||||
'dbip-country.mmdb','dbip-country-lite.mmdb'],
|
||||
'isp' => ['GeoIP2-ISP.mmdb','GeoLite2-ISP.mmdb'],
|
||||
'asn' => ['GeoIP2-ASN.mmdb','GeoLite2-ASN.mmdb'],
|
||||
);
|
||||
|
||||
my %country_to_continent = (
|
||||
'AP'=>'AS','EU'=>'EU','AD'=>'EU','AE'=>'AS','AF'=>'AS','AG'=>'NA',
|
||||
'AI'=>'NA','AL'=>'EU','AM'=>'AS','CW'=>'NA','AO'=>'AF','AQ'=>'AN',
|
||||
'AR'=>'SA','AS'=>'OC','AT'=>'EU','AU'=>'OC','AW'=>'NA','AZ'=>'AS',
|
||||
'BA'=>'EU','BB'=>'NA','BD'=>'AS','BE'=>'EU','BF'=>'AF','BG'=>'EU',
|
||||
'BH'=>'AS','BI'=>'AF','BJ'=>'AF','BM'=>'NA','BN'=>'AS','BO'=>'SA',
|
||||
'BR'=>'SA','BS'=>'NA','BT'=>'AS','BV'=>'AN','BW'=>'AF','BY'=>'EU',
|
||||
'BZ'=>'NA','CA'=>'NA','CC'=>'AS','CD'=>'AF','CF'=>'AF','CG'=>'AF',
|
||||
'CH'=>'EU','CI'=>'AF','CK'=>'OC','CL'=>'SA','CM'=>'AF','CN'=>'AS',
|
||||
'CO'=>'SA','CR'=>'NA','CU'=>'NA','CV'=>'AF','CX'=>'AS','CY'=>'AS',
|
||||
'CZ'=>'EU','DE'=>'EU','DJ'=>'AF','DK'=>'EU','DM'=>'NA','DO'=>'NA',
|
||||
'DZ'=>'AF','EC'=>'SA','EE'=>'EU','EG'=>'AF','EH'=>'AF','ER'=>'AF',
|
||||
'ES'=>'EU','ET'=>'AF','FI'=>'EU','FJ'=>'OC','FK'=>'SA','FM'=>'OC',
|
||||
'FO'=>'EU','FR'=>'EU','FX'=>'EU','GA'=>'AF','GB'=>'EU','GD'=>'NA',
|
||||
'GE'=>'AS','GF'=>'SA','GH'=>'AF','GI'=>'EU','GL'=>'NA','GM'=>'AF',
|
||||
'GN'=>'AF','GP'=>'NA','GQ'=>'AF','GR'=>'EU','GS'=>'AN','GT'=>'NA',
|
||||
'GU'=>'OC','GW'=>'AF','GY'=>'SA','HK'=>'AS','HM'=>'AN','HN'=>'NA',
|
||||
'HR'=>'EU','HT'=>'NA','HU'=>'EU','ID'=>'AS','IE'=>'EU','IL'=>'AS',
|
||||
'IN'=>'AS','IO'=>'AS','IQ'=>'AS','IR'=>'AS','IS'=>'EU','IT'=>'EU',
|
||||
'JM'=>'NA','JO'=>'AS','JP'=>'AS','KE'=>'AF','KG'=>'AS','KH'=>'AS',
|
||||
'KI'=>'OC','KM'=>'AF','KN'=>'NA','KP'=>'AS','KR'=>'AS','KW'=>'AS',
|
||||
'KY'=>'NA','KZ'=>'AS','LA'=>'AS','LB'=>'AS','LC'=>'NA','LI'=>'EU',
|
||||
'LK'=>'AS','LR'=>'AF','LS'=>'AF','LT'=>'EU','LU'=>'EU','LV'=>'EU',
|
||||
'LY'=>'AF','MA'=>'AF','MC'=>'EU','MD'=>'EU','MG'=>'AF','MH'=>'OC',
|
||||
'MK'=>'EU','ML'=>'AF','MM'=>'AS','MN'=>'AS','MO'=>'AS','MP'=>'OC',
|
||||
'MQ'=>'NA','MR'=>'AF','MS'=>'NA','MT'=>'EU','MU'=>'AF','MV'=>'AS',
|
||||
'MW'=>'AF','MX'=>'NA','MY'=>'AS','MZ'=>'AF','NA'=>'AF','NC'=>'OC',
|
||||
'NE'=>'AF','NF'=>'OC','NG'=>'AF','NI'=>'NA','NL'=>'EU','NO'=>'EU',
|
||||
'NP'=>'AS','NR'=>'OC','NU'=>'OC','NZ'=>'OC','OM'=>'AS','PA'=>'NA',
|
||||
'PE'=>'SA','PF'=>'OC','PG'=>'OC','PH'=>'AS','PK'=>'AS','PL'=>'EU',
|
||||
'PM'=>'NA','PN'=>'OC','PR'=>'NA','PS'=>'AS','PT'=>'EU','PW'=>'OC',
|
||||
'PY'=>'SA','QA'=>'AS','RE'=>'AF','RO'=>'EU','RU'=>'EU','RW'=>'AF',
|
||||
'SA'=>'AS','SB'=>'OC','SC'=>'AF','SD'=>'AF','SE'=>'EU','SG'=>'AS',
|
||||
'SH'=>'AF','SI'=>'EU','SJ'=>'EU','SK'=>'EU','SL'=>'AF','SM'=>'EU',
|
||||
'SN'=>'AF','SO'=>'AF','SR'=>'SA','ST'=>'AF','SV'=>'NA','SY'=>'AS',
|
||||
'SZ'=>'AF','TC'=>'NA','TD'=>'AF','TF'=>'AN','TG'=>'AF','TH'=>'AS',
|
||||
'TJ'=>'AS','TK'=>'OC','TM'=>'AS','TN'=>'AF','TO'=>'OC','TL'=>'AS',
|
||||
'TR'=>'EU','TT'=>'NA','TV'=>'OC','TW'=>'AS','TZ'=>'AF','UA'=>'EU',
|
||||
'UG'=>'AF','UM'=>'OC','US'=>'NA','UY'=>'SA','UZ'=>'AS','VA'=>'EU',
|
||||
'VC'=>'NA','VE'=>'SA','VG'=>'NA','VI'=>'NA','VN'=>'AS','VU'=>'OC',
|
||||
'WF'=>'OC','WS'=>'OC','YE'=>'AS','YT'=>'AF','RS'=>'EU','ZA'=>'AF',
|
||||
'ZM'=>'AF','ME'=>'EU','ZW'=>'AF','AX'=>'EU','GG'=>'EU','IM'=>'EU',
|
||||
'JE'=>'EU','BL'=>'NA','MF'=>'NA','BQ'=>'NA','SS'=>'AF','**'=>'**',
|
||||
);
|
||||
|
||||
sub new {
|
||||
my ($class, $conf) = @_;
|
||||
$class = ref($class) || $class;
|
||||
|
||||
my $self = {};
|
||||
bless ($self, $class);
|
||||
|
||||
$self->{cache} = ();
|
||||
$self->init_database($conf || {});
|
||||
$self;
|
||||
}
|
||||
|
||||
sub init_database {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
# Try city too if country wanted
|
||||
$opts->{wanted}->{city} = 1 if $opts->{wanted}->{country};
|
||||
# Try isp too if asn wanted
|
||||
$opts->{wanted}->{isp} = 1 if $opts->{wanted}->{asn};
|
||||
|
||||
my $geodb_opts = {
|
||||
'module' => $opts->{conf}->{module} || undef,
|
||||
'dbs' => $opts->{conf}->{options} || undef,
|
||||
'wanted' => $opts->{wanted} || undef,
|
||||
'search_path' => defined $opts->{conf}->{geodb_search_path} ?
|
||||
$opts->{conf}->{geodb_search_path} : \@geoip_default_path,
|
||||
};
|
||||
|
||||
my ($db, $dbapi, $loaded);
|
||||
|
||||
## GeoIP2
|
||||
if (!$db && (!$geodb_opts->{module} || $geodb_opts->{module} eq 'geoip2')) {
|
||||
($db, $dbapi) = $self->load_geoip2($geodb_opts);
|
||||
$loaded = 'geoip2' if $db;
|
||||
}
|
||||
|
||||
## Geo::IP
|
||||
if (!$db && (!$geodb_opts->{module} || $geodb_opts->{module} eq 'geoip')) {
|
||||
($db, $dbapi) = $self->load_geoip($geodb_opts);
|
||||
$loaded = 'geoip' if $db;
|
||||
}
|
||||
|
||||
## IP::Country::DB_File
|
||||
if (!$db && $geodb_opts->{module} && $geodb_opts->{module} eq 'dbfile') {
|
||||
# Only try if geodb_module and path to ipcc.db specified
|
||||
($db, $dbapi) = $self->load_dbfile($geodb_opts);
|
||||
$loaded = 'dbfile' if $db;
|
||||
}
|
||||
|
||||
## IP::Country::Fast
|
||||
if (!$db && (!$geodb_opts->{module} || $geodb_opts->{module} eq 'fast')) {
|
||||
($db, $dbapi) = $self->load_fast($geodb_opts);
|
||||
$loaded = 'fast' if $db;
|
||||
}
|
||||
|
||||
if (!$db) {
|
||||
dbg("geodb: No supported database could be loaded");
|
||||
die("No supported GeoDB database could be loaded\n");
|
||||
}
|
||||
|
||||
# country can be aliased to city
|
||||
if (!$dbapi->{country} && $dbapi->{city}) {
|
||||
$dbapi->{country} = $dbapi->{city};
|
||||
}
|
||||
if (!$dbapi->{country_v6} && $dbapi->{city_v6}) {
|
||||
$dbapi->{country_v6} = $dbapi->{city_v6}
|
||||
}
|
||||
# GeoIP2 asn can be aliased to isp
|
||||
if ($loaded eq 'geoip2') {
|
||||
if (!$dbapi->{asn} && $dbapi->{isp}) {
|
||||
$dbapi->{asn} = $dbapi->{isp};
|
||||
}
|
||||
if (!$dbapi->{asn_v6} && $dbapi->{isp_v6}) {
|
||||
$dbapi->{asn_v6} = $dbapi->{isp_v6}
|
||||
}
|
||||
}
|
||||
|
||||
$self->{db} = $db;
|
||||
$self->{dbapi} = $dbapi;
|
||||
|
||||
foreach (@{$self->get_dbinfo()}) {
|
||||
dbg("geodb: database info: ".$_);
|
||||
}
|
||||
#dbg("geodb: apis available: ".join(', ', sort keys %{$self->{dbapi}}));
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub load_geoip2 {
|
||||
my ($self, $geodb_opts) = @_;
|
||||
my ($db, $dbapi, $ok);
|
||||
|
||||
# Warn about fatal errors if this module was specifically requested
|
||||
my $errwarn = ($geodb_opts->{module}||'') eq 'geoip2';
|
||||
|
||||
eval {
|
||||
require MaxMind::DB::Reader;
|
||||
} or do {
|
||||
my $err = $@;
|
||||
$err =~ s/ at .*//s;
|
||||
$err = "geodb: MaxMind::DB::Reader (GeoIP2) module load failed: $err";
|
||||
$errwarn ? warn("$err\n") : dbg($err);
|
||||
return (undef, undef);
|
||||
};
|
||||
|
||||
my %path;
|
||||
foreach my $dbtype (@geoip_types) {
|
||||
# skip country if city already loaded
|
||||
next if $dbtype eq 'country' && $db->{city};
|
||||
# skip asn if isp already loaded
|
||||
next if $dbtype eq 'asn' && $db->{isp};
|
||||
# skip if not needed
|
||||
next if $geodb_opts->{wanted} && !$geodb_opts->{wanted}->{$dbtype};
|
||||
# only autosearch if no absolute path given
|
||||
if (!defined $geodb_opts->{dbs}->{$dbtype}) {
|
||||
# Try some default locations
|
||||
PATHS_GEOIP2: foreach my $p (@{$geodb_opts->{search_path}}) {
|
||||
foreach my $f (@{$geoip2_default_files{$dbtype}}) {
|
||||
if (-f "$p/$f") {
|
||||
$path{$dbtype} = "$p/$f";
|
||||
dbg("geodb: GeoIP2: search found $dbtype $p/$f");
|
||||
last PATHS_GEOIP2;
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
if (!-f $geodb_opts->{dbs}->{$dbtype}) {
|
||||
dbg("geodb: GeoIP2: $dbtype database requested, but not found: ".
|
||||
$geodb_opts->{dbs}->{$dbtype});
|
||||
next;
|
||||
}
|
||||
$path{$dbtype} = $geodb_opts->{dbs}->{$dbtype};
|
||||
}
|
||||
|
||||
if (defined $path{$dbtype}) {
|
||||
eval {
|
||||
$db->{$dbtype} = MaxMind::DB::Reader->new(
|
||||
file => $path{$dbtype},
|
||||
);
|
||||
die "unknown error" unless $db->{$dbtype};
|
||||
1;
|
||||
};
|
||||
if ($@ || !$db->{$dbtype}) {
|
||||
my $err = $@;
|
||||
$err =~ s/\s+Trace begun.*//s;
|
||||
$err =~ s/ at .*//s;
|
||||
dbg("geodb: GeoIP2: $dbtype load failed: $err");
|
||||
} else {
|
||||
dbg("geodb: GeoIP2: loaded $dbtype from $path{$dbtype}");
|
||||
$ok = 1;
|
||||
}
|
||||
} else {
|
||||
my $from = defined $geodb_opts->{dbs}->{$dbtype} ?
|
||||
$geodb_opts->{dbs}->{$dbtype} : "default locations";
|
||||
dbg("geodb: GeoIP2: $dbtype database not found from $from");
|
||||
}
|
||||
}
|
||||
|
||||
if (!$ok) {
|
||||
warn("geodb: GeoIP2 requested, but no databases could be loaded\n") if $errwarn;
|
||||
return (undef, undef)
|
||||
}
|
||||
|
||||
# dbinfo_DBTYPE()
|
||||
$db->{city} and $dbapi->{dbinfo_city} = sub {
|
||||
my $m = $_[0]->{db}->{city}->metadata();
|
||||
return "GeoIP2 city: ".$m->description()->{en}." / ".localtime($m->build_epoch());
|
||||
};
|
||||
$db->{country} and $dbapi->{dbinfo_country} = sub {
|
||||
my $m = $_[0]->{db}->{country}->metadata();
|
||||
return "GeoIP2 country: ".$m->description()->{en}." / ".localtime($m->build_epoch());
|
||||
};
|
||||
$db->{isp} and $dbapi->{dbinfo_isp} = sub {
|
||||
my $m = $_[0]->{db}->{isp}->metadata();
|
||||
return "GeoIP2 isp: ".$m->description()->{en}." / ".localtime($m->build_epoch());
|
||||
};
|
||||
$db->{asn} and $dbapi->{dbinfo_asn} = sub {
|
||||
my $m = $_[0]->{db}->{asn}->metadata();
|
||||
return "GeoIP2 asn: ".$m->description()->{en}." / ".localtime($m->build_epoch());
|
||||
};
|
||||
|
||||
# city()
|
||||
$db->{city} and $dbapi->{city} = $dbapi->{city_v6} = sub {
|
||||
my $res = {};
|
||||
my $city;
|
||||
eval {
|
||||
$city = $_[0]->{db}->{city}->record_for_address($_[1]);
|
||||
1;
|
||||
} or do {
|
||||
$@ =~ s/\s+Trace begun.*//s;
|
||||
dbg("geodb: GeoIP2 city query failed for $_[1]: $@");
|
||||
return $res;
|
||||
};
|
||||
eval {
|
||||
$res->{city_name} = $city->{city}->{names}->{en};
|
||||
$res->{country} = $city->{country}->{iso_code};
|
||||
$res->{country_name} = $city->{country}->{names}->{en};
|
||||
$res->{continent} = $city->{continent}->{code};
|
||||
$res->{continent_name} = $city->{continent}->{names}->{en};
|
||||
1;
|
||||
};
|
||||
return $res;
|
||||
};
|
||||
|
||||
# country()
|
||||
$db->{country} and $dbapi->{country} = $dbapi->{country_v6} = sub {
|
||||
my $res = {};
|
||||
my $country;
|
||||
eval {
|
||||
$country = $_[0]->{db}->{country}->record_for_address($_[1]);
|
||||
1;
|
||||
} or do {
|
||||
$@ =~ s/\s+Trace begun.*//s;
|
||||
dbg("geodb: GeoIP2 country query failed for $_[1]: $@");
|
||||
return $res;
|
||||
};
|
||||
eval {
|
||||
$res->{country} = $country->{country}->{iso_code};
|
||||
$res->{country_name} = $country->{country}->{names}->{en};
|
||||
$res->{continent} = $country->{continent}->{code};
|
||||
$res->{continent_name} = $country->{continent}->{names}->{en};
|
||||
1;
|
||||
};
|
||||
return $res;
|
||||
};
|
||||
|
||||
# isp()
|
||||
$db->{isp} and $dbapi->{isp} = $dbapi->{isp_v6} = sub {
|
||||
my $res = {};
|
||||
my $isp;
|
||||
eval {
|
||||
$isp = $_[0]->{db}->{isp}->record_for_address($_[1]);
|
||||
1;
|
||||
} or do {
|
||||
$@ =~ s/\s+Trace begun.*//s;
|
||||
dbg("geodb: GeoIP2 isp query failed for $_[1]: $@");
|
||||
return $res;
|
||||
};
|
||||
eval {
|
||||
$res->{asn} = $isp->{autonomous_system_number};
|
||||
$res->{asn_organization} = $isp->{autonomous_system_organization};
|
||||
$res->{isp} = $isp->{isp};
|
||||
$res->{organization} = $isp->{organization};
|
||||
1;
|
||||
};
|
||||
return $res;
|
||||
};
|
||||
|
||||
# asn()
|
||||
$db->{asn} and $dbapi->{asn} = $dbapi->{asn_v6} = sub {
|
||||
my $res = {};
|
||||
my $asn;
|
||||
eval {
|
||||
$asn = $_[0]->{db}->{asn}->record_for_address($_[1]);
|
||||
1;
|
||||
} or do {
|
||||
$@ =~ s/\s+Trace begun.*//s;
|
||||
dbg("geodb: GeoIP2 asn query failed for $_[1]: $@");
|
||||
return $res;
|
||||
};
|
||||
eval {
|
||||
$res->{asn} = $asn->{autonomous_system_number};
|
||||
$res->{asn_organization} = $asn->{autonomous_system_organization};
|
||||
1;
|
||||
};
|
||||
return $res;
|
||||
};
|
||||
|
||||
return ($db, $dbapi);
|
||||
}
|
||||
|
||||
sub load_geoip {
|
||||
my ($self, $geodb_opts) = @_;
|
||||
my ($db, $dbapi, $ok);
|
||||
my ($gic_wanted, $gic_have, $gip_wanted, $gip_have);
|
||||
my ($flags, $fix_stderr, $can_ipv6);
|
||||
|
||||
# Warn about fatal errors if this module was specifically requested
|
||||
my $errwarn = ($geodb_opts->{module}||'') eq 'geoip';
|
||||
|
||||
eval {
|
||||
require Geo::IP;
|
||||
# need GeoIP C library 1.6.3 and GeoIP perl API 1.4.4 or later to avoid messages leaking - Bug 7153
|
||||
$gip_wanted = version->parse('v1.4.4');
|
||||
$gip_have = version->parse(Geo::IP->VERSION);
|
||||
$gic_wanted = version->parse('v1.6.3');
|
||||
eval { $gic_have = version->parse(Geo::IP->lib_version()); }; # might not have lib_version()
|
||||
$gic_have = 'none' if !defined $gic_have;
|
||||
dbg("geodb: GeoIP: versions: Geo::IP $gip_have, C library $gic_have");
|
||||
$flags = 0;
|
||||
$fix_stderr = 0;
|
||||
if (ref($gic_have) eq 'version') {
|
||||
# this code burps an ugly message if it fails, but that's redirected elsewhere
|
||||
eval '$flags = Geo::IP::GEOIP_SILENCE' if $gip_wanted >= $gip_have;
|
||||
$fix_stderr = $flags && $gic_wanted >= $gic_have;
|
||||
}
|
||||
$can_ipv6 = Geo::IP->VERSION >= 1.39 && Geo::IP->api eq 'CAPI';
|
||||
1;
|
||||
} or do {
|
||||
my $err = $@;
|
||||
$err =~ s/ at .*//s;
|
||||
$err = "geodb: Geo::IP module load failed: $err";
|
||||
$errwarn ? warn("$err\n") : dbg($err);
|
||||
return (undef, undef);
|
||||
};
|
||||
|
||||
my %path;
|
||||
foreach my $dbtype (@geoip_types) {
|
||||
# skip country if city already loaded
|
||||
next if $dbtype eq 'country' && $db->{city};
|
||||
# skip asn if isp already loaded
|
||||
next if $dbtype eq 'asn' && $db->{isp};
|
||||
# skip if not needed
|
||||
next if $geodb_opts->{wanted} && !$geodb_opts->{wanted}->{$dbtype};
|
||||
# only autosearch if no absolute path given
|
||||
if (!defined $geodb_opts->{dbs}->{$dbtype}) {
|
||||
# Try some default locations
|
||||
PATHS_GEOIP: foreach my $p (@{$geodb_opts->{search_path}}) {
|
||||
foreach my $f (@{$geoip_default_files{$dbtype}}) {
|
||||
if (-f "$p/$f") {
|
||||
$path{$dbtype} = "$p/$f";
|
||||
dbg("geodb: GeoIP: search found $dbtype $p/$f");
|
||||
if ($can_ipv6 && $f =~ s/\.(dat)$/v6.$1/i) {
|
||||
if (-f "$p/$f") {
|
||||
$path{$dbtype."_v6"} = "$p/$f";
|
||||
dbg("geodb: GeoIP: search found $dbtype $p/$f");
|
||||
}
|
||||
}
|
||||
last PATHS_GEOIP;
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
if (!-f $geodb_opts->{dbs}->{$dbtype}) {
|
||||
dbg("geodb: GeoIP: $dbtype database requested, but not found: ".
|
||||
$geodb_opts->{dbs}->{$dbtype});
|
||||
next;
|
||||
}
|
||||
$path{$dbtype} = $geodb_opts->{dbs}->{$dbtype};
|
||||
}
|
||||
}
|
||||
|
||||
if (!$can_ipv6) {
|
||||
dbg("geodb: GeoIP: IPv6 support not enabled, versions Geo::IP 1.39, GeoIP C API 1.4.7 required");
|
||||
}
|
||||
|
||||
if ($fix_stderr) {
|
||||
open(OLDERR, ">&STDERR");
|
||||
open(STDERR, ">/dev/null");
|
||||
}
|
||||
foreach my $dbtype (@geoip_types) {
|
||||
next unless defined $path{$dbtype};
|
||||
eval {
|
||||
$db->{$dbtype} = Geo::IP->open($path{$dbtype}, Geo::IP->GEOIP_STANDARD | $flags);
|
||||
if ($can_ipv6 && defined $path{$dbtype."_v6"}) {
|
||||
$db->{$dbtype."_v6"} = Geo::IP->open($path{$dbtype."_v6"}, Geo::IP->GEOIP_STANDARD | $flags);
|
||||
}
|
||||
};
|
||||
if ($@ || !$db->{$dbtype}) {
|
||||
my $err = $@;
|
||||
$err =~ s/ at .*//s;
|
||||
dbg("geodb: GeoIP: database $path{$dbtype} load failed: $err");
|
||||
} else {
|
||||
dbg("geodb: GeoIP: loaded $dbtype from $path{$dbtype}");
|
||||
$ok = 1;
|
||||
}
|
||||
}
|
||||
if ($fix_stderr) {
|
||||
open(STDERR, ">&OLDERR");
|
||||
close(OLDERR);
|
||||
}
|
||||
|
||||
if (!$ok) {
|
||||
warn("geodb: GeoIP requested, but no databases could be loaded\n") if $errwarn;
|
||||
return (undef, undef)
|
||||
}
|
||||
|
||||
# dbinfo_DBTYPE()
|
||||
$db->{city} and $dbapi->{dbinfo_city} = sub {
|
||||
return "Geo::IP IPv4 city: " . ($_[0]->{db}->{city}->database_info || '?')." / IPv6: ".
|
||||
($_[0]->{db}->{city_v6} ? $_[0]->{db}->{city_v6}->database_info || '?' : 'no')
|
||||
};
|
||||
$db->{country} and $dbapi->{dbinfo_country} = sub {
|
||||
return "Geo::IP IPv4 country: " . ($_[0]->{db}->{country}->database_info || '?')." / IPv6: ".
|
||||
($_[0]->{db}->{country_v6} ? $_[0]->{db}->{country_v6}->database_info || '?' : 'no')
|
||||
};
|
||||
$db->{isp} and $dbapi->{dbinfo_isp} = sub {
|
||||
return "Geo::IP IPv4 isp: " . ($_[0]->{db}->{isp}->database_info || '?')." / IPv6: ".
|
||||
($_[0]->{db}->{isp_v6} ? $_[0]->{db}->{isp_v6}->database_info || '?' : 'no')
|
||||
};
|
||||
$db->{asn} and $dbapi->{dbinfo_asn} = sub {
|
||||
return "Geo::IP IPv4 asn: " . ($_[0]->{db}->{asn}->database_info || '?')." / IPv6: ".
|
||||
($_[0]->{db}->{asn_v6} ? $_[0]->{db}->{asn_v6}->database_info || '?' : 'no')
|
||||
};
|
||||
|
||||
# city()
|
||||
$db->{city} and $dbapi->{city} = sub {
|
||||
my $res = {};
|
||||
my $city;
|
||||
if ($_[1] =~ IS_IPV4_ADDRESS) {
|
||||
$city = $_[0]->{db}->{city}->record_by_addr($_[1]);
|
||||
} elsif ($_[0]->{db}->{city_v6}) {
|
||||
$city = $_[0]->{db}->{city_v6}->record_by_addr_v6($_[1]);
|
||||
}
|
||||
if (!defined $city) {
|
||||
dbg("geodb: GeoIP city query failed for $_[1]");
|
||||
return $res;
|
||||
}
|
||||
$res->{city_name} = $city->city;
|
||||
$res->{country} = $city->country_code;
|
||||
$res->{country_name} = $city->country_name;
|
||||
$res->{continent} = $city->continent_code;
|
||||
return $res;
|
||||
};
|
||||
$dbapi->{city_v6} = $dbapi->{city} if $db->{city_v6};
|
||||
|
||||
# country()
|
||||
$db->{country} and $dbapi->{country} = sub {
|
||||
my $res = {};
|
||||
my $country;
|
||||
eval {
|
||||
if ($_[1] =~ IS_IPV4_ADDRESS) {
|
||||
$country = $_[0]->{db}->{country}->country_code_by_addr($_[1]);
|
||||
} elsif ($_[0]->{db}->{country_v6}) {
|
||||
$country = $_[0]->{db}->{country_v6}->country_code_by_addr_v6($_[1]);
|
||||
}
|
||||
1;
|
||||
};
|
||||
if (!defined $country) {
|
||||
dbg("geodb: GeoIP country query failed for $_[1]");
|
||||
return $res;
|
||||
};
|
||||
$res->{country} = $country || 'XX';
|
||||
$res->{continent} = $country_to_continent{$country} || 'XX';
|
||||
return $res;
|
||||
};
|
||||
$dbapi->{country_v6} = $dbapi->{country} if $db->{country_v6};
|
||||
|
||||
# isp()
|
||||
$db->{isp} and $dbapi->{isp} = sub {
|
||||
my $res = {};
|
||||
my $isp;
|
||||
eval {
|
||||
if ($_[1] =~ IS_IPV4_ADDRESS) {
|
||||
$isp = $_[0]->{db}->{isp}->isp_by_addr($_[1]);
|
||||
} else {
|
||||
# TODO?
|
||||
return $res;
|
||||
}
|
||||
1;
|
||||
};
|
||||
if (!defined $isp) {
|
||||
dbg("geodb: GeoIP isp query failed for $_[1]");
|
||||
return $res;
|
||||
};
|
||||
$res->{isp} = $isp;
|
||||
return $res;
|
||||
};
|
||||
|
||||
# asn()
|
||||
$db->{asn} and $dbapi->{asn} = sub {
|
||||
my $res = {};
|
||||
my $asn;
|
||||
eval {
|
||||
if ($_[1] =~ IS_IPV4_ADDRESS) {
|
||||
$asn = $_[0]->{db}->{asn}->isp_by_addr($_[1]);
|
||||
} else {
|
||||
# TODO?
|
||||
return $res;
|
||||
}
|
||||
1;
|
||||
};
|
||||
if (!defined $asn || $asn !~ /^((?:AS)?\d+)(?:\s+(.+))?/) {
|
||||
dbg("geodb: GeoIP asn query failed for $_[1]");
|
||||
return $res;
|
||||
};
|
||||
$res->{asn} = $1;
|
||||
$res->{asn_organization} = $2 if defined $2;
|
||||
return $res;
|
||||
};
|
||||
|
||||
return ($db, $dbapi);
|
||||
}
|
||||
|
||||
sub load_dbfile {
|
||||
my ($self, $geodb_opts) = @_;
|
||||
my ($db, $dbapi);
|
||||
|
||||
# Warn about fatal errors if this module was specifically requested
|
||||
my $errwarn = ($geodb_opts->{module}||'') eq 'dbfile';
|
||||
|
||||
if (!defined $geodb_opts->{dbs}->{country}) {
|
||||
my $err = "geodb: IP::Country::DB_File requires geodb_options country:/path/to/ipcc.db";
|
||||
$errwarn ? warn("$err\n") : dbg($err);
|
||||
return (undef, undef);
|
||||
}
|
||||
|
||||
if (!-f $geodb_opts->{dbs}->{country}) {
|
||||
my $err = "geodb: IP::Country::DB_File database not found: ".$geodb_opts->{dbs}->{country};
|
||||
$errwarn ? warn("$err\n") : dbg($err);
|
||||
return (undef, undef);
|
||||
}
|
||||
|
||||
eval {
|
||||
require IP::Country::DB_File;
|
||||
$db->{country} = IP::Country::DB_File->new($geodb_opts->{dbs}->{country});
|
||||
1;
|
||||
};
|
||||
if ($@ || !$db->{country}) {
|
||||
my $err = $@;
|
||||
$err =~ s/ at .*//s;
|
||||
$err = "geodb: IP::Country::DB_File country load failed: $err";
|
||||
$errwarn ? warn("$err\n") : dbg($err);
|
||||
return (undef, undef);
|
||||
} else {
|
||||
dbg("geodb: IP::Country::DB_File loaded country from ".$geodb_opts->{dbs}->{country});
|
||||
}
|
||||
|
||||
# dbinfo_DBTYPE()
|
||||
$db->{country} and $dbapi->{dbinfo_country} = sub {
|
||||
return "IP::Country::DB_File country: ".localtime($_[0]->{db}->{country}->db_time());
|
||||
};
|
||||
|
||||
# country();
|
||||
$db->{country} and $dbapi->{country} = $dbapi->{country_v6} = sub {
|
||||
my $res = {};
|
||||
my $country;
|
||||
if ($_[1] =~ IS_IPV4_ADDRESS) {
|
||||
$country = $_[0]->{db}->{country}->inet_atocc($_[1]);
|
||||
} else {
|
||||
$country = $_[0]->{db}->{country}->inet6_atocc($_[1]);
|
||||
}
|
||||
if (!defined $country) {
|
||||
dbg("geodb: IP::Country::DB_File country query failed for $_[1]");
|
||||
return $res;
|
||||
};
|
||||
$res->{country} = $country || 'XX';
|
||||
$res->{continent} = $country_to_continent{$country} || 'XX';
|
||||
return $res;
|
||||
};
|
||||
|
||||
return ($db, $dbapi);
|
||||
}
|
||||
|
||||
sub load_fast {
|
||||
my ($self, $geodb_opts) = @_;
|
||||
my ($db, $dbapi);
|
||||
|
||||
# Warn about fatal errors if this module was specifically requested
|
||||
my $errwarn = ($geodb_opts->{module}||'') eq 'fast';
|
||||
|
||||
eval {
|
||||
require IP::Country::Fast;
|
||||
$db->{country} = IP::Country::Fast->new();
|
||||
1;
|
||||
};
|
||||
if ($@ || !$db->{country}) {
|
||||
my $err = $@;
|
||||
$err =~ s/ at .*//s;
|
||||
$err = "geodb: IP::Country::Fast load failed: $err";
|
||||
$errwarn ? warn("$err\n") : dbg($err);
|
||||
return (undef, undef);
|
||||
}
|
||||
|
||||
# dbinfo_DBTYPE()
|
||||
$db->{country} and $dbapi->{dbinfo_country} = sub {
|
||||
return "IP::Country::Fast country: ".localtime($_[0]->{db}->{country}->db_time());
|
||||
};
|
||||
|
||||
# country();
|
||||
$db->{country} and $dbapi->{country} = sub {
|
||||
my $res = {};
|
||||
my $country;
|
||||
if ($_[1] =~ IS_IPV4_ADDRESS) {
|
||||
$country = $_[0]->{db}->{country}->inet_atocc($_[1]);
|
||||
} else {
|
||||
return $res;
|
||||
}
|
||||
if (!defined $country) {
|
||||
dbg("geodb: IP::Country::Fast country query failed for $_[1]");
|
||||
return $res;
|
||||
};
|
||||
$res->{country} = $country || 'XX';
|
||||
$res->{continent} = $country_to_continent{$country} || 'XX';
|
||||
return $res;
|
||||
};
|
||||
|
||||
return ($db, $dbapi);
|
||||
}
|
||||
|
||||
# return array, infoline per database type
|
||||
sub get_dbinfo {
|
||||
my ($self, $db) = @_;
|
||||
|
||||
my @lines;
|
||||
foreach (@geoip_types) {
|
||||
if (exists $self->{dbapi}->{"dbinfo_".$_}) {
|
||||
push @lines,
|
||||
$self->{dbapi}->{"dbinfo_".$_}->($self) || "$_ failed";
|
||||
}
|
||||
}
|
||||
|
||||
return \@lines;
|
||||
}
|
||||
|
||||
sub get_country {
|
||||
my ($self, $ip) = @_;
|
||||
|
||||
return undef if !defined $ip || $ip !~ /\S/; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
if ($ip =~ IS_IP_PRIVATE) {
|
||||
return '**';
|
||||
}
|
||||
|
||||
if ($ip !~ IS_IP_ADDRESS) {
|
||||
$ip = name_to_ip($ip);
|
||||
return 'XX' if !defined $ip;
|
||||
}
|
||||
|
||||
if ($self->{dbapi}->{city}) {
|
||||
return $self->_get('city',$ip)->{country} || 'XX';
|
||||
} elsif ($self->{dbapi}->{country}) {
|
||||
return $self->_get('country',$ip)->{country} || 'XX';
|
||||
} else {
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
}
|
||||
|
||||
sub get_continent {
|
||||
my ($self, $ip) = @_;
|
||||
|
||||
return undef if !defined $ip || $ip !~ /\S/; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
# If it's already CC, use our own lookup table..
|
||||
if (length($ip) == 2) {
|
||||
return $country_to_continent{uc($ip)} || 'XX';
|
||||
}
|
||||
|
||||
if ($self->{dbapi}->{city}) {
|
||||
return $self->_get('city',$ip)->{continent} || 'XX';
|
||||
} elsif ($self->{dbapi}->{country}) {
|
||||
return $self->_get('country',$ip)->{continent} || 'XX';
|
||||
} else {
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
}
|
||||
|
||||
sub get_isp {
|
||||
my ($self, $ip) = @_;
|
||||
|
||||
return undef if !defined $ip || $ip !~ /\S/; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
if ($self->{dbapi}->{isp}) {
|
||||
return $self->_get('isp',$ip)->{isp};
|
||||
} else {
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
}
|
||||
|
||||
sub get_isp_org {
|
||||
my ($self, $ip) = @_;
|
||||
|
||||
return undef if !defined $ip || $ip !~ /\S/; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
if ($self->{dbapi}->{isp}) {
|
||||
return $self->_get('isp',$ip)->{organization};
|
||||
} else {
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
}
|
||||
|
||||
sub get_asn {
|
||||
my ($self, $ip) = @_;
|
||||
|
||||
return undef if !defined $ip || $ip !~ /\S/; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
if ($self->{dbapi}->{asn}) {
|
||||
return $self->_get('asn',$ip)->{asn};
|
||||
} elsif ($self->{dbapi}->{isp}) {
|
||||
return $self->_get('isp',$ip)->{asn};
|
||||
} else {
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
}
|
||||
|
||||
sub get_asn_org {
|
||||
my ($self, $ip) = @_;
|
||||
|
||||
return undef if !defined $ip || $ip !~ /\S/; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
if ($self->{dbapi}->{asn}) {
|
||||
return $self->_get('asn',$ip)->{asn_organization};
|
||||
} elsif ($self->{dbapi}->{isp}) {
|
||||
return $self->_get('isp',$ip)->{asn_organization};
|
||||
} else {
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
}
|
||||
|
||||
sub get_all {
|
||||
my ($self, $ip) = @_;
|
||||
|
||||
return undef if !defined $ip || $ip !~ /\S/; ## no critic (ProhibitExplicitReturnUndef)
|
||||
|
||||
my $all = {};
|
||||
|
||||
if ($ip =~ IS_IP_PRIVATE) {
|
||||
return { 'country' => '**' };
|
||||
}
|
||||
|
||||
if ($ip !~ IS_IP_ADDRESS) {
|
||||
$ip = name_to_ip($ip);
|
||||
if (!defined $ip) {
|
||||
return { 'country' => 'XX' };
|
||||
}
|
||||
}
|
||||
|
||||
if ($self->{dbapi}->{city}) {
|
||||
my $res = $self->_get('city',$ip);
|
||||
$all->{$_} = $res->{$_} foreach (keys %$res);
|
||||
} elsif ($self->{dbapi}->{country}) {
|
||||
my $res = $self->_get('country',$ip);
|
||||
$all->{$_} = $res->{$_} foreach (keys %$res);
|
||||
}
|
||||
|
||||
if ($self->{dbapi}->{isp}) {
|
||||
my $res = $self->_get('isp',$ip);
|
||||
$all->{$_} = $res->{$_} foreach (keys %$res);
|
||||
}
|
||||
|
||||
if ($self->{dbapi}->{asn}) {
|
||||
my $res = $self->_get('asn',$ip);
|
||||
$all->{$_} = $res->{$_} foreach (keys %$res);
|
||||
}
|
||||
|
||||
return $all;
|
||||
}
|
||||
|
||||
sub can {
|
||||
my ($self, $check) = @_;
|
||||
|
||||
return defined $self->{dbapi}->{$check};
|
||||
}
|
||||
|
||||
# TODO: use SA internal dns synchronously?
|
||||
# This shouldn't be called much, as plugins
|
||||
# should do their own resolving if needed
|
||||
sub name_to_ip {
|
||||
my $name = shift;
|
||||
if (my $ip = inet_aton($name)) {
|
||||
$ip = inet_ntoa($ip);
|
||||
dbg("geodb: resolved internally $name: $ip");
|
||||
return $ip;
|
||||
}
|
||||
dbg("geodb: failed to internally resolve $name");
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
|
||||
sub _get {
|
||||
my ($self, $type, $ip) = @_;
|
||||
|
||||
# reset cache at 100 ips
|
||||
if (scalar keys %{$self->{cache}} >= 100) {
|
||||
$self->{cache} = ();
|
||||
}
|
||||
|
||||
if (!exists $self->{cache}{$ip}{$type}) {
|
||||
if ($self->{dbapi}->{$type}) {
|
||||
$self->{cache}{$ip}{$type} = $self->{dbapi}->{$type}->($self,$ip);
|
||||
} else {
|
||||
return undef; ## no critic (ProhibitExplicitReturnUndef)
|
||||
}
|
||||
}
|
||||
|
||||
return $self->{cache}{$ip}{$type};
|
||||
}
|
||||
|
||||
1;
|
@ -24,9 +24,6 @@ use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
require 5.008; # need basic Unicode support for HTML::Parser::utf8_mode
|
||||
# require 5.008008; # Bug 3787; [perl #37950]: Malformed UTF-8 character ...
|
||||
|
||||
use HTML::Parser 3.43 ();
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
@ -66,7 +63,7 @@ my %elements_whitespace = map {; $_ => 1 }
|
||||
|
||||
# elements that push URIs
|
||||
my %elements_uri = map {; $_ => 1 }
|
||||
qw( body table tr td a area link img frame iframe embed script form base bgsound ),
|
||||
qw( body table tr td a area link img frame iframe embed script form base bgsound meta ),
|
||||
;
|
||||
|
||||
# style attribute not accepted
|
||||
@ -248,6 +245,18 @@ sub parse {
|
||||
# the HTML::Parser API won't do it for us
|
||||
$text =~ s/<(\w+)\s*\/>/<$1>/gi;
|
||||
|
||||
# Normalize unicode quotes, messes up attributes parsing
|
||||
# U+201C e2 80 9c LEFT DOUBLE QUOTATION MARK
|
||||
# U+201D e2 80 9d RIGHT DOUBLE QUOTATION MARK
|
||||
# Examples of input:
|
||||
# <a href=\x{E2}\x{80}\x{9D}https://foobar.com\x{E2}\x{80}\x{9D}>
|
||||
# .. results in uri "\x{E2}\x{80}\x{9D}https://foobar.com\x{E2}\x{80}\x{9D}"
|
||||
if (utf8::is_utf8($text)) {
|
||||
$text =~ s/(?:\x{201C}|\x{201D})/"/g;
|
||||
} else {
|
||||
$text =~ s/\x{E2}\x{80}(?:\x{9C}|\x{9D})/"/g;
|
||||
}
|
||||
|
||||
if (!$self->UNIVERSAL::can('utf8_mode')) {
|
||||
# utf8_mode is cleared by default, only warn if it would need to be set
|
||||
warn "message: cannot set utf8_mode, module HTML::Parser is too old\n"
|
||||
@ -352,8 +361,8 @@ sub canon_uri {
|
||||
my ($self, $uri) = @_;
|
||||
|
||||
# URIs don't have leading/trailing whitespace ...
|
||||
$uri =~ s/^\s+//;
|
||||
$uri =~ s/\s+$//;
|
||||
$uri =~ s/^[\s\xA0]+//;
|
||||
$uri =~ s/[\s\xA0]+$//;
|
||||
|
||||
# Make sure all the URIs are nice and short
|
||||
if (length $uri > MAX_URI_LENGTH) {
|
||||
@ -414,6 +423,17 @@ sub html_uri {
|
||||
}
|
||||
}
|
||||
}
|
||||
elsif ($tag eq "meta" &&
|
||||
exists $attr->{'http-equiv'} &&
|
||||
exists $attr->{content} &&
|
||||
$attr->{'http-equiv'} =~ /refresh/i &&
|
||||
$attr->{content} =~ /\burl\s*=/i)
|
||||
{
|
||||
my $uri = $attr->{content};
|
||||
$uri =~ s/^.*\burl\s*=\s*//i;
|
||||
$uri =~ s/\s*;.*//i;
|
||||
$self->push_uri($tag, $uri);
|
||||
}
|
||||
}
|
||||
|
||||
# this might not be quite right, may need to pay attention to table nesting
|
||||
@ -516,7 +536,7 @@ sub text_style {
|
||||
my $whcolor = $1 ? 'bgcolor' : 'fgcolor';
|
||||
my $value = lc $2;
|
||||
|
||||
if ($value =~ /rgb/) {
|
||||
if (index($value, 'rgb') >= 0) {
|
||||
$value =~ tr/0-9,//cd;
|
||||
my @rgb = split(/,/, $value);
|
||||
$new{$whcolor} = sprintf("#%02x%02x%02x",
|
||||
@ -705,6 +725,8 @@ sub html_tests {
|
||||
{
|
||||
$self->{charsets} .= exists $self->{charsets} ? " $1" : $1;
|
||||
}
|
||||
|
||||
# todo: capture URI from meta refresh tag
|
||||
}
|
||||
|
||||
sub display_text {
|
||||
@ -1154,7 +1176,7 @@ sub _merge_uri {
|
||||
return "/" . $r_path;
|
||||
}
|
||||
else {
|
||||
if ($base_path =~ m|/|) {
|
||||
if (index($base_path, '/') >= 0) {
|
||||
$base_path =~ s|(?<=/)[^/]*$||;
|
||||
}
|
||||
else {
|
||||
|
@ -78,9 +78,6 @@ sub is_charset_ok_for_locales {
|
||||
$cs =~ s/:.*$//gs; # trim off multiple charsets, just use 1st
|
||||
dbg ("locales: is $cs ok for @locales?");
|
||||
|
||||
study $cs; # study is a no-op since perl 5.16.0, eliminating related bugs
|
||||
#warn "JMD $cs";
|
||||
|
||||
# always OK (the net speaks mostly roman charsets)
|
||||
return 1 if ($cs eq 'USASCII');
|
||||
return 1 if ($cs eq 'ASCII');
|
||||
|
@ -69,6 +69,11 @@ sub jittery_one_second_sleep {
|
||||
Time::HiRes::sleep(rand(1.0) + 0.5);
|
||||
}
|
||||
|
||||
sub jittery_half_second_sleep {
|
||||
my ($self) = @_;
|
||||
Time::HiRes::sleep(rand(0.5) + 0.25);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
||||
|
@ -55,7 +55,7 @@ sub safe_lock {
|
||||
|
||||
my $lock_file = "$path.mutex";
|
||||
my $umask = umask(~$mode);
|
||||
my $fh = new IO::File();
|
||||
my $fh = IO::File->new;
|
||||
|
||||
if (!$fh->open ($lock_file, O_RDWR|O_CREAT)) {
|
||||
umask $umask; # just in case
|
||||
|
@ -29,6 +29,7 @@ use Mail::SpamAssassin::Logger;
|
||||
use File::Spec;
|
||||
use Time::Local;
|
||||
use Fcntl qw(:DEFAULT :flock);
|
||||
use Errno qw(EEXIST);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Locker);
|
||||
|
||||
@ -60,7 +61,7 @@ sub safe_lock {
|
||||
$max_retries ||= 30;
|
||||
$mode ||= "0700";
|
||||
$mode = (oct $mode) & 0666;
|
||||
dbg ("locker: mode is $mode");
|
||||
dbg ("locker: mode is %03o", $mode);
|
||||
|
||||
my $lock_file = "$path.lock";
|
||||
my $hname = Mail::SpamAssassin::Util::fq_hostname();
|
||||
@ -76,11 +77,11 @@ sub safe_lock {
|
||||
die "locker: safe_lock: cannot create tmp lockfile $lock_tmp for $lock_file: $!\n";
|
||||
}
|
||||
umask $umask;
|
||||
autoflush LTMP 1;
|
||||
LTMP->autoflush(1);
|
||||
dbg("locker: safe_lock: created $lock_tmp");
|
||||
|
||||
for (my $retries = 0; $retries < $max_retries; $retries++) {
|
||||
if ($retries > 0) { $self->jittery_one_second_sleep(); }
|
||||
for (my $retries = 0; $retries < $max_retries * 2; $retries++) {
|
||||
if ($retries > 0) { $self->jittery_half_second_sleep(); }
|
||||
print LTMP "$hname.$$\n" or warn "Error writing to $lock_tmp: $!";
|
||||
dbg("locker: safe_lock: trying to get lock on $path with $retries retries");
|
||||
if (link($lock_tmp, $lock_file)) {
|
||||
@ -88,6 +89,10 @@ sub safe_lock {
|
||||
$is_locked = 1;
|
||||
last;
|
||||
}
|
||||
# if lock exists, it's already likely locked, no point complaining here
|
||||
unless ($!{EEXIST}) {
|
||||
warn "locker: creating link $lock_file to $lock_tmp failed: '$!'";
|
||||
}
|
||||
# link _may_ return false even if the link _is_ created
|
||||
@stat = lstat($lock_tmp);
|
||||
@stat or warn "locker: error accessing $lock_tmp: $!";
|
||||
@ -149,7 +154,7 @@ sub safe_unlock {
|
||||
warn "locker: safe_unlock: failed to create lock tmpfile $lock_tmp: $!";
|
||||
return;
|
||||
} else {
|
||||
autoflush LTMP 1;
|
||||
LTMP->autoflush(1);
|
||||
print LTMP "\n" or warn "Error writing to $lock_tmp: $!";
|
||||
|
||||
if (!(@stat_ourtmp = stat(LTMP)) || (scalar(@stat_ourtmp) < 11)) {
|
||||
@ -157,7 +162,7 @@ sub safe_unlock {
|
||||
warn "locker: safe_unlock: failed to create lock tmpfile $lock_tmp";
|
||||
close LTMP or die "error closing $lock_tmp: $!";
|
||||
unlink($lock_tmp)
|
||||
or warn "locker: safe_lock: unlink of lock file failed: $!\n";
|
||||
or warn "locker: safe_lock: unlink of lock file $lock_tmp failed: $!\n";
|
||||
return;
|
||||
}
|
||||
}
|
||||
@ -169,7 +174,7 @@ sub safe_unlock {
|
||||
|
||||
close LTMP or die "error closing $lock_tmp: $!";
|
||||
unlink($lock_tmp)
|
||||
or warn "locker: safe_lock: unlink of lock file failed: $!\n";
|
||||
or warn "locker: safe_lock: unlink of lock file $lock_tmp failed: $!\n";
|
||||
|
||||
# 2. If the ctime hasn't been modified, unlink the file and return. If the
|
||||
# lock has expired, sleep the usual random interval before returning. If we
|
||||
@ -191,7 +196,7 @@ sub safe_unlock {
|
||||
{
|
||||
# things are good: the ctimes match so it was our lock
|
||||
unlink($lock_file)
|
||||
or warn "locker: safe_unlock: unlink failed: $lock_file\n";
|
||||
or warn "locker: safe_unlock: unlinking $lock_file failed: $!\n";
|
||||
dbg("locker: safe_unlock: unlink $lock_file");
|
||||
|
||||
if ($ourtmp_ctime >= $lock_ctime + LOCK_MAX_AGE) {
|
||||
|
@ -72,7 +72,7 @@ sub safe_lock {
|
||||
return 1;
|
||||
}
|
||||
my @stat = stat($lock_file);
|
||||
@stat or warn "locker: error accessing $lock_file: $!";
|
||||
@stat or dbg("locker: error accessing $lock_file: $!");
|
||||
|
||||
# check age of lockfile ctime
|
||||
my $age = ($#stat < 11 ? undef : $stat[10]);
|
||||
|
@ -75,7 +75,22 @@ $LOG_SA{facility} = {}; # no dbg facilities turned on
|
||||
|
||||
# always log to stderr initially
|
||||
use Mail::SpamAssassin::Logger::Stderr;
|
||||
$LOG_SA{method}->{stderr} = Mail::SpamAssassin::Logger::Stderr->new();
|
||||
$LOG_SA{method}->{stderr} =
|
||||
Mail::SpamAssassin::Logger::Stderr->new(escape =>
|
||||
exists $ENV{'SA_LOGGER_ESCAPE'} ? $ENV{'SA_LOGGER_ESCAPE'} : 1
|
||||
);
|
||||
|
||||
# Use of M:SA:Util causes circular dependencies, separate helper here.
|
||||
my %escape_map =
|
||||
("\r" => '\\r', "\n" => '\\n', "\t" => '\\t', "\\" => '\\\\');
|
||||
sub escape_str {
|
||||
# Things are already forced as octets by _log, no utf8::encode needed
|
||||
# Control chars, DEL, backslash
|
||||
$_[0] =~ s@
|
||||
( [\x00-\x1F\x7F\x80-\xFF\\] )
|
||||
@ $escape_map{$1} || sprintf("\\x{%02X}",ord($1))
|
||||
@egsx;
|
||||
}
|
||||
|
||||
=head1 METHODS
|
||||
|
||||
@ -165,7 +180,7 @@ sub log_message {
|
||||
# don't log them -- this is caller 0, the use'ing package is 1, the eval is 2
|
||||
my @caller = caller 2;
|
||||
return if (defined $caller[3] && defined $caller[0] &&
|
||||
$caller[3] =~ /^\(eval\)$/ &&
|
||||
$caller[3] eq '(eval)' &&
|
||||
$caller[0] =~ m#^Mail::SpamAssassin(?:$|::)#);
|
||||
}
|
||||
|
||||
@ -215,13 +230,15 @@ sub _log_message {
|
||||
foreach my $line (split(/\n/, $_[1])) {
|
||||
# replace control characters with "_", tabs and spaces get
|
||||
# replaced with a single space.
|
||||
$line =~ tr/\x09\x20\x00-\x1f/ _/s;
|
||||
# Deprecated here, see new Bug 6583 escaping in Logger/*.pm modules
|
||||
#$line =~ tr/\x09\x20\x00-\x1f/ _/s;
|
||||
|
||||
if ($first) {
|
||||
$first = 0;
|
||||
} else {
|
||||
local $1;
|
||||
$line =~ s/^([^:]+?):/$1: [...]/;
|
||||
}
|
||||
|
||||
while (my ($name, $object) = each %{ $LOG_SA{method} }) {
|
||||
$object->log_message($_[0], $line, $_[2]);
|
||||
}
|
||||
@ -274,6 +291,9 @@ sub _log {
|
||||
}
|
||||
|
||||
my ($level, $message, @args) = @_;
|
||||
|
||||
utf8::encode($message) if utf8::is_utf8($message); # handle as octets
|
||||
|
||||
$message =~ s/^(?:[a-z0-9_-]*):\s*//i;
|
||||
|
||||
$message = sprintf($message,@args) if @args;
|
||||
@ -284,18 +304,28 @@ sub _log {
|
||||
log_message(($level == INFO ? "info" : "dbg"), $message);
|
||||
}
|
||||
|
||||
=item add(method => 'syslog', socket => $socket, facility => $facility)
|
||||
=item add(method => 'syslog', socket => $socket, facility => $facility, escape => $escape)
|
||||
|
||||
C<socket> is the type the syslog ("unix" or "inet"). C<facility> is the
|
||||
syslog facility (typically "mail").
|
||||
|
||||
=item add(method => 'file', filename => $file)
|
||||
If optional C<escape> is true, all non-ascii characters are escaped for safe
|
||||
output: backslashes change to \\ and non-ascii chars to \x{XX} or \x{XXXX}
|
||||
(Unicode). If not defined, pre-4.0 style sanitizing is used
|
||||
( tr/\x09\x20\x00-\x1f/_/s ).
|
||||
|
||||
C<filename> is the name of the log file.
|
||||
Escape value can be overridden with environment variable
|
||||
C<SA_LOGGER_ESCAPE>.
|
||||
|
||||
=item add(method => 'stderr')
|
||||
=item add(method => 'file', filename => $file, escape => $escape)
|
||||
|
||||
C<filename> is the name of the log file. C<escape> works as described
|
||||
above.
|
||||
|
||||
=item add(method => 'stderr', escape => $escape)
|
||||
|
||||
No options are needed for stderr logging, just don't close stderr first.
|
||||
C<escape> works as described above.
|
||||
|
||||
=cut
|
||||
|
||||
@ -307,6 +337,10 @@ sub add {
|
||||
|
||||
return 0 if $class !~ /^\w+$/; # be paranoid
|
||||
|
||||
if (exists $ENV{'SA_LOGGER_ESCAPE'}) {
|
||||
$params{escape} = $ENV{'SA_LOGGER_ESCAPE'}
|
||||
}
|
||||
|
||||
eval 'use Mail::SpamAssassin::Logger::'.$class.'; 1'
|
||||
or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
|
@ -37,13 +37,17 @@ use re 'taint';
|
||||
use POSIX ();
|
||||
use Time::HiRes ();
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(am_running_on_windows);
|
||||
|
||||
our @ISA = ();
|
||||
|
||||
# ADDING OS-DEPENDENT LINE TERMINATOR - BUG 6456
|
||||
|
||||
# Using Mail::SpamAssassin::Util::am_running_on_windows() leads to circular
|
||||
# dependencies. So, we are duplicating the code instead.
|
||||
use constant RUNNING_ON_WINDOWS => ($^O =~ /^(?:mswin|dos|os2)/oi);
|
||||
|
||||
my $eol = "\n";
|
||||
if (am_running_on_windows()) {
|
||||
if (RUNNING_ON_WINDOWS) {
|
||||
$eol = "\r\n";
|
||||
}
|
||||
|
||||
@ -58,6 +62,7 @@ sub new {
|
||||
my %params = @_;
|
||||
$self->{filename} = $params{filename} || 'spamassassin.log';
|
||||
$self->{timestamp_fmt} = $params{timestamp_fmt};
|
||||
$self->{escape} = $params{escape} if exists $params{escape};
|
||||
|
||||
if (! $self->init()) {
|
||||
die "logger: file initialization failed$eol";
|
||||
@ -101,6 +106,16 @@ sub log_message {
|
||||
}
|
||||
$timestamp .= ' ' if $timestamp ne '';
|
||||
|
||||
if ($self->{escape}) {
|
||||
# Bug 6583, escape
|
||||
Mail::SpamAssassin::Logger::escape_str($msg);
|
||||
} elsif (!exists $self->{escape}) {
|
||||
# Backwards compatible pre-4.0 escaping, if $escape not given.
|
||||
# replace control characters with "_", tabs and spaces get
|
||||
# replaced with a single space.
|
||||
$msg =~ tr/\x09\x20\x00-\x1f/ _/s;
|
||||
}
|
||||
|
||||
my($nwrite) = syswrite(STDLOG, sprintf("%s[%s] %s: %s%s",
|
||||
$timestamp, $$, $level, $msg, $eol));
|
||||
defined $nwrite or warn "error writing to log file: $!";
|
||||
|
@ -36,6 +36,7 @@ use re 'taint';
|
||||
|
||||
use POSIX ();
|
||||
use Time::HiRes ();
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = ();
|
||||
|
||||
@ -59,6 +60,7 @@ sub new {
|
||||
|
||||
my %params = @_;
|
||||
$self->{timestamp_fmt} = $params{timestamp_fmt};
|
||||
$self->{escape} = $params{escape} if exists $params{escape};
|
||||
|
||||
return($self);
|
||||
}
|
||||
@ -83,6 +85,16 @@ sub log_message {
|
||||
}
|
||||
$timestamp .= ' ' if $timestamp ne '';
|
||||
|
||||
if ($self->{escape}) {
|
||||
# Bug 6583, escape
|
||||
Mail::SpamAssassin::Logger::escape_str($msg);
|
||||
} elsif (!exists $self->{escape}) {
|
||||
# Backwards compatible pre-4.0 escaping, if $escape not given.
|
||||
# replace control characters with "_", tabs and spaces get
|
||||
# replaced with a single space.
|
||||
$msg =~ tr/\x09\x20\x00-\x1f/ _/s;
|
||||
}
|
||||
|
||||
my($nwrite) = syswrite(STDERR, sprintf("%s[%d] %s: %s%s",
|
||||
$timestamp, $$, $level, $msg, $eol));
|
||||
defined $nwrite or warn "error writing to log file: $!";
|
||||
|
@ -73,6 +73,7 @@ sub new {
|
||||
$self->{log_socket} = $params{socket};
|
||||
$self->{log_facility} = $params{facility};
|
||||
$self->{timestamp_fmt} = $params{timestamp_fmt};
|
||||
$self->{escape} = $params{escape} if exists $params{escape};
|
||||
|
||||
if (! $self->init()) {
|
||||
die "logger: syslog initialization failed\n";
|
||||
@ -148,6 +149,16 @@ sub log_message {
|
||||
$msg = '(bad prio: ' . $_[1] . ') ' . $msg;
|
||||
}
|
||||
|
||||
if ($self->{escape}) {
|
||||
# Bug 6583, escape
|
||||
Mail::SpamAssassin::Logger::escape_str($msg);
|
||||
} elsif (!exists $self->{escape}) {
|
||||
# Backwards compatible pre-4.0 escaping, if $escape not given
|
||||
# replace control characters with "_", tabs and spaces get
|
||||
# replaced with a single space.
|
||||
$msg =~ tr/\x09\x20\x00-\x1f/ _/s;
|
||||
}
|
||||
|
||||
# install a new handler for SIGPIPE -- this signal has been
|
||||
# found to occur with syslog-ng after syslog-ng restarts.
|
||||
local $SIG{'PIPE'} = sub {
|
||||
|
@ -46,10 +46,8 @@ use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1 sha1_hex); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1 sha1_hex) }
|
||||
}
|
||||
use Digest::SHA qw(sha1 sha1_hex);
|
||||
use Scalar::Util qw(tainted);
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Message::Node;
|
||||
@ -159,7 +157,7 @@ sub new {
|
||||
if (ref $message eq 'ARRAY') {
|
||||
@message = @{$message};
|
||||
}
|
||||
elsif (ref($message) eq 'GLOB' || ref($message) =~ /^IO::/) {
|
||||
elsif (ref($message) eq 'GLOB' || index(ref($message), 'IO::') == 0) {
|
||||
if (defined fileno $message) {
|
||||
|
||||
# sysread+split avoids a Perl I/O bug (Bug 5985)
|
||||
@ -207,14 +205,14 @@ sub new {
|
||||
# messages? Tainting the message is important because it prevents certain
|
||||
# exploits later.
|
||||
if (Mail::SpamAssassin::Util::am_running_in_taint_mode() &&
|
||||
grep { !Scalar::Util::tainted($_) } @message) {
|
||||
grep { !tainted($_) } @message) {
|
||||
local($_);
|
||||
# To preserve newlines, no joining and splitting here, process each line
|
||||
# directly as is.
|
||||
foreach (@message) {
|
||||
$_ = Mail::SpamAssassin::Util::taint_var($_);
|
||||
}
|
||||
if (grep { !Scalar::Util::tainted($_) } @message) {
|
||||
if (grep { !tainted($_) } @message) {
|
||||
die "Mail::SpamAssassin::Message failed to enforce message taintness";
|
||||
}
|
||||
}
|
||||
@ -256,7 +254,7 @@ sub new {
|
||||
# bug 4363
|
||||
# Check to see if we should do CRLF instead of just LF
|
||||
# For now, just check the first and last line and do whatever it does
|
||||
if (@message && ($message[0] =~ /\015\012/ || $message[-1] =~ /\015\012/)) {
|
||||
if (index($message[0], "\015\012") != -1 || index($message[-1], "\015\012") != -1) {
|
||||
$self->{line_ending} = "\015\012";
|
||||
dbg("message: line ending changed to CRLF");
|
||||
}
|
||||
@ -270,7 +268,12 @@ sub new {
|
||||
for (;;) {
|
||||
# make sure not to lose the last header field when there is no body
|
||||
my $eof = !@message;
|
||||
my $current = $eof ? "\n" : shift @message;
|
||||
my $current = $eof ? $self->{line_ending} : shift @message;
|
||||
|
||||
# Bug 7785: spamass-milter breaks wrapped headers, add any missing \r
|
||||
if ($squash_crlf) {
|
||||
$current =~ s/(?<!\015)\012/\015\012/gs;
|
||||
}
|
||||
|
||||
if ( $current =~ /^[ \t]/ ) {
|
||||
# This wasn't useful in terms of a rule, but we may want to treat it
|
||||
@ -306,7 +309,7 @@ sub new {
|
||||
}
|
||||
}
|
||||
|
||||
if ($current =~ /^\r?$/) { # a regular end of a header section
|
||||
if ($current eq $self->{line_ending}) { # a regular end of a header section
|
||||
if ($eof) {
|
||||
$self->{'missing_head_body_separator'} = 1;
|
||||
} else {
|
||||
@ -395,7 +398,7 @@ sub new {
|
||||
# either a blank line or the boundary (if defined), insert a blank line
|
||||
# to ensure proper parsing - do not consider MIME headers at the beginning of the body
|
||||
# to be part of the message headers.
|
||||
if ($self->{'type'} =~ /^multipart\//i && $#message > 0 && $message[0] =~ /\S/)
|
||||
if (index($self->{'type'}, 'multipart/') == 0 && $#message > 0 && $message[0] =~ /\S/)
|
||||
{
|
||||
if (!defined $boundary || $message[0] !~ /^--\Q$boundary\E/)
|
||||
{
|
||||
@ -537,6 +540,77 @@ sub get_pristine_body {
|
||||
return $self->{pristine_body};
|
||||
}
|
||||
|
||||
=item get_pristine_body_digest()
|
||||
|
||||
Returns SHA1 hex digest of the pristine message body.
|
||||
CRLF line endings are normalized to LF before hashing.
|
||||
|
||||
=cut
|
||||
|
||||
sub get_pristine_body_digest {
|
||||
my ($self) = @_;
|
||||
|
||||
return $self->{pristine_body_digest} if exists $self->{pristine_body_digest};
|
||||
|
||||
if ($self->{line_ending} eq "\015\012") {
|
||||
# Don't make a copy, process line by line to save memory
|
||||
# CRLF should be exception, so it's not that critical here
|
||||
my $sha = Digest::SHA->new('sha1');
|
||||
while ($self->{pristine_body} =~ /(.*?)(\015\012)?/gs) {
|
||||
$sha->add($1.(defined $2 ? "\012" : ""));
|
||||
}
|
||||
$self->{pristine_body_digest} = $sha->hexdigest;
|
||||
} else {
|
||||
$self->{pristine_body_digest} = sha1_hex($self->{pristine_body});
|
||||
}
|
||||
|
||||
dbg("message: pristine body digest: ".$self->{pristine_body_digest});
|
||||
return $self->{pristine_body_digest};
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item get_msgid()
|
||||
|
||||
Returns Message-ID header for the message, with <> and surrounding
|
||||
whitespace removed. Returns undef, if nothing found between <>.
|
||||
|
||||
=cut
|
||||
|
||||
sub get_msgid {
|
||||
my ($self) = @_;
|
||||
|
||||
my $msgid = $self->get_header("Message-Id");
|
||||
if (defined $msgid && $msgid =~ /^\s*<(.+)>\s*$/s) {
|
||||
return $1;
|
||||
} else {
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
=item generate_msgid()
|
||||
|
||||
Generate a calculated "Message-ID" in B<sha1hex@sa_generated> format, using
|
||||
To, Date headers and pristine body as source for hashing.
|
||||
|
||||
=cut
|
||||
|
||||
sub generate_msgid {
|
||||
my ($self) = @_;
|
||||
|
||||
return $self->{msgid_generated} if exists $self->{msgid_generated};
|
||||
|
||||
# See Bug 5185, not using Received headers etc anymore
|
||||
my $to = $self->get_header("To") || '';
|
||||
my $date = $self->get_header("Date") || '';
|
||||
my $body_digest = $self->get_pristine_body_digest();
|
||||
|
||||
$self->{msgid_generated} =
|
||||
sha1_hex($to."\000".$date."\000".$body_digest).'@sa_generated';
|
||||
|
||||
return $self->{msgid_generated};
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
=item extract_message_metadata($permsgstatus)
|
||||
@ -706,6 +780,7 @@ sub finish {
|
||||
# temporary files are deleted even if the finish() method is omitted
|
||||
sub DESTROY {
|
||||
my $self = shift;
|
||||
|
||||
# best practices: prevent potential calls to eval and to system routines
|
||||
# in code of a DESTROY method from clobbering global variables $@ and $!
|
||||
local($@,$!); # keep outer error handling unaffected by DESTROY
|
||||
@ -774,7 +849,7 @@ sub parse_body {
|
||||
#
|
||||
my ($msg, $boundary, $body, $subparse) = @$toparse;
|
||||
|
||||
if ($msg->{'type'} =~ m{^multipart/}i && defined $boundary && $subparse > 0) {
|
||||
if (index($msg->{'type'}, 'multipart/') == 0 && defined $boundary && $subparse > 0) {
|
||||
$self->_parse_multipart($toparse);
|
||||
}
|
||||
else {
|
||||
@ -782,7 +857,8 @@ sub parse_body {
|
||||
$self->_parse_normal($toparse);
|
||||
|
||||
# bug 5041: process message/*, but exclude message/partial content types
|
||||
if ($msg->{'type'} =~ m{^message/(?!partial\z)}i && $subparse > 0)
|
||||
if (index($msg->{'type'}, 'message/') == 0 &&
|
||||
$msg->{'type'} ne 'message/partial' && $subparse > 0)
|
||||
{
|
||||
# Just decode the part, but we don't need the resulting string here.
|
||||
$msg->decode(0);
|
||||
@ -798,7 +874,7 @@ sub parse_body {
|
||||
# bug 5051, bug 3748: check $msg->{decoded}: sometimes message/* parts
|
||||
# have no content, and we get stuck waiting for STDIN, which is bad. :(
|
||||
|
||||
if ($msg->{'type'} =~ m{^message/(?:rfc822|global)\z}i &&
|
||||
if (($msg->{'type'} eq 'message/rfc822' || $msg->{'type'} eq 'message/global') &&
|
||||
defined $msg->{'decoded'} && $msg->{'decoded'} ne '')
|
||||
{
|
||||
# Ok, so this part is still semi-recursive, since M::SA::Message
|
||||
@ -857,6 +933,7 @@ sub _parse_multipart {
|
||||
my($self, $toparse) = @_;
|
||||
|
||||
my ($msg, $boundary, $body, $subparse) = @{$toparse};
|
||||
my $nested_boundary = 0;
|
||||
|
||||
# we're not supposed to be a leaf, so prep ourselves
|
||||
$msg->{'body_parts'} = [];
|
||||
@ -907,6 +984,7 @@ sub _parse_multipart {
|
||||
my $header;
|
||||
my $part_array;
|
||||
my $found_end_boundary;
|
||||
my $found_last_end_boundary;
|
||||
my $partcnt = 0;
|
||||
|
||||
my $line_count = @{$body};
|
||||
@ -915,7 +993,12 @@ sub _parse_multipart {
|
||||
# deal with the mime part;
|
||||
# a triage before an unlikely-to-match regexp avoids a CPU hotspot
|
||||
$found_end_boundary = defined $boundary && substr($_,0,2) eq '--'
|
||||
&& /^--\Q$boundary\E(?:--)?\s*$/;
|
||||
&& /^--\Q$boundary\E(--)?\s*$/;
|
||||
$found_last_end_boundary = $found_end_boundary && $1;
|
||||
if ($found_end_boundary && $nested_boundary) {
|
||||
$found_end_boundary = 0;
|
||||
$nested_boundary = 0 if ($found_last_end_boundary); # bug 7358 - handle one level of non-unique boundary string
|
||||
}
|
||||
if ( --$line_count == 0 || $found_end_boundary ) {
|
||||
my $line = $_; # remember the last line
|
||||
|
||||
@ -951,8 +1034,20 @@ sub _parse_multipart {
|
||||
$part_array = [];
|
||||
}
|
||||
|
||||
my($p_boundary);
|
||||
($part_msg->{'type'}, $p_boundary) = Mail::SpamAssassin::Util::parse_content_type($part_msg->header('content-type'));
|
||||
($part_msg->{'type'}, my $p_boundary, undef, undef, my $ct_was_missing) =
|
||||
Mail::SpamAssassin::Util::parse_content_type($part_msg->header('content-type'));
|
||||
|
||||
# bug 5741: if ct was missing and parent == multipart/digest, then
|
||||
# type should be set as message/rfc822
|
||||
if ($ct_was_missing) {
|
||||
if ($msg->{'type'} eq 'multipart/digest') {
|
||||
dbg("message: missing type, setting multipart/digest child as message/rfc822");
|
||||
$part_msg->{'type'} = 'message/rfc822';
|
||||
} else {
|
||||
dbg("message: missing type, setting as default text/plain");
|
||||
}
|
||||
}
|
||||
|
||||
$p_boundary ||= $boundary;
|
||||
dbg("message: found part of type ".$part_msg->{'type'}.", boundary: ".(defined $p_boundary ? $p_boundary : ''));
|
||||
|
||||
@ -962,12 +1057,8 @@ sub _parse_multipart {
|
||||
push(@{$self->{'parse_queue'}}, [ $part_msg, $p_boundary, $part_array, $subparse ]);
|
||||
$msg->add_body_part($part_msg);
|
||||
|
||||
# rfc 1521 says /^--boundary--$/, some MUAs may just require /^--boundary--/
|
||||
# but this causes problems with horizontal lines when the boundary is
|
||||
# made up of dashes as well, etc.
|
||||
if (defined $boundary) {
|
||||
# no re "strict"; # since perl 5.21.8: Ranges of ASCII printables...
|
||||
if ($line =~ /^--\Q${boundary}\E--\s*$/) {
|
||||
if ($found_last_end_boundary) {
|
||||
# Make a note that we've seen the end boundary
|
||||
$self->{mime_boundary_state}->{$boundary}--;
|
||||
last;
|
||||
@ -1016,6 +1107,12 @@ sub _parse_multipart {
|
||||
if ($header) {
|
||||
my ( $key, $value ) = split ( /:\s*/, $header, 2 );
|
||||
$part_msg->header( $key, $value );
|
||||
if (defined $boundary && lc $key eq 'content-type') {
|
||||
my (undef, $nested_bound) = Mail::SpamAssassin::Util::parse_content_type($part_msg->header('content-type'));
|
||||
if (defined $nested_bound && $nested_bound eq $boundary) {
|
||||
$nested_boundary = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
$in_body = 1;
|
||||
|
||||
@ -1070,12 +1167,18 @@ sub _parse_normal {
|
||||
|
||||
dbg("message: parsing normal part");
|
||||
|
||||
# 0: content-type, 1: boundary, 2: charset, 3: filename
|
||||
# 0: content-type, 1: boundary, 2: charset, 3: filename 4: ct_missing
|
||||
my @ct = Mail::SpamAssassin::Util::parse_content_type($msg->header('content-type'));
|
||||
|
||||
# multipart sections are required to have a boundary set ... If this
|
||||
# one doesn't, assume it's malformed and revert to text/plain
|
||||
$msg->{'type'} = ($ct[0] !~ m@^multipart/@i || defined $boundary ) ? $ct[0] : 'text/plain';
|
||||
# bug 5741: don't overwrite the default type assigned by _parse_multipart()
|
||||
if (!$ct[4]) {
|
||||
$msg->{'type'} = (index($ct[0], 'multipart/') != 0 || defined $boundary) ?
|
||||
$ct[0] : 'text/plain'
|
||||
} else {
|
||||
dbg("message: missing type, setting previous multipart type: %s", $msg->{'type'});
|
||||
}
|
||||
$msg->{'charset'} = $ct[2];
|
||||
|
||||
# attempt to figure out a name for this attachment if there is one ...
|
||||
@ -1086,9 +1189,6 @@ sub _parse_normal {
|
||||
elsif ($ct[3]) {
|
||||
$msg->{'name'} = $ct[3];
|
||||
}
|
||||
if ($msg->{'name'}) {
|
||||
$msg->{'name'} = Encode::decode("MIME-Header", $msg->{'name'});
|
||||
}
|
||||
|
||||
$msg->{'boundary'} = $boundary;
|
||||
|
||||
@ -1096,7 +1196,8 @@ sub _parse_normal {
|
||||
# ahead and write the part data out to a temp file -- why keep sucking
|
||||
# up RAM with something we're not going to use?
|
||||
#
|
||||
if ($msg->{'type'} !~ m@^(?:text/(?:plain|html)$|message\b)@) {
|
||||
unless ($msg->{'type'} eq 'text/plain' || $msg->{'type'} eq 'text/html' ||
|
||||
index($msg->{'type'}, 'message/') == 0) {
|
||||
my($filepath, $fh);
|
||||
eval {
|
||||
($filepath, $fh) = Mail::SpamAssassin::Util::secure_tmpfile(); 1;
|
||||
@ -1131,7 +1232,7 @@ sub get_mimepart_digests {
|
||||
if (!exists $self->{mimepart_digests}) {
|
||||
# traverse all parts which are leaves, recursively
|
||||
$self->{mimepart_digests} =
|
||||
[ map(sha1_hex($_->decode) . ':' . lc($_->{type}||''),
|
||||
[ map(sha1_hex($_->decode) . ':' . ($_->{type}||''),
|
||||
$self->find_parts(qr/^/,1,1)) ];
|
||||
}
|
||||
return $self->{mimepart_digests};
|
||||
@ -1202,16 +1303,19 @@ sub get_body_text_array_common {
|
||||
# text/plain rendered as html otherwise.
|
||||
if ($html_needs_setting && $type eq 'text/html') {
|
||||
$self->{metadata}->{html} = $p->{html_results};
|
||||
push @{$self->{metadata}->{html_all}}, $p->{html_results};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# whitespace handling (warning: small changes have large effects!)
|
||||
$text =~ s/\n+\s*\n+/\f/gs; # double newlines => form feed
|
||||
$text =~ s/\n+\s*\n+/\x00/gs; # double newlines => null
|
||||
# $text =~ tr/ \t\n\r\x0b\xa0/ /s; # whitespace (incl. VT, NBSP) => space
|
||||
$text =~ tr/ \t\n\r\x0b/ /s; # whitespace (incl. VT) => space
|
||||
$text =~ tr/\f/\n/; # form feeds => newline
|
||||
# $text =~ tr/ \t\n\r\x0b/ /s; # whitespace (incl. VT) => single space
|
||||
$text =~ s/\s+/ /gs; # Unicode whitespace => single space
|
||||
$text =~ tr/\x00/\n/; # null => newline
|
||||
|
||||
utf8::encode($text) if utf8::is_utf8($text);
|
||||
my @textary = split_into_array_of_short_lines($text);
|
||||
$self->{$key} = \@textary;
|
||||
|
||||
@ -1246,7 +1350,7 @@ sub get_decoded_body_text_array {
|
||||
my $scansize = $self->{rawbody_part_scan_size};
|
||||
|
||||
# Find all parts which are leaves
|
||||
my @parts = $self->find_parts(qr/^(?:text|message)\b/i,1);
|
||||
my @parts = $self->find_parts(qr/^(?:text|message)\b/,1);
|
||||
return $self->{text_decoded} unless @parts;
|
||||
|
||||
# Go through each part
|
||||
|
@ -50,6 +50,10 @@ use Mail::SpamAssassin::Dns;
|
||||
use Mail::SpamAssassin::PerMsgStatus;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
my $LOCALHOST = LOCALHOST;
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub parse_received_headers {
|
||||
@ -117,10 +121,6 @@ sub parse_received_headers {
|
||||
}
|
||||
}
|
||||
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
my $LOCALHOST = LOCALHOST;
|
||||
|
||||
my @hdrs = $msg->get_header('Received');
|
||||
|
||||
# Now add the single line headers like X-Originating-IP. (bug 5680)
|
||||
@ -336,9 +336,6 @@ sub parse_received_line {
|
||||
my $ident = '';
|
||||
my $envfrom = undef;
|
||||
my $mta_looked_up_dns = 0;
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
my $LOCALHOST = LOCALHOST;
|
||||
my $auth = '';
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@ -430,7 +427,7 @@ sub parse_received_line {
|
||||
$auth = "GMX ($4 / $3)";
|
||||
}
|
||||
# Critical Path Messaging Server
|
||||
elsif (/ \(authenticated as /&&/\) by .+ \(\d{1,2}\.\d\.\d{3}(?:\.\d{1,3})?\) \(authenticated as .+\) id /) {
|
||||
elsif (/ \(authenticated as / && /\) by .+ \(\d{1,2}\.\d\.\d{3}(?:\.\d{1,3})?\) \(authenticated as .+\) id /) {
|
||||
$auth = 'CriticalPath';
|
||||
}
|
||||
# Postfix 2.3 and later with "smtpd_sasl_authenticated_header yes"
|
||||
@ -820,7 +817,7 @@ sub parse_received_line {
|
||||
|
||||
# Received: from [193.220.176.134] by web40310.mail.yahoo.com via HTTP;
|
||||
# Wed, 12 Feb 2003 14:22:21 PST
|
||||
if (/ via HTTP$/&&/^\[(${IP_ADDRESS})\] by (\S+) via HTTP$/) {
|
||||
if (/ via HTTP$/ && /^\[(${IP_ADDRESS})\] by (\S+) via HTTP$/) {
|
||||
$ip = $1; $by = $2; goto enough;
|
||||
}
|
||||
|
||||
@ -944,13 +941,13 @@ sub parse_received_line {
|
||||
|
||||
# Received: from [129.24.215.125] by ws1-7.us4.outblaze.com with http for
|
||||
# _bushisevil_@mail.com; Thu, 13 Feb 2003 15:59:28 -0500
|
||||
if (/ with http for /&&/^\[(${IP_ADDRESS})\] by (\S+) with http for /) {
|
||||
if (/ with http for / && /^\[(${IP_ADDRESS})\] by (\S+) with http for /) {
|
||||
$ip = $1; $by = $2; goto enough;
|
||||
}
|
||||
|
||||
# Received: from snake.corp.yahoo.com(216.145.52.229) by x.x.org via smap (V1.3)
|
||||
# id xma093673; Wed, 26 Mar 03 20:43:24 -0600
|
||||
if (/ via smap /&&/^(\S+)\((${IP_ADDRESS})\) by (\S+) via smap /) {
|
||||
if (/ via smap / && /^(\S+)\((${IP_ADDRESS})\) by (\S+) via smap /) {
|
||||
$mta_looked_up_dns = 1;
|
||||
$rdns = $1; $ip = $2; $by = $3; goto enough;
|
||||
}
|
||||
@ -965,13 +962,13 @@ sub parse_received_line {
|
||||
# Received: from [192.168.0.71] by web01-nyc.clicvu.com (Post.Office MTA
|
||||
# v3.5.3 release 223 ID# 0-64039U1000L100S0V35) with SMTP id com for
|
||||
# <x@x.org>; Tue, 25 Mar 2003 11:42:04 -0500
|
||||
if (/ \(Post/&&/^\[(${IP_ADDRESS})\] by (\S+) \(Post/) {
|
||||
if (/ \(Post/ && /^\[(${IP_ADDRESS})\] by (\S+) \(Post/) {
|
||||
$ip = $1; $by = $2; goto enough;
|
||||
}
|
||||
|
||||
# Received: from [127.0.0.1] by euphoria (ArGoSoft Mail Server
|
||||
# Freeware, Version 1.8 (1.8.2.5)); Sat, 8 Feb 2003 09:45:32 +0200
|
||||
if (/ \(ArGoSoft/&&/^\[(${IP_ADDRESS})\] by (\S+) \(ArGoSoft/) {
|
||||
if (/ \(ArGoSoft/ && /^\[(${IP_ADDRESS})\] by (\S+) \(ArGoSoft/) {
|
||||
$ip = $1; $by = $2; goto enough;
|
||||
}
|
||||
|
||||
@ -984,7 +981,7 @@ sub parse_received_line {
|
||||
|
||||
# Received: from faerber.muc.de by slarti.muc.de with BSMTP (rsmtp-qm-ot 0.4)
|
||||
# for asrg@ietf.org; 7 Mar 2003 21:10:38 -0000
|
||||
if (/ with BSMTP/&&/^\S+ by \S+ with BSMTP/) {
|
||||
if (/ with BSMTP/ && /^\S+ by \S+ with BSMTP/) {
|
||||
return 0; # BSMTP != a TCP/IP handover, ignore it
|
||||
}
|
||||
|
||||
|
@ -36,8 +36,6 @@ use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
require 5.008001; # needs utf8::is_utf8()
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
use Mail::SpamAssassin::HTML;
|
||||
@ -167,8 +165,8 @@ sub header {
|
||||
my $key = lc($rawkey);
|
||||
|
||||
# Trim whitespace off of the header keys
|
||||
$key =~ s/^\s+//;
|
||||
$key =~ s/\s+$//;
|
||||
#$key =~ s/^\s+//;
|
||||
#$key =~ s/\s+$//;
|
||||
|
||||
if (@_) {
|
||||
my $raw_value = shift;
|
||||
@ -388,6 +386,12 @@ sub detect_utf16 {
|
||||
my $sum_l_o = 0;
|
||||
my $decoder = undef;
|
||||
|
||||
# avoid scan if BOM present
|
||||
if( $data =~ /^(?:\xff\xfe|\xfe\xff)/ ) {
|
||||
dbg( "message: detect_utf16: found BOM" );
|
||||
return; # let perl figure it out from the BOM
|
||||
}
|
||||
|
||||
my @msg_h = unpack 'H' x length( $data ), $data;
|
||||
my @msg_l = unpack 'h' x length( $data ), $data;
|
||||
|
||||
@ -397,11 +401,11 @@ sub detect_utf16 {
|
||||
$sum_h_o += hex $msg_h[$i+1];
|
||||
$sum_l_e += hex $msg_l[$i];
|
||||
$sum_l_o += hex $msg_l[$i+1];
|
||||
if( $check_char =~ /20 00/ ) {
|
||||
if (index($check_char, '20 00') >= 0) {
|
||||
# UTF-16LE space char detected
|
||||
$utf16le_clues++;
|
||||
}
|
||||
if( $check_char =~ /00 20/ ) {
|
||||
if (index($check_char, '00 20') >= 0) {
|
||||
# UTF-16BE space char detected
|
||||
$utf16be_clues++;
|
||||
}
|
||||
@ -416,7 +420,7 @@ sub detect_utf16 {
|
||||
if( $utf16le_clues > $utf16be_clues ) {
|
||||
dbg( "message: detect_utf16: UTF-16LE" );
|
||||
$decoder = Encode::find_encoding("UTF-16LE");
|
||||
} elsif( $utf16le_clues > $utf16be_clues ) {
|
||||
} elsif( $utf16be_clues > $utf16le_clues ) {
|
||||
dbg( "message: detect_utf16: UTF-16BE" );
|
||||
$decoder = Encode::find_encoding("UTF-16BE");
|
||||
} else {
|
||||
@ -450,6 +454,7 @@ sub _normalize {
|
||||
# my $data = $_[0]; # avoid copying large strings
|
||||
my $charset_declared = $_[1];
|
||||
my $return_decoded = $_[2]; # true: Unicode characters, false: UTF-8 octets
|
||||
my $insist_on_declared_charset = $_[3]; # no FB_CROAK in Encode::decode
|
||||
|
||||
warn "message: _normalize() was given characters, expected bytes: $_[0]\n"
|
||||
if utf8::is_utf8($_[0]);
|
||||
@ -457,10 +462,6 @@ sub _normalize {
|
||||
# workaround for Encode::decode taint laundering bug [rt.cpan.org #84879]
|
||||
my $data_taint = substr($_[0], 0, 0); # empty string, tainted like $data
|
||||
|
||||
if (!defined $charset_declared || $charset_declared eq '') {
|
||||
$charset_declared = 'us-ascii';
|
||||
}
|
||||
|
||||
# number of characters with code above 127
|
||||
my $cnt_8bits = $_[0] =~ tr/\x00-\x7F//c;
|
||||
|
||||
@ -469,7 +470,8 @@ sub _normalize {
|
||||
/^(?: (?:US-)?ASCII | ANSI[_ ]? X3\.4- (?:1986|1968) |
|
||||
ISO646-US )\z/xsi)
|
||||
{ # declared as US-ASCII (a.k.a. ANSI X3.4-1986) and it really is
|
||||
dbg("message: kept, charset is US-ASCII as declared");
|
||||
dbg("message: contains only US-ASCII characters, declared %s, not decoding",
|
||||
$charset_declared);
|
||||
return $_[0]; # is all-ASCII, no need for decoding
|
||||
}
|
||||
|
||||
@ -479,17 +481,21 @@ sub _normalize {
|
||||
UTF-?8 | (KOI8|EUC)-[A-Z]{1,2} |
|
||||
Big5 | GBK | GB[ -]?18030 (?:-20\d\d)? )\z/xsi)
|
||||
{ # declared as extended ASCII, but it is actually a plain 7-bit US-ASCII
|
||||
dbg("message: kept, charset is US-ASCII, declared %s", $charset_declared);
|
||||
dbg("message: contains only US-ASCII characters, declared %s, not decoding",
|
||||
$charset_declared);
|
||||
return $_[0]; # is all-ASCII, no need for decoding
|
||||
}
|
||||
|
||||
# Try first to strictly decode based on a declared character set.
|
||||
|
||||
my $rv;
|
||||
if ($charset_declared =~ /^UTF-?8\z/i) {
|
||||
# attempt decoding as strict UTF-8 (flags: FB_CROAK | LEAVE_SRC)
|
||||
|
||||
# Try first as UTF-8 ignoring declaring?
|
||||
my $tried_utf8;
|
||||
if ($cnt_8bits && !$insist_on_declared_charset) {
|
||||
if (eval { $rv = $enc_utf8->decode($_[0], 1|8); defined $rv }) {
|
||||
dbg("message: decoded as declared charset UTF-8");
|
||||
dbg("message: decoded as charset UTF-8, declared %s",
|
||||
$charset_declared);
|
||||
return $_[0] if !$return_decoded;
|
||||
$rv .= $data_taint; # carry taintedness over, avoid Encode bug
|
||||
return $rv; # decoded
|
||||
@ -499,8 +505,16 @@ sub _normalize {
|
||||
$err = $@; $err =~ s/\s+/ /gs; $err =~ s/(.*) at .*/$1/;
|
||||
$err = " ($err)";
|
||||
}
|
||||
dbg("message: failed decoding as declared charset UTF-8 ($err)");
|
||||
dbg("message: failed decoding as charset UTF-8, declared %s%s",
|
||||
$charset_declared, $err);
|
||||
$tried_utf8 = 1;
|
||||
}
|
||||
}
|
||||
|
||||
if ($charset_declared =~ /^(?:US-)?ASCII\z/i
|
||||
&& !$insist_on_declared_charset) {
|
||||
# declared as US-ASCII but contains 8-bit characters, makes no sense
|
||||
# to attempt decoding first as strict US-ASCII as we know it would fail
|
||||
|
||||
} elsif ($charset_declared =~ /^UTF[ -]?16/i) {
|
||||
# Handle cases where spammers use UTF-16 encoding without including a BOM
|
||||
@ -508,8 +522,10 @@ sub _normalize {
|
||||
# https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7252
|
||||
|
||||
my $decoder = detect_utf16( $_[0] );
|
||||
if (defined $decoder) {
|
||||
if (eval { $rv = $decoder->decode($_[0], 1|8); defined $rv }) {
|
||||
dbg("message: declared charset %s decoded as charset %s", $charset_declared, $decoder->name);
|
||||
dbg("message: decoded as charset %s, declared %s",
|
||||
$decoder->name, $charset_declared);
|
||||
return $_[0] if !$return_decoded;
|
||||
$rv .= $data_taint; # carry taintedness over, avoid Encode bug
|
||||
return $rv; # decoded
|
||||
@ -519,20 +535,10 @@ sub _normalize {
|
||||
$err = $@; $err =~ s/\s+/ /gs; $err =~ s/(.*) at .*/$1/;
|
||||
$err = " ($err)";
|
||||
}
|
||||
dbg("message: failed decoding as declared charset %s%s", $charset_declared, $err);
|
||||
dbg("message: failed decoding as charset %s, declared %s%s",
|
||||
$decoder->name, $charset_declared, $err);
|
||||
}
|
||||
};
|
||||
|
||||
} elsif ($cnt_8bits &&
|
||||
eval { $rv = $enc_utf8->decode($_[0], 1|8); defined $rv }) {
|
||||
dbg("message: decoded as charset UTF-8, declared %s", $charset_declared);
|
||||
return $_[0] if !$return_decoded;
|
||||
$rv .= $data_taint; # carry taintedness over, avoid Encode bug
|
||||
return $rv; # decoded
|
||||
|
||||
} elsif ($charset_declared =~ /^(?:US-)?ASCII\z/i) {
|
||||
# declared as US-ASCII but contains 8-bit characters, makes no sense
|
||||
# to attempt decoding first as strict US-ASCII as we know it would fail
|
||||
|
||||
} else {
|
||||
# try decoding as a declared character set
|
||||
|
||||
@ -553,8 +559,11 @@ sub _normalize {
|
||||
my($chset, $decoder);
|
||||
if ($charset_declared =~ /^(?: ISO-?8859-1 | Windows-1252 | CP1252 )\z/xi) {
|
||||
$chset = 'Windows-1252'; $decoder = $enc_w1252;
|
||||
} elsif ($charset_declared =~ /^UTF-?8\z/i) {
|
||||
$chset = 'UTF-8'; $decoder = $enc_utf8;
|
||||
} else {
|
||||
$chset = $charset_declared; $decoder = Encode::find_encoding($chset);
|
||||
$chset = $charset_declared;
|
||||
$decoder = Encode::find_encoding($chset);
|
||||
if (!$decoder && $chset =~ /^GB[ -]?18030(?:-20\d\d)?\z/i) {
|
||||
$decoder = Encode::find_encoding('GBK'); # a subset of GB18030
|
||||
dbg("message: no decoder for a declared charset %s, using GBK",
|
||||
@ -564,20 +573,24 @@ sub _normalize {
|
||||
if (!$decoder) {
|
||||
dbg("message: failed decoding, no decoder for a declared charset %s",
|
||||
$chset);
|
||||
} else {
|
||||
}
|
||||
elsif ($tried_utf8 && $chset eq 'UTF-8') {
|
||||
# was already tried initially, no point doing again
|
||||
}
|
||||
else {
|
||||
my $check_flags = Encode::LEAVE_SRC; # 0x0008
|
||||
$check_flags |= Encode::FB_CROAK unless $insist_on_declared_charset;
|
||||
my $err = '';
|
||||
eval { $rv = $decoder->decode($_[0], 1|8) }; # FB_CROAK | LEAVE_SRC
|
||||
if (eval { $rv = $decoder->decode($_[0], $check_flags); defined $rv }) {
|
||||
dbg("message: decoded as charset %s, declared %s",
|
||||
$decoder->name, $charset_declared);
|
||||
} else {
|
||||
if ($@) {
|
||||
$err = $@; $err =~ s/\s+/ /gs; $err =~ s/(.*) at .*/$1/;
|
||||
$err = " ($err)";
|
||||
}
|
||||
if (lc $chset eq lc $charset_declared) {
|
||||
dbg("message: %s as declared charset %s%s",
|
||||
defined $rv ? 'decoded' : 'failed decoding', $charset_declared, $err);
|
||||
} else {
|
||||
dbg("message: %s as charset %s, declared %s%s",
|
||||
defined $rv ? 'decoded' : 'failed decoding',
|
||||
$chset, $charset_declared, $err);
|
||||
dbg("message: failed decoding as charset %s, declared %s%s",
|
||||
$decoder->name, $charset_declared, $err);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -589,7 +602,7 @@ sub _normalize {
|
||||
# Note that Windows-1252 is a proper superset of ISO-8859-1.
|
||||
#
|
||||
if (!defined $rv && !$cnt_8bits) {
|
||||
dbg("message: kept, guessed charset is US-ASCII, declared %s",
|
||||
dbg("message: contains only US-ASCII characters, declared %s, not decoding",
|
||||
$charset_declared);
|
||||
return $_[0]; # is all-ASCII, no need for decoding
|
||||
|
||||
@ -675,7 +688,7 @@ sub _normalize {
|
||||
|
||||
=item rendered()
|
||||
|
||||
render_text() takes the given text/* type MIME part, and attempts to
|
||||
rendered() takes the given text/* type MIME part, and attempts to
|
||||
render it into a text scalar. It will always render text/html, and will
|
||||
use a heuristic to determine if other text/* parts should be considered
|
||||
text/html. Two scalars are returned: the rendered type (either text/html
|
||||
@ -686,18 +699,31 @@ or whatever the original type was), and the rendered text.
|
||||
sub rendered {
|
||||
my ($self) = @_;
|
||||
|
||||
if (!exists $self->{rendered}) {
|
||||
# Cached?
|
||||
if (exists $self->{rendered}) {
|
||||
return ($self->{rendered_type}, $self->{rendered});
|
||||
}
|
||||
|
||||
# We only know how to render text/plain and text/html ...
|
||||
# Note: for bug 4843, make sure to skip text/calendar parts
|
||||
# we also want to skip things like text/x-vcard
|
||||
# text/x-aol is ignored here, but looks like text/html ...
|
||||
return(undef,undef) unless ( $self->{'type'} =~ /^text\/(?:plain|html)$/i );
|
||||
my $type = lc $self->{'type'};
|
||||
unless ($type eq 'text/plain' || $type eq 'text/html') {
|
||||
return (undef,undef);
|
||||
}
|
||||
|
||||
my $text = $self->decode; # QP and Base64 decoding, bytes
|
||||
my $text_len = length($text); # num of bytes in original charset encoding
|
||||
|
||||
my $charset = $self->{charset};
|
||||
if (!defined $charset) {
|
||||
dbg("message: no charset declared, using us-ascii");
|
||||
$charset = 'us-ascii';
|
||||
}
|
||||
|
||||
# render text/html always
|
||||
if ($text ne '' && $self->{'type'} =~ m{^text/html$}i)
|
||||
if ($text ne '' && $type eq 'text/html')
|
||||
{
|
||||
$self->{rendered_type} = 'text/html';
|
||||
|
||||
@ -712,21 +738,39 @@ sub rendered {
|
||||
# subroutine _normalize() to return Unicode text. See Bug 7133
|
||||
#
|
||||
$character_semantics = 1; # $text will be in characters
|
||||
$text = _normalize($text, $self->{charset}, 1); # bytes to chars
|
||||
} elsif (!defined $self->{charset} ||
|
||||
$self->{charset} =~ /^(?:US-ASCII|UTF-8)\z/i) {
|
||||
# With some luck input can be interpreted as UTF-8, do not warn.
|
||||
# It is still possible to hit the HTML::Parses utf8_mode bug however.
|
||||
$text = _normalize($text, $charset, 1); # bytes to chars
|
||||
} elsif ($charset =~ /^(?:US-ASCII|UTF-8)\z/i) {
|
||||
if ($text !~ tr/\x00-\x7F//c) {
|
||||
# all-ASCII, keep as octets (utf8 flag off)
|
||||
dbg("message: contains only US-ASCII characters, declared %s, not decoding",
|
||||
$charset);
|
||||
} else { # non-ASCII, try UTF-8
|
||||
my $rv;
|
||||
# with some luck input can be interpreted as UTF-8
|
||||
if (eval { $rv = $enc_utf8->decode($text, 1|8); defined $rv }) {
|
||||
$text = $rv; # decoded to perl characters
|
||||
$character_semantics = 1; # $text will be in characters
|
||||
dbg("message: decoded as charset UTF-8, declared %s", $charset);
|
||||
} else {
|
||||
my $err = '';
|
||||
if ($@) {
|
||||
$err = $@; $err =~ s/\s+/ /gs; $err =~ s/(.*) at .*/$1/;
|
||||
$err = " ($err)";
|
||||
}
|
||||
dbg("message: failed decoding as charset UTF-8, declared %s%s",
|
||||
$charset, $err);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
dbg("message: 'normalize_charset' is off, encoding will likely ".
|
||||
"be misinterpreted; declared charset: %s", $self->{charset});
|
||||
"be misinterpreted; declared charset: %s", $charset);
|
||||
}
|
||||
# the 0 requires decoded HTML results to be in bytes (not characters)
|
||||
my $html = Mail::SpamAssassin::HTML->new($character_semantics,0); # object
|
||||
# the 1 requires decoded HTML results to be in characters (utf8 flag on)
|
||||
my $html = Mail::SpamAssassin::HTML->new($character_semantics,1); # object
|
||||
|
||||
$html->parse($text); # parse+render text
|
||||
|
||||
# resulting HTML-decoded text is in bytes, likely encoded as UTF-8
|
||||
# resulting HTML-decoded text is in perl characters (utf8 flag on)
|
||||
$self->{rendered} = $html->get_rendered_text();
|
||||
$self->{visible_rendered} = $html->get_rendered_text(invisible => 0);
|
||||
$self->{invisible_rendered} = $html->get_rendered_text(invisible => 1);
|
||||
@ -735,24 +779,56 @@ sub rendered {
|
||||
# end-of-document result values that require looking at the text
|
||||
my $r = $self->{html_results}; # temporary reference for brevity
|
||||
|
||||
# count the number of spaces in the rendered text (likely UTF-8 octets)
|
||||
my $space = $self->{rendered} =~ tr/ \t\n\r\x0b//;
|
||||
# count the number of spaces in the rendered text
|
||||
my $space;
|
||||
if (utf8::is_utf8($self->{rendered})) {
|
||||
my $str = $self->{rendered};
|
||||
$str =~ s/\S+//g; # delete non-whitespace Unicode characters
|
||||
$space = length $str; # count remaining Unicode space characters
|
||||
undef $str; # deallocate storage
|
||||
dbg("message: spaces (Unicode) in HTML: %d out of %d%s",
|
||||
$space, length $self->{rendered},
|
||||
$character_semantics ? '' : ', octets!?');
|
||||
} else {
|
||||
$space = $self->{rendered} =~ tr/ \t\n\r\x0b//;
|
||||
dbg("message: spaces (octets) in HTML: %d out of %d%s",
|
||||
$space, length $self->{rendered},
|
||||
$character_semantics ? ', chars!?' : '');
|
||||
}
|
||||
# we may want to add the count of other Unicode whitespace characters
|
||||
|
||||
$r->{html_length} = length $self->{rendered}; # bytes (likely UTF-8)
|
||||
$r->{html_length} = length $self->{rendered}; # perl characters count
|
||||
$r->{non_space_len} = $r->{html_length} - $space;
|
||||
$r->{ratio} = ($text_len - $r->{html_length}) / $text_len if $text_len;
|
||||
}
|
||||
|
||||
else { # plain text
|
||||
if ($self->{normalize} && $enc_utf8) {
|
||||
# request transcoded result as UTF-8 octets!
|
||||
$text = _normalize($text, $self->{charset}, 0);
|
||||
$text = _normalize($text, $charset, 1); # bytes to chars
|
||||
} elsif ($charset =~ /^(?:US-ASCII|UTF-8)\z/i) {
|
||||
if ($text =~ tr/\x00-\x7F//c) { # non-ASCII, try UTF-8
|
||||
my $rv;
|
||||
# with some luck input can be interpreted as UTF-8
|
||||
if (eval { $rv = $enc_utf8->decode($text, 1|8); defined $rv }) {
|
||||
$text = $rv; # decoded to perl characters
|
||||
dbg("message: decoded as charset UTF-8, declared %s", $charset);
|
||||
} else {
|
||||
my $err = '';
|
||||
if ($@) {
|
||||
$err = $@; $err =~ s/\s+/ /gs; $err =~ s/(.*) at .*/$1/;
|
||||
$err = " ($err)";
|
||||
}
|
||||
$self->{rendered_type} = $self->{type};
|
||||
$self->{rendered} = $self->{'visible_rendered'} = $text;
|
||||
$self->{'invisible_rendered'} = '';
|
||||
dbg("message: failed decoding as charset UTF-8, declared %s%s",
|
||||
$charset, $err);
|
||||
}
|
||||
} else {
|
||||
dbg("message: contains only US-ASCII characters, declared %s, not decoding",
|
||||
$charset);
|
||||
}
|
||||
}
|
||||
$self->{rendered_type} = $type;
|
||||
$self->{rendered} = $self->{visible_rendered} = $text;
|
||||
$self->{invisible_rendered} = '';
|
||||
}
|
||||
|
||||
return ($self->{rendered_type}, $self->{rendered});
|
||||
@ -841,24 +917,24 @@ Delete the specified header (decoded and raw) from the Node information.
|
||||
sub delete_header {
|
||||
my($self, $hdr) = @_;
|
||||
|
||||
foreach ( grep(/^${hdr}$/i, keys %{$self->{'headers'}}) ) {
|
||||
foreach ( grep(/^${hdr}$/io, keys %{$self->{'headers'}}) ) {
|
||||
delete $self->{'headers'}->{$_};
|
||||
delete $self->{'raw_headers'}->{$_};
|
||||
}
|
||||
|
||||
my @neworder = grep(!/^${hdr}$/i, @{$self->{'header_order'}});
|
||||
my @neworder = grep(!/^${hdr}$/io, @{$self->{'header_order'}});
|
||||
$self->{'header_order'} = \@neworder;
|
||||
}
|
||||
|
||||
# decode a header appropriately. don't bother adding it to the pod documents.
|
||||
sub __decode_header {
|
||||
# decode 'encoded-word' (RFC 2047, RFC 2231)
|
||||
sub _decode_mime_encoded_word {
|
||||
my ( $encoding, $cte, $data ) = @_;
|
||||
|
||||
if ( $cte eq 'B' ) {
|
||||
if ( uc $cte eq 'B' ) {
|
||||
# base 64 encoded
|
||||
$data = Mail::SpamAssassin::Util::base64_decode($data);
|
||||
}
|
||||
elsif ( $cte eq 'Q' ) {
|
||||
elsif ( uc $cte eq 'Q' ) {
|
||||
# quoted printable
|
||||
|
||||
# the RFC states that in the encoded text, "_" is equal to "=20"
|
||||
@ -868,12 +944,24 @@ sub __decode_header {
|
||||
}
|
||||
else {
|
||||
# not possible since the input has already been limited to 'B' and 'Q'
|
||||
die "message: unknown encoding type '$cte' in RFC2047 header";
|
||||
die "message: unknown encoding type '$cte' in RFC 2047 header";
|
||||
}
|
||||
return _normalize($data, $encoding, 0); # transcode to UTF-8 octets
|
||||
|
||||
if (defined $encoding) {
|
||||
# RFC 2231 section 5: Language specification in Encoded Words
|
||||
# =?US-ASCII*EN?Q?Keith_Moore?=
|
||||
# strip optional language information following an asterisk
|
||||
$encoding =~ s{ \* .* \z }{}xs;
|
||||
|
||||
$data = _normalize($data, $encoding, 0, 1); # transcode to UTF-8 octets
|
||||
}
|
||||
# dbg("message: _decode_mime_encoded_word (%s, %s): %s",
|
||||
# $cte, $encoding || '-', $data);
|
||||
|
||||
return $data; # as UTF-8 octets
|
||||
}
|
||||
|
||||
# Decode base64 and quoted-printable in headers according to RFC2047.
|
||||
# Decode base64 and quoted-printable in headers according to RFC 2047.
|
||||
#
|
||||
sub _decode_header {
|
||||
my($header_field_body, $header_field_name) = @_;
|
||||
@ -881,35 +969,86 @@ sub _decode_header {
|
||||
return '' unless defined $header_field_body && $header_field_body ne '';
|
||||
|
||||
# deal with folding and cream the newlines and such
|
||||
$header_field_body =~ s/\n[ \t]+/\n /g;
|
||||
$header_field_body =~ s/\n[ \t]/\n /g; # turning tab into space on folds
|
||||
$header_field_body =~ s/\015?\012//gs;
|
||||
|
||||
if ($header_field_body =~ tr/\x00-\x7F//c) {
|
||||
# Non-ASCII characters in header are not allowed by RFC 5322, but
|
||||
# RFC 6532 relaxed the rule and allows UTF-8 encoding in header
|
||||
# field bodies; no other encoding is allowed there (apart from
|
||||
# RFC 2047 MIME encoded words, which must be all-ASCII anyway).
|
||||
# The following call keeps UTF-8 octets if valid, otherwise tries
|
||||
# some decoding guesswork so that the result is valid UTF-8 (octets).
|
||||
$header_field_body = _normalize($header_field_body, 'UTF-8', 0);
|
||||
}
|
||||
|
||||
if ($header_field_name =~
|
||||
/^ (?: Received | (?:Resent-)? (?: Message-ID | Date ) |
|
||||
MIME-Version | References | In-Reply-To | List-.* ) \z /xsi ) {
|
||||
# Bug 6945: some header fields must not be processed for MIME encoding
|
||||
# Bug 7466: leave out the Content-*
|
||||
# Bug 7249: leave out the Content-*
|
||||
|
||||
} else {
|
||||
local($1,$2,$3);
|
||||
} elsif (index($header_field_body, '=?') != -1) { # triage for possible encoded-words
|
||||
local($1,$2,$3,$4);
|
||||
|
||||
# Multiple encoded sections must ignore the interim whitespace.
|
||||
# To avoid possible FPs with (\s+(?==\?))?, look for the whole RE
|
||||
# separated by whitespace.
|
||||
1 while $header_field_body =~
|
||||
s{ ( = \? [A-Za-z0-9_-]+ \? [bqBQ] \? [^?]* \? = ) \s+
|
||||
( = \? [A-Za-z0-9_-]+ \? [bqBQ] \? [^?]* \? = ) }
|
||||
{$1$2}xsg;
|
||||
|
||||
# transcode properly encoded RFC 2047 substrings into UTF-8 octets,
|
||||
# leave everything else unchanged as it is supposed to be UTF-8 (RFC 6532)
|
||||
# or plain US-ASCII
|
||||
$header_field_body =~
|
||||
s{ (?: = \? ([A-Za-z0-9_-]+) \? ([bqBQ]) \? ([^?]*) \? = ) }
|
||||
{ __decode_header($1, uc($2), $3) }xsge;
|
||||
s{ ( = \? [A-Za-z0-9*_-]+ \? [bqBQ] \? [^?]* \? = ) \s+
|
||||
(?= = \? [A-Za-z0-9*_-]+ \? [bqBQ] \? [^?]* \? = ) }{$1}xsg;
|
||||
|
||||
# Bug 7249: work around violations of the RFC 2047 section 5 requirement:
|
||||
# Each 'encoded-word' MUST represent an integral number of characters.
|
||||
# A multi-octet character may not be split across adjacent 'encoded-word's
|
||||
# Unfortunately such violations are not uncommon.
|
||||
#
|
||||
# Bug 7307: to deal with the above, base64/QP decoding must be decoupled
|
||||
# from decoding a specified multi-byte character set into UTF-8.
|
||||
# A previous simpler code could not handle base64 fill bits correctly
|
||||
# (merging of adjecent encoded sections before base64/QP decoding them).
|
||||
|
||||
my @sections; # array of pairs: [string, encoding]
|
||||
my $last_encoding = '';
|
||||
while ( $header_field_body =~
|
||||
m{ \G = \? ([A-Za-z0-9*_-]+) \? ([bqBQ]) \? ([^?]*) \? =
|
||||
| ( [^=]+ | . ) }xsg ) {
|
||||
my($encoding, $str);
|
||||
if (defined $1) { # we have an encoded section
|
||||
$encoding = lc $1;
|
||||
# decode base64 / QP decoding, remember encoding charset
|
||||
$str = _decode_mime_encoded_word(undef, $2, $3);
|
||||
} else { # non-encoded text
|
||||
$encoding = '';
|
||||
$str = $4;
|
||||
}
|
||||
if ($encoding eq $last_encoding && @sections) {
|
||||
# merge sections with same encoding - in violation of RFC 2047 sect.5
|
||||
$sections[$#sections]->[0] .= $str;
|
||||
} else {
|
||||
push(@sections, [$str, $encoding]);
|
||||
}
|
||||
$last_encoding = $encoding;
|
||||
}
|
||||
|
||||
# dbg("message: _decode_header %s: %s", $header_field_name, $header_field_body);
|
||||
# transcode encoded RFC 2047 substrings (already base64/QP-decoded)
|
||||
# into UTF-8 octets, leave everything else unchanged as it is supposed
|
||||
# to be UTF-8 (RFC 6532) or its plain US-ASCII subset (RFC 5322);
|
||||
#
|
||||
my $decoded_result = '';
|
||||
for my $sect (@sections) {
|
||||
my $encoding = $sect->[1];
|
||||
# RFC 2231 section 5: Language specification in Encoded Words
|
||||
# =?US-ASCII*EN?Q?Keith_Moore?=
|
||||
# strip optional language information following an asterisk
|
||||
$encoding =~ s{ \* .* \z }{}xs;
|
||||
$decoded_result .=
|
||||
$encoding eq '' ? $sect->[0] : _normalize($sect->[0], $encoding, 0, 1);
|
||||
}
|
||||
$header_field_body = $decoded_result;
|
||||
}
|
||||
|
||||
dbg("message: _decode_header %s: %s", $header_field_name, $header_field_body);
|
||||
return $header_field_body;
|
||||
}
|
||||
|
||||
|
@ -33,7 +33,7 @@ BEGIN {
|
||||
eval {
|
||||
require Net::Patricia;
|
||||
Net::Patricia->VERSION(1.16); # need AF_INET6 support
|
||||
import Net::Patricia;
|
||||
Net::Patricia->import;
|
||||
$have_patricia = 1;
|
||||
};
|
||||
}
|
||||
@ -59,6 +59,7 @@ sub new {
|
||||
|
||||
sub DESTROY {
|
||||
my($self) = shift;
|
||||
|
||||
if (exists $self->{cache}) {
|
||||
local($@, $!, $_); # protect outer layers from a potential surprise
|
||||
my($hits, $attempts) = ($self->{cache_hits}, $self->{cache_attempts});
|
||||
@ -76,7 +77,45 @@ sub add_cidr {
|
||||
my $numadded = 0;
|
||||
delete $self->{cache}; # invalidate cache (in case of late additions)
|
||||
|
||||
# Pre-parse x.x.x.x-x.x.x.x range notation into CIDR blocks
|
||||
# requires Net::CIDR::Lite
|
||||
my @nets2;
|
||||
foreach my $cidr_orig (@nets) {
|
||||
next if index($cidr_orig, '-') == -1; # Triage
|
||||
my $cidr = $cidr_orig;
|
||||
my $exclude = ($cidr =~ s/^!\s*//) ? 1 : 0;
|
||||
local($1);
|
||||
$cidr =~ s/\b0+(\d+)/$1/; # Strip leading zeroes
|
||||
eval { require Net::CIDR::Lite; }; # Only try to load now when it's necessary
|
||||
if ($@) {
|
||||
warn "netset: IP range notation '$cidr_orig' requires Net::CIDR::Lite module, ignoring\n";
|
||||
$cidr_orig = undef;
|
||||
next;
|
||||
}
|
||||
my $cidrs = Net::CIDR::Lite->new;
|
||||
eval { $cidrs->add_range($cidr); };
|
||||
if ($@) {
|
||||
my $err = $@; $err =~ s/ at .*//s;
|
||||
warn "netset: illegal IP range '$cidr_orig': $err\n";
|
||||
$cidr_orig = undef;
|
||||
next;
|
||||
}
|
||||
my @arr = $cidrs->list;
|
||||
if (!@arr) {
|
||||
my $err = $@; $err =~ s/ at .*//s;
|
||||
warn "netset: failed to parse IP range '$cidr_orig': $err\n";
|
||||
$cidr_orig = undef;
|
||||
next;
|
||||
}
|
||||
# Save exclude flag
|
||||
if ($exclude) { $_ = "!$_" foreach (@arr); }
|
||||
# Rewrite this @nets value directly, add any rest to @nets2
|
||||
$cidr_orig = shift @arr;
|
||||
push @nets2, @arr if @arr;
|
||||
}
|
||||
|
||||
foreach my $cidr_orig (@nets, @nets2) {
|
||||
next unless defined $cidr_orig;
|
||||
my $cidr = $cidr_orig; # leave original unchanged, useful for logging
|
||||
|
||||
# recognizes syntax:
|
||||
|
@ -21,7 +21,7 @@ Mail::SpamAssassin::PerMsgLearner - per-message status (spam or not-spam)
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
my $spamtest = new Mail::SpamAssassin ({
|
||||
my $spamtest = Mail::SpamAssassin->new({
|
||||
'rules_filename' => '/etc/spamassassin.rules',
|
||||
'userprefs_filename' => $ENV{HOME}.'/.spamassassin/user_prefs'
|
||||
});
|
||||
@ -97,8 +97,8 @@ sub learn_spam {
|
||||
my ($self, $id) = @_;
|
||||
|
||||
# bug 4096
|
||||
# if ($self->{main}->{learn_with_whitelist}) {
|
||||
# $self->{main}->add_all_addresses_to_blacklist ($self->{msg});
|
||||
# if ($self->{main}->{learn_with_welcomelist}) {
|
||||
# $self->{main}->add_all_addresses_to_blocklist ($self->{msg});
|
||||
# }
|
||||
|
||||
# use the real message-id here instead of mass-check's idea of an "id",
|
||||
@ -124,8 +124,8 @@ sub learn_ham {
|
||||
my ($self, $id) = @_;
|
||||
|
||||
# bug 4096
|
||||
# if ($self->{main}->{learn_with_whitelist}) {
|
||||
# $self->{main}->add_all_addresses_to_whitelist ($self->{msg});
|
||||
# if ($self->{main}->{learn_with_welcomelist}) {
|
||||
# $self->{main}->add_all_addresses_to_welcomelist ($self->{msg});
|
||||
# }
|
||||
|
||||
$self->{learned} = $self->{bayes_scanner}->learn (0, $self->{msg}, $id);
|
||||
@ -148,8 +148,8 @@ sub forget {
|
||||
my ($self, $id) = @_;
|
||||
|
||||
# bug 4096
|
||||
# if ($self->{main}->{learn_with_whitelist}) {
|
||||
# $self->{main}->remove_all_addresses_from_whitelist ($self->{msg});
|
||||
# if ($self->{main}->{learn_with_welcomelist}) {
|
||||
# $self->{main}->remove_all_addresses_from_welcomelist ($self->{msg});
|
||||
# }
|
||||
|
||||
$self->{learned} = $self->{bayes_scanner}->forget ($self->{msg}, $id);
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -33,7 +33,7 @@ SpamAssassin will call:
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
All persistent address list implementations, used by the auto-whitelist
|
||||
All persistent address list implementations, used by the auto-welcomelist
|
||||
code to track known-good email addresses, use this as a base class.
|
||||
|
||||
See C<Mail::SpamAssassin::DBBasedAddrList> for an example.
|
||||
@ -81,7 +81,7 @@ SpamAssassin classes.
|
||||
|
||||
sub new_checker {
|
||||
my ($factory, $main) = @_;
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
die "auto-welcomelist: unimplemented base method"; # override this
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
@ -109,7 +109,7 @@ a C<count> key and a C<totscore> key.
|
||||
sub get_addr_entry {
|
||||
my ($self, $addr, $signedby) = @_;
|
||||
my $entry = { };
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
die "auto-welcomelist: unimplemented base method"; # override this
|
||||
return $entry;
|
||||
}
|
||||
|
||||
@ -117,27 +117,27 @@ sub get_addr_entry {
|
||||
|
||||
=item $entry = $addrlist->add_score($entry, $score);
|
||||
|
||||
This method should add the given score to the whitelist database for the
|
||||
This method should add the given score to the welcomelist database for the
|
||||
given entry, and then return the new entry.
|
||||
|
||||
=cut
|
||||
|
||||
sub add_score {
|
||||
my ($self, $entry, $score) = @_;
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
die "auto-welcomelist: unimplemented base method"; # override this
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
=item $entry = $addrlist->remove_entry ($entry);
|
||||
|
||||
This method should remove the given entry from the whitelist database.
|
||||
This method should remove the given entry from the welcomelist database.
|
||||
|
||||
=cut
|
||||
|
||||
sub remove_entry {
|
||||
my ($self, $entry) = @_;
|
||||
die "auto-whitelist: unimplemented base method"; # override this
|
||||
die "auto-welcomelist: unimplemented base method"; # override this
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
@ -145,7 +145,7 @@ sub remove_entry {
|
||||
=item $entry = $addrlist->finish ();
|
||||
|
||||
Clean up, if necessary. Called by SpamAssassin when it has finished
|
||||
checking, or adding to, the auto-whitelist database.
|
||||
checking, or adding to, the auto-welcomelist database.
|
||||
|
||||
=cut
|
||||
|
||||
|
@ -427,6 +427,20 @@ The C<Mail::SpamAssassin::PerMsgStatus> context object for this scan.
|
||||
|
||||
=back
|
||||
|
||||
=item $plugin->check_dnsbl ( { options ... } )
|
||||
|
||||
Called when DNSBL or other network lookups are being launched, implying
|
||||
current running priority of -100. This is the place to start your own
|
||||
asynchronously-started network lookups.
|
||||
|
||||
=over 4
|
||||
|
||||
=item permsgstatus
|
||||
|
||||
The C<Mail::SpamAssassin::PerMsgStatus> context object for this scan.
|
||||
|
||||
=back
|
||||
|
||||
=item $plugin->check_post_dnsbl ( { options ... } )
|
||||
|
||||
Called after the DNSBL results have been harvested. This is a good
|
||||
@ -440,6 +454,21 @@ The C<Mail::SpamAssassin::PerMsgStatus> context object for this scan.
|
||||
|
||||
=back
|
||||
|
||||
=item $plugin->check_cleanup ( { options ... } )
|
||||
|
||||
Called just before message check is finishing and before possible
|
||||
auto-learning. This is guaranteed to be always called, unlike check_tick
|
||||
and check_post_dnsbl. Used for cleaning up left callbacks or forked
|
||||
children etc, last chance to make rules hit.
|
||||
|
||||
=over 4
|
||||
|
||||
=item permsgstatus
|
||||
|
||||
The C<Mail::SpamAssassin::PerMsgStatus> context object for this scan.
|
||||
|
||||
=back
|
||||
|
||||
=item $plugin->check_post_learn ( { options ... } )
|
||||
|
||||
Called after auto-learning may (or may not) have taken place. If you
|
||||
@ -817,7 +846,9 @@ Reference to the original message object.
|
||||
|
||||
=back
|
||||
|
||||
=item $plugin->whitelist_address( { options ... } )
|
||||
=item $plugin->welcomelist_address( { options ... } )
|
||||
|
||||
Previously whitelist_address which will work interchangeably until 4.1.
|
||||
|
||||
Called when a request is made to add an address to a
|
||||
persistent address list.
|
||||
@ -834,7 +865,9 @@ Indicate if the call is being made from a command line interface.
|
||||
|
||||
=back
|
||||
|
||||
=item $plugin->blacklist_address( { options ... } )
|
||||
=item $plugin->blocklist_address( { options ... } )
|
||||
|
||||
Previously blacklist_address which will work interchangeably until 4.1.
|
||||
|
||||
Called when a request is made to add an address to a
|
||||
persistent address list.
|
||||
@ -882,7 +915,7 @@ only spamd calls this API.
|
||||
=item result
|
||||
|
||||
The C<'result: ...'> line for this scan. Format is as described
|
||||
at B<http://wiki.apache.org/spamassassin/SpamdSyslogFormat>.
|
||||
at B<https://wiki.apache.org/spamassassin/SpamdSyslogFormat>.
|
||||
|
||||
=back
|
||||
|
||||
@ -1013,17 +1046,27 @@ to receive specific events, or control the callback chain behaviour.
|
||||
|
||||
=over 4
|
||||
|
||||
=item $plugin->register_eval_rule ($nameofevalsub)
|
||||
=item $plugin->register_eval_rule ($nameofevalsub, $ruletype)
|
||||
|
||||
Plugins that implement an eval test will need to call this, so that
|
||||
SpamAssassin calls into the object when that eval test is encountered.
|
||||
See the B<REGISTERING EVAL RULES> section for full details.
|
||||
|
||||
Since 4.0, optional $ruletype can be specified to enforce that eval function
|
||||
cannot be called with wrong ruletype from configuration, for example user
|
||||
using "header FOO eval:foobar()" instead of "body FOO eval:foobar()".
|
||||
Mismatch will result in lint failure. $ruletype can be one of:
|
||||
|
||||
$Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS
|
||||
$Mail::SpamAssassin::Conf::TYPE_BODY_EVALS (allows both body and rawbody)
|
||||
$Mail::SpamAssassin::Conf::TYPE_RAWBODY_EVALS
|
||||
$Mail::SpamAssassin::Conf::TYPE_FULL_EVALS
|
||||
|
||||
=cut
|
||||
|
||||
sub register_eval_rule {
|
||||
my ($self, $nameofsub) = @_;
|
||||
$self->{main}->{conf}->register_eval_rule ($self, $nameofsub);
|
||||
my ($self, $nameofsub, $ruletype) = @_;
|
||||
$self->{main}->{conf}->register_eval_rule ($self, $nameofsub, $ruletype);
|
||||
}
|
||||
|
||||
=item $plugin->register_generated_rule_method ($nameofsub)
|
||||
@ -1132,7 +1175,7 @@ called from rules in the configuration files, in the plugin class' constructor.
|
||||
|
||||
For example,
|
||||
|
||||
$plugin->register_eval_rule ('check_for_foo')
|
||||
$plugin->register_eval_rule ('check_for_foo', $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS)
|
||||
|
||||
will cause C<$plugin-E<gt>check_for_foo()> to be called for this
|
||||
SpamAssassin rule:
|
||||
@ -1157,18 +1200,21 @@ In other words, the eval test method should look something like this:
|
||||
|
||||
sub check_for_foo {
|
||||
my ($self, $permsgstatus, ...arguments...) = @_;
|
||||
...code returning 0 or 1
|
||||
...code returning 0 (miss), 1 (hit), or undef (async function)
|
||||
}
|
||||
|
||||
The eval rule should return C<1> for a hit, or C<0> if the rule is not hit.
|
||||
Special case of "return undef" must be used when result is not yet ready and
|
||||
it will be later declared with PerMsgStatus functions got_hit() or
|
||||
rule_ready() - see their documentation for more info. Make sure not to
|
||||
return undef by mistake.
|
||||
|
||||
Note that the headers can be accessed using the C<get()> method on the
|
||||
C<Mail::SpamAssassin::PerMsgStatus> object, and the body by
|
||||
C<get_decoded_stripped_body_text_array()> and other similar methods.
|
||||
Similarly, the C<Mail::SpamAssassin::Conf> object holding the current
|
||||
configuration may be accessed through C<$permsgstatus-E<gt>{main}-E<gt>{conf}>.
|
||||
|
||||
The eval rule should return C<1> for a hit, or C<0> if the rule
|
||||
is not hit.
|
||||
|
||||
State for a single message being scanned should be stored on the C<$permsgstatus>
|
||||
object, not on the C<$self> object, since C<$self> persists between scan
|
||||
operations. See the 'lifecycle note' on the C<check_start()> method above.
|
||||
@ -1216,8 +1262,8 @@ Mail::SpamAssassin(3)
|
||||
|
||||
Mail::SpamAssassin::PerMsgStatus(3)
|
||||
|
||||
http://wiki.apache.org/spamassassin/PluginWritingTips
|
||||
https://wiki.apache.org/spamassassin/PluginWritingTips
|
||||
|
||||
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=2163
|
||||
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=2163
|
||||
|
||||
=cut
|
||||
|
@ -50,13 +50,30 @@ Autonomous System Number (ASN) of the connecting IP address.
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::ASN
|
||||
|
||||
# Default / recommended settings
|
||||
asn_use_geodb 1
|
||||
asn_use_dns 1
|
||||
asn_prefer_geodb 1
|
||||
|
||||
# Do lookups and add tags / X-Spam-ASN header
|
||||
asn_lookup asn.routeviews.org _ASN_ _ASNCIDR_
|
||||
|
||||
asn_lookup_ipv6 origin6.asn.cymru.com _ASN_ _ASNCIDR_
|
||||
|
||||
add_header all ASN _ASN_ _ASNCIDR_
|
||||
|
||||
header TEST_AS1234 X-ASN =~ /^1234$/
|
||||
# Rules to test ASN or Organization
|
||||
# NOTE: Do not use rules that check metadata X-ASN header,
|
||||
# only check_asn() eval function works correctly.
|
||||
# Rule argument is full regexp to match.
|
||||
|
||||
# ASN Number: GeoIP ASN or DNS
|
||||
# Matched string includes asn_prefix if defined, and normally
|
||||
# looks like "AS1234" (DNS) or "AS1234 Google LLC" (GeoIP)
|
||||
header AS_1234 eval:check_asn('/^AS1234\b/')
|
||||
|
||||
# ASN Organisation: GeoIP ASN has, DNS lists might not have
|
||||
# Note the second parameter which checks MYASN tag (default is ASN)
|
||||
asn_lookup myview.example.com _MYASN_ _MYASNCIDR_
|
||||
header AS_GOOGLE eval:check_asn('/\bGoogle\b/i', 'MYASN')
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
@ -68,12 +85,17 @@ high-volume environment> or that you should use a local mirror of the
|
||||
zone (see C<ftp://ftp.routeviews.org/dnszones/>). Other similar zones
|
||||
may also be used.
|
||||
|
||||
GeoDB (GeoIP ASN) database lookups are supported since SpamAssassin 4.0 and
|
||||
it's recommended to use them instead of DNS queries, unless C<_ASNCIDR_>
|
||||
is needed.
|
||||
|
||||
=head1 TEMPLATE TAGS
|
||||
|
||||
This plugin allows you to create template tags containing the connecting
|
||||
IP's AS number and route info for that AS number.
|
||||
|
||||
The default config will add a header field that looks like this:
|
||||
If you use add_header as documented in the example before, a header field is
|
||||
added that looks like this:
|
||||
|
||||
X-Spam-ASN: AS24940 213.239.192.0/18
|
||||
|
||||
@ -87,29 +109,19 @@ all be added to the C<_ASNCIDR_> tag, separated by spaces, eg:
|
||||
Note that the literal "AS" before the ASN in the _ASN_ tag is configurable
|
||||
through the I<asn_prefix> directive and may be set to an empty string.
|
||||
|
||||
=head1 CONFIGURATION
|
||||
C<_ASNCIDR_> is not available with local GeoDB ASN lookups.
|
||||
|
||||
The standard ruleset contains a configuration that will add a header field
|
||||
containing ASN data to scanned messages. The bayes tokenizer will use the
|
||||
added header field for bayes calculations, and thus affect which BAYES_* rule
|
||||
will trigger for a particular message.
|
||||
=head1 BAYES
|
||||
|
||||
B<Note> that in most cases you should not score on the ASN data directly.
|
||||
Bayes learning will probably trigger on the _ASNCIDR_ tag, but probably not
|
||||
very well on the _ASN_ tag alone.
|
||||
The bayes tokenizer will use ASN data for bayes calculations, and thus
|
||||
affect which BAYES_* rule will trigger for a particular message. No
|
||||
in-depth analysis of the usefulness of bayes tokenization of ASN data has
|
||||
been performed.
|
||||
|
||||
=head1 SEE ALSO
|
||||
|
||||
http://www.routeviews.org/ - all data regarding routing, ASNs, etc....
|
||||
|
||||
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4770 -
|
||||
SpamAssassin Issue #4770 concerning this plugin
|
||||
|
||||
=head1 STATUS
|
||||
|
||||
No in-depth analysis of the usefulness of bayes tokenization of ASN data has
|
||||
been performed.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::ASN;
|
||||
@ -120,27 +132,23 @@ use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(reverse_ip_address);
|
||||
use Mail::SpamAssassin::Dns;
|
||||
use Mail::SpamAssassin::Util qw(reverse_ip_address compile_regexp);
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
our $txtdata_can_provide_a_list;
|
||||
|
||||
my $IPV4_ADDRESS = IPV4_ADDRESS;
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule("check_asn", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
|
||||
$self->set_config($mailsa->{conf});
|
||||
|
||||
#$txtdata_can_provide_a_list = Net::DNS->VERSION >= 0.69;
|
||||
#more robust version check from Damyan Ivanov - Bug 7095
|
||||
$txtdata_can_provide_a_list = version->parse(Net::DNS->VERSION) >= version->parse('0.69');
|
||||
# we need GeoDB ASN
|
||||
$self->{main}->{geodb_wanted}->{asn} = 1;
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -201,6 +209,25 @@ is kept for backward compatibility with versions of SpamAssassin earlier
|
||||
than 3.4.0. A sensible setting is an empty string. The argument may be (but
|
||||
need not be) enclosed in single or double quotes for clarity.
|
||||
|
||||
=item asn_use_geodb ( 0 / 1 ) (default: 1)
|
||||
|
||||
Use Mail::SpamAssassin::GeoDB module to lookup ASN numbers. You need
|
||||
suitable supported module like GeoIP2 or GeoIP with ISP or ASN database
|
||||
installed (for example, add EditionIDs GeoLite2-ASN in GeoIP.conf for
|
||||
geoipupdate program).
|
||||
|
||||
GeoDB can only set _ASN_ tag, it has no data for _ASNCIDR_. If you need
|
||||
both, then set asn_prefer_geodb 0 so DNS rules are tried.
|
||||
|
||||
=item asn_prefer_geodb ( 0 / 1 ) (default: 1)
|
||||
|
||||
If set, DNS lookups (asn_lookup rules) will not be run if GeoDB successfully
|
||||
finds ASN. Set this to 0 to get _ASNCIDR_ even if GeoDB finds _ASN_.
|
||||
|
||||
=item asn_use_dns ( 0 / 1 ) (default: 1)
|
||||
|
||||
Set to 0 to never allow DNS queries.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
@ -272,26 +299,53 @@ need not be) enclosed in single or double quotes for clarity.
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'asn_use_geodb',
|
||||
default => 1,
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'asn_prefer_geodb',
|
||||
default => 1,
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'asn_use_dns',
|
||||
default => 1,
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub parsed_metadata {
|
||||
sub extract_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $pms = $opts->{permsgstatus};
|
||||
my $conf = $self->{main}->{conf};
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
if (!$pms->is_dns_available()) {
|
||||
dbg("asn: DNS is not available, skipping ASN checks");
|
||||
my $geodb = $self->{main}->{geodb};
|
||||
my $has_geodb = $conf->{asn_use_geodb} && $geodb && $geodb->can('asn');
|
||||
if ($has_geodb) {
|
||||
dbg("asn: using GeoDB ASN for lookups");
|
||||
} else {
|
||||
dbg("asn: GeoDB ASN not available");
|
||||
if (!$conf->{asn_use_dns} || !$pms->is_dns_available()) {
|
||||
dbg("asn: DNS is not available, skipping ASN check");
|
||||
return;
|
||||
}
|
||||
|
||||
if (!$conf->{asnlookups} && !$conf->{asnlookups_ipv6}) {
|
||||
dbg("asn: no asn_lookups configured, skipping ASN lookups");
|
||||
if ($self->{main}->{learning}) {
|
||||
dbg("asn: learning message, skipping DNS-based ASN check");
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
# initialize the tag data so that if no result is returned from the DNS
|
||||
# query we won't end up with a missing tag. Don't use $pms->set_tag()
|
||||
@ -309,9 +363,11 @@ sub parsed_metadata {
|
||||
}
|
||||
}
|
||||
|
||||
# get reversed IP address of last external relay to lookup
|
||||
# don't return until we've initialized the template tags
|
||||
my $relay = $pms->{relays_external}->[0];
|
||||
# Initialize status
|
||||
$pms->{asn_results} = ();
|
||||
|
||||
# get IP address of last external relay to lookup
|
||||
my $relay = $opts->{msg}->{metadata}->{relays_external}->[0];
|
||||
if (!defined $relay) {
|
||||
dbg("asn: no first external relay IP available, skipping ASN check");
|
||||
return;
|
||||
@ -319,18 +375,45 @@ sub parsed_metadata {
|
||||
dbg("asn: first external relay is a private IP, skipping ASN check");
|
||||
return;
|
||||
}
|
||||
|
||||
my $ip = $relay->{ip};
|
||||
my $reversed_ip = reverse_ip_address($ip);
|
||||
if (defined $reversed_ip) {
|
||||
dbg("asn: using first external relay IP for lookups: %s", $ip);
|
||||
|
||||
# GeoDB lookup
|
||||
my $asn_found;
|
||||
if ($has_geodb) {
|
||||
my $asn = $geodb->get_asn($ip);
|
||||
my $org = $geodb->get_asn_org($ip);
|
||||
if (!defined $asn) {
|
||||
dbg("asn: GeoDB ASN lookup failed");
|
||||
} else {
|
||||
dbg("asn: could not parse first external relay IP: %s, skipping", $ip);
|
||||
$asn_found = 1;
|
||||
dbg("asn: GeoDB found ASN $asn");
|
||||
# Prevent double prefix
|
||||
my $asn_value =
|
||||
length($conf->{asn_prefix}) && index($asn, $conf->{asn_prefix}) != 0 ?
|
||||
$conf->{asn_prefix}.$asn : $asn;
|
||||
$asn_value .= ' '.$org if defined $org && length($org);
|
||||
$pms->set_tag('ASN', $asn_value);
|
||||
# For Bayes
|
||||
$pms->{msg}->put_metadata('X-ASN', $asn);
|
||||
}
|
||||
}
|
||||
|
||||
# Skip DNS if GeoDB was successful and preferred
|
||||
if ($asn_found && $conf->{asn_prefer_geodb}) {
|
||||
dbg("asn: GeoDB lookup successful, skipping DNS lookups");
|
||||
return;
|
||||
}
|
||||
|
||||
# No point continuing without DNS from now on
|
||||
if (!$conf->{asn_use_dns} || !$pms->is_dns_available()) {
|
||||
dbg("asn: skipping disabled DNS lookups");
|
||||
return;
|
||||
}
|
||||
|
||||
dbg("asn: using DNS for lookups");
|
||||
my $lookup_zone;
|
||||
if ($ip =~ /^$IPV4_ADDRESS$/o) {
|
||||
if ($ip =~ IS_IPV4_ADDRESS) {
|
||||
if (!defined $conf->{asnlookups}) {
|
||||
dbg("asn: asn_lookup for IPv4 not defined, skipping");
|
||||
return;
|
||||
@ -344,6 +427,12 @@ sub parsed_metadata {
|
||||
$lookup_zone = "asnlookups_ipv6";
|
||||
}
|
||||
|
||||
my $reversed_ip = reverse_ip_address($ip);
|
||||
if (!defined $reversed_ip) {
|
||||
dbg("asn: could not parse IP: %s, skipping", $ip);
|
||||
return;
|
||||
}
|
||||
|
||||
# we use arrays and array indices rather than hashes and hash keys
|
||||
# in case someone wants the same zone added to multiple sets of tags
|
||||
my $index = 0;
|
||||
@ -351,14 +440,12 @@ sub parsed_metadata {
|
||||
# do the DNS query, have the callback process the result
|
||||
my $zone_index = $index;
|
||||
my $zone = $reversed_ip . '.' . $entry->{zone};
|
||||
my $key = "asnlookup-${lookup_zone}-${zone_index}-".$entry->{zone};
|
||||
my $ent = $pms->{async}->bgsend_and_start_lookup($zone, 'TXT', undef,
|
||||
{ type => 'ASN', key => $key, zone => $lookup_zone },
|
||||
$pms->{async}->bgsend_and_start_lookup($zone, 'TXT', undef,
|
||||
{ rulename => 'asn_lookup', type => 'ASN' },
|
||||
sub { my($ent, $pkt) = @_;
|
||||
$self->process_dns_result($pms, $pkt, $zone_index, $lookup_zone) },
|
||||
master_deadline => $pms->{master_deadline}
|
||||
);
|
||||
$pms->register_async_rule_start($key) if $ent;
|
||||
$index++;
|
||||
}
|
||||
}
|
||||
@ -376,6 +463,9 @@ sub parsed_metadata {
|
||||
sub process_dns_result {
|
||||
my ($self, $pms, $pkt, $zone_index, $lookup_zone) = @_;
|
||||
|
||||
# NOTE: $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
return if !$pkt;
|
||||
|
||||
my $conf = $self->{main}->{conf};
|
||||
|
||||
my $zone = $conf->{$lookup_zone}[$zone_index]->{zone};
|
||||
@ -403,14 +493,10 @@ sub process_dns_result {
|
||||
%route_tag_data_seen = map(($_,1), @route_tag_data);
|
||||
}
|
||||
|
||||
# NOTE: $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
my @answer = !defined $pkt ? () : $pkt->answer;
|
||||
|
||||
foreach my $rr (@answer) {
|
||||
foreach my $rr ($pkt->answer) {
|
||||
#dbg("asn: %s: lookup result packet: %s", $zone, $rr->string);
|
||||
next if $rr->type ne 'TXT';
|
||||
my @strings = $txtdata_can_provide_a_list ? $rr->txtdata :
|
||||
$rr->char_str_list; # historical
|
||||
my @strings = $rr->txtdata;
|
||||
next if !@strings;
|
||||
for (@strings) { utf8::encode($_) if utf8::is_utf8($_) }
|
||||
|
||||
@ -485,7 +571,49 @@ sub process_dns_result {
|
||||
}
|
||||
}
|
||||
|
||||
sub check_asn {
|
||||
my ($self, $pms, $re, $asn_tag) = @_;
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
if (!defined $re) {
|
||||
warn "asn: rule $rulename eval argument missing\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
my ($rec, $err) = compile_regexp($re, 2);
|
||||
if (!$rec) {
|
||||
warn "asn: invalid regexp for $rulename '$re': $err\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
$asn_tag = 'ASN' unless defined $asn_tag;
|
||||
$pms->action_depends_on_tags($asn_tag,
|
||||
sub { my($pms,@args) = @_;
|
||||
$self->_check_asn($pms, $rulename, $rec, $asn_tag);
|
||||
}
|
||||
);
|
||||
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
sub _check_asn {
|
||||
my ($self, $pms, $rulename, $rec, $asn_tag) = @_;
|
||||
|
||||
$pms->rule_ready($rulename); # mark rule ready for metas
|
||||
|
||||
my $asn = $pms->get_tag($asn_tag);
|
||||
return if !defined $asn;
|
||||
|
||||
if ($asn =~ $rec) {
|
||||
$pms->test_log("$asn_tag: $asn", $rulename);
|
||||
$pms->got_hit($rulename, "");
|
||||
}
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_asn_lookup_ipv6 { 1 }
|
||||
sub has_asn_geodb { 1 }
|
||||
sub has_check_asn { 1 }
|
||||
sub has_check_asn_tag { 1 } # $asn_tag parameter for check_asn()
|
||||
|
||||
1;
|
||||
|
@ -17,7 +17,7 @@
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::AWL - Normalize scores via auto-whitelist
|
||||
Mail::SpamAssassin::Plugin::AWL - Normalize scores via auto-welcomelist
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
@ -28,14 +28,14 @@ To try this out, add this or uncomment this line in init.pre:
|
||||
Use the supplied 60_awl.cf file (ie you don't have to do anything) or
|
||||
add these lines to a .cf file:
|
||||
|
||||
header AWL eval:check_from_in_auto_whitelist()
|
||||
describe AWL From: address is in the auto white-list
|
||||
header AWL eval:check_from_in_auto_welcomelist()
|
||||
describe AWL From: address is in the auto welcome-list
|
||||
tflags AWL userconf noautolearn
|
||||
priority AWL 1000
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin module provides support for the auto-whitelist. It keeps
|
||||
This plugin module provides support for the auto-welcomelist. It keeps
|
||||
track of the average SpamAssassin score for senders. Senders are
|
||||
tracked using a combination of their From: address and their IP address.
|
||||
It then uses that average score to reduce the variability in scoring
|
||||
@ -63,7 +63,7 @@ use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::AutoWhitelist;
|
||||
use Mail::SpamAssassin::AutoWelcomelist;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
@ -80,7 +80,8 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_from_in_auto_whitelist");
|
||||
$self->register_eval_rule("check_from_in_auto_welcomelist", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_from_in_auto_whitelist", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS); # removed in 4.1
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
@ -99,18 +100,20 @@ SpamAssassin handles incoming email messages.
|
||||
|
||||
=over 4
|
||||
|
||||
=item use_auto_whitelist ( 0 | 1 ) (default: 1)
|
||||
=item use_auto_welcomelist ( 0 | 1 ) (default: 1)
|
||||
|
||||
Whether to use auto-whitelists. Auto-whitelists track the long-term
|
||||
Previously use_auto_whitelist which will work interchangeably until 4.1.
|
||||
|
||||
Whether to use auto-welcomelists. Auto-welcomelists track the long-term
|
||||
average score for each sender and then shift the score of new messages
|
||||
toward that long-term average. This can increase or decrease the score
|
||||
for messages, depending on the long-term behavior of the particular
|
||||
correspondent.
|
||||
|
||||
For more information about the auto-whitelist system, please look
|
||||
at the C<Automatic Whitelist System> section of the README file.
|
||||
The auto-whitelist is not intended as a general-purpose replacement
|
||||
for static whitelist entries added to your config files.
|
||||
For more information about the auto-welcomelist system, please look
|
||||
at the C<Automatic Welcomelist System> section of the README file.
|
||||
The auto-welcomelist is not intended as a general-purpose replacement
|
||||
for static welcomelist entries added to your config files.
|
||||
|
||||
Note that certain tests are ignored when determining the final
|
||||
message score:
|
||||
@ -120,12 +123,15 @@ message score:
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'use_auto_whitelist',
|
||||
setting => 'use_auto_welcomelist',
|
||||
aliases => ['use_auto_whitelist'], # backward compatible - to be removed for 4.1
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL
|
||||
});
|
||||
|
||||
=item auto_whitelist_factor n (default: 0.5, range [0..1])
|
||||
=item auto_welcomelist_factor n (default: 0.5, range [0..1])
|
||||
|
||||
Previously auto_whitelist_factor which will work interchangeably until 4.1.
|
||||
|
||||
How much towards the long-term mean for the sender to regress a message.
|
||||
Basically, the algorithm is to track the long-term mean score of messages for
|
||||
@ -143,12 +149,15 @@ mean; C<factor> = 0 mean just use the calculated score.
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_factor',
|
||||
setting => 'auto_welcomelist_factor',
|
||||
aliases => ['auto_whitelist_factor'], # backward compatible - to be removed for 4.1
|
||||
default => 0.5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=item auto_whitelist_ipv4_mask_len n (default: 16, range [0..32])
|
||||
=item auto_welcomelist_ipv4_mask_len n (default: 16, range [0..32])
|
||||
|
||||
Previously auto_whitelist_ipv4_mask_len which will work interchangeably until 4.1.
|
||||
|
||||
The AWL database keeps only the specified number of most-significant bits
|
||||
of an IPv4 address in its fields, so that different individual IP addresses
|
||||
@ -164,7 +173,8 @@ of 8, any split is allowed.
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_ipv4_mask_len',
|
||||
setting => 'auto_welcomelist_ipv4_mask_len',
|
||||
aliases => ['auto_whitelist_ipv4_mask_len'], # removed in 4.1
|
||||
default => 16,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
@ -174,11 +184,13 @@ of 8, any split is allowed.
|
||||
} elsif ($value !~ /^\d+$/ || $value < 0 || $value > 32) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_ipv4_mask_len} = $value;
|
||||
$self->{auto_welcomelist_ipv4_mask_len} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
=item auto_whitelist_ipv6_mask_len n (default: 48, range [0..128])
|
||||
=item auto_welcomelist_ipv6_mask_len n (default: 48, range [0..128])
|
||||
|
||||
Previously auto_whitelist_ipv6_mask_len which will work interchangeably until 4.1.
|
||||
|
||||
The AWL database keeps only the specified number of most-significant bits
|
||||
of an IPv6 address in its fields, so that different individual IP addresses
|
||||
@ -195,7 +207,8 @@ is allowed.
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_ipv6_mask_len',
|
||||
setting => 'auto_welcomelist_ipv6_mask_len',
|
||||
aliases => ['auto_whitelist_ipv6_mask_len'], # removed in 4.1
|
||||
default => 48,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
@ -205,7 +218,7 @@ is allowed.
|
||||
} elsif ($value !~ /^\d+$/ || $value < 0 || $value > 128) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_ipv6_mask_len} = $value;
|
||||
$self->{auto_welcomelist_ipv6_mask_len} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
@ -215,7 +228,7 @@ Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
If this option is set the SQLBasedAddrList module will override the set
|
||||
username with the value given. This can be useful for implementing global
|
||||
or group based auto-whitelist databases.
|
||||
or group based auto-welcomelist databases.
|
||||
|
||||
=cut
|
||||
|
||||
@ -225,7 +238,9 @@ or group based auto-whitelist databases.
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item auto_whitelist_distinguish_signed
|
||||
=item auto_welcomelist_distinguish_signed
|
||||
|
||||
Previously auto_whitelist_distinguish_signed which will work interchangeably until 4.1.
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
@ -240,7 +255,8 @@ turning on this option.
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_distinguish_signed',
|
||||
setting => 'auto_welcomelist_distinguish_signed',
|
||||
aliases => ['auto_whitelist_distinguish_signed'], # removed in 4.1
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL
|
||||
});
|
||||
@ -256,32 +272,38 @@ user's C<user_prefs> file.
|
||||
|
||||
=over 4
|
||||
|
||||
=item auto_whitelist_factory module (default: Mail::SpamAssassin::DBBasedAddrList)
|
||||
=item auto_welcomelist_factory module (default: Mail::SpamAssassin::DBBasedAddrList)
|
||||
|
||||
Select alternative whitelist factory module.
|
||||
Previously auto_whitelist_factory which will work interchangeably until 4.1.
|
||||
|
||||
Select alternative welcomelist factory module.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_factory',
|
||||
setting => 'auto_welcomelist_factory',
|
||||
aliases => ['auto_whitelist_factory'], # removed in 4.1
|
||||
is_admin => 1,
|
||||
default => 'Mail::SpamAssassin::DBBasedAddrList',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item auto_whitelist_path /path/filename (default: ~/.spamassassin/auto-whitelist)
|
||||
=item auto_welcomelist_path /path/filename (default: ~/.spamassassin/auto-welcomelist)
|
||||
|
||||
This is the automatic-whitelist directory and filename. By default, each user
|
||||
has their own whitelist database in their C<~/.spamassassin> directory with
|
||||
Previously auto_whitelist_path which will work interchangeably until 4.1.
|
||||
|
||||
This is the automatic-welcomelist directory and filename. By default, each user
|
||||
has their own welcomelist database in their C<~/.spamassassin> directory with
|
||||
mode 0700. For system-wide SpamAssassin use, you may want to share this
|
||||
across all users, although that is not recommended.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_path',
|
||||
setting => 'auto_welcomelist_path',
|
||||
aliases => ['auto_whitelist_path'], # removed in 4.1
|
||||
is_admin => 1,
|
||||
default => '__userstate__/auto-whitelist',
|
||||
default => '__userstate__/auto-welcomelist',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
@ -291,13 +313,15 @@ across all users, although that is not recommended.
|
||||
if (-d $value) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_path} = $value;
|
||||
$self->{auto_welcomelist_path} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
=item auto_whitelist_db_modules Module ... (default: see below)
|
||||
=item auto_welcomelist_db_modules Module ... (default: see below)
|
||||
|
||||
What database modules should be used for the auto-whitelist storage database
|
||||
Previously auto_whitelist_db_modules which will work interchangeably until 4.1.
|
||||
|
||||
What database modules should be used for the auto-welcomelist storage database
|
||||
file. The first named module that can be loaded from the perl include path
|
||||
will be used. The format is:
|
||||
|
||||
@ -313,15 +337,18 @@ preclude its use for the AWL (see SpamAssassin bug 4353).
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_db_modules',
|
||||
setting => 'auto_welcomelist_db_modules',
|
||||
aliases => ['auto_whitelist_db_modules'], # removed in 4.1
|
||||
is_admin => 1,
|
||||
default => 'DB_File GDBM_File SDBM_File',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=item auto_whitelist_file_mode (default: 0700)
|
||||
=item auto_welcomelist_file_mode (default: 0700)
|
||||
|
||||
The file mode bits used for the automatic-whitelist directory or file.
|
||||
Previously auto_whitelist_file_mode which will work interchangeably until 4.1.
|
||||
|
||||
The file mode bits used for the automatic-welcomelist directory or file.
|
||||
|
||||
Make sure you specify this using the 'x' mode bits set, as it may also be used
|
||||
to create directories. However, if a file is created, the resulting file will
|
||||
@ -330,7 +357,8 @@ not have any execute bits set (the umask is set to 0111).
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'auto_whitelist_file_mode',
|
||||
setting => 'auto_welcomelist_file_mode',
|
||||
aliases => ['auto_whitelist_file_mode'], # removed in 4.1
|
||||
is_admin => 1,
|
||||
default => '0700',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
@ -339,7 +367,7 @@ not have any execute bits set (the umask is set to 0111).
|
||||
if ($value !~ /^0?[0-7]{3}$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{auto_whitelist_file_mode} = untaint_var($value);
|
||||
$self->{auto_welcomelist_file_mode} = untaint_var($value);
|
||||
}
|
||||
});
|
||||
|
||||
@ -390,7 +418,7 @@ The password for the database username, for the above DSN.
|
||||
|
||||
Used by the SQLBasedAddrList storage implementation.
|
||||
|
||||
The table user auto-whitelists are stored in, for the above DSN.
|
||||
The table user auto-welcomelists are stored in, for the above DSN.
|
||||
|
||||
=cut
|
||||
|
||||
@ -404,15 +432,15 @@ The table user auto-whitelists are stored in, for the above DSN.
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub check_from_in_auto_whitelist {
|
||||
sub check_from_in_auto_welcomelist {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
return 0 unless ($pms->{conf}->{use_auto_whitelist});
|
||||
return 0 unless ($pms->{conf}->{use_auto_welcomelist});
|
||||
|
||||
my $timer = $self->{main}->time_method("total_awl");
|
||||
|
||||
my $from = lc $pms->get('From:addr');
|
||||
# dbg("auto-whitelist: From: $from");
|
||||
# dbg("auto-welcomelist: From: $from");
|
||||
return 0 unless $from =~ /\S/;
|
||||
|
||||
# find the earliest usable "originating IP". ignore private nets
|
||||
@ -443,18 +471,18 @@ sub check_from_in_auto_whitelist {
|
||||
my $awlpoints = (sprintf "%0.3f", $points) + 0;
|
||||
|
||||
# Create the AWL object
|
||||
my $whitelist;
|
||||
my $welcomelist;
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($pms->{main});
|
||||
$welcomelist = Mail::SpamAssassin::AutoWelcomelist->new($pms->{main});
|
||||
|
||||
my $meanscore;
|
||||
{ # check
|
||||
my $timer = $self->{main}->time_method("check_awl");
|
||||
$meanscore = $whitelist->check_address($from, $origip, $signedby);
|
||||
$meanscore = $welcomelist->check_address($from, $origip, $signedby);
|
||||
}
|
||||
my $delta = 0;
|
||||
|
||||
dbg("auto-whitelist: AWL active, pre-score: %s, autolearn score: %s, ".
|
||||
dbg("auto-welcomelist: AWL active, pre-score: %s, autolearn score: %s, ".
|
||||
"mean: %s, IP: %s, address: %s %s",
|
||||
$pms->{score}, $awlpoints,
|
||||
!defined $meanscore ? 'undef' : sprintf("%.3f",$meanscore),
|
||||
@ -463,13 +491,13 @@ sub check_from_in_auto_whitelist {
|
||||
|
||||
if (defined $meanscore) {
|
||||
$delta = $meanscore - $awlpoints;
|
||||
$delta *= $pms->{main}->{conf}->{auto_whitelist_factor};
|
||||
$delta *= $pms->{main}->{conf}->{auto_welcomelist_factor};
|
||||
|
||||
$pms->set_tag('AWL', sprintf("%2.1f",$delta));
|
||||
if (defined $meanscore) {
|
||||
$pms->set_tag('AWLMEAN', sprintf("%2.1f", $meanscore));
|
||||
}
|
||||
$pms->set_tag('AWLCOUNT', sprintf("%2.1f", $whitelist->count()));
|
||||
$pms->set_tag('AWLCOUNT', sprintf("%2.1f", $welcomelist->count()));
|
||||
$pms->set_tag('AWLPRESCORE', sprintf("%2.1f", $pms->{score}));
|
||||
}
|
||||
|
||||
@ -478,7 +506,7 @@ sub check_from_in_auto_whitelist {
|
||||
# later ones. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=159704
|
||||
if (!$pms->{disable_auto_learning}) {
|
||||
my $timer = $self->{main}->time_method("update_awl");
|
||||
$whitelist->add_score($awlpoints);
|
||||
$welcomelist->add_score($awlpoints);
|
||||
}
|
||||
|
||||
# now redundant, got_hit() takes care of it
|
||||
@ -491,135 +519,138 @@ sub check_from_in_auto_whitelist {
|
||||
score => sprintf("%0.3f", $delta));
|
||||
}
|
||||
|
||||
$whitelist->finish();
|
||||
$welcomelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
warn("auto-welcomelist: open of auto-welcomelist file failed: $eval_stat\n");
|
||||
# try an unlock, in case we got that far
|
||||
eval { $whitelist->finish(); } if $whitelist;
|
||||
eval { $welcomelist->finish(); } if $welcomelist;
|
||||
return 0;
|
||||
};
|
||||
|
||||
dbg("auto-whitelist: post auto-whitelist score: %.3f", $pms->{score});
|
||||
dbg("auto-welcomelist: post auto-welcomelist score: %.3f", $pms->{score});
|
||||
|
||||
# test hit is above
|
||||
return 0;
|
||||
}
|
||||
*check_from_in_auto_whitelist = \&check_from_in_auto_welcomelist; # removed in 4.1
|
||||
|
||||
sub blacklist_address {
|
||||
sub blocklist_address {
|
||||
my ($self, $args) = @_;
|
||||
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_whitelist});
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_welcomelist});
|
||||
|
||||
unless ($args->{address}) {
|
||||
print "SpamAssassin auto-whitelist: failed to add address to blacklist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: failed to add address to blacklist");
|
||||
print "SpamAssassin auto-welcomelist: failed to add address to blocklist\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: failed to add address to blocklist");
|
||||
return;
|
||||
}
|
||||
|
||||
my $whitelist;
|
||||
my $welcomelist;
|
||||
my $status;
|
||||
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($self->{main});
|
||||
$welcomelist = Mail::SpamAssassin::AutoWelcomelist->new($self->{main});
|
||||
|
||||
if ($whitelist->add_known_bad_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-whitelist: adding address to blacklist: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: adding address to blacklist: " . $args->{address});
|
||||
if ($welcomelist->add_known_bad_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-welcomelist: adding address to blocklist: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: adding address to blocklist: " . $args->{address});
|
||||
$status = 0;
|
||||
}
|
||||
else {
|
||||
print "SpamAssassin auto-whitelist: error adding address to blacklist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: error adding address to blacklist");
|
||||
print "SpamAssassin auto-welcomelist: error adding address to blocklist\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: error adding address to blocklist");
|
||||
$status = 1;
|
||||
}
|
||||
$whitelist->finish();
|
||||
$welcomelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
eval { $whitelist->finish(); };
|
||||
warn("auto-welcomelist: open of auto-welcomelist file failed: $eval_stat\n");
|
||||
eval { $welcomelist->finish(); };
|
||||
return 0;
|
||||
};
|
||||
|
||||
return $status;
|
||||
}
|
||||
*blacklist_address = \&blocklist_address; # removed in 4.1
|
||||
|
||||
sub whitelist_address {
|
||||
sub welcomelist_address {
|
||||
my ($self, $args) = @_;
|
||||
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_whitelist});
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_welcomelist});
|
||||
|
||||
unless ($args->{address}) {
|
||||
print "SpamAssassin auto-whitelist: failed to add address to whitelist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: failed to add address to whitelist");
|
||||
print "SpamAssassin auto-welcomelist: failed to add address to welcomelist\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: failed to add address to welcomelist");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $whitelist;
|
||||
my $welcomelist;
|
||||
my $status;
|
||||
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($self->{main});
|
||||
$welcomelist = Mail::SpamAssassin::AutoWelcomelist->new($self->{main});
|
||||
|
||||
if ($whitelist->add_known_good_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-whitelist: adding address to whitelist: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: adding address to whitelist: " . $args->{address});
|
||||
if ($welcomelist->add_known_good_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-welcomelist: adding address to welcomelist: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: adding address to welcomelist: " . $args->{address});
|
||||
$status = 1;
|
||||
}
|
||||
else {
|
||||
print "SpamAssassin auto-whitelist: error adding address to whitelist\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: error adding address to whitelist");
|
||||
print "SpamAssassin auto-welcomelist: error adding address to welcomelist\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: error adding address to welcomelist");
|
||||
$status = 0;
|
||||
}
|
||||
|
||||
$whitelist->finish();
|
||||
$welcomelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
eval { $whitelist->finish(); };
|
||||
warn("auto-welcomelist: open of auto-welcomelist file failed: $eval_stat\n");
|
||||
eval { $welcomelist->finish(); };
|
||||
return 0;
|
||||
};
|
||||
|
||||
return $status;
|
||||
}
|
||||
*whitelist_address = \&welcomelist_address; # removed in 4.1
|
||||
|
||||
sub remove_address {
|
||||
my ($self, $args) = @_;
|
||||
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_whitelist});
|
||||
return 0 unless ($self->{main}->{conf}->{use_auto_welcomelist});
|
||||
|
||||
unless ($args->{address}) {
|
||||
print "SpamAssassin auto-whitelist: failed to remove address\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: failed to remove address");
|
||||
print "SpamAssassin auto-welcomelist: failed to remove address\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: failed to remove address");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $whitelist;
|
||||
my $welcomelist;
|
||||
my $status;
|
||||
|
||||
eval {
|
||||
$whitelist = Mail::SpamAssassin::AutoWhitelist->new($self->{main});
|
||||
$welcomelist = Mail::SpamAssassin::AutoWelcomelist->new($self->{main});
|
||||
|
||||
if ($whitelist->remove_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-whitelist: removing address: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: removing address: " . $args->{address});
|
||||
if ($welcomelist->remove_address($args->{address}, $args->{signedby})) {
|
||||
print "SpamAssassin auto-welcomelist: removing address: " . $args->{address} . "\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: removing address: " . $args->{address});
|
||||
$status = 1;
|
||||
}
|
||||
else {
|
||||
print "SpamAssassin auto-whitelist: error removing address\n" if ($args->{cli_p});
|
||||
dbg("auto-whitelist: error removing address");
|
||||
print "SpamAssassin auto-welcomelist: error removing address\n" if ($args->{cli_p});
|
||||
dbg("auto-welcomelist: error removing address");
|
||||
$status = 0;
|
||||
}
|
||||
|
||||
$whitelist->finish();
|
||||
$welcomelist->finish();
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
warn("auto-whitelist: open of auto-whitelist file failed: $eval_stat\n");
|
||||
eval { $whitelist->finish(); };
|
||||
warn("auto-welcomelist: open of auto-welcomelist file failed: $eval_stat\n");
|
||||
eval { $welcomelist->finish(); };
|
||||
return 0;
|
||||
};
|
||||
|
||||
|
@ -72,7 +72,7 @@ sub new {
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule("check_access_database");
|
||||
$self->register_eval_rule("check_access_database", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
@ -64,8 +64,8 @@ sub new {
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule("check_microsoft_executable");
|
||||
$self->register_eval_rule("check_suspect_name");
|
||||
$self->register_eval_rule("check_microsoft_executable", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_suspect_name", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -107,7 +107,7 @@ sub _check_attachments {
|
||||
# file extension indicates an executable
|
||||
$pms->{antivirus_microsoft_exe} = 1;
|
||||
}
|
||||
elsif ($cte =~ /base64/ && defined $p->raw()->[0] &&
|
||||
elsif (index($cte, 'base64') >= 0 && defined $p->raw()->[0] &&
|
||||
$p->raw()->[0] =~ /^TV[opqr].A..[AB].[AQgw][A-H].A/)
|
||||
{
|
||||
# base64-encoded executable
|
||||
|
@ -56,18 +56,30 @@ See the C<Mail::SpamAssassin::Conf> POD for details on C<rbl_timeout>.
|
||||
A query template is a string which will be expanded to produce a domain name
|
||||
to be used in a DNS query. The template may include SpamAssassin tag names,
|
||||
which will be replaced by their values to form a final query domain.
|
||||
|
||||
The final query domain must adhere to rules governing DNS domains, i.e.
|
||||
must consist of fields each up to 63 characters long, delimited by dots.
|
||||
There may be a trailing dot at the end, but it is redundant / carries
|
||||
no semantics, because SpamAssassin uses a Net::DSN::Resolver::send method
|
||||
for querying DNS, which ignores any 'search' or 'domain' DNS resolver options.
|
||||
must consist of fields each up to 63 characters long, delimited by dots,
|
||||
not exceeding 255 characters. International domain names (in UTF-8) are
|
||||
allowed and will be encoded to ASCII-compatible encoding (ACE) according
|
||||
to IDN rules. Syntactically invalid resulting queries will be discarded
|
||||
by the DNS resolver code (with some info warnings).
|
||||
|
||||
There may be a trailing dot at the end, but it is redundant / carries no
|
||||
semantics, because SpamAssassin uses a Net::DSN::Resolver::send method for
|
||||
querying DNS, which ignores any 'search' or 'domain' DNS resolver options.
|
||||
Domain names in DNS queries are case-insensitive.
|
||||
|
||||
A tag name is a string of capital letters, preceded and followed by an
|
||||
underscore character. This syntax mirrors the add_header setting, except that
|
||||
tags cannot have parameters in parenthesis when used in askdns templates.
|
||||
Tag names may appear anywhere in the template - each queried DNS zone
|
||||
prescribes how a query should be formed.
|
||||
underscore character. This syntax mirrors the add_header setting, except
|
||||
that tags cannot have parameters in parenthesis when used in askdns
|
||||
templates (exceptions found below). Tag names may appear anywhere in the
|
||||
template - each queried DNS zone prescribes how a query should be formed.
|
||||
|
||||
Special supported tag HEADER() can be used to query any header content,
|
||||
using same header names/modifiers that as header rules support. For example
|
||||
_HEADER(Reply-To:addr:domain)_ can be used to query the trimmed domain part
|
||||
of Reply-To address. See Mail::SpamAssassin::Conf documentation about
|
||||
header rules.
|
||||
|
||||
A query template may contain any number of tag names including none,
|
||||
although in the most common anticipated scenario exactly one tag name would
|
||||
@ -114,7 +126,8 @@ parameter will only act as a filter on a result.
|
||||
|
||||
Currently recognized RR types in the rr_type parameter are: ANY, A, AAAA,
|
||||
MX, TXT, PTR, NAPTR, NS, SOA, CERT, CNAME, DNAME, DHCID, HINFO, MINFO,
|
||||
RP, HIP, IPSECKEY, KX, LOC, SRV, SSHFP, SPF.
|
||||
RP, HIP, IPSECKEY, KX, LOC, GPOS, SRV, OPENPGPKEY, SSHFP, SPF, TLSA, URI,
|
||||
CAA, CSYNC.
|
||||
|
||||
https://www.iana.org/assignments/dns-parameters/dns-parameters.xml
|
||||
|
||||
@ -189,8 +202,10 @@ use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Util qw(decode_dns_question_entry);
|
||||
use Mail::SpamAssassin::Util qw(decode_dns_question_entry idn_to_ascii
|
||||
compile_regexp is_fqdn_valid);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use version 0.77;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
@ -202,8 +217,6 @@ our %rcode_value = ( # https://www.iana.org/assignments/dns-parameters, RFC 619
|
||||
BADMODE => 19, BADNAME => 20, BADALG => 21, BADTRUNC => 22,
|
||||
);
|
||||
|
||||
our $txtdata_can_provide_a_list;
|
||||
|
||||
sub new {
|
||||
my($class,$sa_main) = @_;
|
||||
|
||||
@ -213,10 +226,6 @@ sub new {
|
||||
|
||||
$self->set_config($sa_main->{conf});
|
||||
|
||||
#$txtdata_can_provide_a_list = Net::DNS->VERSION >= 0.69;
|
||||
#more robust version check from Damyan Ivanov - Bug 7095
|
||||
$txtdata_can_provide_a_list = version->parse(Net::DNS->VERSION) >= version->parse('0.69');
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
@ -253,20 +262,14 @@ sub parse_and_canonicalize_subtest {
|
||||
my $result;
|
||||
|
||||
local($1,$2,$3);
|
||||
# modifiers /a, /d, /l, /u in suffix form were added with perl 5.13.10 (5.14)
|
||||
# currently known modifiers are [msixoadlu], but let's not be too picky here
|
||||
if ( $subtest =~ m{^ / (.+) / ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* \( (.+) \) ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* \[ (.+) \] ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* \{ (.+) \} ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* < (.+) > ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
} elsif ($subtest =~ m{^ m \s* (\S) (.+) \1 ([a-z]*) \z}xs) {
|
||||
$result = $2 ne '' ? qr{(?$2)$1} : qr{$1};
|
||||
if ($subtest =~ m{^/ .+ / [a-z]* \z}xs ||
|
||||
$subtest =~ m{^m (\W) .+ (\W) [a-z]* \z}xs) {
|
||||
my ($rec, $err) = compile_regexp($subtest, 1);
|
||||
if (!$rec) {
|
||||
warn "askdns: subtest compile failed: '$subtest': $err\n";
|
||||
} else {
|
||||
$result = $rec;
|
||||
}
|
||||
} elsif ($subtest =~ m{^ (["']) (.*) \1 \z}xs) { # quoted string
|
||||
$result = $2;
|
||||
} elsif ($subtest =~ m{^ \[ ( (?:[A-Z]+|\d+)
|
||||
@ -324,8 +327,9 @@ sub set_config {
|
||||
my @answer_types = split(/,/, $query_type);
|
||||
# https://www.iana.org/assignments/dns-parameters/dns-parameters.xml
|
||||
if (grep(!/^(?:ANY|A|AAAA|MX|TXT|PTR|NAPTR|NS|SOA|CERT|CNAME|DNAME|
|
||||
DHCID|HINFO|MINFO|RP|HIP|IPSECKEY|KX|LOC|SRV|
|
||||
SSHFP|SPF)\z/x, @answer_types)) {
|
||||
DHCID|HINFO|MINFO|RP|HIP|IPSECKEY|KX|LOC|GPOS|SRV|
|
||||
OPENPGPKEY|SSHFP|SPF|TLSA|URI|CAA|CSYNC)\z/x,
|
||||
@answer_types)) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$query_type = 'ANY' if @answer_types > 1 || $answer_types[0] eq 'ANY';
|
||||
@ -333,25 +337,19 @@ sub set_config {
|
||||
$subtest = parse_and_canonicalize_subtest($subtest);
|
||||
defined $subtest or return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
|
||||
# initialize rule structure
|
||||
$self->{askdns}{$rulename}{query} = $query_template;
|
||||
$self->{askdns}{$rulename}{q_type} = $query_type;
|
||||
$self->{askdns}{$rulename}{a_types} = \@answer_types;
|
||||
$self->{askdns}{$rulename}{subtest} = $subtest;
|
||||
$self->{askdns}{$rulename}{tags} = ();
|
||||
|
||||
# collect tag names as used in each query template
|
||||
my @tags = $query_template =~ /_([A-Z][A-Z0-9]*)_/g;
|
||||
my %seen; @tags = grep(!$seen{$_}++, @tags); # filter out duplicates
|
||||
|
||||
# group rules by tag names used in them (to be used as a hash key)
|
||||
my $depends_on_tags = !@tags ? '' : join(',',@tags);
|
||||
|
||||
# subgroup rules by a DNS RR type and a nonexpanded query template
|
||||
my $query_template_key = $query_type . ':' . $query_template;
|
||||
|
||||
$self->{askdns}{$depends_on_tags}{$query_template_key} ||=
|
||||
{ query => $query_template, rules => {}, q_type => $query_type,
|
||||
a_types => # optimization: undef means "same as q_type"
|
||||
@answer_types == 1 && $answer_types[0] eq $query_type ? undef
|
||||
: \@answer_types };
|
||||
$self->{askdns}{$depends_on_tags}{$query_template_key}{rules}{$rulename}
|
||||
= $subtest;
|
||||
# dbg("askdns: rule: %s, config dep: %s, domkey: %s, subtest: %s",
|
||||
# $rulename, $depends_on_tags, $query_template_key, $subtest);
|
||||
# also support common HEADER(arg) tag which does $pms->get(arg)
|
||||
my @tags = $query_template =~ /_([A-Z][A-Z0-9]*(?:_[A-Z0-9]+)*(?:\(.*?\))?)_/g;
|
||||
# save rule to tag dependencies
|
||||
$self->{askdns}{$rulename}{tags}{$_} = 1 foreach (@tags);
|
||||
|
||||
# just define the test so that scores and lint works
|
||||
$self->{parser}->add_test($rulename, undef,
|
||||
@ -366,182 +364,127 @@ sub set_config {
|
||||
# run as early as possible, launching DNS queries as soon as their
|
||||
# dependencies are fulfilled
|
||||
#
|
||||
sub parsed_metadata {
|
||||
sub check_dnsbl {
|
||||
my($self, $opts) = @_;
|
||||
|
||||
my $pms = $opts->{permsgstatus};
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
return if !$pms->is_dns_available;
|
||||
$pms->{askdns_map_dnskey_to_rules} = {};
|
||||
return if !$pms->is_dns_available();
|
||||
|
||||
# walk through all collected askdns rules, obtain tag values whenever
|
||||
# they may become available, and launch DNS queries right after
|
||||
#
|
||||
for my $depends_on_tags (keys %{$conf->{askdns}}) {
|
||||
my @tags;
|
||||
@tags = split(/,/, $depends_on_tags) if $depends_on_tags ne '';
|
||||
|
||||
if (would_log("dbg","askdns")) {
|
||||
while ( my($query_template_key, $struct) =
|
||||
each %{$conf->{askdns}{$depends_on_tags}} ) {
|
||||
my($query_template, $query_type, $answer_types_ref, $rules) =
|
||||
@$struct{qw(query q_type a_types rules)};
|
||||
dbg("askdns: depend on tags %s, rules: %s ",
|
||||
$depends_on_tags, join(', ', keys %$rules));
|
||||
foreach my $rulename (keys %{$conf->{askdns}}) {
|
||||
if (!$conf->{scores}->{$rulename}) {
|
||||
dbg("askdns: skipping disabled rule $rulename");
|
||||
next;
|
||||
}
|
||||
}
|
||||
|
||||
if (!@tags) {
|
||||
# no dependencies on tags, just call directly
|
||||
$self->launch_queries($pms,$depends_on_tags);
|
||||
} else {
|
||||
# enqueue callback for tags needed
|
||||
my @tags = sort keys %{$conf->{askdns}{$rulename}{tags}};
|
||||
if (@tags) {
|
||||
dbg("askdns: rule %s depends on tags: %s", $rulename,
|
||||
join(', ', @tags));
|
||||
$pms->action_depends_on_tags(@tags == 1 ? $tags[0] : \@tags,
|
||||
sub { my($pms,@args) = @_;
|
||||
$self->launch_queries($pms,$depends_on_tags) }
|
||||
$self->launch_queries($pms,$rulename,\@tags) }
|
||||
);
|
||||
} else {
|
||||
# no dependencies on tags, just call directly
|
||||
$self->launch_queries($pms,$rulename,[]);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# generate DNS queries - called for each set of rules
|
||||
# when their tag dependencies are met
|
||||
# generate DNS queries - called for each rule when its tag dependencies
|
||||
# are met
|
||||
#
|
||||
sub launch_queries {
|
||||
my($self, $pms, $depends_on_tags) = @_;
|
||||
my $conf = $pms->{conf};
|
||||
my($self, $pms, $rulename, $tags) = @_;
|
||||
|
||||
my %tags;
|
||||
# obtain tag/value pairs of tags we depend upon in this set of rules
|
||||
if ($depends_on_tags ne '') {
|
||||
%tags = map( ($_,$pms->get_tag($_)), split(/,/,$depends_on_tags) );
|
||||
my $arule = $pms->{conf}->{askdns}{$rulename};
|
||||
my $query_tmpl = $arule->{query};
|
||||
my $queries;
|
||||
if (@$tags) {
|
||||
if (!exists $pms->{askdns_qtmpl_cache}{$query_tmpl}) {
|
||||
# replace tags in query template
|
||||
# iterate through each tag, replacing list of strings as we go
|
||||
my %q_iter = ( "$query_tmpl" => 1 );
|
||||
foreach my $tag (@$tags) {
|
||||
# cache tag values locally
|
||||
if (!exists $pms->{askdns_tag_cache}{$tag}) {
|
||||
my $valref = $pms->get_tag_raw($tag);
|
||||
my @vals = grep { defined $_ && $_ ne '' } (ref $valref ? @$valref : $valref);
|
||||
# Paranoid check for undefined tag
|
||||
if (!@vals) {
|
||||
dbg("askdns: skipping rule $rulename, no value found for tag: $tag");
|
||||
return;
|
||||
}
|
||||
dbg("askdns: preparing queries which depend on tags: %s",
|
||||
join(', ', map($_.' => '.$tags{$_}, keys %tags)));
|
||||
|
||||
# replace tag names in a query template with actual tag values
|
||||
# and launch DNS queries
|
||||
while ( my($query_template_key, $struct) =
|
||||
each %{$conf->{askdns}{$depends_on_tags}} ) {
|
||||
my($query_template, $query_type, $answer_types_ref, $rules) =
|
||||
@$struct{qw(query q_type a_types rules)};
|
||||
|
||||
my @rulenames = keys %$rules;
|
||||
if (grep($conf->{scores}->{$_}, @rulenames)) {
|
||||
dbg("askdns: query template %s, type %s, rules: %s",
|
||||
$query_template,
|
||||
!$answer_types_ref ? $query_type
|
||||
: $query_type.'/'.join(',',@$answer_types_ref),
|
||||
join(', ', @rulenames));
|
||||
$pms->{askdns_tag_cache}{$tag} = \@vals;
|
||||
}
|
||||
my %q_iter_new;
|
||||
foreach my $q (keys %q_iter) {
|
||||
# handle space separated multi-valued tags
|
||||
foreach my $val (@{$pms->{askdns_tag_cache}{$tag}}) {
|
||||
my $qtmp = $q;
|
||||
$qtmp =~ s/\Q_${tag}_\E/${val}/g;
|
||||
$q_iter_new{$qtmp} = 1;
|
||||
}
|
||||
}
|
||||
%q_iter = %q_iter_new;
|
||||
}
|
||||
# cache idn'd queries
|
||||
my @q_arr;
|
||||
push @q_arr, idn_to_ascii($_) foreach (keys %q_iter);
|
||||
$pms->{askdns_qtmpl_cache}{$query_tmpl} = \@q_arr;
|
||||
}
|
||||
$queries = $pms->{askdns_qtmpl_cache}{$query_tmpl};
|
||||
} else {
|
||||
dbg("askdns: query template %s, type %s, all rules disabled: %s",
|
||||
$query_template, $query_type, join(', ', @rulenames));
|
||||
push @$queries, idn_to_ascii($query_tmpl);
|
||||
}
|
||||
|
||||
foreach my $query (@$queries) {
|
||||
if (!is_fqdn_valid($query, 1)) {
|
||||
dbg("askdns: skipping invalid query ($rulename): $query");
|
||||
next;
|
||||
}
|
||||
|
||||
# collect all tag names from a template, each may occur more than once
|
||||
my @templ_tags = $query_template =~ /_([A-Z][A-Z0-9]*)_/gs;
|
||||
|
||||
# filter out duplicate tag names, and tags with undefined or empty value
|
||||
my %seen;
|
||||
@templ_tags = grep(!$seen{$_}++ && defined $tags{$_} && $tags{$_} ne '',
|
||||
@templ_tags);
|
||||
|
||||
my %templ_vals; # values that each tag takes
|
||||
for my $t (@templ_tags) {
|
||||
my %seen;
|
||||
# a tag value may be a space-separated list,
|
||||
# store it as an arrayref, removing duplicate values
|
||||
$templ_vals{$t} = [ grep(!$seen{$_}++, split(' ',$tags{$t})) ];
|
||||
}
|
||||
|
||||
# count through all tag value tuples
|
||||
my @digit = (0) x @templ_tags; # counting accumulator
|
||||
OUTER:
|
||||
for (;;) {
|
||||
my %current_tag_val; # maps a tag name to its current iteration value
|
||||
for my $j (0 .. $#templ_tags) {
|
||||
my $t = $templ_tags[$j];
|
||||
$current_tag_val{$t} = $templ_vals{$t}[$digit[$j]];
|
||||
}
|
||||
local $1;
|
||||
my $query_domain = $query_template;
|
||||
$query_domain =~ s{_([A-Z][A-Z0-9]*)_}
|
||||
{ defined $current_tag_val{$1} ? $current_tag_val{$1}
|
||||
: '' }ge;
|
||||
|
||||
# the $dnskey identifies this query in AsyncLoop's pending_lookups
|
||||
my $dnskey = join(':', 'askdns', $query_type, $query_domain);
|
||||
dbg("askdns: expanded query %s, dns key %s", $query_domain, $dnskey);
|
||||
|
||||
if ($query_domain eq '') {
|
||||
# ignore, just in case
|
||||
} else {
|
||||
if (!exists $pms->{askdns_map_dnskey_to_rules}{$dnskey}) {
|
||||
$pms->{askdns_map_dnskey_to_rules}{$dnskey} =
|
||||
[ [$query_type, $answer_types_ref, $rules] ];
|
||||
} else {
|
||||
push(@{$pms->{askdns_map_dnskey_to_rules}{$dnskey}},
|
||||
[$query_type, $answer_types_ref, $rules] );
|
||||
}
|
||||
# launch a new DNS query for $query_type and $query_domain
|
||||
my $ent = $pms->{async}->bgsend_and_start_lookup(
|
||||
$query_domain, $query_type, undef,
|
||||
{ key => $dnskey, zone => $query_domain },
|
||||
sub { my ($ent2,$pkt) = @_;
|
||||
$self->process_response_packet($pms, $ent2, $pkt, $dnskey) },
|
||||
master_deadline => $pms->{master_deadline} );
|
||||
# these rules are now underway; unless the rule hits, these will
|
||||
# not be considered "finished" until harvest_dnsbl_queries() completes
|
||||
$pms->register_async_rule_start($dnskey) if $ent;
|
||||
}
|
||||
|
||||
last if !@templ_tags;
|
||||
# increment accumulator, little-endian
|
||||
for (my $j = 0; ; $j++) {
|
||||
last if ++$digit[$j] <= $#{$templ_vals{$templ_tags[$j]}};
|
||||
$digit[$j] = 0; # and carry
|
||||
last OUTER if $j >= $#templ_tags;
|
||||
}
|
||||
}
|
||||
dbg("askdns: launching query ($rulename): $query");
|
||||
my $ret = $pms->{async}->bgsend_and_start_lookup(
|
||||
$query, $arule->{q_type}, undef,
|
||||
{ rulename => $rulename, type => 'AskDNS' },
|
||||
sub { my ($ent,$pkt) = @_;
|
||||
$self->process_response_packet($pms, $ent, $pkt, $rulename) },
|
||||
master_deadline => $pms->{master_deadline}
|
||||
);
|
||||
$pms->rule_ready($rulename) if !$ret; # mark ready if nothing launched
|
||||
}
|
||||
}
|
||||
|
||||
sub process_response_packet {
|
||||
my($self, $pms, $ent, $pkt, $dnskey) = @_;
|
||||
my($self, $pms, $ent, $pkt, $rulename) = @_;
|
||||
|
||||
my $conf = $pms->{conf};
|
||||
my %rulenames_hit;
|
||||
|
||||
# map a dnskey back to info on queries which caused this DNS lookup
|
||||
my $queries_ref = $pms->{askdns_map_dnskey_to_rules}{$dnskey};
|
||||
|
||||
my($header, @question, @answer, $qtype, $rcode);
|
||||
# NOTE: $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
if ($pkt) {
|
||||
@answer = $pkt->answer;
|
||||
$header = $pkt->header;
|
||||
@question = $pkt->question;
|
||||
$qtype = uc $question[0]->qtype if @question;
|
||||
$rcode = uc $header->rcode if $header; # 'NOERROR', 'NXDOMAIN', ...
|
||||
return if !$pkt;
|
||||
|
||||
my @question = $pkt->question;
|
||||
return if !@question;
|
||||
|
||||
$pms->rule_ready($rulename); # mark rule ready for metas
|
||||
|
||||
my @answer = $pkt->answer;
|
||||
my $rcode = uc $pkt->header->rcode; # 'NOERROR', 'NXDOMAIN', ...
|
||||
|
||||
# NOTE: qname is encoded in RFC 1035 zone format, decode it
|
||||
dbg("askdns: answer received, rcode %s, query %s, answer has %d records",
|
||||
$rcode,
|
||||
dbg("askdns: answer received (%s), rcode %s, query %s, answer has %d records",
|
||||
$rulename, $rcode,
|
||||
join(', ', map(join('/', decode_dns_question_entry($_)), @question)),
|
||||
scalar @answer);
|
||||
|
||||
if (defined $rcode && exists $rcode_value{$rcode}) {
|
||||
# Net::DNS return a rcode name for codes it knows about,
|
||||
# and returns a number for the rest; we deal with numbers from here on
|
||||
$rcode = $rcode_value{$rcode} if exists $rcode_value{$rcode};
|
||||
}
|
||||
}
|
||||
if (!@answer) {
|
||||
|
||||
# a trick to make the following loop run at least once, so that we can
|
||||
# evaluate also rules which only care for rcode status
|
||||
@answer = ( undef );
|
||||
}
|
||||
@answer = (undef) if !@answer;
|
||||
|
||||
# NOTE: $rr->rdstring returns the result encoded in a DNS zone file
|
||||
# format, i.e. enclosed in double quotes if a result contains whitespace
|
||||
@ -561,9 +504,12 @@ sub process_response_packet {
|
||||
# the code handling such reply from DNS MUST assemble all of these
|
||||
# marshaled text blocks into a single one before any syntactical
|
||||
# verification takes place.
|
||||
# The same goes for RFC 4408 (SPF), RFC 4871 (DKIM), RFC 5617 (ADSP),
|
||||
# The same goes for RFC 7208 (SPF), RFC 4871 (DKIM), RFC 5617 (ADSP),
|
||||
# draft-kucherawy-dmarc-base (DMARC), ...
|
||||
|
||||
my $arule = $pms->{conf}->{askdns}{$rulename};
|
||||
my $subtest = $arule->{subtest};
|
||||
|
||||
for my $rr (@answer) {
|
||||
my($rr_rdatastr, $rdatanum, $rr_type);
|
||||
if (!$rr) {
|
||||
@ -571,37 +517,32 @@ sub process_response_packet {
|
||||
} else {
|
||||
$rr_type = uc $rr->type;
|
||||
if ($rr_type eq 'A') {
|
||||
# Net::DNS::RR::A::address() is available since Net::DNS 0.69
|
||||
$rr_rdatastr = $rr->UNIVERSAL::can('address') ? $rr->address
|
||||
: $rr->rdatastr;
|
||||
$rr_rdatastr = $rr->address;
|
||||
if ($rr_rdatastr =~ m/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\z/) {
|
||||
$rdatanum = Mail::SpamAssassin::Util::my_inet_aton($rr_rdatastr);
|
||||
}
|
||||
|
||||
} elsif ($rr->UNIVERSAL::can('txtdata')) {
|
||||
# TXT, SPF: join with no intervening spaces, as per RFC 5518
|
||||
if ($txtdata_can_provide_a_list || $rr_type ne 'TXT') {
|
||||
$rr_rdatastr = join('', $rr->txtdata); # txtdata() in list context!
|
||||
} else { # char_str_list() is only available for TXT records
|
||||
$rr_rdatastr = join('', $rr->char_str_list); # historical
|
||||
}
|
||||
# Net::DNS attempts to decode text strings in a TXT record as UTF-8,
|
||||
# which is undesired: octets failing the UTF-8 decoding are converted
|
||||
# to a Unicode "replacement character" U+FFFD (encoded as octets
|
||||
# \x{EF}\x{BF}\x{BD} in UTF-8), and ASCII text is unnecessarily
|
||||
# flagged as perl native characters (utf8 flag on), which can be
|
||||
# disruptive on later processing, e.g. implicitly upgrading strings
|
||||
# on concatenation. Unfortunately there is no way of legally bypassing
|
||||
# the UTF-8 decoding by Net::DNS::RR::TXT in Net::DNS::RR::Text.
|
||||
# Try to minimize damage by encoding back to UTF-8 octets:
|
||||
utf8::encode($rr_rdatastr) if utf8::is_utf8($rr_rdatastr);
|
||||
|
||||
} else {
|
||||
# rdatastr() is historical, use rdstring() since Net::DNS 0.69
|
||||
$rr_rdatastr = $rr->UNIVERSAL::can('rdstring') ? $rr->rdstring
|
||||
: $rr->rdatastr;
|
||||
$rr_rdatastr = $rr->rdstring;
|
||||
utf8::encode($rr_rdatastr) if utf8::is_utf8($rr_rdatastr);
|
||||
}
|
||||
# dbg("askdns: received rr type %s, data: %s", $rr_type, $rr_rdatastr);
|
||||
}
|
||||
|
||||
for my $q_tuple (!ref $queries_ref ? () : @$queries_ref) {
|
||||
next if !$q_tuple;
|
||||
my($query_type, $answer_types_ref, $rules) = @$q_tuple;
|
||||
|
||||
next if !defined $qtype;
|
||||
$answer_types_ref = [$query_type] if !defined $answer_types_ref;
|
||||
|
||||
while (my($rulename,$subtest) = each %$rules) {
|
||||
my $match;
|
||||
local($1,$2,$3);
|
||||
if (ref $subtest eq 'HASH') { # a list of DNS rcodes (as hash keys)
|
||||
@ -609,7 +550,7 @@ sub process_response_packet {
|
||||
} elsif ($rcode != 0) {
|
||||
# skip remaining tests on DNS error
|
||||
} elsif (!defined($rr_type) ||
|
||||
!grep($_ eq 'ANY' || $_ eq $rr_type, @$answer_types_ref) ) {
|
||||
!grep($_ eq 'ANY' || $_ eq $rr_type, @{$arule->{a_types}}) ) {
|
||||
# skip remaining tests on wrong RR type
|
||||
} elsif (!defined $subtest) {
|
||||
$match = 1; # any valid response of the requested RR type matches
|
||||
@ -628,15 +569,10 @@ sub process_response_packet {
|
||||
: 0; # notice int($n1) to fix perl ~5.14 taint bug (Bug 7725)
|
||||
}
|
||||
if ($match) {
|
||||
$self->askdns_hit($pms, $ent->{query_domain}, $qtype,
|
||||
$self->askdns_hit($pms, $ent->{query_domain}, $question[0]->qtype,
|
||||
$rr_rdatastr, $rulename);
|
||||
$rulenames_hit{$rulename} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
# these rules have completed (since they got at least 1 hit)
|
||||
$pms->register_async_rule_finish($_) for keys %rulenames_hit;
|
||||
}
|
||||
|
||||
sub askdns_hit {
|
||||
@ -648,9 +584,11 @@ sub askdns_hit {
|
||||
|
||||
# only the first hit will show in the test log report, even if
|
||||
# an answer section matches more than once - got_hit() handles this
|
||||
$pms->clear_test_state;
|
||||
$pms->test_log(sprintf("%s %s:%s", $query_domain,$qtype,$rr_rdatastr));
|
||||
$pms->test_log(sprintf("%s %s:%s", $query_domain,$qtype,$rr_rdatastr), $rulename);
|
||||
$pms->got_hit($rulename, 'ASKDNS: ', ruletype => 'askdns'); # score=>$score
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_tag_header { 1 } # HEADER() was implemented together with Conf::feature_get_host # Bug 7734
|
||||
|
||||
1;
|
||||
|
553
upstream/lib/Mail/SpamAssassin/Plugin/AuthRes.pm
Normal file
553
upstream/lib/Mail/SpamAssassin/Plugin/AuthRes.pm
Normal file
@ -0,0 +1,553 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::AuthRes - use Authentication-Results header fields
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
=head2 SpamAssassin configuration:
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::AuthRes
|
||||
|
||||
authres_trusted_authserv myserv.example.com
|
||||
authres_networks all
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin parses Authentication-Results header fields and can supply the
|
||||
results obtained to other plugins, so as to avoid repeating checks that have
|
||||
been performed already.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::AuthRes;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# list of valid methods and values
|
||||
# https://www.iana.org/assignments/email-auth/email-auth.xhtml
|
||||
# some others not in that list:
|
||||
# dkim-atps=neutral
|
||||
my %method_result = (
|
||||
'auth' => {'fail'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'temperror'=>1},
|
||||
'dkim' => {'fail'=>1,'neutral'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'policy'=>1,'temperror'=>1},
|
||||
'dkim-adsp' => {'discard'=>1,'fail'=>1,'none'=>1,'nxdomain'=>1,'pass'=>1,'permerror'=>1,'temperror'=>1,'unknown'=>1},
|
||||
'dkim-atps' => {'fail'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'temperror'=>1,'neutral'=>1},
|
||||
'dmarc' => {'fail'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'temperror'=>1},
|
||||
'domainkeys' => {'fail'=>1,'neutral'=>1,'none'=>1,'permerror'=>1,'policy'=>1,'pass'=>1,'temperror'=>1},
|
||||
'iprev' => {'fail'=>1,'pass'=>1,'permerror'=>1,'temperror'=>1},
|
||||
'rrvs' => {'fail'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'temperror'=>1,'unknown'=>1},
|
||||
'sender-id' => {'fail'=>1,'hardfail'=>1,'neutral'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'policy'=>1,'softfail'=>1,'temperror'=>1},
|
||||
'smime' => {'fail'=>1,'neutral'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'policy'=>1,'temperror'=>1},
|
||||
'spf' => {'fail'=>1,'hardfail'=>1,'neutral'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'policy'=>1,'softfail'=>1,'temperror'=>1},
|
||||
'vbr' => {'fail'=>1,'none'=>1,'pass'=>1,'permerror'=>1,'temperror'=>1},
|
||||
);
|
||||
my %method_ptype_prop = (
|
||||
'auth' => {'smtp' => {'auth'=>1,'mailfrom'=>1}},
|
||||
'dkim' => {'header' => {'d'=>1,'i'=>1,'b'=>1}},
|
||||
'dkim-adsp' => {'header' => {'from'=>1}},
|
||||
'dkim-atps' => {'header' => {'from'=>1}},
|
||||
'dmarc' => {'header' => {'from'=>1}},
|
||||
'domainkeys' => {'header' => {'d'=>1,'from'=>1,'sender'=>1}},
|
||||
'iprev' => {'policy' => {'iprev'=>1}},
|
||||
'rrvs' => {'smtp' => {'rcptto'=>1}},
|
||||
'sender-id' => {'header' => {'*'=>1}},
|
||||
'smime' => {'body' => {'smime-part'=>1,'smime-identifer'=>1,'smime-serial'=>1,'smime-issuer'=>1}},
|
||||
'spf' => {'smtp' => {'mailfrom'=>1,'helo'=>1}},
|
||||
'vbr' => {'header' => {'md'=>1,'mv'=>1}},
|
||||
);
|
||||
|
||||
# Some MIME helpers
|
||||
my $QUOTED_STRING = qr/"((?:[^"\\]++|\\.)*+)"?/;
|
||||
my $TOKEN = qr/[^\s\x00-\x1f\x80-\xff\(\)\<\>\@\,\;\:\/\[\]\?\=\"]+/;
|
||||
my $ATOM = qr/[a-zA-Z0-9\@\!\#\$\%\&\\\'\*\+\-\/\=\?\^\_\`\{\|\}\~]+/;
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
|
||||
# the usual perlobj boilerplate to create a subclass object
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsa->{conf});
|
||||
|
||||
# process first as other plugins might depend on us
|
||||
$self->register_method_priority("parsed_metadata", -10);
|
||||
|
||||
$self->register_eval_rule("check_authres_result", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
=head1 ADMINISTRATOR OPTIONS
|
||||
|
||||
=over
|
||||
|
||||
=item authres_networks internal/trusted/all (default: internal)
|
||||
|
||||
Process Authenticated-Results headers set by servers from these networks
|
||||
(refers to SpamAssassin *_networks zones). Any header outside this is
|
||||
completely ignored (affects all module settings).
|
||||
|
||||
internal = internal_networks
|
||||
trusted = internal_networks + trusted_networks
|
||||
all = all above + all external
|
||||
|
||||
Setting "all" is safe only if your MX servers filter properly all incoming
|
||||
A-R headers, and you use authres_trusted_authserv to match your authserv-id.
|
||||
This is suitable for default OpenDKIM for example. These settings might
|
||||
also be required if your filters do not insert A-R header to correct
|
||||
position above the internal Received header (some known offenders: OpenDKIM,
|
||||
OpenDMARC, amavisd-milter).
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'authres_networks',
|
||||
is_admin => 1,
|
||||
default => 'internal',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if (!defined $value || $value =~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
$value = lc($value);
|
||||
if ($value =~ /^(?:internal|trusted|all)$/) {
|
||||
$self->{authres_networks} = $value;
|
||||
} else {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
=over
|
||||
|
||||
=item authres_trusted_authserv authservid1 id2 ... (default: none)
|
||||
|
||||
Trusted authentication server IDs (the domain-name-like first word of
|
||||
Authentication-Results field, also known as C<authserv-id>).
|
||||
|
||||
Note that if set, ALL A-R headers are ignored unless a match is found.
|
||||
|
||||
Use strongly recommended, possibly along with authres_networks all.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'authres_trusted_authserv',
|
||||
is_admin => 1,
|
||||
default => {},
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if (!defined $value || $value =~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
foreach my $id (split(/\s+/, lc $value)) {
|
||||
$self->{authres_trusted_authserv}->{$id} = 1;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
=over
|
||||
|
||||
=item authres_ignored_authserv authservid1 id2 ... (default: none)
|
||||
|
||||
Ignored authentication server IDs (the domain-name-like first word of
|
||||
Authentication-Results field, also known as C<authserv-id>).
|
||||
|
||||
Any A-R header is ignored if match is found.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'authres_ignored_authserv',
|
||||
is_admin => 1,
|
||||
default => {},
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if (!defined $value || $value =~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
foreach my $id (split(/\s+/, lc $value)) {
|
||||
$self->{authres_ignored_authserv}->{$id} = 1;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
=head1 METADATA
|
||||
|
||||
Parsed headers are stored in $pms-E<gt>{authres_parsed}, as a hash of array
|
||||
of hashes where results are collected by method. For example, the header
|
||||
field:
|
||||
|
||||
Authentication-Results: server.example.com;
|
||||
spf=pass smtp.mailfrom=bounce.example.org;
|
||||
dkim=pass header.i=@example.org;
|
||||
dkim=fail header.i=@another.signing.domain.example
|
||||
|
||||
Produces the following structure:
|
||||
|
||||
$pms->{authres_parsed} = {
|
||||
'dkim' => [
|
||||
{
|
||||
'properties' => {
|
||||
'header' => {
|
||||
'i' => '@example.org'
|
||||
}
|
||||
},
|
||||
'authserv' => 'server.example.com',
|
||||
'result' => 'pass',
|
||||
'version' => 1,
|
||||
'reason' => ''
|
||||
},
|
||||
{
|
||||
'properties' => {
|
||||
'header' => {
|
||||
'i' => '@another.signing.domain.example'
|
||||
}
|
||||
},
|
||||
'result' => 'fail',
|
||||
'authserv' => 'server.example.com',
|
||||
'version' => 1,
|
||||
'reason' => ''
|
||||
},
|
||||
],
|
||||
}
|
||||
|
||||
Within each array, the order of results is the original, which should be most
|
||||
recent results first.
|
||||
|
||||
For checking result of methods, $pms-E<gt>{authres_result} is available:
|
||||
|
||||
$pms->{authres_result} = {
|
||||
'dkim' => 'pass',
|
||||
'spf' => 'fail',
|
||||
}
|
||||
|
||||
=head1 EVAL FUNCTIONS
|
||||
|
||||
=over 4
|
||||
|
||||
=item header RULENAME eval:check_authres_result(method, result)
|
||||
|
||||
Can be used to check results.
|
||||
|
||||
ifplugin Mail::SpamAssassin::Plugin::AuthRes
|
||||
ifplugin !(Mail::SpamAssassin::Plugin::SPF)
|
||||
header SPF_PASS eval:check_authres_result('spf', 'pass')
|
||||
header SPF_FAIL eval:check_authres_result('spf', 'fail')
|
||||
header SPF_SOFTFAIL eval:check_authres_result('spf', 'softfail')
|
||||
header SPF_TEMPFAIL eval:check_authres_result('spf', 'tempfail')
|
||||
endif
|
||||
ifplugin !(Mail::SpamAssassin::Plugin::DKIM)
|
||||
header DKIM_VERIFIED eval:check_authres_result('dkim', 'pass')
|
||||
header DKIM_INVALID eval:check_authres_result('dkim', 'fail')
|
||||
endif
|
||||
endif
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub check_authres_result {
|
||||
my ($self, $pms, $method, $wanted_result) = @_;
|
||||
|
||||
my $result = $pms->{authres_result}->{$method};
|
||||
$wanted_result = lc($wanted_result);
|
||||
|
||||
if ($wanted_result eq 'missing') {
|
||||
return !defined($result) ? 1 : 0;
|
||||
}
|
||||
|
||||
return ($wanted_result eq $result);
|
||||
}
|
||||
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $pms = $opts->{permsgstatus};
|
||||
|
||||
my @authres;
|
||||
my $nethdr;
|
||||
|
||||
if ($pms->{conf}->{authres_networks} eq 'internal') {
|
||||
$nethdr = 'ALL-INTERNAL';
|
||||
} elsif ($pms->{conf}->{authres_networks} eq 'trusted') {
|
||||
$nethdr = 'ALL-TRUSTED';
|
||||
} else {
|
||||
$nethdr = 'ALL';
|
||||
}
|
||||
|
||||
foreach my $hdr (split(/^/m, $pms->get($nethdr))) {
|
||||
if ($hdr =~ /^(?:Arc\-)?Authentication-Results:\s*(.+)/i) {
|
||||
push @authres, $1;
|
||||
}
|
||||
}
|
||||
|
||||
if (!@authres) {
|
||||
dbg("authres: no Authentication-Results headers found from %s",
|
||||
$pms->{conf}->{authres_networks});
|
||||
return 0;
|
||||
}
|
||||
|
||||
foreach (@authres) {
|
||||
eval {
|
||||
$self->parse_authres($pms, $_);
|
||||
} or do {
|
||||
dbg("authres: skipping header, $@");
|
||||
}
|
||||
}
|
||||
|
||||
$pms->{authres_result} = {};
|
||||
# Set $pms->{authres_result} info for all found methods
|
||||
# 'pass' will always win if multiple results
|
||||
foreach my $method (keys %method_result) {
|
||||
my $parsed = $pms->{authres_parsed}->{$method};
|
||||
next if !$parsed;
|
||||
foreach my $pref (@$parsed) {
|
||||
if (!$pms->{authres_result}->{$method} ||
|
||||
$pref->{result} eq 'pass')
|
||||
{
|
||||
$pms->{authres_result}->{$method} = $pref->{result};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (%{$pms->{authres_result}}) {
|
||||
dbg("authres: results: %s",
|
||||
join(' ', map { $_.'='.$pms->{authres_result}->{$_} }
|
||||
sort keys %{$pms->{authres_result}}));
|
||||
} else {
|
||||
dbg("authres: no results");
|
||||
}
|
||||
}
|
||||
|
||||
sub parse_authres {
|
||||
my ($self, $pms, $hdr) = @_;
|
||||
|
||||
dbg("authres: parsing Authentication-Results: $hdr");
|
||||
|
||||
my $authserv;
|
||||
my $version = 1;
|
||||
my @methods;
|
||||
|
||||
local $_ = $hdr;
|
||||
|
||||
# authserv-id
|
||||
if (!/\G($TOKEN)/gcs) {
|
||||
die("invalid authserv\n");
|
||||
}
|
||||
$authserv = lc($1);
|
||||
|
||||
if (%{$pms->{conf}->{authres_trusted_authserv}}) {
|
||||
if (!$pms->{conf}->{authres_trusted_authserv}->{$authserv}) {
|
||||
die("authserv not trusted: $authserv\n");
|
||||
}
|
||||
}
|
||||
if ($pms->{conf}->{authres_ignored_authserv}->{$authserv}) {
|
||||
die("ignored authserv: $authserv\n");
|
||||
}
|
||||
|
||||
skip_cfws();
|
||||
if (/\G\d+/gcs) { # skip authserv version
|
||||
skip_cfws();
|
||||
}
|
||||
if (!/\G;/gcs) {
|
||||
die("missing delimiter\n");
|
||||
}
|
||||
skip_cfws();
|
||||
|
||||
while (pos() < length()) {
|
||||
my ($method, $result);
|
||||
my $reason = '';
|
||||
my $props = {};
|
||||
|
||||
# skip none method
|
||||
if (/\Gnone\b/igcs) {
|
||||
die("method none\n");
|
||||
}
|
||||
|
||||
# method / version = result
|
||||
if (!/\G([\w-]+)/gcs) {
|
||||
die("invalid method\n");
|
||||
}
|
||||
$method = lc($1);
|
||||
if (!exists $method_result{$method}) {
|
||||
die("unknown method: $method\n");
|
||||
}
|
||||
skip_cfws();
|
||||
if (/\G\//gcs) {
|
||||
skip_cfws();
|
||||
if (!/\G\d+/gcs) {
|
||||
die("invalid $method version\n");
|
||||
}
|
||||
$version = $1;
|
||||
skip_cfws();
|
||||
}
|
||||
if (!/\G=/gcs) {
|
||||
die("missing result for $method: ".substr($_, pos())."\n");
|
||||
}
|
||||
skip_cfws();
|
||||
if (!/\G(\w+)/gcs) {
|
||||
die("invalid result for $method\n");
|
||||
}
|
||||
$result = $1;
|
||||
if (!exists $method_result{$method}{$result}) {
|
||||
die("unknown result for $method: $result\n");
|
||||
}
|
||||
skip_cfws();
|
||||
|
||||
# reason = value
|
||||
if (/\Greason\b/igcs) {
|
||||
skip_cfws();
|
||||
if (!/\G=/gcs) {
|
||||
die("invalid reason\n");
|
||||
}
|
||||
skip_cfws();
|
||||
if (!/\G$QUOTED_STRING|($TOKEN)/gcs) {
|
||||
die("invalid reason\n");
|
||||
}
|
||||
$reason = defined $1 ? $1 : $2;
|
||||
skip_cfws();
|
||||
}
|
||||
|
||||
# ptype.property = value
|
||||
while (pos() < length()) {
|
||||
my ($ptype, $property, $value);
|
||||
|
||||
# ptype
|
||||
if (!/\G(\w+)/gcs) {
|
||||
die("invalid ptype: ".substr($_,pos())."\n");
|
||||
}
|
||||
$ptype = lc($1);
|
||||
if (!exists $method_ptype_prop{$method}{$ptype}) {
|
||||
die("unknown ptype: $ptype\n");
|
||||
}
|
||||
skip_cfws();
|
||||
|
||||
# dot
|
||||
if (!/\G\./gcs) {
|
||||
die("missing property\n");
|
||||
}
|
||||
skip_cfws();
|
||||
|
||||
# property
|
||||
if (!/\G(\w+)/gcs) {
|
||||
die("invalid property\n");
|
||||
}
|
||||
$property = lc($1);
|
||||
if (!exists $method_ptype_prop{$method}{$ptype}{$property} &&
|
||||
!exists $method_ptype_prop{$method}{$ptype}{'*'}) {
|
||||
die("unknown property for $ptype: $property\n");
|
||||
}
|
||||
skip_cfws();
|
||||
|
||||
# =
|
||||
if (!/\G=/gcs) {
|
||||
die("missing property value\n");
|
||||
}
|
||||
skip_cfws();
|
||||
|
||||
# value:
|
||||
# The grammar is ( value / [ [ local-part ] "@" ] domain-name )
|
||||
# where value := token / quoted-string
|
||||
# and local-part := dot-atom / quoted-string / obs-local-part
|
||||
if (!/\G$QUOTED_STRING|($ATOM(?:\.$ATOM)*|$TOKEN)(?=(?:[\s;]|$))/gcs) {
|
||||
die("invalid $ptype.$property value\n");
|
||||
}
|
||||
$value = defined $1 ? $1 : $2;
|
||||
skip_cfws();
|
||||
|
||||
$props->{$ptype}->{$property} = $value;
|
||||
|
||||
if (/\G(?:;|$)/gcs) {
|
||||
skip_cfws();
|
||||
last;
|
||||
}
|
||||
}
|
||||
|
||||
push @methods, [$method, {
|
||||
'authserv' => $authserv,
|
||||
'version' => $version,
|
||||
'result' => $result,
|
||||
'reason' => $reason,
|
||||
'properties' => $props,
|
||||
}];
|
||||
}
|
||||
|
||||
# paranoid check..
|
||||
if (pos() < length()) {
|
||||
die("parse ended prematurely?\n");
|
||||
}
|
||||
|
||||
# Pushed to pms only if header parsed completely
|
||||
foreach my $marr (@methods) {
|
||||
push @{$pms->{authres_parsed}->{$marr->[0]}}, $marr->[1];
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
# skip whitespace and comments
|
||||
sub skip_cfws {
|
||||
/\G\s*/gcs;
|
||||
if (/\G\(/gcs) {
|
||||
my $i = 1;
|
||||
while (/\G.*?([()]|\z)/gcs) {
|
||||
$1 eq ')' ? $i-- : $i++;
|
||||
last if !$i;
|
||||
}
|
||||
die("comment not ended\n") if $i;
|
||||
/\G\s*/gcs;
|
||||
}
|
||||
}
|
||||
|
||||
#sub check_cleanup {
|
||||
# my ($self, $opts) = @_;
|
||||
# my $pms = $opts->{permsgstatus};
|
||||
# use Data::Dumper;
|
||||
# print STDERR Dumper($pms->{authres_parsed});
|
||||
# print STDERR Dumper($pms->{authres_result});
|
||||
#}
|
||||
|
||||
1;
|
@ -105,7 +105,11 @@ Note: SpamAssassin requires at least 3 points from the header, and 3
|
||||
points from the body to auto-learn as spam. Therefore, the minimum
|
||||
working value for this option is 6.
|
||||
|
||||
If the test option autolearn_force is set, the minimum value will
|
||||
If test option C<autolearn_header> or C<autolearn_body> is set, points from
|
||||
that rule are forced to count as coming from header or body accordingly.
|
||||
This can be useful for adjusting some meta rules.
|
||||
|
||||
If the test option C<autolearn_force> is set, the minimum value will
|
||||
remain at 6 points but there is no requirement that the points come
|
||||
from body and header rules. This option is useful for autolearning
|
||||
with rules that are considered to be extremely safe indicators of
|
||||
|
@ -37,7 +37,43 @@ And the chi-square probability combiner as described here:
|
||||
|
||||
The results are incorporated into SpamAssassin as the BAYES_* rules.
|
||||
|
||||
=head1 METHODS
|
||||
=head1 ADMINISTRATOR SETTINGS
|
||||
|
||||
=over 4
|
||||
|
||||
=item bayes_stopword_languages lang (default: en)
|
||||
|
||||
Languages enabled in bayes stopwords processing, every language have a
|
||||
default stopwords regexp, tokens matching this regular expression will not
|
||||
be considered in bayes processing.
|
||||
|
||||
Custom regular expressions for additional languages can be defined in C<local.cf>.
|
||||
|
||||
Custom regular expressions can be specified by using the C<bayes_stopword_lang>
|
||||
keyword like in the following example:
|
||||
|
||||
bayes_stopword_languages en se
|
||||
bayes_stopword_en (?:you|me)
|
||||
bayes_stopword_se (?:du|mig)
|
||||
|
||||
Regexps are case-insensitive will be anchored automatically at beginning and
|
||||
end.
|
||||
|
||||
To disable stopwords usage, specify C<bayes_stopword_languages disable>.
|
||||
|
||||
Only one bayes_stopword_languages or bayes_stopword_xx configuration line
|
||||
can be used. New configuration line will override the old one, for example
|
||||
the ones from SpamAssassin default ruleset (60_bayes_stopwords.cf).
|
||||
|
||||
=back
|
||||
|
||||
=over 4
|
||||
|
||||
=item bayes_max_token_length (default: 15)
|
||||
|
||||
Configure the maximum number of character a token could contain
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
@ -48,16 +84,12 @@ use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1 sha1_hex); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1 sha1_hex) }
|
||||
}
|
||||
use Digest::SHA qw(sha1 sha1_hex);
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::PerMsgStatus;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp untaint_var);
|
||||
|
||||
# pick ONLY ONE of these combining implementations.
|
||||
use Mail::SpamAssassin::Bayes::CombineChi;
|
||||
@ -135,13 +167,16 @@ our $IGNORED_HDRS = qr{(?: (?:X-)?Sender # misc noise
|
||||
| X-Gnus-Mail-Source
|
||||
| Xref
|
||||
|
||||
)}x;
|
||||
)}ix;
|
||||
|
||||
# Note only the presence of these headers, in order to reduce the
|
||||
# hapaxen they generate.
|
||||
our $MARK_PRESENCE_ONLY_HDRS = qr{(?: X-Face
|
||||
|X-(?:Gnu-?PG|PGP|GPG)(?:-Key)?-Fingerprint
|
||||
|D(?:KIM|omainKey)-Signature
|
||||
|X-Google-DKIM-Signature
|
||||
|ARC-(?:Message-Signature|Seal)
|
||||
|Autocrypt
|
||||
)}ix;
|
||||
|
||||
# tweaks tested as of Nov 18 2002 by jm posted to -devel at
|
||||
@ -220,6 +255,8 @@ use constant REQUIRE_SIGNIFICANT_TOKENS_TO_SCORE => -1;
|
||||
|
||||
# How long a token should we hold onto? (note: German speakers typically
|
||||
# will require a longer token than English ones.)
|
||||
# This is just a default value, option can be changed using
|
||||
# bayes_max_token_length option
|
||||
use constant MAX_TOKEN_LENGTH => 15;
|
||||
|
||||
###########################################################################
|
||||
@ -236,10 +273,100 @@ sub new {
|
||||
$self->{conf} = $main->{conf};
|
||||
$self->{use_ignores} = 1;
|
||||
|
||||
$self->register_eval_rule("check_bayes");
|
||||
# Old default stopword list, need to have hardcoded one incase sa-update is not available
|
||||
$self->{bayes_stopword}{en} = qr/(?:a(?:ble|l(?:ready|l)|n[dy]|re)|b(?:ecause|oth)|c(?:an|ome)|e(?:ach|mail|ven)|f(?:ew|irst|or|rom)|give|h(?:a(?:ve|s)|ttp)|i(?:n(?:formation|to)|t\'s)|just|know|l(?:ike|o(?:ng|ok))|m(?:a(?:de|il(?:(?:ing|to))?|ke|ny)|o(?:re|st)|uch)|n(?:eed|o[tw]|umber)|o(?:ff|n(?:ly|e)|ut|wn)|p(?:eople|lace)|right|s(?:ame|ee|uch)|t(?:h(?:at|is|rough|e)|ime)|using|w(?:eb|h(?:ere|y)|ith(?:out)?|or(?:ld|k))|y(?:ears?|ou(?:(?:\'re|r))?))/;
|
||||
|
||||
$self->set_config($self->{conf});
|
||||
$self->register_eval_rule("check_bayes", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'bayes_max_token_length',
|
||||
default => MAX_TOKEN_LENGTH,
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'bayes_stopword_languages',
|
||||
default => ['en'],
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRINGLIST,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
my @langs;
|
||||
if ($value eq 'disable') {
|
||||
@{$self->{bayes_stopword_languages}} = ();
|
||||
}
|
||||
else {
|
||||
foreach my $lang (split(/(?:\s*,\s*|\s+)/, lc($value))) {
|
||||
if ($lang !~ /^([a-z]{2})$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
push @langs, $lang;
|
||||
}
|
||||
if (!@langs) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
@{$self->{bayes_stopword_languages}} = @langs;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub parse_config {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
# Ignore users's configuration lines
|
||||
return 0 if $opts->{user_config};
|
||||
|
||||
if ($opts->{key} =~ /^bayes_stopword_([a-z]{2})$/i) {
|
||||
$self->inhibit_further_callbacks();
|
||||
my $lang = lc($1);
|
||||
foreach my $re (split(/\s+/, $opts->{value})) {
|
||||
my ($rec, $err) = compile_regexp('^(?i)'.$re.'$', 0);
|
||||
if (!$rec) {
|
||||
warn "bayes: invalid regexp for $opts->{key}: $err\n";
|
||||
return 0;
|
||||
}
|
||||
$self->{bayes_stopword}{$lang} = $rec;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub finish_parsing_end {
|
||||
my ($self, $opts) = @_;
|
||||
my $conf = $opts->{conf};
|
||||
|
||||
my @langs;
|
||||
foreach my $lang (@{$conf->{bayes_stopword_languages}}) {
|
||||
if (defined $self->{bayes_stopword}{$lang}) {
|
||||
push @langs, $lang;
|
||||
} else {
|
||||
warn "bayes: missing stopwords regexp for language '$lang'\n";
|
||||
}
|
||||
}
|
||||
if (@langs) {
|
||||
dbg("bayes: stopwords for languages enabled: ".join(' ', @langs));
|
||||
@{$conf->{bayes_stopword_languages}} = @langs;
|
||||
} else {
|
||||
dbg("bayes: no stopword languages enabled");
|
||||
$conf->{bayes_stopword_languages} = [];
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub finish {
|
||||
my $self = shift;
|
||||
if ($self->{store}) {
|
||||
@ -410,10 +537,11 @@ sub _learn_trapped {
|
||||
my @msgid = ( $msgid );
|
||||
|
||||
if (!defined $msgid) {
|
||||
@msgid = $self->get_msgid($msg);
|
||||
@msgid = ( $msg->generate_msgid(), $msg->get_msgid() );
|
||||
}
|
||||
|
||||
foreach my $msgid_t ( @msgid ) {
|
||||
next if !defined $msgid_t;
|
||||
my $seen = $self->{store}->seen_get ($msgid_t);
|
||||
|
||||
if (defined ($seen)) {
|
||||
@ -545,7 +673,7 @@ sub _forget_trapped {
|
||||
my $isspam;
|
||||
|
||||
if (!defined $msgid) {
|
||||
@msgid = $self->get_msgid($msg);
|
||||
@msgid = ( $msg->generate_msgid(), $msg->get_msgid() );
|
||||
}
|
||||
|
||||
while( $msgid = shift @msgid ) {
|
||||
@ -671,7 +799,6 @@ sub learner_is_scan_available {
|
||||
|
||||
sub scan {
|
||||
my ($self, $permsgstatus, $msg) = @_;
|
||||
my $score;
|
||||
|
||||
return unless $self->{conf}->{use_learner};
|
||||
|
||||
@ -756,6 +883,7 @@ sub scan {
|
||||
if (@pw_keys > N_SIGNIFICANT_TOKENS) { $#pw_keys = N_SIGNIFICANT_TOKENS - 1 }
|
||||
|
||||
my @sorted;
|
||||
my $score;
|
||||
foreach my $tok (@pw_keys) {
|
||||
next if $tok_strength{$tok} <
|
||||
$Mail::SpamAssassin::Bayes::Combine::MIN_PROB_STRENGTH;
|
||||
@ -964,49 +1092,6 @@ sub learner_dump_database {
|
||||
###########################################################################
|
||||
# TODO: these are NOT public, but the test suite needs to call them.
|
||||
|
||||
sub get_msgid {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
my @msgid;
|
||||
|
||||
my $msgid = $msg->get_header("Message-Id");
|
||||
if (defined $msgid && $msgid ne '' && $msgid !~ /^\s*<\s*(?:\@sa_generated)?>.*$/) {
|
||||
# remove \r and < and > prefix/suffixes
|
||||
chomp $msgid;
|
||||
$msgid =~ s/^<//; $msgid =~ s/>.*$//g;
|
||||
push(@msgid, $msgid);
|
||||
}
|
||||
|
||||
# Modified 2012-01-17 per bug 5185 to remove last received from msg_id calculation
|
||||
|
||||
# Use sha1_hex(Date: and top N bytes of body)
|
||||
# where N is MIN(1024 bytes, 1/2 of body length)
|
||||
#
|
||||
my $date = $msg->get_header("Date");
|
||||
$date = "None" if (!defined $date || $date eq ''); # No Date?
|
||||
|
||||
#Removed per bug 5185
|
||||
#my @rcvd = $msg->get_header("Received");
|
||||
#my $rcvd = $rcvd[$#rcvd];
|
||||
#$rcvd = "None" if (!defined $rcvd || $rcvd eq ''); # No Received?
|
||||
|
||||
# Make a copy since pristine_body is a reference ...
|
||||
my $body = join('', $msg->get_pristine_body());
|
||||
|
||||
if (length($body) > 64) { # Small Body?
|
||||
my $keep = ( length $body > 2048 ? 1024 : int(length($body) / 2) );
|
||||
substr($body, $keep) = '';
|
||||
}
|
||||
|
||||
#Stripping all CR and LF so that testing midstream from MTA and post delivery don't
|
||||
#generate different id's simply because of LF<->CR<->CRLF changes.
|
||||
$body =~ s/[\r\n]//g;
|
||||
|
||||
unshift(@msgid, sha1_hex($date."\000".$body).'@sa_generated');
|
||||
|
||||
return wantarray ? @msgid : $msgid[0];
|
||||
}
|
||||
|
||||
sub get_body_from_msg {
|
||||
my ($self, $msg) = @_;
|
||||
|
||||
@ -1024,7 +1109,7 @@ sub get_body_from_msg {
|
||||
|
||||
if (!defined $msgdata) {
|
||||
# why?!
|
||||
warn "bayes: failed to get body for ".scalar($self->get_msgid($self->{msg}))."\n";
|
||||
warn "bayes: failed to get body for ".scalar($self->{msg}->generate_msgid())."\n";
|
||||
return { };
|
||||
}
|
||||
|
||||
@ -1052,8 +1137,10 @@ sub _get_msgdata_from_permsgstatus {
|
||||
# The calling functions expect a uniq'ed array of tokens ...
|
||||
sub tokenize {
|
||||
my ($self, $msg, $msgdata) = @_;
|
||||
my $conf = $self->{conf};
|
||||
my $t_src = $conf->{bayes_token_sources};
|
||||
|
||||
my $t_src = $self->{conf}->{bayes_token_sources};
|
||||
$self->{stopword_cache} = ();
|
||||
|
||||
# visible tokens from the body
|
||||
my @tokens_body;
|
||||
@ -1117,6 +1204,8 @@ sub tokenize {
|
||||
dbg("bayes: tokenized header: %d tokens", scalar @tokens_header);
|
||||
}
|
||||
|
||||
delete $self->{stopword_cache};
|
||||
|
||||
# Go ahead and uniq the array, skip null tokens (can happen sometimes)
|
||||
# generate an SHA1 hash and take the lower 40 bits as our token
|
||||
my %tokens;
|
||||
@ -1137,6 +1226,7 @@ sub _tokenize_line {
|
||||
my $region = $_[3];
|
||||
local ($_) = $_[1];
|
||||
|
||||
my $conf = $self->{conf};
|
||||
my @rettokens;
|
||||
|
||||
# include quotes, .'s and -'s for URIs, and [$,]'s for Nigerian-scam strings,
|
||||
@ -1174,7 +1264,7 @@ sub _tokenize_line {
|
||||
# cleared, even if the source string has perl characters semantics !!!
|
||||
# Is this really still desirable?
|
||||
|
||||
foreach my $token (split) {
|
||||
TOKEN: foreach my $token (split) {
|
||||
$token =~ s/^[-'"\.,]+//; # trim non-alphanum chars at start or end
|
||||
$token =~ s/[-'"\.,]+$//; # so we don't get loads of '"foo' tokens
|
||||
|
||||
@ -1183,7 +1273,7 @@ sub _tokenize_line {
|
||||
# tokens, so the SQL BayesStore returns undef. I really want a way
|
||||
# of optimizing that out, but I haven't come up with anything yet.
|
||||
#
|
||||
next if ( defined $magic_re && $token =~ /$magic_re/ );
|
||||
next if ( defined $magic_re && $token =~ /$magic_re/o );
|
||||
|
||||
# *do* keep 3-byte tokens; there's some solid signs in there
|
||||
my $len = length($token);
|
||||
@ -1192,8 +1282,24 @@ sub _tokenize_line {
|
||||
# area, and it just slows us down to record them.
|
||||
# See http://wiki.apache.org/spamassassin/BayesStopList for more info.
|
||||
#
|
||||
next if $len < 3 ||
|
||||
($token =~ /^(?:a(?:ble|l(?:ready|l)|n[dy]|re)|b(?:ecause|oth)|c(?:an|ome)|e(?:ach|mail|ven)|f(?:ew|irst|or|rom)|give|h(?:a(?:ve|s)|ttp)|i(?:n(?:formation|to)|t\'s)|just|know|l(?:ike|o(?:ng|ok))|m(?:a(?:de|il(?:(?:ing|to))?|ke|ny)|o(?:re|st)|uch)|n(?:eed|o[tw]|umber)|o(?:ff|n(?:ly|e)|ut|wn)|p(?:eople|lace)|right|s(?:ame|ee|uch)|t(?:h(?:at|is|rough|e)|ime)|using|w(?:eb|h(?:ere|y)|ith(?:out)?|or(?:ld|k))|y(?:ears?|ou(?:(?:\'re|r))?))$/i);
|
||||
next if $len < 3;
|
||||
|
||||
# check stopwords regexp if not cached
|
||||
if (@{$conf->{bayes_stopword_languages}}) {
|
||||
if (!exists $self->{stopword_cache}{$token}) {
|
||||
foreach my $lang (@{$conf->{bayes_stopword_languages}}) {
|
||||
if ($token =~ $self->{bayes_stopword}{$lang}) {
|
||||
dbg("bayes: skipped token '$token' because it's in stopword list for language '$lang'");
|
||||
$self->{stopword_cache}{$token} = 1;
|
||||
next TOKEN;
|
||||
}
|
||||
}
|
||||
$self->{stopword_cache}{$token} = 0;
|
||||
} else {
|
||||
# bail out if cached known
|
||||
next if $self->{stopword_cache}{$token};
|
||||
}
|
||||
}
|
||||
|
||||
# are we in the body? If so, apply some body-specific breakouts
|
||||
if ($region == 1 || $region == 2) {
|
||||
@ -1212,7 +1318,7 @@ sub _tokenize_line {
|
||||
# used as part of split tokens such as "HTo:D*net" indicating that
|
||||
# the domain ".net" appeared in the To header.
|
||||
#
|
||||
if ($len > MAX_TOKEN_LENGTH && $token !~ /\*/) {
|
||||
if ($len > $conf->{bayes_max_token_length} && index($token, '*') == -1) {
|
||||
|
||||
if (TOKENIZE_LONG_8BIT_SEQS_AS_UTF8_CHARS && $token =~ /[\x80-\xBF]{2}/) {
|
||||
# Bug 7135
|
||||
@ -1287,9 +1393,6 @@ sub _tokenize_headers {
|
||||
|
||||
my %parsed;
|
||||
|
||||
my %user_ignore;
|
||||
$user_ignore{lc $_} = 1 for @{$self->{main}->{conf}->{bayes_ignore_headers}};
|
||||
|
||||
# get headers in array context
|
||||
my @hdrs;
|
||||
my @rcvdlines;
|
||||
@ -1317,7 +1420,7 @@ sub _tokenize_headers {
|
||||
|
||||
# remove user-specified headers here, after Received, in case they
|
||||
# want to ignore that too
|
||||
next if exists $user_ignore{lc $hdr};
|
||||
next if exists $self->{conf}->{bayes_ignore_header}->{lc $hdr};
|
||||
|
||||
# Prep the header value
|
||||
$val ||= '';
|
||||
@ -1344,22 +1447,38 @@ sub _tokenize_headers {
|
||||
elsif ($hdr =~ /^${MARK_PRESENCE_ONLY_HDRS}$/i) {
|
||||
$val = "1"; # just mark the presence, they create lots of hapaxen
|
||||
}
|
||||
elsif ($hdr =~ /^x-spam-relays-(?:external|internal|trusted|untrusted)$/) {
|
||||
# remove redundant rdns helo ident envfrom intl auth msa words
|
||||
$val =~ s/ [a-z]+=/ /g;
|
||||
}
|
||||
|
||||
if (MAP_HEADERS_MID) {
|
||||
if ($hdr =~ /^(?:In-Reply-To|References|Message-ID)$/i) {
|
||||
if (exists $parsed{"*MI"}) {
|
||||
$parsed{"*MI"} .= " ".$val;
|
||||
} else {
|
||||
$parsed{"*MI"} = $val;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (MAP_HEADERS_FROMTOCC) {
|
||||
if ($hdr =~ /^(?:From|To|Cc)$/i) {
|
||||
if (exists $parsed{"*Ad"}) {
|
||||
$parsed{"*Ad"} .= " ".$val;
|
||||
} else {
|
||||
$parsed{"*Ad"} = $val;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (MAP_HEADERS_USERAGENT) {
|
||||
if ($hdr =~ /^(?:X-Mailer|User-Agent)$/i) {
|
||||
if (exists $parsed{"*UA"}) {
|
||||
$parsed{"*UA"} .= " ".$val;
|
||||
} else {
|
||||
$parsed{"*UA"} = $val;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# replace hdr name with "compressed" version if possible
|
||||
if (defined $HEADER_NAME_COMPRESSION{$hdr}) {
|
||||
@ -1371,11 +1490,13 @@ sub _tokenize_headers {
|
||||
} else {
|
||||
$parsed{$hdr} = $val;
|
||||
}
|
||||
}
|
||||
|
||||
if (would_log('dbg', 'bayes') > 1) {
|
||||
foreach my $hdr (sort keys %parsed) {
|
||||
dbg("bayes: header tokens for $hdr = \"$parsed{$hdr}\"");
|
||||
}
|
||||
}
|
||||
|
||||
return %parsed;
|
||||
}
|
||||
|
||||
@ -1393,7 +1514,7 @@ sub _pre_chew_content_type {
|
||||
}
|
||||
|
||||
# stop-list words for Content-Type header: these wind up totally gray
|
||||
$val =~ s/\b(?:text|charset)\b//;
|
||||
$val =~ s/\b(?:text|charset)\b/ /g;
|
||||
|
||||
$val;
|
||||
}
|
||||
@ -1468,10 +1589,17 @@ sub _pre_chew_addr_header {
|
||||
my ($self, $val) = @_;
|
||||
local ($_);
|
||||
|
||||
my @addrs = $self->{main}->find_all_addrs_in_line ($val);
|
||||
my @addrs = Mail::SpamAssassin::Util::parse_header_addresses($val);
|
||||
my @toks;
|
||||
foreach (@addrs) {
|
||||
push (@toks, $self->_tokenize_mail_addrs ($_));
|
||||
foreach my $addr (@addrs) {
|
||||
if (defined $addr->{phrase}) {
|
||||
foreach (split(/\s+/, $addr->{phrase})) {
|
||||
push @toks, "N*".$_; # Bug 6319
|
||||
}
|
||||
}
|
||||
if (defined $addr->{address}) {
|
||||
push @toks, $self->_tokenize_mail_addrs($addr->{address});
|
||||
}
|
||||
}
|
||||
return join (' ', @toks);
|
||||
}
|
||||
|
@ -39,16 +39,16 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("multipart_alternative_difference");
|
||||
$self->register_eval_rule("multipart_alternative_difference_count");
|
||||
$self->register_eval_rule("check_blank_line_ratio");
|
||||
$self->register_eval_rule("tvd_vertical_words");
|
||||
$self->register_eval_rule("check_stock_info");
|
||||
$self->register_eval_rule("check_body_length");
|
||||
$self->register_eval_rule("multipart_alternative_difference", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("multipart_alternative_difference_count", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_blank_line_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("tvd_vertical_words", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_stock_info", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_body_length", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
$self->register_eval_rule("plaintext_body_length");
|
||||
$self->register_eval_rule("plaintext_sig_length");
|
||||
$self->register_eval_rule("plaintext_body_sig_ratio");
|
||||
$self->register_eval_rule("plaintext_body_length", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("plaintext_sig_length", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("plaintext_body_sig_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -69,7 +69,7 @@ sub multipart_alternative_difference_count {
|
||||
my ($self, $pms, $fulltext, $ratio, $minhtml) = @_;
|
||||
$self->_multipart_alternative_difference($pms) unless (exists $pms->{madiff});
|
||||
return 0 unless $pms->{madiff_html} > $minhtml;
|
||||
return(($pms->{madiff_text} / $pms->{madiff_html}) > $ratio);
|
||||
return (($pms->{madiff_text} / $pms->{madiff_html}) > $ratio);
|
||||
}
|
||||
|
||||
sub _multipart_alternative_difference {
|
||||
@ -81,7 +81,7 @@ sub _multipart_alternative_difference {
|
||||
my $msg = $pms->{msg};
|
||||
|
||||
# Find all multipart/alternative parts in the message
|
||||
my @ma = $msg->find_parts(qr@^multipart/alternative\b@i);
|
||||
my @ma = $msg->find_parts(qr@^multipart/alternative\b@);
|
||||
|
||||
# If there are no multipart/alternative sections, skip this test.
|
||||
return if (!@ma);
|
||||
@ -104,7 +104,7 @@ sub _multipart_alternative_difference {
|
||||
my %text;
|
||||
|
||||
# limit our search to text-based parts
|
||||
my @txt = $part->find_parts(qr@^text\b@i);
|
||||
my @txt = $part->find_parts(qr@^text\b@);
|
||||
foreach my $text (@txt) {
|
||||
# we only care about the rendered version of the part
|
||||
my ($type, $rnd) = $text->rendered();
|
||||
@ -123,7 +123,7 @@ sub _multipart_alternative_difference {
|
||||
}
|
||||
|
||||
# If there are no words, mark if there's at least 1 image ...
|
||||
if (!%html && exists $pms->{html}{inside}{img}) {
|
||||
if (!%html && exists $text->{html_results}{inside}{img}) {
|
||||
# Use "\n" as the mark since it can't ever occur normally
|
||||
$html{"\n"}=1;
|
||||
}
|
||||
@ -222,7 +222,7 @@ sub tvd_vertical_words {
|
||||
}
|
||||
|
||||
dbg("eval: tvd_vertical_words value: $pms->{tvd_vertical_words} / min: $min / max: $max - value must be >= min and < max");
|
||||
return 1 if ($pms->{tvd_vertical_words} >= $min && $pms->{tvd_vertical_words} < $max);
|
||||
return ($pms->{tvd_vertical_words} >= $min && $pms->{tvd_vertical_words} < $max);
|
||||
}
|
||||
|
||||
sub check_stock_info {
|
||||
@ -241,7 +241,7 @@ sub _check_stock_info {
|
||||
$pms->{stock_info} = 0;
|
||||
|
||||
# Find all multipart/alternative parts in the message
|
||||
my @parts = $pms->{msg}->find_parts(qr@^text/plain$@i);
|
||||
my @parts = $pms->{msg}->find_parts(qr@^text/plain$@);
|
||||
return if (!@parts);
|
||||
|
||||
# Go through each of the multipart parts
|
||||
@ -360,11 +360,17 @@ sub _plaintext_body_sig_ratio {
|
||||
|
||||
# Find the last occurence of a signature delimiter and get the body and
|
||||
# signature lengths.
|
||||
my ($len_b, $len_s) = map { length } $text =~ /(^|.*\n)-- \n(.*?)$/s;
|
||||
|
||||
if (! defined $len_b) { # no sig marker, all body
|
||||
$len_b = length $text;
|
||||
$len_s = 0;
|
||||
my $len_b = length($text);
|
||||
my $len_s = 0;
|
||||
|
||||
while ($text =~ /^-- ?\r?$/mg) {
|
||||
|
||||
# ignore decoy marker at the end
|
||||
next if ( length($text) - $+[0] <= 4 );
|
||||
|
||||
$len_b = $-[0];
|
||||
$len_s = length($text) - $+[0];
|
||||
}
|
||||
|
||||
$pms->{plaintext_body_sig_ratio}->{body_length} = $len_b;
|
||||
|
@ -119,6 +119,12 @@ sub extract_set {
|
||||
my $nicepri = $pri; $nicepri =~ s/-/neg/g;
|
||||
$self->extract_set_pri($conf, $test_set->{$pri}, $ruletype.'_'.$nicepri);
|
||||
}
|
||||
|
||||
# Clear extract_hints tmpfile
|
||||
if ($self->{tmpf}) {
|
||||
unlink $self->{tmpf};
|
||||
delete $self->{tmpf};
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
@ -190,11 +196,13 @@ NEXT_RULE:
|
||||
next NEXT_RULE;
|
||||
}
|
||||
|
||||
# ignore ReplaceTags rules
|
||||
my $is_a_replacetags_rule = $conf->{replace_rules}->{$name};
|
||||
# ignore ReplaceTags rules, and regex capture template rules
|
||||
my $is_a_replace_rule = $conf->{replace_rules}->{$name} ||
|
||||
$conf->{capture_rules}->{$name} ||
|
||||
$conf->{capture_template_rules}->{$name};
|
||||
my ($minlen, $lossy, @bases);
|
||||
|
||||
if (!$is_a_replacetags_rule) {
|
||||
if (!$is_a_replace_rule) {
|
||||
eval { # catch die()s
|
||||
my ($qr, $mods) = $self->simplify_and_qr_regexp($rule);
|
||||
($lossy, @bases) = $self->extract_hints($rule, $qr, $mods);
|
||||
@ -202,6 +210,7 @@ NEXT_RULE:
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
$eval_stat =~ s/ at .*//s;
|
||||
dbg("zoom: giving up on regexp: $eval_stat");
|
||||
};
|
||||
|
||||
@ -220,9 +229,9 @@ NEXT_RULE:
|
||||
}
|
||||
}
|
||||
|
||||
if ($is_a_replacetags_rule || !$minlen || !@bases) {
|
||||
if ($is_a_replace_rule || !$minlen || !@bases) {
|
||||
dbg("zoom: ignoring rule %s, %s", $name,
|
||||
$is_a_replacetags_rule ? 'is a replace rule'
|
||||
$is_a_replace_rule ? 'is a replace rule'
|
||||
: !@bases ? 'no bases' : 'no minlen');
|
||||
push @failed, { orig => $rule };
|
||||
$cached->{rule_bases}->{$cachekey} = { };
|
||||
@ -469,7 +478,7 @@ sub simplify_and_qr_regexp {
|
||||
}
|
||||
else {
|
||||
die "case-i" if $rule =~ /\(\?i\)/;
|
||||
die "case-i" if $mods =~ /i/;
|
||||
die "case-i" if index($mods, 'i') >= 0;
|
||||
|
||||
# always case-i: /A(?i:ct) N(?i:ow)/ => /Act Now/
|
||||
$rule =~ s/(?<!\\)\(\?i\:(.*?)\)/$1/gs and die "case-i";
|
||||
@ -504,10 +513,7 @@ sub simplify_and_qr_regexp {
|
||||
}
|
||||
|
||||
sub extract_hints {
|
||||
my $self = shift;
|
||||
my $rawrule = shift;
|
||||
my $rule = shift;
|
||||
my $mods = shift;
|
||||
my ($self, $rawrule, $rule, $mods) = @_;
|
||||
|
||||
my $main = $self->{main};
|
||||
my $orig = $rule;
|
||||
@ -534,24 +540,30 @@ sub extract_hints {
|
||||
# r? => (r|)
|
||||
$rule =~ s/(?<!\\)(\w)\?/\($1\|\)/gs;
|
||||
|
||||
my ($tmpf, $tmpfh) = Mail::SpamAssassin::Util::secure_tmpfile();
|
||||
# Create single tmpfile for extract_hints to use, instead of thousands
|
||||
if (!$self->{tmpf}) {
|
||||
($self->{tmpf}, my $tmpfh) = Mail::SpamAssassin::Util::secure_tmpfile();
|
||||
$tmpfh or die "failed to create a temporary file";
|
||||
untaint_var(\$tmpf);
|
||||
close $tmpfh;
|
||||
$self->{tmpf} = untaint_var($self->{tmpf});
|
||||
}
|
||||
|
||||
open(my $tmpfh, '>'.$self->{tmpf})
|
||||
or die "error opening $self->{tmpf}: $!";
|
||||
binmode $tmpfh;
|
||||
print $tmpfh "use bytes; m{" . $rule . "}" . $mods
|
||||
or die "error writing to $tmpf: $!";
|
||||
close $tmpfh or die "error closing $tmpf: $!";
|
||||
or die "error writing to $self->{tmpf}: $!";
|
||||
close $tmpfh or die "error closing $self->{tmpf}: $!";
|
||||
|
||||
my $perl = $self->get_perl();
|
||||
$self->{perl} = $self->get_perl() if !exists $self->{perl};
|
||||
local *IN;
|
||||
open (IN, "$perl -c -Mre=debug $tmpf 2>&1 |")
|
||||
or die "cannot run $perl: ".exit_status_str($?,$!);
|
||||
open (IN, "$self->{perl} -c -Mre=debug $self->{tmpf} 2>&1 |")
|
||||
or die "cannot run $self->{perl}: ".exit_status_str($?,$!);
|
||||
|
||||
my($inbuf,$nread,$fullstr); $fullstr = '';
|
||||
while ( $nread=read(IN,$inbuf,16384) ) { $fullstr .= $inbuf }
|
||||
defined $nread or die "error reading from pipe: $!";
|
||||
|
||||
unlink $tmpf or die "cannot unlink $tmpf: $!";
|
||||
close IN or die "error closing pipe: $!";
|
||||
defined $fullstr or warn "empty result from a pipe";
|
||||
|
||||
@ -1110,7 +1122,7 @@ sub fixup_re {
|
||||
$output .= "\"$esc\"";
|
||||
}
|
||||
}
|
||||
else {
|
||||
elsif ($fixup_re_test) {
|
||||
print "PRE: $pre\nTOK: $tok\n" or die "error writing: $!";
|
||||
}
|
||||
}
|
||||
@ -1132,6 +1144,8 @@ sub fixup_re {
|
||||
$output =~ s/\*\*BACKSLASH\*\*/\\\\/gs;
|
||||
|
||||
if ($fixup_re_test) { print "OUTPUT: $output\n" or die "error writing: $!" }
|
||||
|
||||
utf8::encode($output) if utf8::is_utf8($output); # force octets
|
||||
return $output;
|
||||
}
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -30,6 +30,12 @@ Taking into account signatures from any signing domains:
|
||||
full DKIM_VALID_AU eval:check_dkim_valid_author_sig()
|
||||
full DKIM_VALID_EF eval:check_dkim_valid_envelopefrom()
|
||||
|
||||
Taking into account ARC signatures (Authenticated Received Chain, RFC 8617)
|
||||
from any signing domains:
|
||||
|
||||
full ARC_SIGNED eval:check_arc_signed()
|
||||
full ARC_VALID eval:check_arc_valid()
|
||||
|
||||
Taking into account signatures from specified signing domains only:
|
||||
(quotes may be omitted on domain names consisting only of letters, digits,
|
||||
dots, and minus characters)
|
||||
@ -127,6 +133,7 @@ package Mail::SpamAssassin::Plugin::DKIM;
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Timeout;
|
||||
use Mail::SpamAssassin::Util qw(idn_to_ascii);
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
@ -145,19 +152,23 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# signatures
|
||||
$self->register_eval_rule("check_dkim_signed");
|
||||
$self->register_eval_rule("check_dkim_valid");
|
||||
$self->register_eval_rule("check_dkim_valid_author_sig");
|
||||
$self->register_eval_rule("check_dkim_testing");
|
||||
$self->register_eval_rule("check_dkim_valid_envelopefrom");
|
||||
$self->register_eval_rule("check_dkim_signed", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
$self->register_eval_rule("check_arc_signed", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
$self->register_eval_rule("check_dkim_valid", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
$self->register_eval_rule("check_arc_valid", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
$self->register_eval_rule("check_dkim_valid_author_sig", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
$self->register_eval_rule("check_dkim_testing", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_dkim_valid_envelopefrom", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
|
||||
# author domain signing practices
|
||||
$self->register_eval_rule("check_dkim_adsp");
|
||||
$self->register_eval_rule("check_dkim_dependable");
|
||||
$self->register_eval_rule("check_dkim_adsp", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_dkim_dependable", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
|
||||
# whitelisting
|
||||
$self->register_eval_rule("check_for_dkim_whitelist_from");
|
||||
$self->register_eval_rule("check_for_def_dkim_whitelist_from");
|
||||
# welcomelisting
|
||||
$self->register_eval_rule("check_for_dkim_welcomelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_dkim_whitelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS); #Stub - Remove in SA 4.1
|
||||
$self->register_eval_rule("check_for_def_dkim_welcomelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_def_dkim_whitelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS); #Stub - Remove in SA 4.1
|
||||
|
||||
# old names (aliases) for compatibility
|
||||
$self->register_eval_rule("check_dkim_verified"); # = check_dkim_valid
|
||||
@ -179,17 +190,19 @@ sub set_config {
|
||||
|
||||
=over 4
|
||||
|
||||
=item whitelist_from_dkim author@example.com [signing-domain]
|
||||
=item welcomelist_from_dkim author@example.com [signing-domain]
|
||||
|
||||
Works similarly to whitelist_from, except that in addition to matching
|
||||
Previously whitelist_from_dkim which will work interchangeably until 4.1.
|
||||
|
||||
Works similarly to welcomelist_from, except that in addition to matching
|
||||
an author address (From) to the pattern in the first parameter, the message
|
||||
must also carry a valid Domain Keys Identified Mail (DKIM) signature made by
|
||||
a signing domain (SDID, i.e. the d= tag) that is acceptable to us.
|
||||
|
||||
Only one whitelist entry is allowed per line, as in C<whitelist_from_rcvd>.
|
||||
Multiple C<whitelist_from_dkim> lines are allowed. File-glob style characters
|
||||
Only one welcomelist entry is allowed per line, as in C<welcomelist_from_rcvd>.
|
||||
Multiple C<welcomelist_from_dkim> lines are allowed. File-glob style characters
|
||||
are allowed for the From address (the first parameter), just like with
|
||||
C<whitelist_from_rcvd>.
|
||||
C<welcomelist_from_rcvd>.
|
||||
|
||||
The second parameter (the signing-domain) does not accept full file-glob style
|
||||
wildcards, although a simple '*.' (or just a '.') prefix to a domain name
|
||||
@ -201,39 +214,43 @@ will be an Author Domain Signature (sometimes called first-party signature)
|
||||
which is a signature where the signing domain (SDID) of a signature matches
|
||||
the domain of the author's address (i.e. the address in a From header field).
|
||||
|
||||
Since this whitelist requires a DKIM check to be made, network tests must
|
||||
Since this welcomelist requires a DKIM check to be made, network tests must
|
||||
be enabled.
|
||||
|
||||
Examples of whitelisting based on an author domain signature (first-party):
|
||||
Examples of welcomelisting based on an author domain signature (first-party):
|
||||
|
||||
whitelist_from_dkim joe@example.com
|
||||
whitelist_from_dkim *@corp.example.com
|
||||
whitelist_from_dkim *@*.example.com
|
||||
welcomelist_from_dkim joe@example.com
|
||||
welcomelist_from_dkim *@corp.example.com
|
||||
welcomelist_from_dkim *@*.example.com
|
||||
|
||||
Examples of whitelisting based on third-party signatures:
|
||||
Examples of welcomelisting based on third-party signatures:
|
||||
|
||||
whitelist_from_dkim jane@example.net example.org
|
||||
whitelist_from_dkim rick@info.example.net example.net
|
||||
whitelist_from_dkim *@info.example.net example.net
|
||||
whitelist_from_dkim *@* mail7.remailer.example.com
|
||||
whitelist_from_dkim *@* *.remailer.example.com
|
||||
welcomelist_from_dkim jane@example.net example.org
|
||||
welcomelist_from_dkim rick@info.example.net example.net
|
||||
welcomelist_from_dkim *@info.example.net example.net
|
||||
welcomelist_from_dkim *@* mail7.remailer.example.com
|
||||
welcomelist_from_dkim *@* *.remailer.example.com
|
||||
|
||||
=item def_whitelist_from_dkim author@example.com [signing-domain]
|
||||
=item def_welcomelist_from_dkim author@example.com [signing-domain]
|
||||
|
||||
Same as C<whitelist_from_dkim>, but used for the default whitelist entries
|
||||
in the SpamAssassin distribution. The whitelist score is lower, because
|
||||
Previously def_whitelist_from_dkim which will work interchangeably until 4.1.
|
||||
|
||||
Same as C<welcomelist_from_dkim>, but used for the default welcomelist entries
|
||||
in the SpamAssassin distribution. The welcomelist score is lower, because
|
||||
these are often targets for abuse of public mailers which sign their mail.
|
||||
|
||||
=item unwhitelist_from_dkim author@example.com [signing-domain]
|
||||
=item unwelcomelist_from_dkim author@example.com [signing-domain]
|
||||
|
||||
Previously unwhitelist_from_dkim which will work interchangeably until 4.1.
|
||||
|
||||
Removes an email address with its corresponding signing-domain field
|
||||
from def_whitelist_from_dkim and whitelist_from_dkim tables, if it exists.
|
||||
Parameters to unwhitelist_from_dkim must exactly match the parameters of
|
||||
a corresponding whitelist_from_dkim or def_whitelist_from_dkim config
|
||||
from def_welcomelist_from_dkim and welcomelist_from_dkim tables, if it exists.
|
||||
Parameters to unwelcomelist_from_dkim must exactly match the parameters of
|
||||
a corresponding welcomelist_from_dkim or def_welcomelist_from_dkim config
|
||||
option which created the entry, for it to be removed (a domain name is
|
||||
matched case-insensitively); i.e. if a signing-domain parameter was
|
||||
specified in a whitelisting command, it must also be specified in the
|
||||
unwhitelisting command.
|
||||
specified in a welcomelisting command, it must also be specified in the
|
||||
unwelcomelisting command.
|
||||
|
||||
Useful for removing undesired default entries from a distributed configuration
|
||||
by a local or site-specific configuration or by C<user_prefs>.
|
||||
@ -375,7 +392,7 @@ Example:
|
||||
=item dkim_minimum_key_bits n (default: 1024)
|
||||
|
||||
The smallest size of a signing key (in bits) for a valid signature to be
|
||||
considered for whitelisting. Additionally, the eval function check_dkim_valid()
|
||||
considered for welcomelisting. Additionally, the eval function check_dkim_valid()
|
||||
will return false on short keys when called with explicitly listed domains,
|
||||
and the eval function check_dkim_valid_author_sig() will return false on short
|
||||
keys (regardless of its arguments). Setting the option to 0 disables a key
|
||||
@ -389,11 +406,13 @@ prepend its own signature on a copy of some third party mail and re-send it,
|
||||
which makes it no more trustworthy than without such signature. This is also
|
||||
a reason for a rule DKIM_VALID to have a near-zero score, i.e. a rule hit
|
||||
is only informational.
|
||||
This option is evaluated on ARC signatures checks as well.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'whitelist_from_dkim',
|
||||
setting => 'welcomelist_from_dkim',
|
||||
aliases => ['whitelist_from_dkim'], # removed in 4.1
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
@ -407,13 +426,14 @@ is only informational.
|
||||
my $address = $1;
|
||||
my $sdid = defined $2 ? $2 : ''; # empty implies author domain signature
|
||||
$address =~ s/(\@[^@]*)\z/lc($1)/e; # lowercase the email address domain
|
||||
$self->{parser}->add_to_addrlist_dkim('whitelist_from_dkim',
|
||||
$self->{parser}->add_to_addrlist_dkim('welcomelist_from_dkim',
|
||||
$address, lc $sdid);
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'def_whitelist_from_dkim',
|
||||
setting => 'def_welcomelist_from_dkim',
|
||||
aliases => ['def_whitelist_from_dkim'], # removed in 4.1
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
@ -427,13 +447,14 @@ is only informational.
|
||||
my $address = $1;
|
||||
my $sdid = defined $2 ? $2 : ''; # empty implies author domain signature
|
||||
$address =~ s/(\@[^@]*)\z/lc($1)/e; # lowercase the email address domain
|
||||
$self->{parser}->add_to_addrlist_dkim('def_whitelist_from_dkim',
|
||||
$self->{parser}->add_to_addrlist_dkim('def_welcomelist_from_dkim',
|
||||
$address, lc $sdid);
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'unwhitelist_from_dkim',
|
||||
setting => 'unwelcomelist_from_dkim',
|
||||
aliases => ['unwhitelist_from_dkim'], # removed in 4.1
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
@ -447,9 +468,9 @@ is only informational.
|
||||
my $address = $1;
|
||||
my $sdid = defined $2 ? $2 : ''; # empty implies author domain signature
|
||||
$address =~ s/(\@[^@]*)\z/lc($1)/e; # lowercase the email address domain
|
||||
$self->{parser}->remove_from_addrlist_dkim('whitelist_from_dkim',
|
||||
$self->{parser}->remove_from_addrlist_dkim('welcomelist_from_dkim',
|
||||
$address, lc $sdid);
|
||||
$self->{parser}->remove_from_addrlist_dkim('def_whitelist_from_dkim',
|
||||
$self->{parser}->remove_from_addrlist_dkim('def_welcomelist_from_dkim',
|
||||
$address, lc $sdid);
|
||||
}
|
||||
});
|
||||
@ -480,7 +501,7 @@ is only informational.
|
||||
}
|
||||
});
|
||||
|
||||
# minimal signing key size in bits that is acceptable for whitelisting
|
||||
# minimal signing key size in bits that is acceptable for welcomelisting
|
||||
push (@cmds, {
|
||||
setting => 'dkim_minimum_key_bits',
|
||||
default => 1024,
|
||||
@ -530,6 +551,18 @@ sub check_dkim_signed {
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub check_arc_signed {
|
||||
my ($self, $pms, $full_ref, @acceptable_domains) = @_;
|
||||
$self->_check_dkim_signature($pms) if !$pms->{arc_checked_signature};
|
||||
my $result = 0;
|
||||
if (!$pms->{arc_signed}) {
|
||||
# don't bother
|
||||
} elsif (!@acceptable_domains) {
|
||||
$result = 1; # no additional constraints, any signing domain will do
|
||||
}
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub check_dkim_valid {
|
||||
my ($self, $pms, $full_ref, @acceptable_domains) = @_;
|
||||
$self->_check_dkim_signature($pms) if !$pms->{dkim_checked_signature};
|
||||
@ -545,6 +578,19 @@ sub check_dkim_valid {
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub check_arc_valid {
|
||||
my ($self, $pms, $full_ref, @acceptable_domains) = @_;
|
||||
$self->_check_dkim_signature($pms) if !$pms->{arc_checked_signature};
|
||||
my $result = 0;
|
||||
if (!$pms->{arc_valid}) {
|
||||
# don't bother
|
||||
} elsif (!@acceptable_domains) {
|
||||
$result = 1; # no additional constraints, any signing domain will do,
|
||||
# also any signing key size will do
|
||||
}
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub check_dkim_valid_author_sig {
|
||||
my ($self, $pms, $full_ref, @acceptable_domains) = @_;
|
||||
$self->_check_dkim_signature($pms) if !$pms->{dkim_checked_signature};
|
||||
@ -560,9 +606,10 @@ sub check_dkim_valid_author_sig {
|
||||
sub check_dkim_valid_envelopefrom {
|
||||
my ($self, $pms, $full_ref) = @_;
|
||||
my $result = 0;
|
||||
my $envfrom=$self->{'main'}->{'registryboundaries'}->uri_to_domain($pms->get("EnvelopeFrom"));
|
||||
my ($envfrom) = ($pms->get('EnvelopeFrom:addr')||'') =~ /\@(\S+)/;
|
||||
# if no envelopeFrom, it cannot be valid
|
||||
return $result if !$envfrom;
|
||||
return $result if !defined $envfrom;
|
||||
$envfrom = lc $envfrom;
|
||||
$self->_check_dkim_signature($pms) if !$pms->{dkim_checked_signature};
|
||||
if (!$pms->{dkim_valid}) {
|
||||
# don't bother
|
||||
@ -646,19 +693,21 @@ sub check_dkim_testing {
|
||||
return $result;
|
||||
}
|
||||
|
||||
sub check_for_dkim_whitelist_from {
|
||||
sub check_for_dkim_welcomelist_from {
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_dkim_whitelist($pms) if !$pms->{whitelist_checked};
|
||||
return $pms->{dkim_match_in_whitelist_from_dkim} ||
|
||||
$pms->{dkim_match_in_whitelist_auth};
|
||||
$self->_check_dkim_welcomelist($pms) if !$pms->{welcomelist_checked};
|
||||
return ($pms->{dkim_match_in_welcomelist_from_dkim} ||
|
||||
$pms->{dkim_match_in_welcomelist_auth}) ? 1 : 0;
|
||||
}
|
||||
*check_for_dkim_whitelist_from = \&check_for_dkim_welcomelist_from; # removed in 4.1
|
||||
|
||||
sub check_for_def_dkim_whitelist_from {
|
||||
sub check_for_def_dkim_welcomelist_from {
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_dkim_whitelist($pms) if !$pms->{whitelist_checked};
|
||||
return $pms->{dkim_match_in_def_whitelist_from_dkim} ||
|
||||
$pms->{dkim_match_in_def_whitelist_auth};
|
||||
$self->_check_dkim_welcomelist($pms) if !$pms->{welcomelist_checked};
|
||||
return ($pms->{dkim_match_in_def_welcomelist_from_dkim} ||
|
||||
$pms->{dkim_match_in_def_welcomelist_auth}) ? 1 : 0;
|
||||
}
|
||||
*check_for_def_dkim_whitelist_from = \&check_for_def_dkim_welcomelist_from; # removed in 4.1
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@ -667,8 +716,7 @@ sub _dkim_load_modules {
|
||||
|
||||
if (!$self->{tried_loading}) {
|
||||
$self->{service_available} = 0;
|
||||
my $timemethod = $self->{main}->UNIVERSAL::can("time_method") &&
|
||||
$self->{main}->time_method("dkim_load_modules");
|
||||
my $timemethod = $self->{main}->time_method("dkim_load_modules");
|
||||
my $eval_stat;
|
||||
eval {
|
||||
# Have to do this so that RPM doesn't find these as required perl modules.
|
||||
@ -686,6 +734,9 @@ sub _dkim_load_modules {
|
||||
my $version = Mail::DKIM::Verifier->VERSION;
|
||||
if (version->parse($version) >= version->parse(0.31)) {
|
||||
dbg("dkim: using Mail::DKIM version $version");
|
||||
} elsif (version->parse($version) < version->parse(0.50)) {
|
||||
dbg("dkim: Mail::DKIM $version is older than 0.50 ".
|
||||
"ARC support will not be available, suggested upgrade to 0.50 or later!");
|
||||
} else {
|
||||
info("dkim: Mail::DKIM $version is older than the required ".
|
||||
"minimal version 0.31, suggested upgrade to 0.37 or later!");
|
||||
@ -698,6 +749,18 @@ sub _dkim_load_modules {
|
||||
eval { require Mail::DKIM::DkimPolicy } # ignoring status
|
||||
}
|
||||
}
|
||||
eval {
|
||||
# Have to do this so that RPM doesn't find these as required perl modules.
|
||||
{ require Mail::DKIM::ARC::Verifier }
|
||||
$self->{arc_available} = 1;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
if (defined $eval_stat) {
|
||||
dbg("dkim: cannot load Mail::DKIM::ARC module, DKIM::ARC checks disabled: %s",
|
||||
$eval_stat);
|
||||
}
|
||||
$self->{arc_available} = 0;
|
||||
};
|
||||
}
|
||||
return $self->{service_available};
|
||||
}
|
||||
@ -720,8 +783,8 @@ sub _check_dkim_signed_by {
|
||||
next if $minimum_key_bits && $sig->{_spamassassin_key_size} &&
|
||||
$sig->{_spamassassin_key_size} < $minimum_key_bits;
|
||||
}
|
||||
my $sdid = $sig->domain;
|
||||
next if !defined $sdid; # a signature with a missing required tag 'd' ?
|
||||
my ($sdid) = (defined $sig->identity)? $sig->identity =~ /\@(\S+)/ : ($sig->domain);
|
||||
next if !defined $sdid; # a signature with a missing required tag 'd' or 'i' ?
|
||||
$sdid = lc $sdid;
|
||||
if ($must_be_author_domain_signature) {
|
||||
next if !$pms->{dkim_author_domains}->{$sdid};
|
||||
@ -765,9 +828,10 @@ sub _check_dkim_signature {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $conf = $pms->{conf};
|
||||
my($verifier, @signatures, @valid_signatures);
|
||||
my($verifier, $arc_verifier, @signatures, @arc_signatures, @valid_signatures, @arc_valid_signatures);
|
||||
|
||||
$pms->{dkim_checked_signature} = 1; # has this sub already been invoked?
|
||||
$pms->{arc_checked_signature} = 1; # has this sub already been invoked?
|
||||
$pms->{dkim_signatures_ready} = 0; # have we obtained & verified signatures?
|
||||
$pms->{dkim_signatures_dependable} = 0;
|
||||
# dkim_signatures_dependable =
|
||||
@ -776,16 +840,18 @@ sub _check_dkim_signature {
|
||||
# (no signatures, or message was not truncated) )
|
||||
$pms->{dkim_signatures} = \@signatures;
|
||||
$pms->{dkim_valid_signatures} = \@valid_signatures;
|
||||
$pms->{arc_signatures} = \@arc_signatures;
|
||||
$pms->{arc_valid_signatures} = \@arc_valid_signatures;
|
||||
$pms->{dkim_signed} = 0;
|
||||
$pms->{arc_signed} = 0;
|
||||
$pms->{dkim_valid} = 0;
|
||||
$pms->{arc_valid} = 0;
|
||||
$pms->{dkim_key_testing} = 0;
|
||||
# the following hashes are keyed by a signing domain (SDID):
|
||||
$pms->{dkim_author_sig_tempfailed} = {}; # DNS timeout verifying author sign.
|
||||
$pms->{dkim_has_valid_author_sig} = {}; # a valid author domain signature
|
||||
$pms->{dkim_has_any_author_sig} = {}; # valid or invalid author domain sign.
|
||||
|
||||
$self->_get_authors($pms) if !$pms->{dkim_author_addresses};
|
||||
|
||||
my $suppl_attrib = $pms->{msg}->{suppl_attrib};
|
||||
if (defined $suppl_attrib && exists $suppl_attrib->{dkim_signatures}) {
|
||||
# caller of SpamAssassin already supplied DKIM signature objects
|
||||
@ -793,48 +859,88 @@ sub _check_dkim_signature {
|
||||
@signatures = @$provided_signatures if ref $provided_signatures;
|
||||
$pms->{dkim_signatures_ready} = 1;
|
||||
$pms->{dkim_signatures_dependable} = 1;
|
||||
dbg("dkim: signatures provided by the caller, %d signatures",
|
||||
dbg("dkim: DKIM signatures provided by the caller, %d signatures",
|
||||
scalar(@signatures));
|
||||
}
|
||||
if (defined $suppl_attrib && exists $suppl_attrib->{arc_signatures}) {
|
||||
# caller of SpamAssassin already supplied ARC signature objects
|
||||
my $provided_arc_signatures = $suppl_attrib->{arc_signatures};
|
||||
@arc_signatures = @$provided_arc_signatures if ref $provided_arc_signatures;
|
||||
$pms->{arc_signatures_ready} = 1;
|
||||
$pms->{arc_signatures_dependable} = 1;
|
||||
dbg("dkim: ARC signatures provided by the caller, %d signatures",
|
||||
scalar(@arc_signatures));
|
||||
}
|
||||
|
||||
if ($pms->{dkim_signatures_ready}) {
|
||||
if ($pms->{dkim_signatures_ready} or $pms->{arc_signatures_ready}) {
|
||||
# signatures already available and verified
|
||||
_check_valid_signature($self, $pms, $verifier, 'DKIM', \@signatures) if $self->{service_available};
|
||||
_check_valid_signature($self, $pms, $arc_verifier, 'ARC', \@arc_signatures) if $self->{arc_available};
|
||||
} elsif (!$pms->is_dns_available()) {
|
||||
dbg("dkim: signature verification disabled, DNS resolving not available");
|
||||
} elsif (!$self->_dkim_load_modules()) {
|
||||
# Mail::DKIM module not available
|
||||
} else {
|
||||
# signature objects not provided by the caller, must verify for ourselves
|
||||
my $timemethod = $self->{main}->UNIVERSAL::can("time_method") &&
|
||||
$self->{main}->time_method("check_dkim_signature");
|
||||
use version 0.77;
|
||||
if (version->parse(Mail::DKIM::Verifier->VERSION) >= version->parse(0.40)) {
|
||||
my $timemethod = $self->{main}->time_method("check_dkim_signature");
|
||||
if (Mail::DKIM::Verifier->VERSION >= 0.40) {
|
||||
my $edns = $conf->{dns_options}->{edns};
|
||||
if ($edns && $edns >= 1024) {
|
||||
# Let Mail::DKIM use our interface to Net::DNS::Resolver.
|
||||
# Only do so if EDNS0 provides a reasonably-sized UDP payload size,
|
||||
# as our interface does not provide a DNS fallback to TCP, unlike
|
||||
# the Net::DNS::Resolver::send which does provide it.
|
||||
# See also Bug 7265 regarding a choice of a resolver.
|
||||
# my $res = $self->{main}->{resolver}->get_resolver;
|
||||
my $res = $self->{main}->{resolver};
|
||||
dbg("dkim: providing our own resolver: %s", ref $res);
|
||||
Mail::DKIM::DNS::resolver($res);
|
||||
}
|
||||
}
|
||||
$verifier = Mail::DKIM::Verifier->new;
|
||||
if (!$verifier) {
|
||||
dbg("dkim: cannot create Mail::DKIM::Verifier object");
|
||||
return;
|
||||
$verifier = Mail::DKIM::Verifier->new if $self->{service_available};
|
||||
_check_signature($self, $pms, $verifier, 'DKIM', \@signatures) if $self->{service_available};
|
||||
$arc_verifier = Mail::DKIM::ARC::Verifier->new if $self->{arc_available};
|
||||
_check_signature($self, $pms, $arc_verifier, 'ARC', \@arc_signatures) if $self->{arc_available};
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_signature {
|
||||
my($self, $pms, $verifier, $type, $signatures) = @_;
|
||||
|
||||
my $sig_type = lc $type;
|
||||
$self->_get_authors($pms) if !$pms->{"${sig_type}_author_addresses"};
|
||||
|
||||
my(@valid_signatures);
|
||||
my $conf = $pms->{conf};
|
||||
if (!$verifier) {
|
||||
if ($type eq 'DKIM') {
|
||||
dbg("dkim: cannot create Mail::DKIM::Verifier object");
|
||||
} elsif ($type eq 'ARC') {
|
||||
dbg("dkim: cannot create Mail::DKIM::ARC::Verifier object");
|
||||
}
|
||||
return;
|
||||
} else {
|
||||
if ($type eq 'DKIM') {
|
||||
$pms->{dkim_verifier} = $verifier;
|
||||
#
|
||||
} elsif ($type eq 'ARC') {
|
||||
$pms->{arc_verifier} = $verifier;
|
||||
}
|
||||
}
|
||||
# feed content of a message into verifier, using \r\n endings,
|
||||
# required by Mail::DKIM API (see bug 5300)
|
||||
# note: bug 5179 comment 28: perl does silly things on non-Unix platforms
|
||||
# unless we use \015\012 instead of \r\n
|
||||
eval {
|
||||
my $str = $pms->{msg}->get_pristine();
|
||||
$str =~ s/\r?\n/\015\012/sg; # ensure \015\012 ending
|
||||
if ($pms->{msg}->{line_ending} eq "\015\012") {
|
||||
# message already CRLF, just feed it
|
||||
$verifier->PRINT($str);
|
||||
} else {
|
||||
# feeding large chunk to Mail::DKIM is _much_ faster than line-by-line
|
||||
$str =~ s/\012/\015\012/gs; # LF -> CRLF
|
||||
$verifier->PRINT($str);
|
||||
undef $str;
|
||||
}
|
||||
1;
|
||||
} or do { # intercept die() exceptions and render safe
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
@ -847,7 +953,7 @@ sub _check_dkim_signature {
|
||||
{ secs => $timeout, deadline => $pms->{master_deadline} });
|
||||
|
||||
my $err = $timer->run_and_catch(sub {
|
||||
dbg("dkim: performing public key lookup and signature verification");
|
||||
dbg("dkim: performing public $type key lookup and signature verification");
|
||||
$verifier->CLOSE(); # the action happens here
|
||||
|
||||
# currently SpamAssassin's parsing is better than Mail::Address parsing,
|
||||
@ -856,8 +962,17 @@ sub _check_dkim_signature {
|
||||
|
||||
# versions before 0.29 only provided a public interface to fetch one
|
||||
# signature, newer versions allow access to all signatures of a message
|
||||
@signatures = $verifier->UNIVERSAL::can("signatures") ?
|
||||
@$signatures = $verifier->UNIVERSAL::can("signatures") ?
|
||||
$verifier->signatures : $verifier->signature;
|
||||
if (would_log("dbg","dkim")) {
|
||||
foreach my $signature (@$signatures) {
|
||||
dbg("dkim: $type signature i=%s d=%s",
|
||||
map(!defined $_ ? '(undef)' : $_,
|
||||
$signature->identity, $signature->domain
|
||||
)
|
||||
);
|
||||
}
|
||||
}
|
||||
});
|
||||
if ($timer->timed_out()) {
|
||||
dbg("dkim: public key lookup or verification timed out after %s s",
|
||||
@ -867,23 +982,43 @@ sub _check_dkim_signature {
|
||||
|
||||
} elsif ($err) {
|
||||
chomp $err;
|
||||
dbg("dkim: public key lookup or verification failed: $err");
|
||||
dbg("dkim: $type public key lookup or verification failed: $err");
|
||||
}
|
||||
if ($type eq 'DKIM') {
|
||||
$pms->{dkim_signatures_ready} = 1;
|
||||
if (!@signatures || !$pms->{tests_already_hit}->{'__TRUNCATED'}) {
|
||||
if (!@$signatures || !$pms->{tests_already_hit}->{'__TRUNCATED'}) {
|
||||
$pms->{dkim_signatures_dependable} = 1;
|
||||
}
|
||||
_check_valid_signature($self, $pms, $verifier, 'DKIM', \@$signatures) if $self->{service_available};
|
||||
} elsif ($type eq 'ARC') {
|
||||
$pms->{arc_signatures_ready} = 1;
|
||||
if (!@$signatures || !$pms->{tests_already_hit}->{'__TRUNCATED'}) {
|
||||
$pms->{arc_signatures_dependable} = 1;
|
||||
}
|
||||
_check_valid_signature($self, $pms, $verifier, 'ARC', \@$signatures) if $self->{arc_available};
|
||||
}
|
||||
}
|
||||
|
||||
if ($pms->{dkim_signatures_ready}) {
|
||||
sub _check_valid_signature {
|
||||
my($self, $pms, $verifier, $type, $signatures) = @_;
|
||||
|
||||
my $sig_type = lc $type;
|
||||
$self->_get_authors($pms) if !$pms->{"${sig_type}_author_addresses"};
|
||||
|
||||
my(@valid_signatures);
|
||||
my $conf = $pms->{conf};
|
||||
# DKIM signatures check
|
||||
if ($pms->{"${sig_type}_signatures_ready"}) {
|
||||
my $sig_result_supported;
|
||||
# dkim_minimum_key_bits is evaluated for ARC signatures as well
|
||||
my $minimum_key_bits = $conf->{dkim_minimum_key_bits};
|
||||
foreach my $signature (@signatures) {
|
||||
foreach my $signature (@$signatures) {
|
||||
# old versions of Mail::DKIM would give undef for an invalid signature
|
||||
next if !defined $signature;
|
||||
next if !$signature->selector; # empty selector
|
||||
|
||||
$sig_result_supported = $signature->UNIVERSAL::can("result_detail");
|
||||
# test for empty selector (must not treat a selector "0" as missing!)
|
||||
next if !defined $signature->selector || $signature->selector eq "";
|
||||
|
||||
my($info, $valid, $expired);
|
||||
$valid =
|
||||
($sig_result_supported ? $signature : $verifier)->result eq 'pass';
|
||||
@ -904,23 +1039,24 @@ sub _check_dkim_signature {
|
||||
push(@valid_signatures, $signature) if $valid && !$expired;
|
||||
|
||||
# check if we have a potential Author Domain Signature, valid or not
|
||||
my $d = $signature->domain;
|
||||
my ($d) = (defined $signature->identity)? $signature->identity =~ /\@(\S+)/ : ($signature->domain);
|
||||
if (!defined $d) {
|
||||
# can be undefined on a broken signature with missing required tags
|
||||
} else {
|
||||
$d = lc $d;
|
||||
if ($pms->{dkim_author_domains}->{$d}) { # SDID matches author domain
|
||||
$pms->{dkim_has_any_author_sig}->{$d} = 1;
|
||||
if ($pms->{"${sig_type}_author_domains"}->{$d}) { # SDID matches author domain
|
||||
$pms->{"${sig_type}_has_any_author_sig"}->{$d} = 1;
|
||||
if ($valid && !$expired &&
|
||||
$key_size && $key_size >= $minimum_key_bits) {
|
||||
$pms->{dkim_has_valid_author_sig}->{$d} = 1;
|
||||
$pms->{"${sig_type}_has_valid_author_sig"}->{$d} = 1;
|
||||
} elsif ( ($sig_result_supported ? $signature
|
||||
: $verifier)->result_detail
|
||||
=~ /\b(?:timed out|SERVFAIL)\b/i) {
|
||||
$pms->{dkim_author_sig_tempfailed}->{$d} = 1;
|
||||
$pms->{"${sig_type}_author_sig_tempfailed"}->{$d} = 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
if ($type eq 'DKIM') {
|
||||
if (would_log("dbg","dkim")) {
|
||||
dbg("dkim: %s %s, i=%s, d=%s, s=%s, a=%s, c=%s, %s, %s, %s",
|
||||
$info,
|
||||
@ -935,14 +1071,28 @@ sub _check_dkim_signature {
|
||||
: 'does not match author domain',
|
||||
);
|
||||
}
|
||||
} elsif ($type eq 'ARC') {
|
||||
if (would_log("dbg","dkim")) {
|
||||
dbg("dkim: %s %s, i=%s, d=%s, s=%s, a=%s, c=%s, %s, %s, %s",
|
||||
$info,
|
||||
$type,
|
||||
map(!defined $_ ? '(undef)' : $_,
|
||||
$signature->identity, $d, $signature->selector,
|
||||
$signature->algorithm, scalar($signature->canonicalization),
|
||||
$key_size ? "key_bits=$key_size" : "unknown key size",
|
||||
($sig_result_supported ? $signature : $verifier)->result ),
|
||||
defined $d && $pms->{arc_author_domains}->{$d}
|
||||
? 'matches author domain'
|
||||
: 'does not match author domain',
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (@valid_signatures) {
|
||||
if ($type eq 'DKIM') {
|
||||
$pms->{dkim_signed} = 1;
|
||||
$pms->{dkim_valid} = 1;
|
||||
# let the result stand out more clearly in the log, use uppercase
|
||||
my $sig = $valid_signatures[0];
|
||||
my $sig_res = ($sig_result_supported ? $sig : $verifier)->result_detail;
|
||||
dbg("dkim: signature verification result: %s", uc($sig_res));
|
||||
|
||||
# supply values for both tags
|
||||
my(%seen1, %seen2, %seen3, @identity_list, @domain_list, @selector_list);
|
||||
@ -958,14 +1108,27 @@ sub _check_dkim_signature {
|
||||
@domain_list == 1 ? $domain_list[0] : \@domain_list);
|
||||
$pms->set_tag('DKIMSELECTOR',
|
||||
@selector_list == 1 ? $selector_list[0] : \@selector_list);
|
||||
} elsif (@signatures) {
|
||||
} elsif ($type eq 'ARC') {
|
||||
$pms->{arc_signed} = 1;
|
||||
$pms->{arc_valid} = 1;
|
||||
}
|
||||
# let the result stand out more clearly in the log, use uppercase
|
||||
my $sig = $valid_signatures[0];
|
||||
my $sig_res = ($sig_result_supported ? $sig : $verifier)->result_detail;
|
||||
dbg("dkim: $type signature verification result: %s", uc($sig_res));
|
||||
|
||||
} elsif (@$signatures) {
|
||||
if ($type eq 'DKIM') {
|
||||
$pms->{dkim_signed} = 1;
|
||||
my $sig = $signatures[0];
|
||||
my $sig_res =
|
||||
($sig_result_supported && $sig ? $sig : $verifier)->result_detail;
|
||||
dbg("dkim: signature verification result: %s", uc($sig_res));
|
||||
} elsif ($type eq 'ARC') {
|
||||
$pms->{arc_signed} = 1;
|
||||
}
|
||||
my $sig = @$signatures[0];
|
||||
my $sig_res = ($sig_result_supported ? $sig : $verifier)->result_detail;
|
||||
dbg("dkim: $type signature verification result: %s", uc($sig_res));
|
||||
|
||||
} else {
|
||||
dbg("dkim: signature verification result: none");
|
||||
dbg("dkim: $type signature verification result: none");
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -1064,8 +1227,7 @@ sub _check_dkim_adsp {
|
||||
dbg("dkim: adsp not retrieved, module Mail::DKIM not available");
|
||||
|
||||
} else { # do the ADSP DNS lookup
|
||||
my $timemethod = $self->{main}->UNIVERSAL::can("time_method") &&
|
||||
$self->{main}->time_method("check_dkim_adsp");
|
||||
my $timemethod = $self->{main}->time_method("check_dkim_adsp");
|
||||
|
||||
my $practices; # author domain signing practices object
|
||||
my $timeout = $pms->{conf}->{dkim_timeout};
|
||||
@ -1074,12 +1236,13 @@ sub _check_dkim_adsp {
|
||||
my $err = $timer->run_and_catch(sub {
|
||||
eval {
|
||||
if (Mail::DKIM::AuthorDomainPolicy->UNIVERSAL::can("fetch")) {
|
||||
my $author_domain_ace = idn_to_ascii($author_domain);
|
||||
dbg("dkim: adsp: performing lookup on _adsp._domainkey.%s",
|
||||
$author_domain);
|
||||
$author_domain_ace);
|
||||
# get our Net::DNS::Resolver object
|
||||
my $res = $self->{main}->{resolver}->get_resolver;
|
||||
$practices = Mail::DKIM::AuthorDomainPolicy->fetch(
|
||||
Protocol => "dns", Domain => $author_domain,
|
||||
Protocol => "dns", Domain => $author_domain_ace,
|
||||
DnsResolver => $res);
|
||||
}
|
||||
1;
|
||||
@ -1128,36 +1291,36 @@ sub _check_dkim_adsp {
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_dkim_whitelist {
|
||||
sub _check_dkim_welcomelist {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
$pms->{whitelist_checked} = 1;
|
||||
$pms->{welcomelist_checked} = 1;
|
||||
|
||||
$self->_get_authors($pms) if !$pms->{dkim_author_addresses};
|
||||
|
||||
my $authors_str = join(", ", @{$pms->{dkim_author_addresses}});
|
||||
if ($authors_str eq '') {
|
||||
dbg("dkim: check_dkim_whitelist: could not find author address");
|
||||
dbg("dkim: check_dkim_weclomelist: could not find author address");
|
||||
return;
|
||||
}
|
||||
|
||||
# collect whitelist entries matching the author from all lists
|
||||
# collect welcomelist entries matching the author from all lists
|
||||
my @acceptable_sdid_tuples;
|
||||
$self->_wlcheck_acceptable_signature($pms, \@acceptable_sdid_tuples,
|
||||
'def_whitelist_from_dkim');
|
||||
'def_welcomelist_from_dkim');
|
||||
$self->_wlcheck_author_signature($pms, \@acceptable_sdid_tuples,
|
||||
'def_whitelist_auth');
|
||||
'def_welcomelist_auth');
|
||||
$self->_wlcheck_acceptable_signature($pms, \@acceptable_sdid_tuples,
|
||||
'whitelist_from_dkim');
|
||||
'welcomelist_from_dkim');
|
||||
$self->_wlcheck_author_signature($pms, \@acceptable_sdid_tuples,
|
||||
'whitelist_auth');
|
||||
'welcomelist_auth');
|
||||
if (!@acceptable_sdid_tuples) {
|
||||
dbg("dkim: no wl entries match author %s, no need to verify sigs",
|
||||
$authors_str);
|
||||
return;
|
||||
}
|
||||
|
||||
# if the message doesn't pass DKIM validation, it can't pass DKIM whitelist
|
||||
# if the message doesn't pass DKIM validation, it can't pass DKIM welcomelist
|
||||
|
||||
# trigger a DKIM check;
|
||||
# continue if one or more signatures are valid or we want the debug info
|
||||
@ -1177,38 +1340,38 @@ sub _check_dkim_whitelist {
|
||||
}
|
||||
}
|
||||
if (@valid) {
|
||||
dbg("dkim: author %s, WHITELISTED by %s",
|
||||
dbg("dkim: author %s, WELCOMELISTED by %s",
|
||||
$authors_str, join(", ",@valid));
|
||||
} elsif (@fail) {
|
||||
dbg("dkim: author %s, found in %s BUT IGNORED",
|
||||
$authors_str, join(", ",@fail));
|
||||
} else {
|
||||
dbg("dkim: author %s, not in any dkim whitelist", $authors_str);
|
||||
dbg("dkim: author %s, not in any dkim welcomelist", $authors_str);
|
||||
}
|
||||
}
|
||||
|
||||
# check for verifier-acceptable signatures; an empty (or undefined) signing
|
||||
# domain in a whitelist implies checking for an Author Domain Signature
|
||||
# domain in a welcomelist implies checking for an Author Domain Signature
|
||||
#
|
||||
sub _wlcheck_acceptable_signature {
|
||||
my ($self, $pms, $acceptable_sdid_tuples_ref, $wl) = @_;
|
||||
my $wl_ref = $pms->{conf}->{$wl};
|
||||
foreach my $author (@{$pms->{dkim_author_addresses}}) {
|
||||
foreach my $white_addr (keys %$wl_ref) {
|
||||
my $wl_addr_ref = $wl_ref->{$white_addr};
|
||||
my $re = qr/$wl_addr_ref->{re}/i;
|
||||
# dbg("dkim: WL %s %s, d: %s", $wl, $white_addr,
|
||||
my $author_lc = lc($author);
|
||||
foreach my $welcome_addr (keys %$wl_ref) {
|
||||
my $wl_addr_ref = $wl_ref->{$welcome_addr};
|
||||
# dbg("dkim: WL %s %s, d: %s", $wl, $welcome_addr,
|
||||
# join(", ", map { $_ eq '' ? "''" : $_ } @{$wl_addr_ref->{domain}}));
|
||||
if ($author =~ $re) {
|
||||
if ($author_lc =~ /$wl_addr_ref->{re}/) {
|
||||
foreach my $sdid (@{$wl_addr_ref->{domain}}) {
|
||||
push(@$acceptable_sdid_tuples_ref, [$author,$sdid,$wl,$re]);
|
||||
push(@$acceptable_sdid_tuples_ref, [$author,$sdid,$wl,$welcome_addr]);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# use a traditional whitelist_from -style addrlist, the only acceptable DKIM
|
||||
# use a traditional welcomelist_from -style addrlist, the only acceptable DKIM
|
||||
# signature is an Author Domain Signature. Note: don't pre-parse and store
|
||||
# domains; that's inefficient memory-wise and only saves one m//
|
||||
#
|
||||
@ -1216,11 +1379,11 @@ sub _wlcheck_author_signature {
|
||||
my ($self, $pms, $acceptable_sdid_tuples_ref, $wl) = @_;
|
||||
my $wl_ref = $pms->{conf}->{$wl};
|
||||
foreach my $author (@{$pms->{dkim_author_addresses}}) {
|
||||
foreach my $white_addr (keys %$wl_ref) {
|
||||
my $re = qr/$wl_ref->{$white_addr}/i;
|
||||
# dbg("dkim: WL %s %s", $wl, $white_addr);
|
||||
if ($author =~ $re) {
|
||||
push(@$acceptable_sdid_tuples_ref, [$author,undef,$wl,$re]);
|
||||
my $author_lc = lc($author);
|
||||
foreach my $welcome_addr (keys %$wl_ref) {
|
||||
# dbg("dkim: WL %s %s", $wl, $welcome_addr);
|
||||
if ($author_lc =~ /$wl_ref->{$welcome_addr}/) {
|
||||
push(@$acceptable_sdid_tuples_ref, [$author,undef,$wl,$welcome_addr]);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -1238,9 +1401,10 @@ sub _wlcheck_list {
|
||||
foreach my $signature (@{$pms->{dkim_signatures}}) {
|
||||
# old versions of Mail::DKIM would give undef for an invalid signature
|
||||
next if !defined $signature;
|
||||
next if !$signature->selector; # empty selector
|
||||
|
||||
my $sig_result_supported = $signature->UNIVERSAL::can("result_detail");
|
||||
# test for empty selector (must not treat a selector "0" as missing!)
|
||||
next if !defined $signature->selector || $signature->selector eq "";
|
||||
|
||||
my($info, $valid, $expired, $key_size_weak);
|
||||
$valid =
|
||||
($sig_result_supported ? $signature : $verifier)->result eq 'pass';
|
||||
@ -1256,13 +1420,13 @@ sub _wlcheck_list {
|
||||
}
|
||||
}
|
||||
|
||||
my $sdid = $signature->domain;
|
||||
my ($sdid) = (defined $signature->identity)? $signature->identity =~ /\@(\S+)/ : ($signature->domain);
|
||||
$sdid = lc $sdid if defined $sdid;
|
||||
|
||||
my %tried_authors;
|
||||
foreach my $entry (@$acceptable_sdid_tuples_ref) {
|
||||
my($author, $acceptable_sdid, $wl, $re) = @$entry;
|
||||
# $re and $wl are here for logging purposes only, $re already checked.
|
||||
my($author, $acceptable_sdid, $wl, $welcome_addr) = @$entry;
|
||||
# $welcome_addr and $wl are here for logging purposes only, already checked.
|
||||
# The $acceptable_sdid is a verifier-acceptable signing domain
|
||||
# identifier (to be matched against a 'd' tag in signatures).
|
||||
# When $acceptable_sdid is undef or an empty string it implies
|
||||
@ -1274,7 +1438,7 @@ sub _wlcheck_list {
|
||||
|
||||
my $matches = 0;
|
||||
if (!defined $sdid) {
|
||||
# don't bother, invalid signature with a missing 'd' tag
|
||||
# don't bother, invalid signature with a missing 'd' or 'i' tag
|
||||
|
||||
} elsif (!defined $acceptable_sdid || $acceptable_sdid eq '') {
|
||||
# An "Author Domain Signature" (sometimes called a first-party
|
||||
@ -1287,7 +1451,7 @@ sub _wlcheck_list {
|
||||
$matches = 1 if $sdid eq $author_domain;
|
||||
|
||||
} else { # checking for verifier-acceptable signature
|
||||
# The second argument to a 'whitelist_from_dkim' option is now (since
|
||||
# The second argument to a 'welcomelist_from_dkim' option is now (since
|
||||
# version 3.3.0) supposed to be a signing domain (SDID), no longer an
|
||||
# identity (AUID). Nevertheless, be prepared to accept the full e-mail
|
||||
# address there for compatibility, and just ignore its local-part.
|
||||
@ -1303,17 +1467,17 @@ sub _wlcheck_list {
|
||||
if (would_log("dbg","dkim")) {
|
||||
if ($sdid eq $author_domain) {
|
||||
dbg("dkim: %s author domain signature by %s, MATCHES %s %s",
|
||||
$info, $sdid, $wl, $re);
|
||||
$info, $sdid, $wl, $welcome_addr);
|
||||
} else {
|
||||
dbg("dkim: %s third-party signature by %s, author domain %s, ".
|
||||
"MATCHES %s %s", $info, $sdid, $author_domain, $wl, $re);
|
||||
"MATCHES %s %s", $info, $sdid, $author_domain, $wl, $welcome_addr);
|
||||
}
|
||||
}
|
||||
# a defined value indicates at least a match, not necessarily valid
|
||||
# (this complication servers to preserve logging compatibility)
|
||||
$any_match_by_wl{$wl} = '' if !exists $any_match_by_wl{$wl};
|
||||
}
|
||||
# only valid signature can cause whitelisting
|
||||
# only valid signature can cause welcomelisting
|
||||
$matches = 0 if !$valid || $expired || $key_size_weak;
|
||||
|
||||
if ($matches) {
|
||||
@ -1328,4 +1492,7 @@ sub _wlcheck_list {
|
||||
return ($any_match_at_all, \%any_match_by_wl);
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_arc { 1 }
|
||||
|
||||
1;
|
||||
|
360
upstream/lib/Mail/SpamAssassin/Plugin/DMARC.pm
Normal file
360
upstream/lib/Mail/SpamAssassin/Plugin/DMARC.pm
Normal file
@ -0,0 +1,360 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
#
|
||||
# Author: Giovanni Bechis <gbechis@apache.org>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::DMARC - check DMARC policy
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::DMARC
|
||||
|
||||
ifplugin Mail::SpamAssassin::Plugin::DMARC
|
||||
header DMARC_PASS eval:check_dmarc_pass()
|
||||
describe DMARC_PASS DMARC pass policy
|
||||
tflags DMARC_PASS net nice
|
||||
score DMARC_PASS -0.001
|
||||
|
||||
header DMARC_REJECT eval:check_dmarc_reject()
|
||||
describe DMARC_REJECT DMARC reject policy
|
||||
tflags DMARC_REJECT net
|
||||
score DMARC_REJECT 0.001
|
||||
|
||||
header DMARC_QUAR eval:check_dmarc_quarantine()
|
||||
describe DMARC_QUAR DMARC quarantine policy
|
||||
tflags DMARC_QUAR net
|
||||
score DMARC_QUAR 0.001
|
||||
|
||||
header DMARC_NONE eval:check_dmarc_none()
|
||||
describe DMARC_NONE DMARC none policy
|
||||
tflags DMARC_NONE net
|
||||
score DMARC_NONE 0.001
|
||||
|
||||
header DMARC_MISSING eval:check_dmarc_missing()
|
||||
describe DMARC_MISSING Missing DMARC policy
|
||||
tflags DMARC_MISSING net
|
||||
score DMARC_MISSING 0.001
|
||||
endif
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin checks if emails match DMARC policy, the plugin needs both DKIM
|
||||
and SPF plugins enabled.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::DMARC;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
my $VERSION = 0.2;
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub dbg { my $msg = shift; Mail::SpamAssassin::Logger::dbg("DMARC: $msg", @_); }
|
||||
sub info { my $msg = shift; Mail::SpamAssassin::Logger::info("DMARC: $msg", @_); }
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsa->{conf});
|
||||
$self->register_eval_rule("check_dmarc_pass", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_dmarc_reject", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_dmarc_quarantine", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_dmarc_none", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_dmarc_missing", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
=over 4
|
||||
|
||||
=item dmarc_save_reports ( 0 | 1 ) (default: 0)
|
||||
|
||||
Store DMARC reports using Mail::DMARC::Store, mail-dmarc.ini must be configured to save and send DMARC reports.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'dmarc_save_reports',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
|
||||
# Force waiting of SPF and DKIM results
|
||||
$pms->{dmarc_async_queue} = [];
|
||||
}
|
||||
|
||||
sub _check_eval {
|
||||
my ($self, $pms, $result) = @_;
|
||||
|
||||
if (exists $pms->{dmarc_async_queue}) {
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
push @{$pms->{dmarc_async_queue}}, sub {
|
||||
if ($result->()) {
|
||||
$pms->got_hit($rulename, '', ruletype => 'header');
|
||||
} else {
|
||||
$pms->rule_ready($rulename);
|
||||
}
|
||||
};
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
$self->_check_dmarc($pms);
|
||||
# make sure not to return undef, as this is not async anymore
|
||||
return $result->() || 0;
|
||||
}
|
||||
|
||||
sub check_dmarc_pass {
|
||||
my ($self, $pms, $name) = @_;
|
||||
|
||||
my $result = sub {
|
||||
defined $pms->{dmarc_result} &&
|
||||
$pms->{dmarc_result} eq 'pass' &&
|
||||
$pms->{dmarc_policy} ne 'no policy available';
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_dmarc_reject {
|
||||
my ($self, $pms, $name) = @_;
|
||||
|
||||
my $result = sub {
|
||||
defined $pms->{dmarc_result} &&
|
||||
$pms->{dmarc_result} eq 'fail' &&
|
||||
$pms->{dmarc_policy} eq 'reject';
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_dmarc_quarantine {
|
||||
my ($self, $pms, $name) = @_;
|
||||
|
||||
my $result = sub {
|
||||
defined $pms->{dmarc_result} &&
|
||||
$pms->{dmarc_result} eq 'fail' &&
|
||||
$pms->{dmarc_policy} eq 'quarantine';
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_dmarc_none {
|
||||
my ($self, $pms, $name) = @_;
|
||||
|
||||
my $result = sub {
|
||||
defined $pms->{dmarc_result} &&
|
||||
$pms->{dmarc_result} eq 'fail' &&
|
||||
$pms->{dmarc_policy} eq 'none';
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_dmarc_missing {
|
||||
my ($self, $pms, $name) = @_;
|
||||
|
||||
my $result = sub {
|
||||
defined $pms->{dmarc_result} &&
|
||||
$pms->{dmarc_policy} eq 'no policy available';
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_tick {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
$self->_check_async_queue($opts->{permsgstatus});
|
||||
}
|
||||
|
||||
sub check_cleanup {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
# Finish it whether SPF and DKIM is ready or not
|
||||
$self->_check_async_queue($opts->{permsgstatus}, 1);
|
||||
}
|
||||
|
||||
sub _check_async_queue {
|
||||
my ($self, $pms, $finish) = @_;
|
||||
|
||||
return unless exists $pms->{dmarc_async_queue};
|
||||
|
||||
# Check if SPF or DKIM is ready
|
||||
if ($finish || ($pms->{spf_checked} && $pms->{dkim_checked_signature})) {
|
||||
$self->_check_dmarc($pms);
|
||||
$_->() foreach (@{$pms->{dmarc_async_queue}});
|
||||
# No more async queueing needed. If any evals are called later, they
|
||||
# will act on the results directly.
|
||||
delete $pms->{dmarc_async_queue};
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_dmarc {
|
||||
my ($self, $pms, $name) = @_;
|
||||
|
||||
return unless $pms->is_dns_available();
|
||||
|
||||
# Load DMARC module
|
||||
if (!exists $self->{has_mail_dmarc}) {
|
||||
my $eval_stat;
|
||||
eval {
|
||||
require Mail::DMARC::PurePerl;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
};
|
||||
if (!defined($eval_stat)) {
|
||||
dbg("using Mail::DMARC::PurePerl for DMARC checks");
|
||||
$self->{has_mail_dmarc} = 1;
|
||||
} else {
|
||||
dbg("cannot load Mail::DMARC::PurePerl: module: $eval_stat");
|
||||
dbg("Mail::DMARC::PurePerl is required for DMARC checks, DMARC checks disabled");
|
||||
$self->{has_mail_dmarc} = undef;
|
||||
}
|
||||
}
|
||||
|
||||
return if !$self->{has_mail_dmarc};
|
||||
return if $pms->{dmarc_checked};
|
||||
$pms->{dmarc_checked} = 1;
|
||||
|
||||
my $lasthop = $pms->{relays_external}->[0];
|
||||
if (!defined $lasthop) {
|
||||
dbg("no external relay found, skipping DMARC check");
|
||||
return;
|
||||
}
|
||||
|
||||
my $from_addr = ($pms->get('From:first:addr'))[0];
|
||||
return if not defined $from_addr;
|
||||
return if index($from_addr, '@') == -1;
|
||||
|
||||
my $mfrom_domain = ($pms->get('EnvelopeFrom:first:addr:host'))[0];
|
||||
if (!defined $mfrom_domain) {
|
||||
$mfrom_domain = ($pms->get('From:first:addr:domain'))[0];
|
||||
return if !defined $mfrom_domain;
|
||||
dbg("EnvelopeFrom header not found, using From");
|
||||
}
|
||||
|
||||
my $spf_status = 'none';
|
||||
if ($pms->{spf_pass}) { $spf_status = 'pass'; }
|
||||
elsif ($pms->{spf_fail}) { $spf_status = 'fail'; }
|
||||
elsif ($pms->{spf_permerror}) { $spf_status = 'fail'; }
|
||||
elsif ($pms->{spf_none}) { $spf_status = 'fail'; }
|
||||
elsif ($pms->{spf_neutral}) { $spf_status = 'neutral'; }
|
||||
elsif ($pms->{spf_softfail}) { $spf_status = 'softfail'; }
|
||||
|
||||
my $spf_helo_status = 'none';
|
||||
if ($pms->{spf_helo_pass}) { $spf_helo_status = 'pass'; }
|
||||
elsif ($pms->{spf_helo_fail}) { $spf_helo_status = 'fail'; }
|
||||
elsif ($pms->{spf_helo_permerror}) { $spf_helo_status = 'fail'; }
|
||||
elsif ($pms->{spf_helo_none}) { $spf_helo_status = 'fail'; }
|
||||
elsif ($pms->{spf_helo_neutral}) { $spf_helo_status = 'neutral'; }
|
||||
elsif ($pms->{spf_helo_softfail}) { $spf_helo_status = 'softfail'; }
|
||||
|
||||
my $dmarc = Mail::DMARC::PurePerl->new();
|
||||
$dmarc->source_ip($lasthop->{ip});
|
||||
$dmarc->header_from_raw($from_addr);
|
||||
|
||||
my $suppl_attrib = $pms->{msg}->{suppl_attrib};
|
||||
if (defined $suppl_attrib && exists $suppl_attrib->{dkim_signatures}) {
|
||||
my $dkim_signatures = $suppl_attrib->{dkim_signatures};
|
||||
foreach my $signature ( @$dkim_signatures ) {
|
||||
$dmarc->dkim( domain => $signature->domain, result => $signature->result );
|
||||
dbg("DKIM result for domain " . $signature->domain . ": " . $signature->result);
|
||||
}
|
||||
} else {
|
||||
$dmarc->dkim($pms->{dkim_verifier}) if (ref($pms->{dkim_verifier}));
|
||||
}
|
||||
|
||||
my $result;
|
||||
eval {
|
||||
$dmarc->spf([
|
||||
{
|
||||
scope => 'mfrom',
|
||||
domain => $mfrom_domain,
|
||||
result => $spf_status,
|
||||
},
|
||||
{
|
||||
scope => 'helo',
|
||||
domain => $lasthop->{lc_helo},
|
||||
result => $spf_helo_status,
|
||||
},
|
||||
]);
|
||||
$result = $dmarc->validate();
|
||||
};
|
||||
if ($@) {
|
||||
dbg("error while evaluating domain $mfrom_domain: $@");
|
||||
return;
|
||||
}
|
||||
|
||||
if (defined($pms->{dmarc_result} = $result->result)) {
|
||||
if ($pms->{conf}->{dmarc_save_reports}) {
|
||||
my $rua = eval { $result->published()->rua(); };
|
||||
if (defined $rua && index($rua, 'mailto:') >= 0) {
|
||||
eval { $dmarc->save_aggregate(); };
|
||||
if ($@) {
|
||||
info("report could not be saved: $@");
|
||||
} else {
|
||||
dbg("report will be sent to $rua");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (defined $result->reason->[0]{comment} &&
|
||||
$result->reason->[0]{comment} eq 'too many policies') {
|
||||
dbg("result: no policy available (too many policies)");
|
||||
$pms->{dmarc_policy} = 'no policy available';
|
||||
} elsif ($result->result eq 'pass') {
|
||||
dbg("result: pass");
|
||||
$pms->{dmarc_policy} = $result->published->p;
|
||||
} elsif ($result->result ne 'none') {
|
||||
dbg("result: $result->{result}, disposition: $result->{disposition}, dkim: $result->{dkim}, spf: $result->{spf} (spf: $spf_status, spf_helo: $spf_helo_status)");
|
||||
$pms->{dmarc_policy} = $result->disposition;
|
||||
} else {
|
||||
dbg("result: no policy available");
|
||||
$pms->{dmarc_policy} = 'no policy available';
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
1;
|
||||
|
@ -45,7 +45,7 @@ package Mail::SpamAssassin::Plugin::DNSEval;
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use Mail::SpamAssassin::Util qw(reverse_ip_address is_fqdn_valid);
|
||||
use Mail::SpamAssassin::Util qw(reverse_ip_address idn_to_ascii compile_regexp is_fqdn_valid);
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
@ -55,7 +55,6 @@ use re 'taint';
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
@ -75,7 +74,6 @@ sub new {
|
||||
'check_rbl_ns_from',
|
||||
'check_rbl_txt',
|
||||
'check_rbl_sub',
|
||||
'check_rbl_results_for',
|
||||
'check_rbl_from_host',
|
||||
'check_rbl_from_domain',
|
||||
'check_rbl_envfrom',
|
||||
@ -86,7 +84,7 @@ sub new {
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
foreach(@{$self->{'evalrules'}}) {
|
||||
$self->register_eval_rule($_);
|
||||
$self->register_eval_rule($_, $Mail::SpamAssassin::Conf::TYPE_RBL_EVALS);
|
||||
}
|
||||
|
||||
return $self;
|
||||
@ -154,21 +152,104 @@ sub set_config {
|
||||
# directly as part of PMS
|
||||
sub check_start {
|
||||
my ($self, $opts) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
|
||||
foreach(@{$self->{'evalrules'}}) {
|
||||
$opts->{'permsgstatus'}->register_plugin_eval_glue($_);
|
||||
$pms->register_plugin_eval_glue($_);
|
||||
}
|
||||
|
||||
# Initialize check_rbl_sub tests
|
||||
$self->_init_rbl_subs($pms);
|
||||
}
|
||||
|
||||
sub _init_rbl_subs {
|
||||
my ($self, $pms) = @_;
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
# Very hacky stuff and direct rbl_evals usage for now, TODO rewrite everything
|
||||
foreach my $rule (@{$conf->{eval_to_rule}->{check_rbl_sub}||[]}) {
|
||||
next if !exists $conf->{rbl_evals}->{$rule};
|
||||
next if !$conf->{scores}->{$rule};
|
||||
# rbl_evals is [$function,[@args]]
|
||||
my $args = $conf->{rbl_evals}->{$rule}->[1];
|
||||
my ($set, $subtest) = @$args;
|
||||
if (!defined $subtest) {
|
||||
warn("dnseval: missing subtest for rule $rule\n");
|
||||
next;
|
||||
}
|
||||
if ($subtest =~ /^sb:/) {
|
||||
warn("dnseval: ignored $rule, SenderBase rules are deprecated\n");
|
||||
next;
|
||||
}
|
||||
# Compile as regex if not pure ip/bitmask (same check in process_dnsbl_result)
|
||||
if ($subtest !~ /^\d+(?:\.\d+\.\d+\.\d+)?$/) {
|
||||
my ($rec, $err) = compile_regexp($subtest, 0);
|
||||
if (!$rec) {
|
||||
warn("dnseval: invalid rule $rule subtest regexp '$subtest': $err\n");
|
||||
next;
|
||||
}
|
||||
$subtest = $rec;
|
||||
}
|
||||
dbg("dnseval: initialize check_rbl_sub for rule $rule, set $set, subtest $subtest");
|
||||
push @{$pms->{rbl_subs}{$set}}, [$subtest, $rule];
|
||||
}
|
||||
}
|
||||
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $pms = $opts->{permsgstatus};
|
||||
|
||||
return 1 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 1 if !$pms->is_dns_available();
|
||||
|
||||
# Process relaylists only once, not everytime in check_rbl_backend
|
||||
#
|
||||
# ok, make a list of all the IPs in the untrusted set
|
||||
my @fullips = map { $_->{ip} } @{$pms->{relays_untrusted}};
|
||||
# now, make a list of all the IPs in the external set, for use in
|
||||
# notfirsthop testing. This will often be more IPs than found
|
||||
# in @fullips. It includes the IPs that are trusted, but
|
||||
# not in internal_networks.
|
||||
my @fullexternal = map {
|
||||
(!$_->{internal}) ? ($_->{ip}) : ()
|
||||
} @{$pms->{relays_trusted}};
|
||||
push @fullexternal, @fullips; # add untrusted set too
|
||||
# Make sure a header significantly improves results before adding here
|
||||
# X-Sender-Ip: could be worth using (very low occurence for me)
|
||||
# X-Sender: has a very low bang-for-buck for me
|
||||
my @originating;
|
||||
foreach my $header (@{$pms->{conf}->{originating_ip_headers}}) {
|
||||
my $str = $pms->get($header, undef);
|
||||
next unless defined $str && $str ne '';
|
||||
push @originating, ($str =~ m/($IP_ADDRESS)/g);
|
||||
}
|
||||
# Let's go ahead and trim away all private ips (KLC)
|
||||
# also uniq the list and strip dups. (jm)
|
||||
my @ips = $self->ip_list_uniq_and_strip_private(@fullips);
|
||||
# if there's no untrusted IPs, it means we trust all the open-internet
|
||||
# relays, so we skip checks
|
||||
if (scalar @ips + scalar @originating > 0) {
|
||||
dbg("dnseval: IPs found: full-external: ".join(", ", @fullexternal).
|
||||
" untrusted: ".join(", ", @ips).
|
||||
" originating: ".join(", ", @originating));
|
||||
@{$pms->{dnseval_fullexternal}} = @fullexternal;
|
||||
@{$pms->{dnseval_ips}} = @ips;
|
||||
@{$pms->{dnseval_originating}} = @originating;
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub ip_list_uniq_and_strip_private {
|
||||
my ($self, @origips) = @_;
|
||||
my @ips;
|
||||
my %seen;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
foreach my $ip (@origips) {
|
||||
next unless $ip;
|
||||
next if (exists ($seen{$ip})); $seen{$ip} = 1;
|
||||
next if ($ip =~ /$IP_PRIVATE/o);
|
||||
next if exists $seen{$ip};
|
||||
$seen{$ip} = 1;
|
||||
next if $ip =~ IS_IP_PRIVATE;
|
||||
push(@ips, $ip);
|
||||
}
|
||||
return @ips;
|
||||
@ -181,12 +262,17 @@ sub ip_list_uniq_and_strip_private {
|
||||
sub check_rbl_accreditor {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest, $accreditor) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
if (!defined $pms->{accreditor_tag}) {
|
||||
$self->message_accreditor_tag($pms);
|
||||
}
|
||||
if ($pms->{accreditor_tag}->{$accreditor}) {
|
||||
$self->check_rbl_backend($pms, $rule, $set, $rbl_server, 'A', $subtest);
|
||||
# return undef for async status
|
||||
return $self->_check_rbl_backend($pms, $rule, $set, $rbl_server, 'A', $subtest);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -226,58 +312,16 @@ sub message_accreditor_tag {
|
||||
$pms->{accreditor_tag} = \%acctags;
|
||||
}
|
||||
|
||||
sub check_rbl_backend {
|
||||
sub _check_rbl_backend {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $type, $subtest) = @_;
|
||||
local ($_);
|
||||
|
||||
# First check that DNS is available, if not do not perform this check
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
return if !exists $pms->{dnseval_ips}; # no untrusted ips
|
||||
|
||||
if (($rbl_server !~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) &&
|
||||
(index($rbl_server, '.') >= 0) &&
|
||||
($rbl_server !~ /\.$/)) {
|
||||
$rbl_server .= ".";
|
||||
}
|
||||
|
||||
dbg("dns: checking RBL $rbl_server, set $set");
|
||||
|
||||
# ok, make a list of all the IPs in the untrusted set
|
||||
my @fullips = map { $_->{ip} } @{$pms->{relays_untrusted}};
|
||||
|
||||
# now, make a list of all the IPs in the external set, for use in
|
||||
# notfirsthop testing. This will often be more IPs than found
|
||||
# in @fullips. It includes the IPs that are trusted, but
|
||||
# not in internal_networks.
|
||||
my @fullexternal = map {
|
||||
(!$_->{internal}) ? ($_->{ip}) : ()
|
||||
} @{$pms->{relays_trusted}};
|
||||
push (@fullexternal, @fullips); # add untrusted set too
|
||||
|
||||
# Make sure a header significantly improves results before adding here
|
||||
# X-Sender-Ip: could be worth using (very low occurence for me)
|
||||
# X-Sender: has a very low bang-for-buck for me
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my @originating;
|
||||
for my $header (@{$pms->{conf}->{originating_ip_headers}}) {
|
||||
my $str = $pms->get($header,undef);
|
||||
next unless defined $str && $str ne '';
|
||||
push (@originating, ($str =~ m/($IP_ADDRESS)/g));
|
||||
}
|
||||
|
||||
# Let's go ahead and trim away all private ips (KLC)
|
||||
# also uniq the list and strip dups. (jm)
|
||||
my @ips = $self->ip_list_uniq_and_strip_private(@fullips);
|
||||
|
||||
# if there's no untrusted IPs, it means we trust all the open-internet
|
||||
# relays, so we can return right now.
|
||||
return 0 unless (scalar @ips + scalar @originating > 0);
|
||||
|
||||
dbg("dns: IPs found: full-external: ".join(", ", @fullexternal).
|
||||
" untrusted: ".join(", ", @ips).
|
||||
" originating: ".join(", ", @originating));
|
||||
$rbl_server =~ s/\.+\z//; # strip unneeded trailing dot
|
||||
dbg("dnseval: checking RBL $rbl_server, set $set, rule $rule");
|
||||
|
||||
my $trusted = $self->{main}->{conf}->{trusted_networks};
|
||||
my @ips = @{$pms->{dnseval_ips}};
|
||||
|
||||
# If name is foo-notfirsthop, check all addresses except for
|
||||
# the originating one. Suitable for use with dialup lists, like the PDL.
|
||||
@ -293,9 +337,9 @@ sub check_rbl_backend {
|
||||
# specified some third-party relays as trusted. Also, don't use
|
||||
# @originating; those headers are added by a phase of relaying through
|
||||
# a server like Hotmail, which is not going to be in dialup lists anyway.
|
||||
@ips = $self->ip_list_uniq_and_strip_private(@fullexternal);
|
||||
@ips = $self->ip_list_uniq_and_strip_private(@{$pms->{dnseval_fullexternal}});
|
||||
if ($1 eq "lastexternal") {
|
||||
@ips = (defined $ips[0]) ? ($ips[0]) : ();
|
||||
@ips = defined $ips[0] ? ($ips[0]) : ();
|
||||
} else {
|
||||
pop @ips if (scalar @ips > 1);
|
||||
}
|
||||
@ -307,14 +351,14 @@ sub check_rbl_backend {
|
||||
elsif ($set =~ /-(first|un)trusted$/)
|
||||
{
|
||||
my @tips;
|
||||
foreach my $ip (@originating) {
|
||||
foreach my $ip (@{$pms->{dnseval_originating}}) {
|
||||
if ($ip && !$trusted->contains_ip($ip)) {
|
||||
push(@tips, $ip);
|
||||
}
|
||||
}
|
||||
@ips = $self->ip_list_uniq_and_strip_private (@ips, @tips);
|
||||
@ips = $self->ip_list_uniq_and_strip_private(@ips, @tips);
|
||||
if ($1 eq "first") {
|
||||
@ips = (defined $ips[0]) ? ($ips[0]) : ();
|
||||
@ips = defined $ips[0] ? ($ips[0]) : ();
|
||||
} else {
|
||||
shift @ips;
|
||||
}
|
||||
@ -322,7 +366,7 @@ sub check_rbl_backend {
|
||||
else
|
||||
{
|
||||
my @tips;
|
||||
foreach my $ip (@originating) {
|
||||
foreach my $ip (@{$pms->{dnseval_originating}}) {
|
||||
if ($ip && !$trusted->contains_ip($ip)) {
|
||||
push(@tips, $ip);
|
||||
}
|
||||
@ -333,71 +377,77 @@ sub check_rbl_backend {
|
||||
}
|
||||
|
||||
# How many IPs max you check in the received lines
|
||||
my $checklast=$self->{main}->{conf}->{num_check_received};
|
||||
my $checklast = $self->{main}->{conf}->{num_check_received};
|
||||
|
||||
if (scalar @ips > $checklast) {
|
||||
splice (@ips, $checklast); # remove all others
|
||||
}
|
||||
|
||||
my $tflags = $pms->{conf}->{tflags}->{$rule};
|
||||
|
||||
# Trusted relays should only be checked against nice rules (dnswls)
|
||||
if (defined $tflags && $tflags !~ /\bnice\b/) {
|
||||
if (($pms->{conf}->{tflags}->{$rule}||'') !~ /\bnice\b/) {
|
||||
# remove trusted hosts from beginning
|
||||
while (@ips && $trusted->contains_ip($ips[0])) { shift @ips }
|
||||
}
|
||||
|
||||
unless (scalar @ips > 0) {
|
||||
dbg("dns: no untrusted IPs to check");
|
||||
dbg("dnseval: no untrusted IPs to check");
|
||||
return 0;
|
||||
}
|
||||
|
||||
dbg("dns: only inspecting the following IPs: ".join(", ", @ips));
|
||||
dbg("dnseval: only inspecting the following IPs: ".join(", ", @ips));
|
||||
|
||||
eval {
|
||||
my $queries;
|
||||
foreach my $ip (@ips) {
|
||||
my $revip = reverse_ip_address($ip);
|
||||
$pms->do_rbl_lookup($rule, $set, $type,
|
||||
$revip.'.'.$rbl_server, $subtest) if defined $revip;
|
||||
if (defined(my $revip = reverse_ip_address($ip))) {
|
||||
my $ret = $pms->do_rbl_lookup($rule, $set, $type, $revip.'.'.$rbl_server, $subtest);
|
||||
$queries++ if defined $ret;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
# note that results are not handled here, hits are handled directly
|
||||
# as DNS responses are harvested
|
||||
return 0;
|
||||
return 0 if !$queries; # no query started
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
sub check_rbl {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
$self->check_rbl_backend($pms, $rule, $set, $rbl_server, 'A', $subtest);
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
# return undef for async status
|
||||
return $self->_check_rbl_backend($pms, $rule, $set, $rbl_server, 'A', $subtest);
|
||||
}
|
||||
|
||||
sub check_rbl_txt {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
$self->check_rbl_backend($pms, $rule, $set, $rbl_server, 'TXT', $subtest);
|
||||
}
|
||||
|
||||
# run for first message
|
||||
sub check_rbl_sub {
|
||||
my ($self, $pms, $rule, $set, $subtest) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
$pms->register_rbl_subtest($rule, $set, $subtest);
|
||||
# return undef for async status
|
||||
return $self->_check_rbl_backend($pms, $rule, $set, $rbl_server, 'TXT', $subtest);
|
||||
}
|
||||
|
||||
# backward compatibility
|
||||
sub check_rbl_results_for {
|
||||
#warn "dns: check_rbl_results_for() is deprecated, use check_rbl_sub()\n";
|
||||
check_rbl_sub(@_);
|
||||
sub check_rbl_sub {
|
||||
my ($self, $pms, $rule, $set, $subtest) = @_;
|
||||
# just a dummy, _init_rbl_subs/do_rbl_lookup handles the subs
|
||||
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
# this only checks the address host name and not the domain name because
|
||||
# using the domain name had much worse results for dsn.rfc-ignorant.org
|
||||
sub check_rbl_from_host {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
_check_rbl_addresses($self, $pms, $rule, $set, $rbl_server, $subtest, $pms->all_from_addrs());
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
# return undef for async status
|
||||
return $self->_check_rbl_addresses($pms, $rule, $set, $rbl_server,
|
||||
$subtest, $pms->all_from_addrs());
|
||||
}
|
||||
|
||||
sub check_rbl_headers {
|
||||
@ -415,16 +465,18 @@ sub check_rbl_headers {
|
||||
@env_hdr = split(/,/, $conf->{rbl_headers});
|
||||
}
|
||||
|
||||
my $queries;
|
||||
foreach my $rbl_headers (@env_hdr) {
|
||||
my $addr = $pms->get($rbl_headers.':addr', undef);
|
||||
if ( defined $addr && $addr =~ /\@([^\@\s]+)/ ) {
|
||||
$self->_check_rbl_addresses($pms, $rule, $set, $rbl_server,
|
||||
my $ret = $self->_check_rbl_addresses($pms, $rule, $set, $rbl_server,
|
||||
$subtest, $addr);
|
||||
$queries++ if defined $ret;
|
||||
} else {
|
||||
my $unsplitted_host = $pms->get($rbl_headers);
|
||||
chomp($unsplitted_host);
|
||||
foreach my $host (split(/\n/, $unsplitted_host)) {
|
||||
if($host =~ /^$IP_ADDRESS$/ ) {
|
||||
if ($host =~ IS_IP_ADDRESS) {
|
||||
next if ($conf->{tflags}->{$rule}||'') =~ /\bdomains_only\b/;
|
||||
$host = reverse_ip_address($host);
|
||||
} else {
|
||||
@ -432,17 +484,23 @@ sub check_rbl_headers {
|
||||
next unless is_fqdn_valid($host);
|
||||
next unless $pms->{main}->{registryboundaries}->is_domain_valid($host);
|
||||
}
|
||||
$pms->do_rbl_lookup($rule, $set, 'A', "$host.$rbl_server", $subtest);
|
||||
my $ret = $pms->do_rbl_lookup($rule, $set, 'A', "$host.$rbl_server", $subtest);
|
||||
$queries++ if defined $ret;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return 0 if !$queries; # no query started
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
=over 4
|
||||
|
||||
=item check_rbl_from_domain
|
||||
|
||||
This checks all the from addrs domain names as an alternate to check_rbl_from_host. As of v3.4.1, it has been improved to include a subtest for a specific octet.
|
||||
This checks all the from addrs domain names as an alternate to
|
||||
check_rbl_from_host. As of v3.4.1, it has been improved to include a
|
||||
subtest for a specific octet.
|
||||
|
||||
=back
|
||||
|
||||
@ -450,9 +508,14 @@ This checks all the from addrs domain names as an alternate to check_rbl_from_ho
|
||||
|
||||
sub check_rbl_from_domain {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
_check_rbl_addresses($self, $pms, $rule, $set, $rbl_server, $subtest, $pms->all_from_addrs_domains());
|
||||
}
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
# return undef for async status
|
||||
return $self->_check_rbl_addresses($pms, $rule, $set, $rbl_server,
|
||||
$subtest, $pms->all_from_addrs_domains());
|
||||
}
|
||||
=over 4
|
||||
|
||||
=item check_rbl_ns_from
|
||||
@ -472,6 +535,7 @@ sub check_rbl_ns_from {
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
|
||||
dbg("dnseval: EnvelopeFrom header not found") unless defined (($pms->get("EnvelopeFrom:addr"))[0]);
|
||||
for my $from ($pms->get('EnvelopeFrom:addr')) {
|
||||
next unless defined $from;
|
||||
$from =~ tr/././s; # bug 3366
|
||||
@ -482,20 +546,20 @@ sub check_rbl_ns_from {
|
||||
}
|
||||
return 0 unless defined $domain;
|
||||
|
||||
dbg("dns: checking NS for host $domain");
|
||||
dbg("dnseval: checking NS for host $domain");
|
||||
|
||||
my $key = "NS:" . $domain;
|
||||
my $obj = { dom => $domain, rule => $rule, set => $set, rbl_server => $rbl_server, subtest => $subtest };
|
||||
my $ent = {
|
||||
rulename => $rule, key => $key, zone => $domain, obj => $obj, type => "URI-NS",
|
||||
rulename => $rule, zone => $domain, obj => $obj, type => "URI-NS",
|
||||
};
|
||||
# dig $dom ns
|
||||
$ent = $pms->{async}->bgsend_and_start_lookup(
|
||||
my $ret = $pms->{async}->bgsend_and_start_lookup(
|
||||
$domain, 'NS', undef, $ent,
|
||||
sub { my ($ent2,$pkt) = @_;
|
||||
$self->complete_ns_lookup($pms, $ent2, $pkt, $domain) },
|
||||
master_deadline => $pms->{master_deadline} );
|
||||
return $ent;
|
||||
return 0 if !defined $ret; # no query started
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
sub complete_ns_lookup {
|
||||
@ -508,11 +572,11 @@ sub complete_ns_lookup {
|
||||
|
||||
if (!$pkt) {
|
||||
# $pkt will be undef if the DNS query was aborted (e.g. timed out)
|
||||
dbg("DNSEval: complete_ns_lookup aborted %s", $ent->{key});
|
||||
dbg("dnseval: complete_ns_lookup aborted %s", $ent->{key});
|
||||
return;
|
||||
}
|
||||
|
||||
dbg("DNSEval: complete_ns_lookup %s", $ent->{key});
|
||||
dbg("dnseval: complete_ns_lookup %s", $ent->{key});
|
||||
my @ns = $pkt->authority;
|
||||
|
||||
foreach my $rr (@ns) {
|
||||
@ -521,9 +585,9 @@ sub complete_ns_lookup {
|
||||
chomp($nshost);
|
||||
if (is_fqdn_valid($nshost)) {
|
||||
if ( defined $subtest ) {
|
||||
dbg("dns: checking [$nshost] / $rule / $set / $rbl_server / $subtest");
|
||||
dbg("dnseval: checking [$nshost] / $rule / $set / $rbl_server / $subtest");
|
||||
} else {
|
||||
dbg("dns: checking [$nshost] / $rule / $set / $rbl_server");
|
||||
dbg("dnseval: checking [$nshost] / $rule / $set / $rbl_server");
|
||||
}
|
||||
$pms->do_rbl_lookup($rule, $set, 'A',
|
||||
"$nshost.$rbl_server", $subtest);
|
||||
@ -559,10 +623,11 @@ sub check_rbl_rcvd {
|
||||
}
|
||||
}
|
||||
|
||||
my $queries;
|
||||
foreach my $host ( @udnsrcvd ) {
|
||||
if((defined $host) and ($host ne "")) {
|
||||
chomp($host);
|
||||
if($host =~ /^$IP_ADDRESS$/ ) {
|
||||
if ($host =~ IS_IP_ADDRESS) {
|
||||
next if ($pms->{conf}->{tflags}->{$rule}||'') =~ /\bdomains_only\b/;
|
||||
$host = reverse_ip_address($host);
|
||||
} else {
|
||||
@ -572,49 +637,55 @@ sub check_rbl_rcvd {
|
||||
next unless $pms->{main}->{registryboundaries}->is_domain_valid($host);
|
||||
}
|
||||
if ( defined $subtest ) {
|
||||
dbg("dns: checking [$host] / $rule / $set / $rbl_server / $subtest");
|
||||
dbg("dnseval: checking [$host] / $rule / $set / $rbl_server / $subtest");
|
||||
} else {
|
||||
dbg("dns: checking [$host] / $rule / $set / $rbl_server");
|
||||
dbg("dnseval: checking [$host] / $rule / $set / $rbl_server");
|
||||
}
|
||||
$pms->do_rbl_lookup($rule, $set, 'A', "$host.$rbl_server", $subtest);
|
||||
my $ret = $pms->do_rbl_lookup($rule, $set, 'A', "$host.$rbl_server", $subtest);
|
||||
$queries++ if defined $ret;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
|
||||
return 0 if !$queries; # no query started
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
# this only checks the address host name and not the domain name because
|
||||
# using the domain name had much worse results for dsn.rfc-ignorant.org
|
||||
sub check_rbl_envfrom {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest) = @_;
|
||||
_check_rbl_addresses($self, $pms, $rule, $set, $rbl_server, $subtest, $pms->get('EnvelopeFrom:addr',undef));
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
# return undef for async status
|
||||
return $self->_check_rbl_addresses($pms, $rule, $set, $rbl_server,
|
||||
$subtest, $pms->get('EnvelopeFrom:addr',undef));
|
||||
}
|
||||
|
||||
sub _check_rbl_addresses {
|
||||
my ($self, $pms, $rule, $set, $rbl_server, $subtest, @addresses) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
$rbl_server =~ s/\.+\z//; # strip unneeded trailing dot
|
||||
|
||||
my %hosts;
|
||||
for (@addresses) {
|
||||
next if !defined($_) || !/ \@ ( [^\@\s]+ )/x;
|
||||
next if !defined($_) || !/\@([^\@\s]+)/;
|
||||
my $address = $1;
|
||||
# strip leading & trailing dots (as seen in some e-mail addresses)
|
||||
$address =~ s/^\.+//; $address =~ s/\.+\z//;
|
||||
$address =~ s/^\.+//;
|
||||
$address =~ s/\.+\z//;
|
||||
# squash duplicate dots to avoid an invalid DNS query with a null label
|
||||
$address =~ tr/.//s;
|
||||
$hosts{lc($address)} = 1 if $address =~ /\./; # must by a FQDN
|
||||
# Also checks it's FQDN
|
||||
if ($address =~ tr/.//s) {
|
||||
$hosts{lc($address)} = 1;
|
||||
}
|
||||
}
|
||||
return unless scalar keys %hosts;
|
||||
|
||||
if (($rbl_server !~ /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/) &&
|
||||
(index($rbl_server, '.') >= 0) &&
|
||||
($rbl_server !~ /\.$/)) {
|
||||
$rbl_server .= ".";
|
||||
}
|
||||
|
||||
my $queries;
|
||||
for my $host (keys %hosts) {
|
||||
if ($host =~ /^$IP_ADDRESS$/) {
|
||||
if ($host =~ IS_IP_ADDRESS) {
|
||||
next if ($pms->{conf}->{tflags}->{$rule}||'') =~ /\bdomains_only\b/;
|
||||
$host = reverse_ip_address($host);
|
||||
} else {
|
||||
@ -622,23 +693,26 @@ sub _check_rbl_addresses {
|
||||
next unless is_fqdn_valid($host);
|
||||
next unless $pms->{main}->{registryboundaries}->is_domain_valid($host);
|
||||
}
|
||||
dbg("dns: checking [$host] / $rule / $set / $rbl_server");
|
||||
$pms->do_rbl_lookup($rule, $set, 'A', "$host.$rbl_server", $subtest);
|
||||
dbg("dnseval: checking [$host] / $rule / $set / $rbl_server");
|
||||
my $ret = $pms->do_rbl_lookup($rule, $set, 'A', "$host.$rbl_server", $subtest);
|
||||
$queries++ if defined $ret;
|
||||
}
|
||||
|
||||
return 0 if !$queries; # no async
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
sub check_dns_sender {
|
||||
my ($self, $pms, $rule) = @_;
|
||||
|
||||
return 0 if $self->{main}->{conf}->{skip_rbl_checks};
|
||||
return 0 unless $pms->is_dns_available();
|
||||
return 0 if !$pms->is_dns_available();
|
||||
|
||||
my $host;
|
||||
for my $from ($pms->get('EnvelopeFrom:addr',undef)) {
|
||||
foreach my $from ($pms->get('EnvelopeFrom:addr', undef)) {
|
||||
next unless defined $from;
|
||||
|
||||
$from =~ tr/././s; # bug 3366
|
||||
if ($from =~ m/ \@ ( [^\@\s]+ \. [^\@\s]+ )/x ) {
|
||||
$from =~ tr/.//s; # bug 3366
|
||||
if ($from =~ m/\@([^\@\s]+\.[^\@\s]+)/) {
|
||||
$host = lc($1);
|
||||
last;
|
||||
}
|
||||
@ -650,15 +724,45 @@ sub check_dns_sender {
|
||||
return 0;
|
||||
}
|
||||
|
||||
dbg("dns: checking A and MX for host $host");
|
||||
$host = idn_to_ascii($host);
|
||||
dbg("dnseval: checking A and MX for host $host");
|
||||
|
||||
$pms->do_dns_lookup($rule, 'A', $host);
|
||||
$pms->do_dns_lookup($rule, 'MX', $host);
|
||||
my $queries;
|
||||
my $ret = $self->do_sender_lookup($pms, $rule, 'A', $host);
|
||||
$queries++ if defined $ret;
|
||||
$ret = $self->do_sender_lookup($pms, $rule, 'MX', $host);
|
||||
$queries++ if defined $ret;
|
||||
|
||||
# cache name of host for later checking
|
||||
$pms->{sender_host} = $host;
|
||||
return 0 if !$queries; # no query started
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
return 0;
|
||||
sub do_sender_lookup {
|
||||
my ($self, $pms, $rule, $type, $host) = @_;
|
||||
|
||||
my $ent = {
|
||||
rulename => $rule,
|
||||
type => "DNSBL-Sender",
|
||||
};
|
||||
return $pms->{async}->bgsend_and_start_lookup(
|
||||
$host, $type, undef, $ent, sub {
|
||||
my ($ent, $pkt) = @_;
|
||||
return if !$pkt; # aborted / timed out
|
||||
$pms->rule_ready($ent->{rulename}); # mark as run, could still hit
|
||||
foreach my $answer ($pkt->answer) {
|
||||
next if !$answer;
|
||||
next if $answer->type ne 'A' && $answer->type ne 'MX';
|
||||
if ($pkt->header->rcode eq 'NXDOMAIN' ||
|
||||
$pkt->header->rcode eq 'SERVFAIL')
|
||||
{
|
||||
if (++$pms->{sender_host_fail} == 2) {
|
||||
$pms->got_hit($ent->{rulename}, "DNS: ", ruletype => "dns");
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
master_deadline => $self->{master_deadline},
|
||||
);
|
||||
}
|
||||
|
||||
# capability checks for "if can(Mail::SpamAssassin::Plugin::DNSEval::XXX)":
|
||||
|
944
upstream/lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm
Normal file
944
upstream/lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm
Normal file
@ -0,0 +1,944 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
DecodeShortURLs - Check for shortened URLs
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::DecodeShortURLs
|
||||
|
||||
url_shortener tinyurl.com
|
||||
url_shortener_get bit.ly
|
||||
|
||||
body HAS_SHORT_URL eval:short_url()
|
||||
describe HAS_SHORT_URL Message has one or more shortened URLs
|
||||
|
||||
body SHORT_URL_REDIR eval:short_url_redir()
|
||||
describe SHORT_URL_REDIR Message has shortened URL that resulted in a valid redirection
|
||||
|
||||
body SHORT_URL_CHAINED eval:short_url_chained()
|
||||
describe SHORT_URL_CHAINED Message has shortened URL chained to other shorteners
|
||||
|
||||
body SHORT_URL_MAXCHAIN eval:short_url_maxchain()
|
||||
describe SHORT_URL_MAXCHAIN Message has shortened URL that causes too many redirections
|
||||
|
||||
body SHORT_URL_LOOP eval:short_url_loop()
|
||||
describe SHORT_URL_LOOP Message has short URL that loops back to itself
|
||||
|
||||
body SHORT_URL_200 eval:short_url_code('200') # Can check any non-redirect HTTP code
|
||||
describe SHORT_URL_200 Message has shortened URL returning HTTP 200
|
||||
|
||||
body SHORT_URL_404 eval:short_url_code('404') # Can check any non-redirect HTTP code
|
||||
describe SHORT_URL_404 Message has shortened URL returning HTTP 404
|
||||
|
||||
uri URI_TINYURL_BLOCKED m,https://tinyurl\.com/app/nospam,
|
||||
describe URI_TINYURL_BLOCKED Message contains a tinyurl that has been disabled due to abuse
|
||||
|
||||
uri URI_BITLY_BLOCKED m,^https://bitly\.com/a/blocked,
|
||||
describe URI_BITLY_BLOCKED Message contains a bit.ly URL that has been disabled due to abuse
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin looks for URLs shortened by a list of URL shortening services.
|
||||
Upon finding a matching URL, plugin will send a HTTP request to the
|
||||
shortening service and retrieve the Location-header which points to the
|
||||
actual shortened URL. It then adds this URL to the list of URIs extracted
|
||||
by SpamAssassin which can then be accessed by uri rules and plugins such as
|
||||
URIDNSBL.
|
||||
|
||||
This plugin will follow chained redirections, where a short URL redirects to
|
||||
another short URL. Redirection depth limit can be set with
|
||||
C<max_short_url_redirections>.
|
||||
|
||||
Maximum of C<max_short_urls> short URLs are checked in a message (10 by
|
||||
default). Setting it to 0 disables HTTP requests, allowing only short_url()
|
||||
test to work and report found shorteners.
|
||||
|
||||
All supported rule types for checking short URLs and redirection status are
|
||||
documented in L<SYNOPSIS> section.
|
||||
|
||||
=head1 NOTES
|
||||
|
||||
This plugin runs at the check_dnsbl hook (priority -100) so that it may
|
||||
modify the parsed URI list prior to normal uri rules or the URIDNSBL plugin.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::DecodeShortURLs;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use strict;
|
||||
use warnings;
|
||||
|
||||
use vars qw(@ISA);
|
||||
@ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
my $VERSION = 4.00;
|
||||
|
||||
use constant HAS_LWP_USERAGENT => eval { require LWP::UserAgent; };
|
||||
|
||||
sub dbg { my $msg = shift; return Mail::SpamAssassin::Logger::dbg("DecodeShortURLs: $msg", @_); }
|
||||
sub info { my $msg = shift; return Mail::SpamAssassin::Logger::info("DecodeShortURLs: $msg", @_); }
|
||||
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
if ($mailsaobject->{local_tests_only}) {
|
||||
dbg("local tests only, disabling HTTP requests");
|
||||
$self->{net_disabled} = 1;
|
||||
}
|
||||
elsif (!HAS_LWP_USERAGENT) {
|
||||
dbg("module LWP::UserAgent not installed, disabling HTTP requests");
|
||||
$self->{net_disabled} = 1;
|
||||
}
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
$self->register_method_priority ('check_dnsbl', -10);
|
||||
$self->register_eval_rule('short_url', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_redir', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_200', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_404', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_code', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_chained', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_maxchain', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_loop', $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule('short_url_tests'); # for legacy plugin compatibility warning
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
=head1 PRIVILEGED SETTINGS
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener domain [domain...] (default: none)
|
||||
|
||||
Domains that should be considered as an URL shortener. If the domain begins
|
||||
with a '.', 3rd level tld of the main domain will be checked.
|
||||
|
||||
Example:
|
||||
|
||||
url_shortener tinyurl.com
|
||||
url_shortener .page.link
|
||||
|
||||
=back
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_get domain [domain...] (default: none)
|
||||
|
||||
Alias to C<url_shortener>. HTTP request will be done with GET method,
|
||||
instead of default HEAD. Required for some services like bit.ly to return
|
||||
blocked URL correctly.
|
||||
|
||||
Example:
|
||||
|
||||
url_shortener_get bit.ly
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds = ();
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener',
|
||||
default => {},
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
foreach my $domain (split(/\s+/, $value)) {
|
||||
$self->{url_shortener}->{lc $domain} = 1; # 1 == head
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_get',
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
foreach my $domain (split(/\s+/, $value)) {
|
||||
$self->{url_shortener}->{lc $domain} = 2; # 2 == get
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item clear_url_shortener [domain] [domain...]
|
||||
|
||||
Clear configured url_shortener and url_shortener_get domains, for example to
|
||||
override default settings from an update channel. If domains are specified,
|
||||
then only those are removed from list.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'clear_url_shortener',
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value eq '') {
|
||||
$self->{url_shortener} = {};
|
||||
} else {
|
||||
foreach my $domain (split(/\s+/, $value)) {
|
||||
delete $self->{url_shortener}->{lc $domain};
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_cache_type (default: none)
|
||||
|
||||
The cache type that is being utilized. Currently only supported value is
|
||||
C<dbi> that implies C<url_shortener_cache_dsn> is a DBI connect string.
|
||||
DBI module is required.
|
||||
|
||||
Example:
|
||||
url_shortener_cache_type dbi
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_cache_type',
|
||||
default => '',
|
||||
is_priv => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_cache_dsn (default: none)
|
||||
|
||||
The DBI dsn of the database to use.
|
||||
|
||||
For SQLite, the database will be created automatically if it does not
|
||||
already exist, the supplied path and file must be read/writable by the
|
||||
user running spamassassin or spamd.
|
||||
|
||||
For MySQL/MariaDB or PostgreSQL, see sql-directory for database table
|
||||
creation clauses.
|
||||
|
||||
You will need to have the proper DBI module for your database. For example
|
||||
DBD::SQLite, DBD::mysql, DBD::MariaDB or DBD::Pg.
|
||||
|
||||
Minimum required SQLite version is 3.24.0 (available from DBD::SQLite 1.59_01).
|
||||
|
||||
Examples:
|
||||
|
||||
url_shortener_cache_dsn dbi:SQLite:dbname=/var/lib/spamassassin/DecodeShortURLs.db
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_cache_dsn',
|
||||
default => '',
|
||||
is_priv => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_cache_username (default: none)
|
||||
|
||||
The username that should be used to connect to the database. Not used for
|
||||
SQLite.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_cache_username',
|
||||
default => '',
|
||||
is_priv => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_cache_password (default: none)
|
||||
|
||||
The password that should be used to connect to the database. Not used for
|
||||
SQLite.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_cache_password',
|
||||
default => '',
|
||||
is_priv => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_cache_ttl (default: 86400)
|
||||
|
||||
The length of time a cache entry will be valid for in seconds.
|
||||
Default is 86400 (1 day).
|
||||
|
||||
See C<url_shortener_cache_autoclean> for database cleaning.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_cache_ttl',
|
||||
is_admin => 1,
|
||||
default => 86400,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_cache_autoclean (default: 1000)
|
||||
|
||||
Automatically purge old entries from database. Value describes a random run
|
||||
chance of 1/x. The default value of 1000 means that cleaning is run
|
||||
approximately once for every 1000 messages processed. Value of 1 would mean
|
||||
database is cleaned every time a message is processed.
|
||||
|
||||
Set 0 to disable automatic cleaning and to do it manually.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_cache_autoclean',
|
||||
is_admin => 1,
|
||||
default => 1000,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_loginfo (default: 0 (off))
|
||||
|
||||
If this option is enabled (set to 1), then short URLs and the decoded URLs will be logged with info priority.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_loginfo',
|
||||
is_admin => 1,
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_timeout (default: 5)
|
||||
|
||||
Maximum time a short URL HTTP request can take, in seconds.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_timeout',
|
||||
is_admin => 1,
|
||||
default => 5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item max_short_urls (default: 10)
|
||||
|
||||
Maximum amount of short URLs that will be looked up per message. Chained
|
||||
redirections are not counted, only initial short URLs found.
|
||||
|
||||
Setting it to 0 disables HTTP requests, allowing only short_url() test to
|
||||
work and report any found shortener URLs.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'max_short_urls',
|
||||
is_admin => 1,
|
||||
default => 10,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item max_short_url_redirections (default: 10)
|
||||
|
||||
Maximum depth of chained redirections that a short URL can generate.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'max_short_url_redirections',
|
||||
is_admin => 1,
|
||||
default => 10,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item url_shortener_user_agent (default: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36)
|
||||
|
||||
Set User-Agent header for HTTP requests. Some services require it to look
|
||||
like a common browser.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'url_shortener_user_agent',
|
||||
is_admin => 1,
|
||||
default => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
=head1 ACKNOWLEDGEMENTS
|
||||
|
||||
Original DecodeShortURLs plugin was developed by Steve Freegard.
|
||||
|
||||
=cut
|
||||
|
||||
sub short_url_tests {
|
||||
# Legacy compatibility warning done in finish_parsing_start
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub finish_parsing_start {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
if ($opts->{conf}->{eval_to_rule}->{short_url_tests}) {
|
||||
warn "DecodeShortURLs: Legacy configuration format detected. ".
|
||||
"Eval function short_url_tests() is no longer supported, ".
|
||||
"please see documentation for the new rule format.\n";
|
||||
}
|
||||
}
|
||||
|
||||
sub initialise_url_shortener_cache {
|
||||
my ($self, $conf) = @_;
|
||||
|
||||
return if $self->{dbh};
|
||||
return if !$conf->{url_shortener_cache_type};
|
||||
|
||||
if (!$conf->{url_shortener_cache_dsn}) {
|
||||
warn "DecodeShortURLs: invalid cache configuration\n";
|
||||
return;
|
||||
}
|
||||
|
||||
##
|
||||
## SQLite
|
||||
##
|
||||
if ($conf->{url_shortener_cache_type} =~ /^(?:dbi|sqlite)$/i
|
||||
&& $conf->{url_shortener_cache_dsn} =~ /^dbi:SQLite/)
|
||||
{
|
||||
eval {
|
||||
local $SIG{'__DIE__'};
|
||||
require DBI;
|
||||
require DBD::SQLite;
|
||||
DBD::SQLite->VERSION(1.59_01); # Required for ON CONFLICT
|
||||
$self->{dbh} = DBI->connect_cached(
|
||||
$conf->{url_shortener_cache_dsn}, '', '',
|
||||
{RaiseError => 1, PrintError => 0, InactiveDestroy => 1, AutoCommit => 1}
|
||||
);
|
||||
$self->{dbh}->do("
|
||||
CREATE TABLE IF NOT EXISTS short_url_cache (
|
||||
short_url TEXT PRIMARY KEY NOT NULL,
|
||||
decoded_url TEXT NOT NULL,
|
||||
hits INTEGER NOT NULL DEFAULT 1,
|
||||
created INTEGER NOT NULL,
|
||||
modified INTEGER NOT NULL
|
||||
)
|
||||
");
|
||||
# Maintaining index for cleaning is likely more expensive than occasional full table scan
|
||||
#$self->{dbh}->do("
|
||||
# CREATE INDEX IF NOT EXISTS short_url_modified
|
||||
# ON short_url_cache(created)
|
||||
#");
|
||||
$self->{sth_insert} = $self->{dbh}->prepare("
|
||||
INSERT INTO short_url_cache (short_url, decoded_url, created, modified)
|
||||
VALUES (?,?,strftime('%s','now'),strftime('%s','now'))
|
||||
ON CONFLICT(short_url) DO UPDATE
|
||||
SET decoded_url = excluded.decoded_url,
|
||||
modified = excluded.modified,
|
||||
hits = hits + 1
|
||||
");
|
||||
$self->{sth_select} = $self->{dbh}->prepare("
|
||||
SELECT decoded_url FROM short_url_cache
|
||||
WHERE short_url = ?
|
||||
");
|
||||
$self->{sth_delete} = $self->{dbh}->prepare("
|
||||
DELETE FROM short_url_cache
|
||||
WHERE short_url = ? AND created < strftime('%s','now') - $conf->{url_shortener_cache_ttl}
|
||||
");
|
||||
$self->{sth_clean} = $self->{dbh}->prepare("
|
||||
DELETE FROM short_url_cache
|
||||
WHERE created < strftime('%s','now') - $conf->{url_shortener_cache_ttl}
|
||||
");
|
||||
};
|
||||
}
|
||||
##
|
||||
## MySQL/MariaDB
|
||||
##
|
||||
elsif (lc $conf->{url_shortener_cache_type} eq 'dbi'
|
||||
&& $conf->{url_shortener_cache_dsn} =~ /^dbi:(?:mysql|MariaDB)/i)
|
||||
{
|
||||
eval {
|
||||
local $SIG{'__DIE__'};
|
||||
require DBI;
|
||||
$self->{dbh} = DBI->connect_cached(
|
||||
$conf->{url_shortener_cache_dsn},
|
||||
$conf->{url_shortener_cache_username},
|
||||
$conf->{url_shortener_cache_password},
|
||||
{RaiseError => 1, PrintError => 0, InactiveDestroy => 1, AutoCommit => 1}
|
||||
);
|
||||
$self->{sth_insert} = $self->{dbh}->prepare("
|
||||
INSERT INTO short_url_cache (short_url, decoded_url, created, modified)
|
||||
VALUES (?,?,UNIX_TIMESTAMP(),UNIX_TIMESTAMP())
|
||||
ON DUPLICATE KEY UPDATE
|
||||
decoded_url = VALUES(decoded_url),
|
||||
modified = VALUES(modified),
|
||||
hits = hits + 1
|
||||
");
|
||||
$self->{sth_select} = $self->{dbh}->prepare("
|
||||
SELECT decoded_url FROM short_url_cache
|
||||
WHERE short_url = ?
|
||||
");
|
||||
$self->{sth_delete} = $self->{dbh}->prepare("
|
||||
DELETE FROM short_url_cache
|
||||
WHERE short_url = ? AND created < UNIX_TIMESTAMP() - $conf->{url_shortener_cache_ttl}
|
||||
");
|
||||
$self->{sth_clean} = $self->{dbh}->prepare("
|
||||
DELETE FROM short_url_cache
|
||||
WHERE created < UNIX_TIMESTAMP() - $conf->{url_shortener_cache_ttl}
|
||||
");
|
||||
};
|
||||
}
|
||||
##
|
||||
## PostgreSQL
|
||||
##
|
||||
elsif (lc $conf->{url_shortener_cache_type} eq 'dbi'
|
||||
&& $conf->{url_shortener_cache_dsn} =~ /^dbi:Pg/i)
|
||||
{
|
||||
eval {
|
||||
local $SIG{'__DIE__'};
|
||||
require DBI;
|
||||
$self->{dbh} = DBI->connect_cached(
|
||||
$conf->{url_shortener_cache_dsn},
|
||||
$conf->{url_shortener_cache_username},
|
||||
$conf->{url_shortener_cache_password},
|
||||
{RaiseError => 1, PrintError => 0, InactiveDestroy => 1, AutoCommit => 1}
|
||||
);
|
||||
$self->{sth_insert} = $self->{dbh}->prepare("
|
||||
INSERT INTO short_url_cache (short_url, decoded_url, created, modified)
|
||||
VALUES (?,?,CAST(EXTRACT(epoch FROM NOW()) AS INT),CAST(EXTRACT(epoch FROM NOW()) AS INT))
|
||||
ON CONFLICT (short_url) DO UPDATE SET
|
||||
decoded_url = EXCLUDED.decoded_url,
|
||||
modified = EXCLUDED.modified,
|
||||
hits = short_url_cache.hits + 1
|
||||
");
|
||||
$self->{sth_select} = $self->{dbh}->prepare("
|
||||
SELECT decoded_url FROM short_url_cache
|
||||
WHERE short_url = ?
|
||||
");
|
||||
$self->{sth_delete} = $self->{dbh}->prepare("
|
||||
DELETE FROM short_url_cache
|
||||
WHERE short_url ? = AND created < CAST(EXTRACT(epoch FROM NOW()) AS INT) - $conf->{url_shortener_cache_ttl}
|
||||
");
|
||||
$self->{sth_clean} = $self->{dbh}->prepare("
|
||||
DELETE FROM short_url_cache
|
||||
WHERE created < CAST(EXTRACT(epoch FROM NOW()) AS INT) - $conf->{url_shortener_cache_ttl}
|
||||
");
|
||||
};
|
||||
##
|
||||
## ...
|
||||
##
|
||||
} else {
|
||||
warn "DecodeShortURLs: invalid cache configuration\n";
|
||||
return;
|
||||
}
|
||||
|
||||
if ($@ || !$self->{sth_clean}) {
|
||||
warn "DecodeShortURLs: cache connect failed: $@\n";
|
||||
undef $self->{dbh};
|
||||
undef $self->{sth_insert};
|
||||
undef $self->{sth_select};
|
||||
undef $self->{sth_delete};
|
||||
undef $self->{sth_clean};
|
||||
}
|
||||
}
|
||||
|
||||
sub short_url {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return $pms->{short_url} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub short_url_redir {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return $pms->{short_url_redir} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub short_url_200 {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return $pms->{short_url_200} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub short_url_404 {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return $pms->{short_url_404} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub short_url_code {
|
||||
my ($self, $pms, undef, $code) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return 0 unless defined $code && $code =~ /^\d{3}$/;
|
||||
return $pms->{"short_url_$code"} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub short_url_chained {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return $pms->{short_url_chained} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub short_url_maxchain {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return $pms->{short_url_maxchain} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub short_url_loop {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
# Make sure checks are run
|
||||
$self->_check_short($pms);
|
||||
|
||||
return $pms->{short_url_loop} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub _check_shortener_uri {
|
||||
my ($uri, $conf) = @_;
|
||||
|
||||
local($1,$2);
|
||||
return 0 unless $uri =~ m{^
|
||||
https?:// # Only http
|
||||
(?:[^\@/?#]*\@)? # Ignore user:pass@
|
||||
([^/?#:]+) # (Capture hostname)
|
||||
(?::\d+)? # Possible port
|
||||
(.*?\w)? # Some path wanted
|
||||
}ix;
|
||||
my $host = lc $1;
|
||||
my $has_path = defined $2;
|
||||
my $levels = $host =~ tr/.//;
|
||||
# No point looking at single level "xxx.yy" without a path
|
||||
return if $levels == 1 && !$has_path;
|
||||
if (exists $conf->{url_shortener}->{$host}) {
|
||||
return {
|
||||
'uri' => $uri,
|
||||
'method' => $conf->{url_shortener}->{$host} == 1 ? 'head' : 'get',
|
||||
};
|
||||
}
|
||||
# if domain is a 3rd level domain check if there is a url shortener
|
||||
# on the 2nd level tld
|
||||
elsif ($levels == 2 && $host =~ /^(?!www)[^.]+(\.[^.]+\.[^.]+)$/i &&
|
||||
exists $conf->{url_shortener}->{$1}) {
|
||||
return {
|
||||
'uri' => $uri,
|
||||
'method' => $conf->{url_shortener}->{$1} == 1 ? 'head' : 'get',
|
||||
};
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
sub check_dnsbl {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
$self->_check_short($opts->{permsgstatus});
|
||||
}
|
||||
|
||||
sub _check_short {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
return if $pms->{short_url_checked}++;
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
# Sort short URLs into hash to de-dup them
|
||||
my %short_urls;
|
||||
my $uris = $pms->get_uri_detail_list();
|
||||
while (my($uri, $info) = each %{$uris}) {
|
||||
next unless $info->{domains} && $info->{cleaned};
|
||||
if (my $short_url_info = _check_shortener_uri($uri, $conf)) {
|
||||
$short_urls{$uri} = $short_url_info;
|
||||
last if scalar keys %short_urls >= $conf->{max_short_urls};
|
||||
}
|
||||
}
|
||||
|
||||
# Bail out if no shortener was found
|
||||
return unless %short_urls;
|
||||
|
||||
# Mark that a URL shortener was found
|
||||
$pms->{short_url} = 1;
|
||||
|
||||
# Bail out if network lookups not enabled or max_short_urls 0
|
||||
return if $self->{net_disabled};
|
||||
return if !$conf->{max_short_urls};
|
||||
|
||||
# Initialize cache
|
||||
$self->initialise_url_shortener_cache($conf);
|
||||
|
||||
# Initialize LWP
|
||||
my $ua = LWP::UserAgent->new(
|
||||
'agent' => $conf->{url_shortener_user_agent},
|
||||
'max_redirect' => 0,
|
||||
'timeout' => $conf->{url_shortener_timeout},
|
||||
);
|
||||
$ua->env_proxy;
|
||||
|
||||
# Launch HTTP requests
|
||||
foreach my $uri (keys %short_urls) {
|
||||
$self->recursive_lookup($short_urls{$uri}, $pms, $ua);
|
||||
}
|
||||
|
||||
# Automatically purge old entries
|
||||
if ($self->{dbh} && $conf->{url_shortener_cache_autoclean}
|
||||
&& rand() < 1/$conf->{url_shortener_cache_autoclean})
|
||||
{
|
||||
dbg("cleaning stale cache entries");
|
||||
eval { $self->{sth_clean}->execute(); };
|
||||
if ($@) { dbg("cache cleaning failed: $@"); }
|
||||
}
|
||||
}
|
||||
|
||||
sub recursive_lookup {
|
||||
my ($self, $short_url_info, $pms, $ua, %been_here) = @_;
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
my $count = scalar keys %been_here;
|
||||
dbg("redirection count $count") if $count;
|
||||
if ($count >= $conf->{max_short_url_redirections}) {
|
||||
dbg("found more than $conf->{max_short_url_redirections} shortener redirections");
|
||||
# Fire test
|
||||
$pms->{short_url_maxchain} = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
my $short_url = $short_url_info->{uri};
|
||||
my $location;
|
||||
if (defined($location = $self->cache_get($short_url))) {
|
||||
if ($conf->{url_shortener_loginfo}) {
|
||||
info("found cached $short_url => $location");
|
||||
} else {
|
||||
dbg("found cached $short_url => $location");
|
||||
}
|
||||
# Cached http code?
|
||||
if ($location =~ /^\d{3}$/) {
|
||||
$pms->{"short_url_$location"} = 1;
|
||||
# Update cache
|
||||
$self->cache_add($short_url, $location);
|
||||
return;
|
||||
}
|
||||
} else {
|
||||
# Not cached; do lookup
|
||||
my $method = $short_url_info->{method};
|
||||
my $response = $ua->$method($short_url);
|
||||
if (!$response->is_redirect) {
|
||||
dbg("URL is not redirect: $short_url = ".$response->status_line);
|
||||
my $rcode = $response->code;
|
||||
if ($rcode =~ /^\d{3}$/) {
|
||||
$pms->{"short_url_$rcode"} = 1;
|
||||
# Update cache
|
||||
$self->cache_add($short_url, $rcode);
|
||||
}
|
||||
return;
|
||||
}
|
||||
$location = $response->headers->{location};
|
||||
if ($self->{url_shortener_loginfo}) {
|
||||
info("found $short_url => $location");
|
||||
} else {
|
||||
dbg("found $short_url => $location");
|
||||
}
|
||||
}
|
||||
|
||||
# Update cache
|
||||
$self->cache_add($short_url, $location);
|
||||
|
||||
# Bail out if $short_url redirects to itself
|
||||
if ($short_url eq $location) {
|
||||
dbg("URL is redirect to itself");
|
||||
return;
|
||||
}
|
||||
|
||||
# At this point we have a valid redirection and new URL in $response
|
||||
$pms->{short_url_redir} = 1;
|
||||
|
||||
# Set chained here otherwise we might mark a disabled page or
|
||||
# redirect back to the same host as chaining incorrectly.
|
||||
$pms->{short_url_chained} = 1 if $count;
|
||||
|
||||
# Check if we are being redirected to a local page
|
||||
# Don't recurse in this case...
|
||||
if ($location !~ m{^[a-z]+://}i) {
|
||||
my $orig_location = $location;
|
||||
my $orig_short_url = $short_url;
|
||||
# Strip to..
|
||||
if (index($location, '/') == 0) {
|
||||
$short_url =~ s{^([a-z]+://.*?)[/?#].*}{$1}; # ..absolute path
|
||||
} else {
|
||||
$short_url =~ s{^([a-z]+://.*)/}{$1}; # ..relative path
|
||||
}
|
||||
$location = "$short_url/$location";
|
||||
dbg("looks like a local redirection: $orig_short_url => $location ($orig_location)");
|
||||
$pms->add_uri_detail_list($location) if !$pms->{uri_detail_list}->{$location};
|
||||
return;
|
||||
}
|
||||
|
||||
if (exists $been_here{$location}) {
|
||||
# Loop detected
|
||||
dbg("error: loop detected: $location");
|
||||
$pms->{short_url_loop} = 1;
|
||||
return;
|
||||
}
|
||||
$been_here{$location} = 1;
|
||||
$pms->add_uri_detail_list($location) if !$pms->{uri_detail_list}->{$location};
|
||||
|
||||
# Check for recursion
|
||||
if (my $short_url_info = _check_shortener_uri($location, $conf)) {
|
||||
# Recurse...
|
||||
$self->recursive_lookup($short_url_info, $pms, $ua, %been_here);
|
||||
}
|
||||
}
|
||||
|
||||
sub cache_add {
|
||||
my ($self, $short_url, $decoded_url) = @_;
|
||||
|
||||
return if !$self->{dbh};
|
||||
return if length($short_url) > 256 || length($decoded_url) > 512;
|
||||
|
||||
# Upsert
|
||||
eval { $self->{sth_insert}->execute($short_url, $decoded_url); };
|
||||
if ($@) {
|
||||
dbg("could not add to cache: $@");
|
||||
}
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
sub cache_get {
|
||||
my ($self, $key) = @_;
|
||||
|
||||
return if !$self->{dbh};
|
||||
|
||||
# Make sure expired entries are gone. Just a quick check for primary key,
|
||||
# not that expensive.
|
||||
eval { $self->{sth_delete}->execute($key); };
|
||||
if ($@) {
|
||||
dbg("cache delete failed: $@");
|
||||
return;
|
||||
}
|
||||
|
||||
# Now try to get it (don't bother parsing if something was deleted above,
|
||||
# it would be rare event anyway)
|
||||
eval { $self->{sth_select}->execute($key); };
|
||||
if ($@) {
|
||||
dbg("cache get failed: $@");
|
||||
return;
|
||||
}
|
||||
|
||||
my @row = $self->{sth_select}->fetchrow_array();
|
||||
if (@row) {
|
||||
return $row[0];
|
||||
}
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_short_url { 1 }
|
||||
sub has_autoclean { 1 }
|
||||
sub has_short_url_code { 1 }
|
||||
sub has_user_agent { 1 } # url_shortener_user_agent
|
||||
sub has_get { 1 } # url_shortener_get
|
||||
sub has_clear { 1 } # clear_url_shortener
|
||||
sub has_timeout { 1 } # url_shortener_timeout
|
||||
sub has_max_redirections { 1 } # max_short_url_redirections
|
||||
# short_url() will always hit if matching url_shortener was found, even
|
||||
# without HTTP requests. To check if a valid HTTP redirection response was
|
||||
# seen, use short_url_redir().
|
||||
sub has_short_url_redir { 1 }
|
||||
|
||||
1;
|
713
upstream/lib/Mail/SpamAssassin/Plugin/ExtractText.pm
Normal file
713
upstream/lib/Mail/SpamAssassin/Plugin/ExtractText.pm
Normal file
@ -0,0 +1,713 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
# Authors: Jonas Eckerman, Dave Wreski, Giovanni Bechis
|
||||
|
||||
=head1 NAME
|
||||
|
||||
ExtractText - extracts text from documenmts.
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::ExtractText
|
||||
|
||||
ifplugin Mail::SpamAssassin::Plugin::ExtractText
|
||||
|
||||
extracttext_external pdftotext /usr/bin/pdftotext -nopgbrk -layout -enc UTF-8 {} -
|
||||
extracttext_use pdftotext .pdf application/pdf
|
||||
|
||||
# http://docx2txt.sourceforge.net
|
||||
extracttext_external docx2txt /usr/bin/docx2txt {} -
|
||||
extracttext_use docx2txt .docx application/docx
|
||||
|
||||
extracttext_external antiword /usr/bin/antiword -t -w 0 -m UTF-8.txt {}
|
||||
extracttext_use antiword .doc application/(?:vnd\.?)?ms-?word.*
|
||||
|
||||
extracttext_external unrtf /usr/bin/unrtf --nopict {}
|
||||
extracttext_use unrtf .doc .rtf application/rtf text/rtf
|
||||
|
||||
extracttext_external odt2txt /usr/bin/odt2txt --encoding=UTF-8 {}
|
||||
extracttext_use odt2txt .odt .ott application/.*?opendocument.*text
|
||||
extracttext_use odt2txt .sdw .stw application/(?:x-)?soffice application/(?:x-)?starwriter
|
||||
|
||||
extracttext_external tesseract {OMP_THREAD_LIMIT=1} /usr/bin/tesseract -c page_separator= {} -
|
||||
extracttext_use tesseract .jpg .png .bmp .tif .tiff image/(?:jpeg|png|x-ms-bmp|tiff)
|
||||
|
||||
add_header all ExtractText-Flags _EXTRACTTEXTFLAGS_
|
||||
header PDF_NO_TEXT X-ExtractText-Flags =~ /\bpdftotext_NoText\b/
|
||||
describe PDF_NO_TEXT PDF without text
|
||||
score PDF_NO_TEXT 0.001
|
||||
|
||||
header DOC_NO_TEXT X-ExtractText-Flags =~ /\b(?:antiword|openxml|unrtf|odt2txt)_NoText\b/
|
||||
describe DOC_NO_TEXT Document without text
|
||||
score DOC_NO_TEXT 0.001
|
||||
|
||||
header EXTRACTTEXT exists:X-ExtractText-Flags
|
||||
describe EXTRACTTEXT Email processed by extracttext plugin
|
||||
score EXTRACTTEXT 0.001
|
||||
|
||||
endif
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This module uses external tools to extract text from message parts,
|
||||
and then sets the text as the rendered part. External tool must output
|
||||
plain text, not HTML or other non-textual result.
|
||||
|
||||
How to extract text is completely configurable, and based on
|
||||
MIME part type and file name.
|
||||
|
||||
=head1 CONFIGURATION
|
||||
|
||||
All configuration lines in user_prefs files will be ignored.
|
||||
|
||||
=over 4
|
||||
|
||||
=item extracttext_maxparts (default: 10)
|
||||
|
||||
Configure the maximum mime parts number to analyze, a value of 0 means all mime parts
|
||||
will be analyzed
|
||||
|
||||
=item extracttext_timeout (default: 5 10)
|
||||
|
||||
Configure the timeout in seconds of external tool checks, per attachment.
|
||||
|
||||
Second argument speficies maximum total time for all checks.
|
||||
|
||||
=back
|
||||
|
||||
=head2 Tools
|
||||
|
||||
=over
|
||||
|
||||
=item extracttext_use
|
||||
|
||||
Specifies what tool to use for what message parts.
|
||||
|
||||
The general syntax is
|
||||
|
||||
extracttext_use C<name> C<specifiers>
|
||||
|
||||
=back
|
||||
|
||||
=over
|
||||
|
||||
=item name
|
||||
|
||||
the internal name of a tool.
|
||||
|
||||
=item specifiers
|
||||
|
||||
File extension and regular expressions for file names and MIME
|
||||
types. The regular expressions are anchored to beginning and end.
|
||||
|
||||
=back
|
||||
|
||||
=head3 Examples
|
||||
|
||||
extracttext_use antiword .doc application/(?:vnd\.?)?ms-?word.*
|
||||
extracttext_use openxml .docx .dotx .dotm application/(?:vnd\.?)openxml.*?word.*
|
||||
extracttext_use openxml .doc .dot application/(?:vnd\.?)?ms-?word.*
|
||||
extracttext_use unrtf .doc .rtf application/rtf text/rtf
|
||||
|
||||
=over
|
||||
|
||||
=item extracttext_external
|
||||
|
||||
Defines an external tool. The tool must read a document on standard input
|
||||
or from a file and write text to standard output.
|
||||
|
||||
The special keyword "{}" will be substituted at runtime with the temporary
|
||||
filename to be scanned by the external tool.
|
||||
|
||||
Environment variables can be defined with "{KEY=VALUE}", these strings will
|
||||
be removed from commandline.
|
||||
|
||||
It is required that commandline used outputs result directly to STDOUT.
|
||||
|
||||
The general syntax is
|
||||
|
||||
extracttext_external C<name> C<command> C<parameters>
|
||||
|
||||
=back
|
||||
|
||||
=over
|
||||
|
||||
=item name
|
||||
|
||||
The internal name of this tool.
|
||||
|
||||
=item command
|
||||
|
||||
The full path to the external command to run.
|
||||
|
||||
=item parameters
|
||||
|
||||
Parameters for the external command. The temporary file name containing
|
||||
the document will be automatically added as last parameter.
|
||||
|
||||
=back
|
||||
|
||||
=head3 Examples
|
||||
|
||||
extracttext_external antiword /usr/bin/antiword -t -w 0 -m UTF-8.txt {} -
|
||||
extracttext_external unrtf /usr/bin/unrtf --nopict {}
|
||||
extracttext_external odt2txt /usr/bin/odt2txt --encoding=UTF-8 {}
|
||||
|
||||
=head2 Metadata
|
||||
|
||||
The plugin adds some pseudo headers to the message. These headers are seen by
|
||||
the bayes system, and can be used in normal SpamAssassin rules.
|
||||
|
||||
The headers are also available as template tags as noted below.
|
||||
|
||||
=head3 Example
|
||||
|
||||
The fictional example headers below are based on a message containing this:
|
||||
|
||||
=over
|
||||
|
||||
=item 1
|
||||
A perfectly normal PDF.
|
||||
|
||||
=item 2
|
||||
An OpenXML document with a word document inside.
|
||||
Neither Office document contains text.
|
||||
|
||||
=back
|
||||
|
||||
=head3 Headers
|
||||
|
||||
=over
|
||||
|
||||
=item X-ExtractText-Chars
|
||||
|
||||
Tag: _EXTRACTTEXTCHARS_
|
||||
|
||||
Contains a count of characters that were extracted.
|
||||
|
||||
X-ExtractText-Chars: 10970
|
||||
|
||||
=item X-ExtractText-Words
|
||||
|
||||
Tag: _EXTRACTTEXTWORDS_
|
||||
|
||||
Contains a count of "words" that were extracted.
|
||||
|
||||
X-ExtractText-Chars: 1599
|
||||
|
||||
=item X-ExtractText-Tools
|
||||
|
||||
Tag: _EXTRACTTEXTTOOLS_
|
||||
|
||||
Contains chains of tools used for extraction.
|
||||
|
||||
X-ExtractText-Tools: pdftotext openxml_antiword
|
||||
|
||||
=item X-ExtractText-Types
|
||||
|
||||
Tag: _EXTRACTTEXTTYPES_
|
||||
|
||||
Contains chains of MIME types for parts found during extraction.
|
||||
|
||||
X-ExtractText-Types: application/pdf; application/vnd.openxmlformats-officedocument.wordprocessingml.document, application/ms-word
|
||||
|
||||
=item X-ExtractText-Extensions
|
||||
|
||||
Tag: _EXTRACTTEXTEXTENSIONS_
|
||||
|
||||
Contains chains of canonicalized file extensions for parts
|
||||
found during extraction.
|
||||
|
||||
X-ExtractText-Extensions: pdf docx
|
||||
|
||||
=item X-ExtractText-Flags
|
||||
|
||||
Tag: _EXTRACTTEXTFLAGS_
|
||||
|
||||
Contains notes from the plugin.
|
||||
|
||||
X-ExtractText-Flags: openxml_NoText
|
||||
|
||||
=back
|
||||
|
||||
=head3 Rules
|
||||
|
||||
Example:
|
||||
|
||||
header PDF_NO_TEXT X-ExtractText-Flags =~ /\bpdftotext_Notext\b/
|
||||
describe PDF_NO_TEXT PDF without text
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::ExtractText;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
my $VERSION = 0.001;
|
||||
|
||||
use File::Basename;
|
||||
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw (compile_regexp untaint_var untaint_file_path
|
||||
proc_status_ok exit_status_str);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsa);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->{match} = [];
|
||||
$self->{tools} = {};
|
||||
$self->{magic} = 0;
|
||||
|
||||
$self->register_method_priority('post_message_parse', -1);
|
||||
$self->set_config($mailsa->{conf});
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'extracttext_maxparts',
|
||||
default => 10,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'extracttext_timeout',
|
||||
default => 5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
local ($1,$2);
|
||||
unless ($value =~ /^(\d+)(?:\s+(\d+))?$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{extracttext_timeout} = $1;
|
||||
$self->{extracttext_timeout_total} = $2;
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub parse_config {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
# Ignore users's configuration lines
|
||||
return 0 if $opts->{user_config};
|
||||
|
||||
if ($opts->{key} eq 'extracttext_use') {
|
||||
$self->inhibit_further_callbacks();
|
||||
# Temporary kludge to notify users. Double backslashes have zero benefit for this plugin config.
|
||||
if ($opts->{value} =~ s/\\\\/\\/g) {
|
||||
warn "extracttext: DOUBLE BACKSLASHES DEPRECATED, change config to single backslashes, autoconverted for backward compatibility: $opts->{key} $opts->{value}\n";
|
||||
}
|
||||
if ($opts->{value} =~ /(?:to|2)html\b/) {
|
||||
warn "extracttext: HTML tools are not supported, plain text output is required. Please remove: $opts->{key} $opts->{value}\n";
|
||||
return 1;
|
||||
}
|
||||
my @vals = split(/\s+/, $opts->{value});
|
||||
my $tool = lc(shift @vals);
|
||||
return 0 unless @vals;
|
||||
foreach my $what (@vals) {
|
||||
my $where;
|
||||
if (index($what, '/') >= 0) {
|
||||
$where = 'type';
|
||||
} else {
|
||||
$where = 'name';
|
||||
if ($what =~ /^\.[a-zA-Z0-9]+$/) {
|
||||
$what = ".*\\$what";
|
||||
}
|
||||
}
|
||||
my ($rec, $err) = compile_regexp('^(?i)'.$what.'$', 0);
|
||||
if (!$rec) {
|
||||
warn("invalid regexp '$what': $err\n");
|
||||
return 0;
|
||||
}
|
||||
push @{$self->{match}}, {where=>$where, what=>$rec, tool=>$tool};
|
||||
dbg('extracttext: use: %s %s %s', $tool, $where, $what);
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
if ($opts->{key} eq 'extracttext_external') {
|
||||
$self->inhibit_further_callbacks();
|
||||
# Temporary kludge to notify users. Double backslashes have zero benefit for this plugin config.
|
||||
if ($opts->{value} =~ s/\\\\/\\/g) {
|
||||
warn "extracttext: DOUBLE BACKSLASHES DEPRECATED, change config to single backslashes, autoconverted for backward compatibility: $opts->{key} $opts->{value}\n";
|
||||
}
|
||||
if ($opts->{value} =~ /(?:to|2)html\b/) {
|
||||
warn "extracttext: HTML tools are not supported, plain text output is required. Please remove: $opts->{key} $opts->{value}\n";
|
||||
return 1;
|
||||
}
|
||||
my %env;
|
||||
while ($opts->{value} =~ s/\{(.+?)\}/ /g) {
|
||||
my ($k,$v) = split(/=/, $1, 2);
|
||||
$env{$k} = defined $v ? $v : '';
|
||||
}
|
||||
my @vals = split(/\s+/, $opts->{value});
|
||||
my $name = lc(shift @vals);
|
||||
return 0 unless @vals > 1;
|
||||
if ($self->{tools}->{$name}) {
|
||||
warn "extracttext: duplicate tool defined: $name\n";
|
||||
return 0;
|
||||
}
|
||||
#unless (-x $vals[0]) {
|
||||
# warn "extracttext: missing tool: $name ($vals[0])\n";
|
||||
# return 0;
|
||||
#}
|
||||
$self->{tools}->{$name} = {
|
||||
'name' => $name,
|
||||
'type' => 'external',
|
||||
'env' => \%env,
|
||||
'cmd' => \@vals,
|
||||
};
|
||||
dbg('extracttext: external: %s "%s"', $name, join('","', @vals));
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
# Extract 'text' via running an external command.
|
||||
sub _extract_external {
|
||||
my ($self, $object, $tool) = @_;
|
||||
|
||||
my ($errno, $pipe_errno, $tmp_file, $err_file, $pid);
|
||||
my $resp = '';
|
||||
my @cmd = @{$tool->{cmd}};
|
||||
|
||||
Mail::SpamAssassin::PerMsgStatus::enter_helper_run_mode($self);
|
||||
|
||||
# Set environment variables
|
||||
foreach (keys %{$tool->{env}}) {
|
||||
$ENV{$_} = $tool->{env}{$_};
|
||||
}
|
||||
|
||||
my $timer = Mail::SpamAssassin::Timeout->new(
|
||||
{ secs => $self->{main}->{conf}->{extracttext_timeout},
|
||||
deadline => $self->{'master_deadline'} });
|
||||
|
||||
my $err = $timer->run_and_catch(sub {
|
||||
local $SIG{PIPE} = sub { die "__brokenpipe__ignore__\n" };
|
||||
|
||||
($tmp_file, my $tmp_fh) = Mail::SpamAssassin::Util::secure_tmpfile();
|
||||
$tmp_file or die "failed to create a temporary file";
|
||||
print $tmp_fh ${$object->{data}};
|
||||
close($tmp_fh);
|
||||
|
||||
($err_file, my $err_fh) = Mail::SpamAssassin::Util::secure_tmpfile();
|
||||
$err_file or die "failed to create a temporary file";
|
||||
close($err_fh);
|
||||
$err_file = untaint_file_path($err_file);
|
||||
|
||||
foreach (@cmd) {
|
||||
# substitute "{}" with the temporary file name to pass to the external software
|
||||
s/\{\}/$tmp_file/;
|
||||
$_ = untaint_var($_);
|
||||
}
|
||||
|
||||
$pid = Mail::SpamAssassin::Util::helper_app_pipe_open(*EXTRACT, undef, ">$err_file", @cmd);
|
||||
$pid or die "$!\n";
|
||||
|
||||
# read+split avoids a Perl I/O bug (Bug 5985)
|
||||
my($inbuf, $nread);
|
||||
|
||||
while ($nread = read(EXTRACT, $inbuf, 8192)) { $resp .= $inbuf }
|
||||
defined $nread or die "error reading from pipe: $!";
|
||||
|
||||
$errno = 0;
|
||||
close EXTRACT or $errno = $!;
|
||||
|
||||
if (proc_status_ok($?, $errno)) {
|
||||
dbg("extracttext: [%s] (%s) finished successfully", $pid, $cmd[0]);
|
||||
} elsif (proc_status_ok($?, $errno, 0, 1)) { # sometimes it exits with 1
|
||||
dbg("extracttext: [%s] (%s) finished: %s", $pid, $cmd[0], exit_status_str($?, $errno));
|
||||
} else {
|
||||
info("extracttext: [%s] (%s) error: %s", $pid, $cmd[0], exit_status_str($?, $errno));
|
||||
}
|
||||
# Save return status for later
|
||||
$pipe_errno = $?;
|
||||
});
|
||||
|
||||
if (defined(fileno(*EXTRACT))) { # still open
|
||||
if ($pid) {
|
||||
if (kill('TERM', $pid)) {
|
||||
dbg("extracttext: killed stale helper [$pid] ($cmd[0])");
|
||||
} else {
|
||||
dbg("extracttext: killing helper application [$pid] ($cmd[0]) failed: $!");
|
||||
}
|
||||
}
|
||||
$errno = 0;
|
||||
close EXTRACT or $errno = $!;
|
||||
proc_status_ok($?, $errno)
|
||||
or info("extracttext: [%s] (%s) error: %s", $pid, $cmd[0], exit_status_str($?, $errno));
|
||||
}
|
||||
|
||||
Mail::SpamAssassin::PerMsgStatus::leave_helper_run_mode($self);
|
||||
unlink($tmp_file);
|
||||
# Read first line from STDERR
|
||||
my $err_resp = -s $err_file ?
|
||||
do { open(ERRF, $err_file); $_ = <ERRF>; close(ERRF); chomp; $_; } : '';
|
||||
unlink($err_file);
|
||||
|
||||
if ($err_resp ne '') {
|
||||
dbg("extracttext: [$pid] ($cmd[0]) stderr output: $err_resp");
|
||||
}
|
||||
|
||||
# If the output starts with the command that has been run it's
|
||||
# probably an error message
|
||||
if ($pipe_errno) {
|
||||
if ($err_resp =~ /\b(?:Usage:|No such file or directory)/) {
|
||||
warn "extracttext: error from $cmd[0], please verify configuration: $err_resp\n";
|
||||
}
|
||||
elsif ($err_resp =~ /^Syntax (?:Warning|Error): (?:May not be a PDF file|Couldn't find trailer dictionary)/) {
|
||||
# Ignore pdftotext
|
||||
}
|
||||
elsif ($err_resp =~ /^Error in (?:findFileFormatStream|fopenReadStream): (?:truncated file|file not found)/) {
|
||||
# Ignore tesseract
|
||||
}
|
||||
elsif ($err_resp =~ /^libpng error:/) {
|
||||
# Ignore tesseract
|
||||
}
|
||||
elsif ($err_resp =~ /^Corrupt JPEG data:/) {
|
||||
# Ignore tesseract
|
||||
}
|
||||
elsif ($err_resp =~ /^\S+ is not a Word Document/) {
|
||||
# Ignore antiword
|
||||
}
|
||||
elsif (!$resp) {
|
||||
warn "extracttext: error (".($pipe_errno/256).") from $cmd[0]: $err_resp\n";
|
||||
}
|
||||
return (0, $resp);
|
||||
}
|
||||
return (1, $resp);
|
||||
}
|
||||
|
||||
sub _extract_object {
|
||||
my ($self, $object, $tool) = @_;
|
||||
my ($ok, $text);
|
||||
|
||||
if ($tool->{type} eq 'external') {
|
||||
($ok, $text) = $self->_extract_external($object, $tool);
|
||||
} else {
|
||||
warn "extracttext: bad tool type: $tool->{type}\n";
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 0 unless $ok;
|
||||
|
||||
if ($text =~ /^[\s\r\n]*$/s) {
|
||||
$text = '';
|
||||
} else {
|
||||
# Remove not important html elements
|
||||
#$text =~ s/(?=<!DOCTYPE)([\s\S]*?)>//g;
|
||||
#$text =~ s/(?=<!--)([\s\S]*?)-->//g;
|
||||
}
|
||||
|
||||
if ($text eq '') {
|
||||
dbg('extracttext: No text extracted');
|
||||
}
|
||||
|
||||
$text = untaint_var($text);
|
||||
utf8::encode($text) if utf8::is_utf8($text);
|
||||
|
||||
return (1, $text);
|
||||
}
|
||||
|
||||
sub _get_extension {
|
||||
my ($self, $object) = @_;
|
||||
my $fext;
|
||||
if ($object->{name} && $object->{name} =~ /\.([^.\\\/]+)$/) {
|
||||
$fext = $1;
|
||||
}
|
||||
elsif ($object->{file} && $object->{file} =~ /\.([^.\\\/]+)$/) {
|
||||
$fext = $1;
|
||||
}
|
||||
return $fext ? ($fext) : ();
|
||||
}
|
||||
|
||||
sub _extract {
|
||||
my ($self, $coll, $part, $type, $name, $data, $tool) = @_;
|
||||
my $object = {
|
||||
'data' => $data,
|
||||
'type' => $type,
|
||||
'name' => $name
|
||||
};
|
||||
my @fexts;
|
||||
my @types;
|
||||
|
||||
my @tools = ($tool->{name});
|
||||
my ($ok, $text) = $self->_extract_object($object,$tool);
|
||||
|
||||
# when url+text, script never returns to this point from _extract_object above
|
||||
#
|
||||
return 0 unless $ok;
|
||||
if ($text ne '' && would_log('dbg','extracttext') > 1) {
|
||||
dbg("extracttext: text extracted:\n$text");
|
||||
}
|
||||
|
||||
push @{$coll->{text}}, $text;
|
||||
push @types, $type;
|
||||
push @fexts, $self->_get_extension($object);
|
||||
if ($text eq '') {
|
||||
push @{$coll->{flags}}, 'NoText';
|
||||
push @{$coll->{text}}, 'NoText';
|
||||
} else {
|
||||
if ($text =~ /<a(?:\s+[^>]+)?\s+href="([^">]*)"/) {
|
||||
push @{$coll->{flags}}, 'ActionURI';
|
||||
dbg("extracttext: ActionURI: $1");
|
||||
push @{$coll->{text}}, $text;
|
||||
}
|
||||
if ($text =~ /NoText/) {
|
||||
push @{$coll->{flags}},'NoText';
|
||||
dbg("extracttext: NoText");
|
||||
push @{$coll->{text}}, $text;
|
||||
}
|
||||
$coll->{chars} += length($text);
|
||||
|
||||
# the following is safe (regarding clobbering the @_) since perl v5.11.0
|
||||
$coll->{words} += split(/\W+/s,$text) - 1;
|
||||
# $coll->{words} += scalar @{[split(/\W+/s,$text)]} - 1; # old perl hack
|
||||
|
||||
dbg("extracttext: rendering text for type $type with $tool->{name}");
|
||||
$part->set_rendered($text);
|
||||
}
|
||||
|
||||
if (@types) {
|
||||
push @{$coll->{types}}, join(', ', @types);
|
||||
}
|
||||
if (@fexts) {
|
||||
push @{$coll->{extensions}}, join('_', @fexts);
|
||||
}
|
||||
push @{$coll->{tools}}, join('_', @tools);
|
||||
return 1;
|
||||
}
|
||||
|
||||
#
|
||||
# check attachment type and match with the right tool
|
||||
#
|
||||
sub _check_extract {
|
||||
my ($self, $coll, $checked, $part, $decoded, $data, $type, $name) = @_;
|
||||
return 0 unless (defined $type || defined $name);
|
||||
foreach my $match (@{$self->{match}}) {
|
||||
next unless $self->{tools}->{$match->{tool}};
|
||||
next if $checked->{$match->{tool}};
|
||||
|
||||
if ($match->{where} eq 'name') {
|
||||
next unless (defined $name && $name =~ $match->{what});
|
||||
} elsif ($match->{where} eq 'type') {
|
||||
next unless (defined $type && $type =~ $match->{what});
|
||||
} else {
|
||||
next;
|
||||
}
|
||||
$checked->{$match->{tool}} = 1;
|
||||
# dbg("extracttext: coll: $coll, part: $part, type: $type, name: $name, data: $data, tool: $self->{tools}->{$match->{tool}}");
|
||||
return 1 if $self->_extract($coll,$part,$type,$name,$data,$self->{tools}->{$match->{tool}});
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub post_message_parse {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $timer = $self->{main}->time_method("extracttext");
|
||||
|
||||
my $msg = $opts->{'message'};
|
||||
$self->{'master_deadline'} = $msg->{'master_deadline'};
|
||||
my $starttime = time;
|
||||
|
||||
my %collect = (
|
||||
'tools' => [],
|
||||
'types' => [],
|
||||
'extensions' => [],
|
||||
'flags' => [],
|
||||
'chars' => 0,
|
||||
'words' => 0,
|
||||
'text' => [],
|
||||
);
|
||||
|
||||
my $conf = $self->{main}->{conf};
|
||||
my $maxparts = $conf->{extracttext_maxparts};
|
||||
my $ttimeout = $conf->{extracttext_timeout_total} ||
|
||||
$conf->{extracttext_timeout} > 10 ? $conf->{extracttext_timeout} : 10;
|
||||
my $nparts = 0;
|
||||
foreach my $part ($msg->find_parts(qr/./, 1)) {
|
||||
next unless $part->is_leaf;
|
||||
if ($maxparts > 0 && ++$nparts > $maxparts) {
|
||||
dbg("extracttext: Skipping MIME parts exceeding the ${maxparts}th");
|
||||
last;
|
||||
}
|
||||
if (time - $starttime >= $ttimeout) {
|
||||
dbg("extracttext: Skipping MIME parts, total execution timeout exceeded");
|
||||
last;
|
||||
}
|
||||
my (undef,$rtd) = $part->rendered;
|
||||
next if defined $rtd;
|
||||
my %checked = ();
|
||||
my $dat = $part->decode();
|
||||
my $typ = $part->{type};
|
||||
my $nam = $part->{name};
|
||||
my $dec = 1;
|
||||
next if $self->_check_extract(\%collect,\%checked,$part,\$dec,\$dat,$typ,$nam);
|
||||
}
|
||||
|
||||
return 1 unless @{$collect{tools}};
|
||||
|
||||
my @uniq_tools = do { my %seen; grep { !$seen{$_}++ } @{$collect{tools}} };
|
||||
my @uniq_types = do { my %seen; grep { !$seen{$_}++ } @{$collect{types}} };
|
||||
my @uniq_ext = do { my %seen; grep { !$seen{$_}++ } @{$collect{extensions}} };
|
||||
my @uniq_flags = do { my %seen; grep { !$seen{$_}++ } @{$collect{flags}} };
|
||||
|
||||
$msg->put_metadata('X-ExtractText-Words', $collect{words});
|
||||
$msg->put_metadata('X-ExtractText-Chars', $collect{chars});
|
||||
$msg->put_metadata('X-ExtractText-Tools', join(' ', @uniq_tools));
|
||||
$msg->put_metadata('X-ExtractText-Types', join(' ', @uniq_types));
|
||||
$msg->put_metadata('X-ExtractText-Extensions', join(' ', @uniq_ext));
|
||||
$msg->put_metadata('X-ExtractText-Flags', join(' ', @uniq_flags));
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
my $msg = $pms->get_message();
|
||||
foreach my $tag (('Words','Chars','Tools','Types','Extensions','Flags')) {
|
||||
my $v = $msg->get_metadata("X-ExtractText-$tag");
|
||||
if (defined $v) {
|
||||
$pms->set_tag("ExtractText$tag", $v);
|
||||
dbg("extracttext: tag: $tag $v");
|
||||
}
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
1;
|
@ -20,7 +20,7 @@ use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
my $VERSION = 2.003;
|
||||
my $VERSION = 4.000;
|
||||
|
||||
=head1 NAME
|
||||
|
||||
@ -48,19 +48,21 @@ freemail_domains domain ...
|
||||
For example:
|
||||
freemail_domains hotmail.com hotmail.co.?? yahoo.* yahoo.*.*
|
||||
|
||||
freemail_whitelist email/domain ...
|
||||
freemail_welcomelist email/domain ...
|
||||
|
||||
Previously freemail_whitelist which will work interchangeably until 4.1.
|
||||
|
||||
Emails or domains listed here are ignored (pretend they aren't
|
||||
freemail). No wildcards!
|
||||
|
||||
freemail_import_whitelist_auth 1/0
|
||||
freemail_import_welcomelist_auth 1/0
|
||||
|
||||
Entries in whitelist_auth will also be used to whitelist emails
|
||||
Entries in welcomelist_auth will also be used to welcomelist emails
|
||||
or domains from being freemail. Default is 0.
|
||||
|
||||
freemail_import_def_whitelist_auth 1/0
|
||||
freemail_import_def_welcomelist_auth 1/0
|
||||
|
||||
Entries in def_whitelist_auth will also be used to whitelist emails
|
||||
Entries in def_welcomelist_auth will also be used to welcomelist emails
|
||||
or domains from being freemail. Default is 0.
|
||||
|
||||
header FREEMAIL_REPLYTO eval:check_freemail_replyto(['option'])
|
||||
@ -96,17 +98,6 @@ header FREEMAIL_BODY eval:check_freemail_body(['regex'])
|
||||
|
||||
Searches body for freemail address. With optional regex to match.
|
||||
|
||||
=head1 CHANGELOG
|
||||
|
||||
1.996 - fix freemail_skip_bulk_envfrom
|
||||
1.997 - set freemail_skip_when_over_max to 1 by default
|
||||
1.998 - don't warn about missing freemail_domains when linting
|
||||
1.999 - default whitelist undisclosed-recipient@yahoo.com etc
|
||||
2.000 - some cleaning up
|
||||
2.001 - fix freemail_whitelist
|
||||
2.002 - _add_desc -> _got_hit, fix description email append bug
|
||||
2.003 - freemail_import_(def_)whitelist_auth
|
||||
|
||||
=cut
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
@ -115,8 +106,8 @@ use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
# default email whitelist
|
||||
our $email_whitelist = qr/
|
||||
# default email welcomelist
|
||||
our $email_welcomelist = qr/
|
||||
^(?:
|
||||
abuse|support|sales|info|helpdesk|contact|kontakt
|
||||
| (?:post|host|domain)master
|
||||
@ -142,7 +133,7 @@ our $skip_replyto_envfrom = qr/
|
||||
)\@
|
||||
/xi;
|
||||
|
||||
sub dbg { Mail::SpamAssassin::Plugin::dbg ("FreeMail: @_"); }
|
||||
sub dbg { my $msg = shift; Mail::SpamAssassin::Plugin::dbg("FreeMail: $msg", @_); }
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
@ -153,10 +144,10 @@ sub new {
|
||||
|
||||
$self->{freemail_available} = 1;
|
||||
$self->set_config($mailsa->{conf});
|
||||
$self->register_eval_rule("check_freemail_replyto");
|
||||
$self->register_eval_rule("check_freemail_from");
|
||||
$self->register_eval_rule("check_freemail_header");
|
||||
$self->register_eval_rule("check_freemail_body");
|
||||
$self->register_eval_rule("check_freemail_replyto", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_freemail_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_freemail_header", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_freemail_body", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -215,13 +206,15 @@ sub set_config {
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_import_whitelist_auth',
|
||||
setting => 'freemail_import_welcomelist_auth',
|
||||
aliases => ['freemail_import_whitelist_auth'], # removed in 4.1
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
push(@cmds, {
|
||||
setting => 'freemail_import_def_whitelist_auth',
|
||||
setting => 'freemail_import_def_welcomelist_auth',
|
||||
aliases => ['freemail_import_def_whitelist_auth'], # removed in 4.1
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
@ -234,9 +227,9 @@ sub parse_config {
|
||||
|
||||
if ($opts->{key} eq "freemail_domains") {
|
||||
foreach my $temp (split(/\s+/, $opts->{value})) {
|
||||
if ($temp =~ /^[a-z0-9.*?-]+$/i) {
|
||||
if ($temp !~ tr/a-zA-Z0-9.*?-//c) {
|
||||
my $value = lc($temp);
|
||||
if ($value =~ /[*?]/) { # separate wildcard list
|
||||
if ($value =~ tr/*?//) { # separate wildcard list
|
||||
$self->{freemail_temp_wc}{$value} = 1;
|
||||
}
|
||||
else {
|
||||
@ -244,21 +237,21 @@ sub parse_config {
|
||||
}
|
||||
}
|
||||
else {
|
||||
warn("invalid freemail_domains: $temp");
|
||||
warn("freemail: invalid freemail_domains: $temp\n");
|
||||
}
|
||||
}
|
||||
$self->inhibit_further_callbacks();
|
||||
return 1;
|
||||
}
|
||||
|
||||
if ($opts->{key} eq "freemail_whitelist") {
|
||||
if ($opts->{key} eq "freemail_welcomelist" || $opts->{key} eq "freemail_whitelist") {
|
||||
foreach my $temp (split(/\s+/, $opts->{value})) {
|
||||
my $value = lc($temp);
|
||||
if ($value =~ /\w[.@]\w/) {
|
||||
$self->{freemail_whitelist}{$value} = 1;
|
||||
$self->{freemail_welcomelist}{$value} = 1;
|
||||
}
|
||||
else {
|
||||
warn("invalid freemail_whitelist: $temp");
|
||||
warn("freemail: invalid freemail_welcomelist: $temp\n");
|
||||
}
|
||||
}
|
||||
$self->inhibit_further_callbacks();
|
||||
@ -292,12 +285,7 @@ sub finish_parsing_end {
|
||||
dbg("loaded freemail_domains entries: $count normal, $wcount wildcard");
|
||||
}
|
||||
else {
|
||||
if ($self->{main}->{lint_rules} ||1) {
|
||||
dbg("no freemail_domains entries defined, disabling plugin");
|
||||
}
|
||||
else {
|
||||
warn("no freemail_domains entries defined, disabling plugin");
|
||||
}
|
||||
$self->{freemail_available} = 0;
|
||||
}
|
||||
|
||||
@ -315,29 +303,29 @@ sub _is_freemail {
|
||||
|
||||
return 0 if $email eq '';
|
||||
|
||||
if (defined $self->{freemail_whitelist}{$email}) {
|
||||
dbg("whitelisted email: $email");
|
||||
if (defined $self->{freemail_welcomelist}{$email}) {
|
||||
dbg("welcomelisted email: $email");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $domain = $email;
|
||||
$domain =~ s/.*\@//;
|
||||
|
||||
if (defined $self->{freemail_whitelist}{$domain}) {
|
||||
dbg("whitelisted domain: $domain");
|
||||
if (defined $self->{freemail_welcomelist}{$domain}) {
|
||||
dbg("welcomelisted domain: $domain");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($email =~ $email_whitelist) {
|
||||
dbg("whitelisted email, default: $email");
|
||||
if ($email =~ $email_welcomelist) {
|
||||
dbg("welcomelisted email, default: $email");
|
||||
return 0;
|
||||
}
|
||||
|
||||
foreach my $list ('whitelist_auth','def_whitelist_auth') {
|
||||
foreach my $list ('welcomelist_auth','def_welcomelist_auth') {
|
||||
if ($pms->{conf}->{"freemail_import_$list"}) {
|
||||
foreach my $regexp (values %{$pms->{conf}->{$list}}) {
|
||||
if ($email =~ /$regexp/o) {
|
||||
dbg("whitelisted email, $list: $email");
|
||||
dbg("welcomelisted email, $list: $email");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
@ -422,21 +410,13 @@ sub _parse_body {
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub _got_hit {
|
||||
my ($self, $pms, $email, $desc) = @_;
|
||||
sub _test_log {
|
||||
my ($self, $pms, $email, $rulename) = @_;
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
|
||||
if (defined $pms->{conf}->{descriptions}->{$rulename}) {
|
||||
$desc = $pms->{conf}->{descriptions}->{$rulename};
|
||||
if ($pms->{conf}->{freemail_add_describe_email}) {
|
||||
$email =~ s/\@/(at)/g;
|
||||
$pms->test_log($email, $rulename);
|
||||
}
|
||||
|
||||
if ($pms->{main}->{conf}->{freemail_add_describe_email}) {
|
||||
$email =~ s/\@/[at]/g;
|
||||
$pms->test_log($email);
|
||||
}
|
||||
|
||||
$pms->got_hit($rulename, "", description => $desc, ruletype => 'eval');
|
||||
}
|
||||
|
||||
sub check_freemail_header {
|
||||
@ -448,7 +428,7 @@ sub check_freemail_header {
|
||||
dbg("RULE ($rulename) check_freemail_header".(defined $regex ? " regex:$regex" : ""));
|
||||
|
||||
unless (defined $header) {
|
||||
warn("check_freemail_header needs argument");
|
||||
warn("freemail: check_freemail_header needs argument\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -462,13 +442,13 @@ sub check_freemail_header {
|
||||
$re = $rec;
|
||||
}
|
||||
|
||||
my @emails = map (lc, $pms->{main}->find_all_addrs_in_line ($pms->get($header)));
|
||||
my @emails = map (lc, $pms->get("$header:addr"));
|
||||
|
||||
if (!scalar (@emails)) {
|
||||
dbg("header $header not found from mail");
|
||||
return 0;
|
||||
}
|
||||
dbg("addresses from header $header: ".join(';',@emails));
|
||||
dbg("addresses from header $header: ".join(', ', @emails));
|
||||
|
||||
foreach my $email (@emails) {
|
||||
if ($self->_is_freemail($email, $pms)) {
|
||||
@ -479,7 +459,7 @@ sub check_freemail_header {
|
||||
else {
|
||||
dbg("HIT! $email is freemail");
|
||||
}
|
||||
$self->_got_hit($pms, $email, "Header $header is freemail");
|
||||
$self->_test_log($pms, $email, $rulename);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
@ -511,16 +491,16 @@ sub check_freemail_body {
|
||||
foreach my $email (keys %{$pms->{freemail_cache}{body}}) {
|
||||
if ($email =~ /$re/o) {
|
||||
dbg("HIT! email from body is freemail and matches regex: $email");
|
||||
$self->_got_hit($pms, $email, "Email from body is freemail");
|
||||
return 0;
|
||||
$self->_test_log($pms, $email, $rulename);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
elsif (scalar keys %{$pms->{freemail_cache}{body}}) {
|
||||
my $emails = join(', ', keys %{$pms->{freemail_cache}{body}});
|
||||
dbg("HIT! body has freemails: $emails");
|
||||
$self->_got_hit($pms, $emails, "Body contains freemails");
|
||||
return 0;
|
||||
$self->_test_log($pms, $emails, $rulename);
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
@ -563,8 +543,8 @@ sub check_freemail_from {
|
||||
else {
|
||||
dbg("HIT! $email is freemail");
|
||||
}
|
||||
$self->_got_hit($pms, $email, "Sender address is freemail");
|
||||
return 0;
|
||||
$self->_test_log($pms, $email, $rulename);
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
@ -580,7 +560,7 @@ sub check_freemail_replyto {
|
||||
|
||||
if (defined $what) {
|
||||
if ($what ne 'replyto' and $what ne 'reply') {
|
||||
warn("invalid check_freemail_replyto option: $what");
|
||||
warn("freemail: invalid check_freemail_replyto option: $what\n");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
@ -590,25 +570,34 @@ sub check_freemail_replyto {
|
||||
|
||||
# Skip mailing-list etc looking requests, mostly FPs from them
|
||||
if ($pms->{main}->{conf}->{freemail_skip_bulk_envfrom}) {
|
||||
my $envfrom = lc($pms->get("EnvelopeFrom"));
|
||||
if ($envfrom =~ $skip_replyto_envfrom) {
|
||||
my $envfrom = ($pms->get("EnvelopeFrom"))[0];
|
||||
if (defined $envfrom && $envfrom =~ $skip_replyto_envfrom) {
|
||||
dbg("envelope sender looks bulk, skipping check: $envfrom");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
my $from = lc($pms->get("From:addr"));
|
||||
my $replyto = lc($pms->get("Reply-To:addr"));
|
||||
my $from_is_fm = $self->_is_freemail($from, $pms);
|
||||
my $replyto_is_fm = $self->_is_freemail($replyto, $pms);
|
||||
my @from_addrs = map (lc, $pms->get("From:addr"));
|
||||
dbg("From address: ".join(", ", @from_addrs)) if @from_addrs;
|
||||
|
||||
dbg("From address: $from") if $from ne '';
|
||||
dbg("Reply-To address: $replyto") if $replyto ne '';
|
||||
my @replyto_addrs = map (lc, $pms->get("Reply-To:addr"));
|
||||
dbg("Reply-To address: ".join(", ", @replyto_addrs)) if @replyto_addrs;
|
||||
|
||||
if ($from_is_fm and $replyto_is_fm and ($from ne $replyto)) {
|
||||
my $from_is_fm = grep { $self->_is_freemail($_, $pms) } @from_addrs;
|
||||
my $replyto_is_fm = grep { $self->_is_freemail($_, $pms) } @replyto_addrs;
|
||||
|
||||
my $from_not_in_replyto = 1;
|
||||
foreach my $from (@from_addrs) {
|
||||
next unless grep { $_ eq $from } @replyto_addrs;
|
||||
$from_not_in_replyto = 0;
|
||||
}
|
||||
|
||||
if ($from_is_fm and $replyto_is_fm and $from_not_in_replyto) {
|
||||
dbg("HIT! From and Reply-To are different freemails");
|
||||
$self->_got_hit($pms, "$from, $replyto", "From and Reply-To are different freemails");
|
||||
return 0;
|
||||
my $from = join(",", @from_addrs);
|
||||
my $replyto = join(",", @replyto_addrs);
|
||||
$self->_test_log($pms, "$from -> $replyto", $rulename);
|
||||
return 1;
|
||||
}
|
||||
|
||||
if ($what eq 'replyto') {
|
||||
@ -618,7 +607,7 @@ sub check_freemail_replyto {
|
||||
}
|
||||
}
|
||||
elsif ($what eq 'reply') {
|
||||
if ($replyto ne '' and !$replyto_is_fm) {
|
||||
if (@replyto_addrs and !$replyto_is_fm) {
|
||||
dbg("Reply-To defined and is not freemail, skipping check");
|
||||
return 0;
|
||||
}
|
||||
@ -627,19 +616,21 @@ sub check_freemail_replyto {
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
my $reply = $replyto_is_fm ? $replyto : $from;
|
||||
|
||||
return 0 unless $self->_parse_body($pms);
|
||||
|
||||
# Compare body to headers
|
||||
if (scalar keys %{$pms->{freemail_cache}{body}}) {
|
||||
my $check = $what eq 'replyto' ? $replyto : $reply;
|
||||
dbg("comparing $check to body freemails");
|
||||
foreach my $email (keys %{$pms->{freemail_cache}{body}}) {
|
||||
if ($email ne $check) {
|
||||
dbg("HIT! $check and $email are different freemails");
|
||||
$self->_got_hit($pms, "$check, $email", "Different freemails in reply header and body");
|
||||
return 0;
|
||||
my $reply_addrs = $what eq 'replyto' ? \@replyto_addrs :
|
||||
$replyto_is_fm ? \@replyto_addrs : \@from_addrs;
|
||||
dbg("comparing to body freemails: ".join(", ", @$reply_addrs));
|
||||
foreach my $body_email (keys %{$pms->{freemail_cache}{body}}) {
|
||||
foreach my $reply_email (@$reply_addrs) {
|
||||
if ($body_email ne $reply_email) {
|
||||
dbg("HIT! $reply_email (Reply) and $body_email (Body) are different freemails");
|
||||
$self->_test_log($pms, "$reply_email, $body_email", $rulename);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -17,29 +17,33 @@
|
||||
|
||||
=head1 NAME
|
||||
|
||||
FromNameSpoof - perform various tests to detect spoof attempts using the From header name section
|
||||
FromNameSpoof - perform various tests to detect spoof attempts using the
|
||||
From header name section
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::FromNameSpoof
|
||||
|
||||
# Does the From:name look like it contains an email address
|
||||
# From:name and From:addr do not match, matching depends on C<fns_check> setting
|
||||
header __PLUGIN_FROMNAME_SPOOF eval:check_fromname_spoof()
|
||||
|
||||
# From:name and From:addr do not match (same as above rule and C<fns_check 0>)
|
||||
header __PLUGIN_FROMNAME_DIFFERENT eval:check_fromname_different()
|
||||
|
||||
# From:name and From:addr domains differ
|
||||
header __PLUGIN_FROMNAME_DOMAIN_DIFFER eval:check_fromname_domain_differ()
|
||||
|
||||
# From:name looks like it contains an email address (not same as From:addr)
|
||||
header __PLUGIN_FROMNAME_EMAIL eval:check_fromname_contains_email()
|
||||
|
||||
# Is the From:name different to the From:addr header
|
||||
header __PLUGIN_FROMNAME_DIFFERENT eval:check_fromname_different()
|
||||
# From:name matches any To:addr
|
||||
header __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
|
||||
|
||||
# From:name and From:addr owners differ
|
||||
header __PLUGIN_FROMNAME_OWNERS_DIFFER eval:check_fromname_owners_differ()
|
||||
|
||||
# From:name domain differs to from header
|
||||
header __PLUGIN_FROMNAME_DOMAIN_DIFFER eval:check_fromname_domain_differ()
|
||||
|
||||
# From:name and From:address don't match and owners differ
|
||||
header __PLUGIN_FROMNAME_SPOOF eval:check_fromname_spoof()
|
||||
|
||||
# From:name address matches To:address
|
||||
header __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
|
||||
# From:name matches Reply-To:addr
|
||||
header __PLUGIN_FROMNAME_EQUALS_REPLYTO eval:check_fromname_equals_replyto()
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
@ -61,17 +65,21 @@ B<Note> that FromNameSpoof detects the "owner" of a domain by the following sear
|
||||
|
||||
<owner>.<tld>
|
||||
|
||||
By default FromNameSpoof will ignore the TLD when testing if From:addr is spoofed.
|
||||
Default 1
|
||||
By default FromNameSpoof will ignore the TLD when comparing addresses:
|
||||
|
||||
fns_check 1
|
||||
|
||||
Check levels:
|
||||
|
||||
0 - Strict checking of From:name != From:addr
|
||||
1 - Allow for different tlds
|
||||
1 - Allow for different TLDs
|
||||
2 - Allow for different aliases but same domain
|
||||
|
||||
"Owner" info can also be mapped as aliases with C<fns_add_addrlist>. For
|
||||
example, to consider "googlemail.com" as "gmail":
|
||||
|
||||
fns_add_addrlist (gmail) *@googlemail.com
|
||||
|
||||
=head1 TAGS
|
||||
|
||||
The following tags are added to the set if a spoof is detected. They are available for
|
||||
@ -93,48 +101,35 @@ use in reports, header fields, other plugins, etc.:
|
||||
Actual From:addr domain
|
||||
|
||||
_FNSFADDROWNER_
|
||||
Actual From:addr detected owner
|
||||
Actual From:addr owner
|
||||
|
||||
=head1 EXAMPLE
|
||||
|
||||
header __PLUGIN_FROMNAME_SPOOF eval:check_fromname_spoof()
|
||||
header __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
|
||||
|
||||
meta FROMNAME_SPOOF_EQUALS_TO (__PLUGIN_FROMNAME_SPOOF && __PLUGIN_FROMNAME_EQUALS_TO)
|
||||
describe FROMNAME_SPOOF_EQUALS_TO From:name is spoof to look like To: address
|
||||
score FROMNAME_SPOOF_EQUALS_TO 1.2
|
||||
header __PLUGIN_FROMNAME_SPOOF eval:check_fromname_spoof()
|
||||
header __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
|
||||
meta FROMNAME_SPOOF_EQUALS_TO (__PLUGIN_FROMNAME_SPOOF && __PLUGIN_FROMNAME_EQUALS_TO)
|
||||
describe FROMNAME_SPOOF_EQUALS_TO From:name is spoof to look like To: address
|
||||
score FROMNAME_SPOOF_EQUALS_TO 1.2
|
||||
|
||||
=cut
|
||||
|
||||
use strict;
|
||||
|
||||
package Mail::SpamAssassin::Plugin::FromNameSpoof;
|
||||
my $VERSION = 0.9;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use List::Util ();
|
||||
use Mail::SpamAssassin::Util;
|
||||
|
||||
use vars qw(@ISA);
|
||||
@ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub dbg { Mail::SpamAssassin::Plugin::dbg ("FromNameSpoof: @_"); }
|
||||
my $VERSION = 1.0;
|
||||
|
||||
sub uri_to_domain {
|
||||
my ($self, $domain) = @_;
|
||||
|
||||
return unless defined $domain;
|
||||
|
||||
if ($Mail::SpamAssassin::VERSION <= 3.004000) {
|
||||
Mail::SpamAssassin::Util::uri_to_domain($domain);
|
||||
} else {
|
||||
$self->{main}->{registryboundaries}->uri_to_domain($domain);
|
||||
}
|
||||
}
|
||||
sub dbg { my $msg = shift; Mail::SpamAssassin::Plugin::dbg("FromNameSpoof: $msg", @_); }
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new
|
||||
{
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
@ -146,13 +141,13 @@ sub new
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_fromname_spoof");
|
||||
$self->register_eval_rule("check_fromname_different");
|
||||
$self->register_eval_rule("check_fromname_domain_differ");
|
||||
$self->register_eval_rule("check_fromname_contains_email");
|
||||
$self->register_eval_rule("check_fromname_equals_to");
|
||||
$self->register_eval_rule("check_fromname_owners_differ");
|
||||
$self->register_eval_rule("check_fromname_equals_replyto");
|
||||
$self->register_eval_rule("check_fromname_spoof", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_fromname_different", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_fromname_domain_differ", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_fromname_contains_email", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_fromname_equals_to", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_fromname_owners_differ", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_fromname_equals_replyto", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
return $self;
|
||||
}
|
||||
|
||||
@ -164,14 +159,13 @@ sub set_config {
|
||||
setting => 'fns_add_addrlist',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my($self, $key, $value, $line) = @_;
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
local($1,$2);
|
||||
if ($value !~ /^ \( (.*?) \) \s+ (.*) \z/sx) {
|
||||
if ($value !~ /^ \( (.+?) \) \s+ (.+) \z/sx) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my $listname = "FNS_$1";
|
||||
$value = $2;
|
||||
$self->{parser}->add_to_addrlist ($listname, split(/\s+/, lc($value)));
|
||||
my $listname = "FNS_".lc($1);
|
||||
$self->{parser}->add_to_addrlist($listname, split(/\s+/, lc $2));
|
||||
$self->{fns_addrlists}{$listname} = 1;
|
||||
}
|
||||
});
|
||||
@ -180,14 +174,13 @@ sub set_config {
|
||||
setting => 'fns_remove_addrlist',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my($self, $key, $value, $line) = @_;
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
local($1,$2);
|
||||
if ($value !~ /^ \( (.*?) \) \s+ (.*) \z/sx) {
|
||||
if ($value !~ /^ \( (.+?) \) \s+ (.+) \z/sx) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
my $listname = "FNS_$1";
|
||||
$value = $2;
|
||||
$self->{parser}->remove_from_addrlist ($listname, split (/\s+/, $value));
|
||||
my $listname = "FNS_".lc($1);
|
||||
$self->{parser}->remove_from_addrlist($listname, split (/\s+/, lc $2));
|
||||
}
|
||||
});
|
||||
|
||||
@ -206,7 +199,7 @@ sub set_config {
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
$self->{fns_ignore_dkim}->{$_} = 1 foreach (split(/\s+/, lc($value)));
|
||||
$self->{fns_ignore_dkim}->{$_} = 1 foreach (split(/\s+/, lc $value));
|
||||
}
|
||||
});
|
||||
|
||||
@ -227,6 +220,16 @@ sub set_config {
|
||||
setting => 'fns_check',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
if ($value !~ /^[012]$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{fns_check} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
@ -235,203 +238,287 @@ sub set_config {
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
$pms->action_depends_on_tags('DKIMDOMAIN',
|
||||
sub { my($pms,@args) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
|
||||
# If fns_ignore_dkim used, force wait for DKIM results
|
||||
if (%{$pms->{conf}->{fns_ignore_dkim}}) {
|
||||
if ($self->{main}->{local_tests_only}) {
|
||||
dbg("local tests only, ignoring fns_ignore_dkim setting");
|
||||
}
|
||||
# Check that DKIM module is loaded (a bit kludgy check)
|
||||
elsif (exists $pms->{conf}->{dkim_timeout}) {
|
||||
# Initialize async queue, any eval calls will queue their checks
|
||||
$pms->{fromname_async_queue} = [];
|
||||
# Process and finish queue as soon as DKIM is ready
|
||||
$pms->action_depends_on_tags('DKIMDOMAIN', sub {
|
||||
$self->_check_async_queue($pms);
|
||||
});
|
||||
} else {
|
||||
dbg("DKIM plugin not loaded, ignoring fns_ignore_dkim setting");
|
||||
}
|
||||
}
|
||||
);
|
||||
1;
|
||||
}
|
||||
|
||||
sub check_fromname_different
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
sub _check_eval {
|
||||
my ($self, $pms, $result) = @_;
|
||||
|
||||
if (exists $pms->{fromname_async_queue}) {
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
push @{$pms->{fromname_async_queue}}, sub {
|
||||
if ($result->()) {
|
||||
$pms->got_hit($rulename, '', ruletype => 'header');
|
||||
} else {
|
||||
$pms->rule_ready($rulename);
|
||||
}
|
||||
};
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_address_different};
|
||||
# make sure not to return undef, as this is not async anymore
|
||||
return $result->() || 0;
|
||||
}
|
||||
|
||||
sub check_fromname_domain_differ
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_domain_different};
|
||||
}
|
||||
|
||||
sub check_fromname_spoof
|
||||
{
|
||||
sub check_fromname_spoof {
|
||||
my ($self, $pms, $check_lvl) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
|
||||
if ( not defined $check_lvl ) {
|
||||
# Some deprecated eval parameter, was not documented?
|
||||
if (!defined $check_lvl || $check_lvl !~ /^[012]$/) {
|
||||
$check_lvl = $pms->{conf}->{fns_check};
|
||||
}
|
||||
|
||||
my $result = sub {
|
||||
my @array = (
|
||||
($pms->{fromname_address_different}) ,
|
||||
($pms->{fromname_address_different} && $pms->{fromname_owner_different}) ,
|
||||
($pms->{fromname_address_different}),
|
||||
($pms->{fromname_address_different} && $pms->{fromname_owner_different}),
|
||||
($pms->{fromname_address_different} && $pms->{fromname_domain_different})
|
||||
);
|
||||
$array[$check_lvl];
|
||||
};
|
||||
|
||||
return $array[$check_lvl];
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_fromname_contains_email
|
||||
{
|
||||
sub check_fromname_different {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $result = sub {
|
||||
$pms->{fromname_address_different};
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_fromname_domain_differ {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $result = sub {
|
||||
$pms->{fromname_domain_different};
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_fromname_contains_email {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $result = sub {
|
||||
$pms->{fromname_contains_email};
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_fromname_equals_to {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $result = sub {
|
||||
$pms->{fromname_equals_to_addr};
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_fromname_owners_differ {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $result = sub {
|
||||
$pms->{fromname_owner_different};
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_fromname_equals_replyto {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $result = sub {
|
||||
$pms->{fromname_equals_replyto};
|
||||
};
|
||||
|
||||
return $self->_check_eval($pms, $result);
|
||||
}
|
||||
|
||||
sub check_cleanup {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
$self->_check_async_queue($opts->{permsgstatus});
|
||||
}
|
||||
|
||||
# Shall only be called when DKIMDOMAIN is ready, or from check_cleanup() to
|
||||
# make sure _check_fromnamespoof is called if DKIMDOMAIN was never set
|
||||
sub _check_async_queue {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
if (exists $pms->{fromname_async_queue}) {
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_contains_email};
|
||||
$_->() foreach (@{$pms->{fromname_async_queue}});
|
||||
# No more async queueing needed. If any evals are called later, they
|
||||
# will act on the results directly.
|
||||
delete $pms->{fromname_async_queue};
|
||||
}
|
||||
}
|
||||
|
||||
sub check_fromname_equals_replyto
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_equals_replyto};
|
||||
}
|
||||
|
||||
sub check_fromname_equals_to
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_equals_to_addr};
|
||||
}
|
||||
|
||||
sub check_fromname_owners_differ
|
||||
{
|
||||
my ($self, $pms) = @_;
|
||||
$self->_check_fromnamespoof($pms);
|
||||
return $pms->{fromname_owner_different};
|
||||
}
|
||||
|
||||
sub _check_fromnamespoof
|
||||
{
|
||||
sub _check_fromnamespoof {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
return if (defined $pms->{fromname_contains_email});
|
||||
return if $pms->{fromname_checked};
|
||||
$pms->{fromname_checked} = 1;
|
||||
|
||||
my $conf = $pms->{conf};
|
||||
|
||||
$pms->{fromname_contains_email} = 0;
|
||||
$pms->{fromname_address_different} = 0;
|
||||
$pms->{fromname_equals_to_addr} = 0;
|
||||
$pms->{fromname_domain_different} = 0;
|
||||
$pms->{fromname_owner_different} = 0;
|
||||
$pms->{fromname_equals_replyto} = 0;
|
||||
|
||||
foreach my $addr (split / /, $pms->get_tag('DKIMDOMAIN') || '') {
|
||||
if ($conf->{fns_ignore_dkim}->{lc($addr)}) {
|
||||
foreach my $addr (split(/\s+/, $pms->get_tag('DKIMDOMAIN')||'')) {
|
||||
if ($conf->{fns_ignore_dkim}->{lc $addr}) {
|
||||
dbg("ignoring, DKIM signed: $addr");
|
||||
return 0;
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
foreach my $iheader (keys %{$conf->{fns_ignore_header}}) {
|
||||
if ($pms->get($iheader)) {
|
||||
dbg("ignoring, header $iheader found");
|
||||
return 0 if ($pms->get($iheader));
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
# Parse From addr
|
||||
my $from_addr = lc $pms->get('From:addr');
|
||||
my $from_domain = $self->{main}->{registryboundaries}->uri_to_domain("mailto:$from_addr");
|
||||
return unless defined $from_domain;
|
||||
|
||||
# Parse From name
|
||||
my $fromname = lc $pms->get('From:name');
|
||||
# Very common to have From address cloned into name, ignore?
|
||||
#if ($fromname eq $from_addr) {
|
||||
# dbg("ignoring, From-name is exactly same as From addr: $fromname");
|
||||
# return;
|
||||
#}
|
||||
my ($fromname_addr, $fromname_domain);
|
||||
if ($fromname =~ /\b([\w\.\!\#\$\%\&\'\*\+\/\=\?\^\_\`\{\|\}\~-]+\@\w[\w-]*\.\w[\w.-]++)\b/i) {
|
||||
$fromname_addr = $1;
|
||||
$fromname_domain = $self->{main}->{registryboundaries}->uri_to_domain("mailto:$fromname_addr");
|
||||
# No valid domain/TLD found? Any reason to keep testing a possibly obfuscated one?
|
||||
if (!defined $fromname_domain) {
|
||||
dbg("no From-name addr found");
|
||||
return;
|
||||
}
|
||||
$pms->{fromname_contains_email} = 1; # check_fromname_contains_email hit
|
||||
# Calculate "closeness" (this really needs documentation, as it's hard to understand)
|
||||
my $nochar = ($fromname =~ y/a-z0-9//c);
|
||||
$nochar -= ($fromname_addr =~ y/a-z0-9//c);
|
||||
my $len = length($fromname) + $nochar - length($fromname_addr);
|
||||
unless ($len <= $conf->{fns_extrachars}) {
|
||||
dbg("not enough closeness for From-name/addr: $fromname <=> $fromname_addr ($len <= $conf->{fns_extrachars})");
|
||||
return;
|
||||
}
|
||||
} else {
|
||||
# No point continuing if email was not found inside name
|
||||
dbg("no From-name addr found");
|
||||
return;
|
||||
}
|
||||
|
||||
# Parse owners
|
||||
my $list_refs = {};
|
||||
|
||||
if ($conf->{fns_addrlists}) {
|
||||
my @lists = keys %{$conf->{fns_addrlists}};
|
||||
foreach my $list (@lists) {
|
||||
$list_refs->{$list} = $conf->{$list};
|
||||
}
|
||||
s/^FNS_// foreach (@lists);
|
||||
dbg("using addrlists: ".join(', ', @lists));
|
||||
dbg("using addrlists for owner aliases: ".join(', ', map { s/^FNS_//r; } @lists));
|
||||
}
|
||||
my $fromname_owner = $self->_find_address_owner($fromname_addr, $fromname_domain, $list_refs);
|
||||
my $from_owner = $self->_find_address_owner($from_addr, $from_domain, $list_refs);
|
||||
|
||||
my %fnd = ();
|
||||
my %fad = ();
|
||||
my %tod = ();
|
||||
dbg("Parsed From-name addr/domain/owner: $fromname_addr/$fromname_domain/$fromname_owner");
|
||||
dbg("Parsed From-addr addr/domain/owner: $from_addr/$from_domain/$from_owner");
|
||||
|
||||
$fnd{'addr'} = $pms->get("From:name");
|
||||
|
||||
if ($fnd{'addr'} =~ /\b((?>[\w\.\!\#\$\%\&\'\*\+\/\=\?\^\_\`\{\|\}\~\-]+@[\w\-\.]+\.[\w\-\.]+))\b/i) {
|
||||
my $nochar = ($fnd{'addr'} =~ y/A-Za-z0-9//c);
|
||||
$nochar -= ($1 =~ y/A-Za-z0-9//c);
|
||||
|
||||
return 0 unless ((length($fnd{'addr'})+$nochar) - length($1) <= $conf->{'fns_extrachars'});
|
||||
|
||||
$fnd{'addr'} = lc $1;
|
||||
} else {
|
||||
return 0;
|
||||
if ($fromname_addr ne $from_addr) {
|
||||
dbg("From-name addr differs from From addr: $fromname_addr != $from_addr");
|
||||
$pms->{fromname_address_different} = 1;
|
||||
}
|
||||
|
||||
my $replyto = lc $pms->get("Reply-To:addr");
|
||||
|
||||
$fad{'addr'} = lc $pms->get("From:addr");
|
||||
my @toaddrs = $pms->all_to_addrs();
|
||||
return 0 unless @toaddrs;
|
||||
|
||||
$tod{'addr'} = lc $toaddrs[0];
|
||||
|
||||
$fnd{'domain'} = $self->uri_to_domain($fnd{'addr'});
|
||||
$fad{'domain'} = $self->uri_to_domain($fad{'addr'});
|
||||
$tod{'domain'} = $self->uri_to_domain($tod{'addr'});
|
||||
|
||||
return 0 unless (defined $fnd{'domain'} && defined $fad{'domain'});
|
||||
|
||||
$pms->{fromname_contains_email} = 1;
|
||||
|
||||
$fnd{'owner'} = $self->_find_address_owner($fnd{'addr'}, $list_refs);
|
||||
|
||||
$fad{'owner'} = $self->_find_address_owner($fad{'addr'}, $list_refs);
|
||||
|
||||
$tod{'owner'} = $self->_find_address_owner($tod{'addr'}, $list_refs);
|
||||
|
||||
$pms->{fromname_address_different} = 1 if ($fnd{'addr'} ne $fad{'addr'});
|
||||
|
||||
$pms->{fromname_domain_different} = 1 if ($fnd{'domain'} ne $fad{'domain'});
|
||||
|
||||
$pms->{fromname_equals_to_addr} = 1 if ($fnd{'addr'} eq $tod{addr});
|
||||
|
||||
$pms->{fromname_equals_replyto} = 1 if ($fnd{'addr'} eq $replyto);
|
||||
|
||||
if ($fnd{'owner'} ne $fad{'owner'}) {
|
||||
if ($fromname_domain ne $from_domain) {
|
||||
dbg("From-name domain differs from From domain: $fromname_domain != $from_domain");
|
||||
$pms->{fromname_domain_different} = 1;
|
||||
}
|
||||
if ($fromname_owner ne $from_owner) {
|
||||
dbg("From-name owner differs from From owner: $fromname_owner != $from_owner");
|
||||
$pms->{fromname_owner_different} = 1;
|
||||
}
|
||||
|
||||
if ($pms->{fromname_address_different}) {
|
||||
$pms->set_tag("FNSFNAMEADDR", $fnd{'addr'});
|
||||
$pms->set_tag("FNSFADDRADDR", $fad{'addr'});
|
||||
$pms->set_tag("FNSFNAMEOWNER", $fnd{'owner'});
|
||||
$pms->set_tag("FNSFADDROWNER", $fad{'owner'});
|
||||
$pms->set_tag("FNSFNAMEDOMAIN", $fnd{'domain'});
|
||||
$pms->set_tag("FNSFADDRDOMAIN", $fad{'domain'});
|
||||
# Check Reply-To related
|
||||
my $replyto_addr = lc $pms->get('Reply-To:addr');
|
||||
if ($fromname_addr eq $replyto_addr) {
|
||||
dbg("From-name addr is same as Reply-To addr: $fromname_addr");
|
||||
$pms->{fromname_equals_replyto} = 1;
|
||||
}
|
||||
|
||||
dbg("From name spoof: $fnd{addr} $fnd{domain} $fnd{owner}");
|
||||
dbg("Actual From: $fad{addr} $fad{domain} $fad{owner}");
|
||||
dbg("To Address: $tod{addr} $tod{domain} $tod{owner}");
|
||||
# Check To related
|
||||
foreach my $to_addr ($pms->all_to_addrs()) {
|
||||
if ($fromname_addr eq $to_addr) {
|
||||
dbg("From-name addr is same as To addr: $fromname_addr");
|
||||
$pms->{fromname_equals_to_addr} = 1;
|
||||
last;
|
||||
}
|
||||
}
|
||||
|
||||
# Set tags
|
||||
if ($pms->{fromname_address_different} || $pms->{fromname_owner_different}) {
|
||||
$pms->set_tag("FNSFNAMEADDR", $fromname_addr);
|
||||
$pms->set_tag("FNSFNAMEDOMAIN", $fromname_domain);
|
||||
$pms->set_tag("FNSFNAMEOWNER", $fromname_owner);
|
||||
$pms->set_tag("FNSFADDRADDR", $from_addr);
|
||||
$pms->set_tag("FNSFADDRDOMAIN", $from_domain);
|
||||
$pms->set_tag("FNSFADDROWNER", $from_owner);
|
||||
}
|
||||
}
|
||||
|
||||
sub _find_address_owner
|
||||
{
|
||||
my ($self, $check, $list_refs) = @_;
|
||||
sub _find_address_owner {
|
||||
my ($self, $addr, $addr_domain, $list_refs) = @_;
|
||||
|
||||
# Check fns addrlist first for user defined mapping
|
||||
foreach my $owner (keys %{$list_refs}) {
|
||||
foreach my $white_addr (keys %{$list_refs->{$owner}}) {
|
||||
my $regexp = qr/$list_refs->{$owner}{$white_addr}/i;
|
||||
if ($check =~ /$regexp/) {
|
||||
$owner =~ s/^FNS_//i;
|
||||
foreach my $listaddr (keys %{$list_refs->{$owner}}) {
|
||||
if ($addr =~ $list_refs->{$owner}{$listaddr}) {
|
||||
$owner =~ s/^FNS_//;
|
||||
return lc $owner;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
my $owner = $self->uri_to_domain($check);
|
||||
|
||||
$check =~ /^([^\@]+)\@(.*)$/;
|
||||
|
||||
if ($owner ne $2) {
|
||||
return $self->_find_address_owner("$1\@$owner", $list_refs);
|
||||
# If we have subdomain addr foo.bar@sub.domain.com,
|
||||
# this will try to recheck foo.bar@domain.com from addrlist
|
||||
local($1,$2);
|
||||
if ($addr =~ /^([^\@]+)\@(.+)$/) {
|
||||
if ($2 ne $addr_domain) {
|
||||
return $self->_find_address_owner("$1\@$addr_domain", $addr_domain, $list_refs);
|
||||
}
|
||||
}
|
||||
|
||||
$owner =~ /^([^\.]+)\./;
|
||||
return lc $1;
|
||||
# Grab the first component of TLD
|
||||
if ($addr_domain =~ /^([^.]+)\./) {
|
||||
return $1;
|
||||
} else {
|
||||
return $addr_domain;
|
||||
}
|
||||
}
|
||||
|
||||
1;
|
||||
|
@ -39,18 +39,18 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("html_tag_balance");
|
||||
$self->register_eval_rule("html_image_only");
|
||||
$self->register_eval_rule("html_image_ratio");
|
||||
$self->register_eval_rule("html_charset_faraway");
|
||||
$self->register_eval_rule("html_tag_exists");
|
||||
$self->register_eval_rule("html_test");
|
||||
$self->register_eval_rule("html_eval");
|
||||
$self->register_eval_rule("html_text_match");
|
||||
$self->register_eval_rule("html_title_subject_ratio");
|
||||
$self->register_eval_rule("html_text_not_match");
|
||||
$self->register_eval_rule("html_range");
|
||||
$self->register_eval_rule("check_iframe_src");
|
||||
$self->register_eval_rule("html_tag_balance", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_image_only", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_image_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_charset_faraway", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_tag_exists", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_test", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_eval", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_text_match", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_title_subject_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_text_not_match", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("html_range", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_iframe_src", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -61,65 +61,88 @@ sub html_tag_balance {
|
||||
return 0 if $rawtag !~ /^([a-zA-Z0-9]+)$/;
|
||||
my $tag = $1;
|
||||
|
||||
return 0 unless exists $pms->{html}{inside}{$tag};
|
||||
|
||||
return 0 if $rawexpr !~ /^([\<\>\=\!\-\+ 0-9]+)$/;
|
||||
my $expr = untaint_var($1);
|
||||
|
||||
$pms->{html}{inside}{$tag} =~ /^([\<\>\=\!\-\+ 0-9]+)$/;
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
next unless exists $html->{inside}{$tag};
|
||||
$html->{inside}{$tag} =~ /^([\<\>\=\!\-\+ 0-9]+)$/;
|
||||
my $val = untaint_var($1);
|
||||
return 1 if eval "\$val $expr";
|
||||
}
|
||||
|
||||
return eval "\$val $expr";
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_image_only {
|
||||
my ($self, $pms, undef, $min, $max) = @_;
|
||||
|
||||
return (exists $pms->{html}{inside}{img} &&
|
||||
exists $pms->{html}{length} &&
|
||||
$pms->{html}{length} > $min &&
|
||||
$pms->{html}{length} <= $max);
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
if (exists $html->{inside}{img} && exists $html->{length} &&
|
||||
$html->{length} > $min && $html->{length} <= $max)
|
||||
{
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_image_ratio {
|
||||
my ($self, $pms, undef, $min, $max) = @_;
|
||||
|
||||
return 0 unless (exists $pms->{html}{non_space_len} &&
|
||||
exists $pms->{html}{image_area} &&
|
||||
$pms->{html}{image_area} > 0);
|
||||
my $ratio = $pms->{html}{non_space_len} / $pms->{html}{image_area};
|
||||
return ($ratio > $min && $ratio <= $max);
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
next unless (exists $html->{non_space_len} &&
|
||||
exists $html->{image_area} &&
|
||||
$html->{image_area} > 0);
|
||||
my $ratio = $html->{non_space_len} / $html->{image_area};
|
||||
return 1 if $ratio > $min && $ratio <= $max;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_charset_faraway {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
return 0 unless exists $pms->{html}{charsets};
|
||||
|
||||
my @locales = Mail::SpamAssassin::Util::get_my_locales($pms->{conf}->{ok_locales});
|
||||
return 0 if grep { $_ eq "all" } @locales;
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
next unless exists $html->{charsets};
|
||||
my $okay = 0;
|
||||
my $bad = 0;
|
||||
for my $c (split(' ', $pms->{html}{charsets})) {
|
||||
foreach my $c (split(/\s+/, $html->{charsets})) {
|
||||
if (Mail::SpamAssassin::Locales::is_charset_ok_for_locales($c, @locales)) {
|
||||
$okay++;
|
||||
}
|
||||
else {
|
||||
} else {
|
||||
$bad++;
|
||||
}
|
||||
}
|
||||
return ($bad && ($bad >= $okay));
|
||||
return 1 if $bad && $bad >= $okay;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_tag_exists {
|
||||
my ($self, $pms, undef, $tag) = @_;
|
||||
return exists $pms->{html}{inside}{$tag};
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
return 1 if exists $html->{inside}{$tag};
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_test {
|
||||
my ($self, $pms, undef, $test) = @_;
|
||||
return $pms->{html}{$test};
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
return 1 if $html->{$test};
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_eval {
|
||||
@ -128,29 +151,38 @@ sub html_eval {
|
||||
return 0 if $rawexpr !~ /^([\<\>\=\!\-\+ 0-9]+)$/;
|
||||
my $expr = untaint_var($1);
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
# workaround bug 3320: weird perl bug where additional, very explicit
|
||||
# untainting into a new var is required.
|
||||
my $tainted = $pms->{html}{$test};
|
||||
return 0 unless defined($tainted);
|
||||
my $tainted = $html->{$test};
|
||||
next unless defined($tainted);
|
||||
my $val = $tainted;
|
||||
|
||||
# just use the value in $val, don't copy it needlessly
|
||||
return eval "\$val $expr";
|
||||
return 1 if eval "\$val $expr";
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_text_match {
|
||||
my ($self, $pms, undef, $text, $regexp) = @_;
|
||||
|
||||
my ($rec, $err) = compile_regexp($regexp, 0);
|
||||
if (!$rec) {
|
||||
warn "htmleval: html_text_match invalid regexp '$regexp': $err";
|
||||
return 0;
|
||||
}
|
||||
foreach my $string (@{$pms->{html}{$text}}) {
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
next unless ref($html->{$text}) eq 'ARRAY';
|
||||
foreach my $string (@{$html->{$text}}) {
|
||||
next unless defined $string;
|
||||
if ($string =~ $rec) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -161,54 +193,74 @@ sub html_title_subject_ratio {
|
||||
if ($subject eq '') {
|
||||
return 0;
|
||||
}
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
my $max = 0;
|
||||
for my $string (@{ $pms->{html}{title} }) {
|
||||
foreach my $string (@{$html->{title}}) {
|
||||
if ($string) {
|
||||
my $ratio = length($string) / length($subject);
|
||||
$max = $ratio if $ratio > $max;
|
||||
my $ratio_s = length($string) / length($subject);
|
||||
$max = $ratio_s if $ratio_s > $max;
|
||||
}
|
||||
}
|
||||
return $max > $ratio;
|
||||
return 1 if $max > $ratio;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_text_not_match {
|
||||
my ($self, $pms, undef, $text, $regexp) = @_;
|
||||
for my $string (@{ $pms->{html}{$text} }) {
|
||||
if (defined $string && $string !~ /${regexp}/) {
|
||||
|
||||
my ($rec, $err) = compile_regexp($regexp, 0);
|
||||
if (!$rec) {
|
||||
warn "htmleval: html_text_not_match invalid regexp '$regexp': $err";
|
||||
return 0;
|
||||
}
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
next unless ref($html->{$text}) eq 'ARRAY';
|
||||
foreach my $string (@{$html->{$text}}) {
|
||||
if (defined $string && $string !~ $rec) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub html_range {
|
||||
my ($self, $pms, undef, $test, $min, $max) = @_;
|
||||
|
||||
return 0 unless exists $pms->{html}{$test};
|
||||
|
||||
$test = $pms->{html}{$test};
|
||||
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
next unless defined $html->{$test};
|
||||
my $value = $html->{$test};
|
||||
# not all perls understand what "inf" means, so we need to do
|
||||
# non-numeric tests! urg!
|
||||
if (!defined $max || $max eq "inf") {
|
||||
return ($test eq "inf") ? 1 : ($test > $min);
|
||||
return 1 if $value > $min;
|
||||
}
|
||||
elsif ($test eq "inf") {
|
||||
# $max < inf, so $test == inf means $test > $max
|
||||
return 0;
|
||||
elsif ($value eq "inf") {
|
||||
# $max < inf, so $value == inf means $value > $max
|
||||
next;
|
||||
}
|
||||
else {
|
||||
# if we get here everything should be a number
|
||||
return ($test > $min && $test <= $max);
|
||||
return 1 if $value > $min && $value <= $max;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub check_iframe_src {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
foreach my $v ( values %{$pms->{html}->{uri_detail}} ) {
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
foreach my $v (values %{$html->{uri_detail}}) {
|
||||
return 1 if $v->{types}->{iframe};
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
@ -38,7 +38,7 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule ("check_https_http_mismatch");
|
||||
$self->register_eval_rule ("check_https_http_mismatch", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -47,64 +47,62 @@ sub new {
|
||||
# ("<" and ">" replaced with "[lt]" and "[gt]" to avoid Kaspersky Desktop AV
|
||||
# false positive ;)
|
||||
sub check_https_http_mismatch {
|
||||
my ($self, $permsgstatus, undef, $minanchors, $maxanchors) = @_;
|
||||
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my ($self, $pms, undef, $minanchors, $maxanchors) = @_;
|
||||
|
||||
$minanchors ||= 1;
|
||||
|
||||
if (!exists $permsgstatus->{chhm_hit}) {
|
||||
$permsgstatus->{chhm_hit} = 0;
|
||||
$permsgstatus->{chhm_anchors} = 0;
|
||||
foreach my $html (@{$pms->{html_all}}) {
|
||||
my $hit = 0;
|
||||
my $anchors = 0;
|
||||
foreach my $k (keys %{$html->{uri_detail}}) {
|
||||
my $v = $html->{uri_detail}->{$k};
|
||||
|
||||
foreach my $k ( keys %{$permsgstatus->{html}->{uri_detail}} ) {
|
||||
my %uri_detail = %{$permsgstatus->{html}->{uri_detail}};
|
||||
my $v = ${uri_detail}{$k};
|
||||
# if the URI wasn't used for an anchor tag, or the anchor text didn't
|
||||
# exist, skip this.
|
||||
next unless (exists $v->{anchor_text} && @{$v->{anchor_text}});
|
||||
next unless exists $v->{anchor_text} && @{$v->{anchor_text}};
|
||||
|
||||
my $uri;
|
||||
if ($k =~ m@^https?://([^/:]+)@i) {
|
||||
if ($k =~ m@^https?://([^/:?#]+)@i) {
|
||||
$uri = $1;
|
||||
# Skip IPs since there's another rule to catch that already
|
||||
if ($uri =~ /^$IP_ADDRESS+$/) {
|
||||
undef $uri;
|
||||
if ($uri =~ IS_IP_ADDRESS) {
|
||||
$uri = undef;
|
||||
next;
|
||||
}
|
||||
# want to compare whole hostnames instead of domains?
|
||||
# comment this next section to the blank line.
|
||||
$uri = $self->{main}->{registryboundaries}->trim_domain($uri);
|
||||
undef $uri unless ($self->{main}->{registryboundaries}->is_domain_valid($uri));
|
||||
my $domain = $self->{main}->{registryboundaries}->uri_to_domain($uri);
|
||||
$uri = undef unless $self->{main}->{registryboundaries}->is_domain_valid($domain);
|
||||
}
|
||||
|
||||
next unless $uri;
|
||||
$permsgstatus->{chhm_anchors}++ if exists $v->{anchor_text};
|
||||
|
||||
$anchors++ if exists $v->{anchor_text};
|
||||
foreach (@{$v->{anchor_text}}) {
|
||||
if (m@https://([^/:]+)@i) {
|
||||
if (m@https://([^\s/:?#]+)@i) {
|
||||
my $https = $1;
|
||||
|
||||
# want to compare whole hostnames instead of domains?
|
||||
# comment this next section to the blank line.
|
||||
if ($https !~ /^$IP_ADDRESS+$/) {
|
||||
if ($https !~ IS_IP_ADDRESS) {
|
||||
$https = $self->{main}->{registryboundaries}->trim_domain($https);
|
||||
undef $https unless ($self->{main}->{registryboundaries}->is_domain_valid($https));
|
||||
$https = undef unless $self->{main}->{registryboundaries}->is_domain_valid($https);
|
||||
}
|
||||
next unless $https;
|
||||
|
||||
dbg("https_http_mismatch: domains $uri -> $https");
|
||||
|
||||
next if $uri eq $https;
|
||||
$permsgstatus->{chhm_hit} = 1;
|
||||
$hit = 1;
|
||||
last;
|
||||
}
|
||||
}
|
||||
}
|
||||
dbg("https_http_mismatch: anchors ".$permsgstatus->{chhm_anchors});
|
||||
|
||||
dbg("https_http_mismatch: anchors $anchors");
|
||||
return 1 if $hit && $anchors >= $minanchors &&
|
||||
(!defined $maxanchors || $anchors < $maxanchors);
|
||||
}
|
||||
|
||||
return ( $permsgstatus->{chhm_hit} && $permsgstatus->{chhm_anchors} >= $minanchors && (defined $maxanchors && $permsgstatus->{chhm_anchors} < $maxanchors) );
|
||||
return 0;
|
||||
}
|
||||
|
||||
1;
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,352 +0,0 @@
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
# contributor license agreements. See the NOTICE file distributed with
|
||||
# this work for additional information regarding copyright ownership.
|
||||
# The ASF licenses this file to you under the Apache License, Version 2.0
|
||||
# (the "License"); you may not use this file except in compliance with
|
||||
# the License. You may obtain a copy of the License at:
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# </@LICENSE>
|
||||
|
||||
=head1 NAME
|
||||
|
||||
Mail::SpamAssassin::Plugin::Hashcash - perform hashcash verification tests
|
||||
|
||||
=head1 SYNOPSIS
|
||||
|
||||
loadplugin Mail::SpamAssassin::Plugin::Hashcash
|
||||
|
||||
=head1 DESCRIPTION
|
||||
|
||||
Hashcash is a payment system for email where CPU cycles used as the
|
||||
basis for an e-cash system. This plugin makes it possible to use valid
|
||||
hashcash tokens added by mail programs as a bonus for messages.
|
||||
|
||||
=cut
|
||||
|
||||
=head1 USER SETTINGS
|
||||
|
||||
=over 4
|
||||
|
||||
=item use_hashcash { 1 | 0 } (default: 1)
|
||||
|
||||
Whether to use hashcash, if it is available.
|
||||
|
||||
=cut
|
||||
|
||||
=item hashcash_accept user@example.com ...
|
||||
|
||||
Used to specify addresses that we accept HashCash tokens for. You should set
|
||||
it to match all the addresses that you may receive mail at.
|
||||
|
||||
Like whitelist and blacklist entries, the addresses are file-glob-style
|
||||
patterns, so C<friend@somewhere.com>, C<*@isp.com>, or C<*.domain.net> will all
|
||||
work. Specifically, C<*> and C<?> are allowed, but all other metacharacters
|
||||
are not. Regular expressions are not used for security reasons.
|
||||
|
||||
The sequence C<%u> is replaced with the current user's username, which
|
||||
is useful for ISPs or multi-user domains.
|
||||
|
||||
Multiple addresses per line, separated by spaces, is OK. Multiple
|
||||
C<hashcash_accept> lines is also OK.
|
||||
|
||||
=cut
|
||||
|
||||
=item hashcash_doublespend_path /path/to/file (default: ~/.spamassassin/hashcash_seen)
|
||||
|
||||
Path for HashCash double-spend database. HashCash tokens are only usable once,
|
||||
so their use is tracked in this database to avoid providing a loophole.
|
||||
|
||||
By default, each user has their own, in their C<~/.spamassassin> directory with
|
||||
mode 0700/0600. Note that once a token is 'spent' it is written to this file,
|
||||
and double-spending of a hashcash token makes it invalid, so this is not
|
||||
suitable for sharing between multiple users.
|
||||
|
||||
=cut
|
||||
|
||||
=item hashcash_doublespend_file_mode (default: 0700)
|
||||
|
||||
The file mode bits used for the HashCash double-spend database file.
|
||||
|
||||
Make sure you specify this using the 'x' mode bits set, as it may also be used
|
||||
to create directories. However, if a file is created, the resulting file will
|
||||
not have any execute bits set (the umask is set to 111).
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::Hashcash;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
|
||||
use Errno qw(ENOENT EACCES);
|
||||
use Fcntl;
|
||||
use File::Path;
|
||||
use File::Basename;
|
||||
|
||||
BEGIN {
|
||||
eval { require Digest::SHA; import Digest::SHA qw(sha1); 1 }
|
||||
or do { require Digest::SHA1; import Digest::SHA1 qw(sha1) }
|
||||
}
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
use constant HAS_DB_FILE => eval { require DB_File; };
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
my $mailsaobject = shift;
|
||||
|
||||
# some boilerplate...
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule ("check_hashcash_value");
|
||||
$self->register_eval_rule ("check_hashcash_double_spend");
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub set_config {
|
||||
my($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'use_hashcash',
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'hashcash_doublespend_path',
|
||||
default => '__userstate__/hashcash_seen',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'hashcash_doublespend_file_mode',
|
||||
default => "0700",
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'hashcash_accept',
|
||||
default => {},
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub check_hashcash_value {
|
||||
my ($self, $scanner, $valmin, $valmax) = @_;
|
||||
my $val = $self->_run_hashcash($scanner);
|
||||
return ($val >= $valmin && $val < $valmax);
|
||||
}
|
||||
|
||||
sub check_hashcash_double_spend {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_run_hashcash($scanner);
|
||||
return ($scanner->{hashcash_double_spent});
|
||||
}
|
||||
|
||||
############################################################################
|
||||
|
||||
sub _run_hashcash {
|
||||
my ($self, $scanner) = @_;
|
||||
|
||||
if (defined $scanner->{hashcash_value}) { return $scanner->{hashcash_value}; }
|
||||
|
||||
$scanner->{hashcash_value} = 0;
|
||||
|
||||
# X-Hashcash: 0:031118:camram-spam@camram.org:c068b58ade6dcbaf
|
||||
# or:
|
||||
# X-hashcash: 1:20:040803:hashcash@freelists.org::6dcdb3a3ad4e1b86:1519d
|
||||
# X-hashcash: 1:20:040803:jm@jmason.org::6b484d06469ccb28:8838a
|
||||
# X-hashcash: 1:20:040803:adam@cypherspace.org::a1cbc54bf0182ea8:5d6a0
|
||||
|
||||
# call down to {msg} so that we can get it as an array of
|
||||
# individual headers
|
||||
my @hdrs = $scanner->{msg}->get_header ("X-Hashcash");
|
||||
if (scalar @hdrs == 0) {
|
||||
@hdrs = $scanner->{msg}->get_header ("Hashcash");
|
||||
}
|
||||
|
||||
foreach my $hc (@hdrs) {
|
||||
my $value = $self->_run_hashcash_for_one_string($scanner, $hc);
|
||||
if ($value) {
|
||||
# remove the "double-spend" bool if we did find a usable string;
|
||||
# this happens when one string is already spent, but another
|
||||
# string has not yet been.
|
||||
delete $scanner->{hashcash_double_spent};
|
||||
return $value;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _run_hashcash_for_one_string {
|
||||
my ($self, $scanner, $hc) = @_;
|
||||
|
||||
if (!$hc) { return 0; }
|
||||
$hc =~ s/\s+//gs; # remove whitespace from multiline, folded tokens
|
||||
|
||||
# untaint the string for paranoia, making sure not to allow \n \0 \' \"
|
||||
if ($hc =~ /^[-A-Za-z0-9\xA0-\xFF:_\/\%\@\.\,\= \*\+\;]+$/) {
|
||||
$hc = untaint_var($hc);
|
||||
}
|
||||
if (!$hc) { return 0; }
|
||||
|
||||
my ($ver, $bits, $date, $rsrc, $exts, $rand, $trial);
|
||||
if ($hc =~ /^0:/) {
|
||||
($ver, $date, $rsrc, $trial) = split (/:/, $hc, 4);
|
||||
}
|
||||
elsif ($hc =~ /^1:/) {
|
||||
($ver, $bits, $date, $rsrc, $exts, $rand, $trial) =
|
||||
split (/:/, $hc, 7);
|
||||
# extensions are, as yet, unused by SpamAssassin
|
||||
}
|
||||
else {
|
||||
dbg("hashcash: version $ver stamps not yet supported");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!$trial) {
|
||||
dbg("hashcash: no trial in stamp '$hc'");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $accept = $scanner->{conf}->{hashcash_accept};
|
||||
if (!$self->_check_hashcash_resource ($scanner, $accept, $rsrc)) {
|
||||
dbg("hashcash: resource $rsrc not accepted here");
|
||||
return 0;
|
||||
}
|
||||
|
||||
# get the hash collision from the token. Computing the hash collision
|
||||
# is very easy (great!) -- just get SHA1(token) and count the 0 bits at
|
||||
# the start of the SHA1 hash, according to the draft at
|
||||
# http://www.hashcash.org/draft-hashcash.txt .
|
||||
my $value = 0;
|
||||
my $bitstring = unpack ("B*", sha1($hc));
|
||||
$bitstring =~ /^(0+)/ and $value = length $1;
|
||||
|
||||
# hashcash v1 tokens: if the "claimed value" of the token is less than
|
||||
# what the token actually contains (ie. token was accidentally generated
|
||||
# with 24 bits instead of the claimed 20), then cut it down to just the
|
||||
# claimed value. that way it's a bit tidier and more deterministic.
|
||||
if ($bits && $value > $bits) {
|
||||
$value = $bits;
|
||||
}
|
||||
|
||||
dbg("hashcash: token value: $value");
|
||||
|
||||
if ($self->was_hashcash_token_double_spent ($scanner, $hc)) {
|
||||
$scanner->{hashcash_double_spent} = 1;
|
||||
return 0;
|
||||
}
|
||||
|
||||
$scanner->{hashcash_value} = $value;
|
||||
return $value;
|
||||
}
|
||||
|
||||
sub was_hashcash_token_double_spent {
|
||||
my ($self, $scanner, $token) = @_;
|
||||
|
||||
my $main = $self->{main};
|
||||
if (!$main->{conf}->{hashcash_doublespend_path}) {
|
||||
dbg("hashcash: hashcash_doublespend_path not defined or empty");
|
||||
return 0;
|
||||
}
|
||||
if (!HAS_DB_FILE) {
|
||||
dbg("hashcash: DB_File module not installed, cannot use double-spend db");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $path = $main->sed_path ($main->{conf}->{hashcash_doublespend_path});
|
||||
my $parentdir = dirname ($path);
|
||||
my $stat_errn = stat($parentdir) ? 0 : 0+$!;
|
||||
if ($stat_errn == 0 && !-d _) {
|
||||
dbg("hashcash: parent dir $parentdir exists but is not a directory");
|
||||
} elsif ($stat_errn == ENOENT) {
|
||||
# run in an eval(); if mkpath has no perms, it calls die()
|
||||
eval {
|
||||
mkpath ($parentdir, 0, (oct ($main->{conf}->{hashcash_doublespend_file_mode}) & 0777));
|
||||
};
|
||||
}
|
||||
|
||||
my %spenddb;
|
||||
if (!tie %spenddb, "DB_File", $path, O_RDWR|O_CREAT,
|
||||
(oct ($main->{conf}->{hashcash_doublespend_file_mode}) & 0666))
|
||||
{
|
||||
dbg("hashcash: failed to tie to $path: $@ $!");
|
||||
# not a serious error. TODO?
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (exists $spenddb{$token}) {
|
||||
untie %spenddb;
|
||||
dbg("hashcash: token '$token' spent already");
|
||||
return 1;
|
||||
}
|
||||
|
||||
$spenddb{$token} = time;
|
||||
dbg("hashcash: marking token '$token' as spent");
|
||||
|
||||
# TODO: expiry?
|
||||
|
||||
untie %spenddb;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _check_hashcash_resource {
|
||||
my ($self, $scanner, $list, $addr) = @_;
|
||||
$addr = lc $addr;
|
||||
if (defined ($list->{$addr})) { return 1; }
|
||||
study $addr; # study is a no-op since perl 5.16.0, eliminating related bugs
|
||||
|
||||
foreach my $regexp (values %{$list})
|
||||
{
|
||||
# allow %u == current username
|
||||
# \\ is added by $conf->add_to_addrlist()
|
||||
$regexp =~ s/\\\%u/$scanner->{main}->{username}/gs;
|
||||
|
||||
if ($addr =~ /$regexp/i) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
# TODO: use "To" and "Cc" addresses gleaned from the mails in the Bayes
|
||||
# database trained as ham, as well.
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
############################################################################
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
@ -25,12 +25,15 @@ use Errno qw(EBADF);
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Locales;
|
||||
use Mail::SpamAssassin::Util qw(get_my_locales parse_rfc822_date);
|
||||
use Mail::SpamAssassin::Util qw(get_my_locales parse_rfc822_date
|
||||
is_valid_utf_8);
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:sa :ip);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
@ -42,51 +45,44 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_for_fake_aol_relay_in_rcvd");
|
||||
$self->register_eval_rule("check_for_faraway_charset_in_headers");
|
||||
$self->register_eval_rule("check_for_unique_subject_id");
|
||||
$self->register_eval_rule("check_illegal_chars");
|
||||
$self->register_eval_rule("check_for_forged_hotmail_received_headers");
|
||||
$self->register_eval_rule("check_for_no_hotmail_received_headers");
|
||||
$self->register_eval_rule("check_for_msn_groups_headers");
|
||||
$self->register_eval_rule("check_for_forged_eudoramail_received_headers");
|
||||
$self->register_eval_rule("check_for_forged_yahoo_received_headers");
|
||||
$self->register_eval_rule("check_for_forged_juno_received_headers");
|
||||
$self->register_eval_rule("check_for_forged_gmail_received_headers");
|
||||
$self->register_eval_rule("check_for_matching_env_and_hdr_from");
|
||||
$self->register_eval_rule("sorted_recipients");
|
||||
$self->register_eval_rule("similar_recipients");
|
||||
$self->register_eval_rule("check_for_missing_to_header");
|
||||
$self->register_eval_rule("check_for_forged_gw05_received_headers");
|
||||
$self->register_eval_rule("check_for_shifted_date");
|
||||
$self->register_eval_rule("subject_is_all_caps");
|
||||
$self->register_eval_rule("check_for_to_in_subject");
|
||||
$self->register_eval_rule("check_outlook_message_id");
|
||||
$self->register_eval_rule("check_messageid_not_usable");
|
||||
$self->register_eval_rule("check_header_count_range");
|
||||
$self->register_eval_rule("check_unresolved_template");
|
||||
$self->register_eval_rule("check_ratware_name_id");
|
||||
$self->register_eval_rule("check_ratware_envelope_from");
|
||||
$self->register_eval_rule("gated_through_received_hdr_remover");
|
||||
$self->register_eval_rule("received_within_months");
|
||||
$self->register_eval_rule("check_equal_from_domains");
|
||||
$self->register_eval_rule("check_for_fake_aol_relay_in_rcvd", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_faraway_charset_in_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_unique_subject_id", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_illegal_chars", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_forged_hotmail_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_no_hotmail_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_msn_groups_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_forged_eudoramail_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_forged_yahoo_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_forged_juno_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_forged_gmail_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_matching_env_and_hdr_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("sorted_recipients", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("similar_recipients", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_missing_to_header", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_forged_gw05_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_shifted_date", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("subject_is_all_caps", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_to_in_subject", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_outlook_message_id", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_messageid_not_usable", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_header_count_range", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_unresolved_template", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_ratware_name_id", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_ratware_envelope_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("gated_through_received_hdr_remover", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("received_within_months", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_equal_from_domains", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
# load triplets.txt into memory
|
||||
sub compile_now_start {
|
||||
my ($self) = @_;
|
||||
|
||||
$self->word_is_in_dictionary("aba");
|
||||
}
|
||||
|
||||
sub check_for_fake_aol_relay_in_rcvd {
|
||||
my ($self, $pms) = @_;
|
||||
local ($_);
|
||||
|
||||
$_ = $pms->get('Received');
|
||||
s/\s/ /gs;
|
||||
s/\s+/ /gs;
|
||||
|
||||
# this is the hostname format used by AOL for their relays. Spammers love
|
||||
# forging it. Don't make it more specific to match aol.com only, though --
|
||||
@ -122,142 +118,20 @@ sub check_for_faraway_charset_in_headers {
|
||||
return 0 if grep { $_ eq "all" } @locales;
|
||||
|
||||
for my $h (qw(From Subject)) {
|
||||
my @hdrs = $pms->get("$h:raw"); # ??? get() returns a scalar ???
|
||||
if ($#hdrs >= 0) {
|
||||
$hdr = join(" ", @hdrs);
|
||||
} else {
|
||||
$hdr = '';
|
||||
}
|
||||
my @hdrs = $pms->get("$h:raw");
|
||||
foreach my $hdr (@hdrs) {
|
||||
while ($hdr =~ /=\?(.+?)\?.\?.*?\?=/g) {
|
||||
Mail::SpamAssassin::Locales::is_charset_ok_for_locales($1, @locales)
|
||||
or return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
0;
|
||||
}
|
||||
|
||||
# Deprecated (Bug 8051)
|
||||
sub check_for_unique_subject_id {
|
||||
my ($self, $pms) = @_;
|
||||
local ($_);
|
||||
$_ = lc $pms->get('Subject');
|
||||
study; # study is a no-op since perl 5.16.0, eliminating related bugs
|
||||
|
||||
my $id = 0;
|
||||
if (/[-_\.\s]{7,}([-a-z0-9]{4,})$/
|
||||
|| /\s{10,}(?:\S\s)?(\S+)$/
|
||||
|| /\s{3,}[-:\#\(\[]+([-a-z0-9]{4,})[\]\)]+$/
|
||||
|| /\s{3,}[:\#\(\[]*([a-f0-9]{4,})[\]\)]*$/
|
||||
|| /\s{3,}[-:\#]([a-z0-9]{5,})$/
|
||||
|| /[\s._]{3,}([^0\s._]\d{3,})$/
|
||||
|| /[\s._]{3,}\[(\S+)\]$/
|
||||
|
||||
# (7217vPhZ0-478TLdy5829qicU9-0@26) and similar
|
||||
|| /\(([-\w]{7,}\@\d+)\)$/
|
||||
|
||||
# Seven or more digits at the end of a subject is almost certainly a id
|
||||
|| /\b(\d{7,})\s*$/
|
||||
|
||||
# stuff at end of line after "!" or "?" is usually an id
|
||||
|| /[!\?]\s*(\d{4,}|\w+(-\w+)+)\s*$/
|
||||
|
||||
# 9095IPZK7-095wsvp8715rJgY8-286-28 and similar
|
||||
# excluding 'Re:', etc and the first word
|
||||
|| /(?:\w{2,3}:\s)?\w+\s+(\w{7,}-\w{7,}(-\w+)*)\s*$/
|
||||
|
||||
# #30D7 and similar
|
||||
|| /\s#\s*([a-f0-9]{4,})\s*$/
|
||||
)
|
||||
{
|
||||
$id = $1;
|
||||
# exempt online purchases
|
||||
if ($id =~ /\d{5,}/
|
||||
&& /(?:item|invoice|order|number|confirmation).{1,6}\Q$id\E\s*$/)
|
||||
{
|
||||
$id = 0;
|
||||
}
|
||||
|
||||
# for the "foo-bar-baz" case, otherwise it won't
|
||||
# be found in the dict:
|
||||
$id =~ s/-//;
|
||||
}
|
||||
|
||||
return ($id && !$self->word_is_in_dictionary($id));
|
||||
}
|
||||
|
||||
# word_is_in_dictionary()
|
||||
#
|
||||
# See if the word looks like an English word, by checking if each triplet
|
||||
# of letters it contains is one that can be found in the English language.
|
||||
# Does not include triplets only found in proper names, or in the Latin
|
||||
# and Greek terms that might be found in a larger dictionary
|
||||
|
||||
my %triplets;
|
||||
my $triplets_loaded = 0;
|
||||
|
||||
sub word_is_in_dictionary {
|
||||
my ($self, $word) = @_;
|
||||
local ($_);
|
||||
local $/ = "\n"; # Ensure $/ is set appropriately
|
||||
|
||||
# $word =~ tr/A-Z/a-z/; # already done by this stage
|
||||
$word =~ s/^\s+//;
|
||||
$word =~ s/\s+$//;
|
||||
|
||||
# If it contains a digit, dash, etc, it's not a valid word.
|
||||
# Don't reject words like "can't" and "I'll"
|
||||
return 0 if ($word =~ /[^a-z\']/);
|
||||
|
||||
# handle a few common "blah blah blah (comment)" styles
|
||||
return 1 if ($word eq "ot"); # off-topic
|
||||
return 1 if ($word =~ /(?:linux|nix|bsd)/); # not in most dicts
|
||||
return 1 if ($word =~ /(?:whew|phew|attn|tha?nx)/); # not in most dicts
|
||||
|
||||
my $word_len = length($word);
|
||||
|
||||
# Unique IDs probably aren't going to be only one or two letters long
|
||||
return 1 if ($word_len < 3);
|
||||
|
||||
if (!$triplets_loaded) {
|
||||
# take a copy to avoid modifying the real one
|
||||
my @default_triplets_path = @Mail::SpamAssassin::default_rules_path;
|
||||
s{$}{/triplets.txt} for @default_triplets_path;
|
||||
my $filename = $self->{main}->first_existing_path (@default_triplets_path);
|
||||
|
||||
if (!defined $filename) {
|
||||
dbg("eval: failed to locate the triplets.txt file");
|
||||
return 1;
|
||||
}
|
||||
|
||||
local *TRIPLETS;
|
||||
if (!open (TRIPLETS, "<$filename")) {
|
||||
dbg("eval: failed to open '$filename', cannot check dictionary: $!");
|
||||
return 1;
|
||||
}
|
||||
for($!=0; <TRIPLETS>; $!=0) {
|
||||
chomp;
|
||||
$triplets{$_} = 1;
|
||||
}
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("eval: error reading from $filename: $!")
|
||||
: die "error reading from $filename: $!";
|
||||
close(TRIPLETS) or die "error closing $filename: $!";
|
||||
|
||||
$triplets_loaded = 1;
|
||||
} # if (!$triplets_loaded)
|
||||
|
||||
|
||||
my $i;
|
||||
|
||||
for ($i = 0; $i < ($word_len - 2); $i++) {
|
||||
my $triplet = substr($word, $i, 3);
|
||||
if (!$triplets{$triplet}) {
|
||||
dbg("eval: unique ID: letter triplet '$triplet' from word '$word' not valid");
|
||||
return 0;
|
||||
}
|
||||
} # for ($i = 0; $i < ($word_len - 2); $i++)
|
||||
|
||||
# All letter triplets in word were found to be valid
|
||||
return 1;
|
||||
}
|
||||
|
||||
# look for 8-bit and other illegal characters that should be MIME
|
||||
@ -268,7 +142,18 @@ sub check_illegal_chars {
|
||||
|
||||
$header .= ":raw" unless $header =~ /:raw$/;
|
||||
my $str = $pms->get($header);
|
||||
return 0 if !defined $str || $str eq '';
|
||||
return 0 if !defined $str || $str !~ /\S/;
|
||||
|
||||
if ($str =~ tr/\x00-\x7F//c && is_valid_utf_8($str)) {
|
||||
# is non-ASCII and is valid UTF-8
|
||||
if ($str =~ tr/\x00-\x08\x0B\x0C\x0E-\x1F//) {
|
||||
dbg("eval: %s is valid UTF-8 but contains controls: %s", $header, $str);
|
||||
} else {
|
||||
# todo: only with a SMTPUTF8 mail
|
||||
dbg("eval: %s is valid UTF-8: %s", $header, $str);
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
# count illegal substrings (RFC 2045)
|
||||
# (non-ASCII + C0 controls except TAB, NL, CR)
|
||||
@ -291,12 +176,12 @@ sub gated_through_received_hdr_remover {
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
my $txt = $pms->get("Mailing-List",undef);
|
||||
if (defined $txt && $txt =~ /^contact \S+\@\S+\; run by ezmlm$/) {
|
||||
if (defined $txt && $txt =~ /^contact \S+\@\S+\; run by ezmlm$/m) {
|
||||
my $dlto = $pms->get("Delivered-To");
|
||||
my $rcvd = $pms->get("Received");
|
||||
|
||||
# ensure we have other indicative headers too
|
||||
if ($dlto =~ /^mailing list \S+\@\S+/ &&
|
||||
if ($dlto =~ /^mailing list \S+\@\S+/m &&
|
||||
$rcvd =~ /qmail \d+ invoked (?:from network|by .{3,20})\); \d+ ... \d+/)
|
||||
{
|
||||
return 1;
|
||||
@ -336,9 +221,8 @@ sub _check_for_forged_hotmail_received_headers {
|
||||
return if $self->check_for_msn_groups_headers($pms);
|
||||
|
||||
my $ip = $pms->get('X-Originating-Ip',undef);
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
my $orig = $pms->get('X-OriginatorOrg',undef);
|
||||
my $ORIGINATOR = 'hotmail.com';
|
||||
my $ORIGINATOR = qr/hotmail\.com|msonline\-outlook/;
|
||||
|
||||
if (defined $ip && $ip =~ /$IP_ADDRESS/) { $ip = 1; } else { $ip = 0; }
|
||||
if (defined $orig && $orig =~ /$ORIGINATOR/) { $orig = 1; } else { $orig = 0; }
|
||||
@ -347,6 +231,8 @@ sub _check_for_forged_hotmail_received_headers {
|
||||
# Received: from hotmail.com (f135.law8.hotmail.com [216.33.241.135])
|
||||
# or like
|
||||
# Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-oln040092066056.outbound.protection.outlook.com [40.92.66.56])
|
||||
# or
|
||||
# Received: from VI1PR04MB3039.eurprd04.prod.outlook.com (2603:10a6:802:b::13)
|
||||
# spammers do not ;)
|
||||
|
||||
if ($self->gated_through_received_hdr_remover($pms)) { return; }
|
||||
@ -355,6 +241,8 @@ sub _check_for_forged_hotmail_received_headers {
|
||||
{ return; }
|
||||
if ($rcvd =~ /from \S*\.outbound\.protection\.outlook\.com \(\S+\.outbound\.protection\.outlook\.com[ \)]/ && $orig)
|
||||
{ return; }
|
||||
if ($rcvd =~ /from \S*\.eurprd\d+\.prod\.outlook\.com \($IP_ADDRESS\)/ && $orig)
|
||||
{ return; }
|
||||
if ($rcvd =~ /from \S*\.hotmail.com \(\[$IP_ADDRESS\][ \):]/ && $ip)
|
||||
{ return; }
|
||||
if ($rcvd =~ /from \S+ by \S+\.hotmail(?:\.msn)?\.com with HTTP\;/ && $ip)
|
||||
@ -465,7 +353,6 @@ sub check_for_forged_eudoramail_received_headers {
|
||||
$rcvd =~ s/\s+/ /gs; # just spaces, simplify the regexp
|
||||
|
||||
my $ip = $pms->get('X-Sender-Ip',undef);
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
if (defined $ip && $ip =~ /$IP_ADDRESS/) { $ip = 1; } else { $ip = 0; }
|
||||
|
||||
# Eudoramail formats its received headers like this:
|
||||
@ -518,7 +405,6 @@ sub check_for_forged_yahoo_received_headers {
|
||||
if ($rcvd =~ /by web\S+\.mail\S*\.yahoo\.com via HTTP/) { return 0; }
|
||||
if ($rcvd =~ /by sonic\S+\.consmr\.mail\S*\.yahoo\.com with HTTP/) { return 0; }
|
||||
if ($rcvd =~ /by smtp\S+\.yahoo\.com with SMTP/) { return 0; }
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
if ($rcvd =~
|
||||
/from \[$IP_ADDRESS\] by \S+\.(?:groups|scd|dcn)\.yahoo\.com with NNFMP/) {
|
||||
return 0;
|
||||
@ -554,13 +440,12 @@ sub check_for_forged_juno_received_headers {
|
||||
my $xorig = $pms->get('X-Originating-IP');
|
||||
my $xmailer = $pms->get('X-Mailer');
|
||||
my $rcvd = $pms->get('Received');
|
||||
my $IP_ADDRESS = IP_ADDRESS;
|
||||
|
||||
if ($xorig ne '') {
|
||||
# New style Juno has no X-Originating-IP header, and other changes
|
||||
if($rcvd !~ /from.*\b(?:juno|untd)\.com.*[\[\(]$IP_ADDRESS[\]\)].*by/
|
||||
&& $rcvd !~ / cookie\.(?:juno|untd)\.com /) { return 1; }
|
||||
if($xmailer !~ /Juno /) { return 1; }
|
||||
if(index($xmailer, 'Juno ') == -1) { return 1; }
|
||||
} else {
|
||||
if($rcvd =~ /from.*\bmail\.com.*\[$IP_ADDRESS\].*by/) {
|
||||
if($xmailer !~ /\bmail\.com/) { return 1; }
|
||||
@ -593,6 +478,7 @@ sub check_for_forged_gmail_received_headers {
|
||||
if ($received =~ /by smtp\.googlemail\.com with ESMTPSA id \S+/) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ( (length($xgms) >= GOOGLE_MESSAGE_STATE_LENGTH_MIN) &&
|
||||
(length($xss) >= GOOGLE_SMTP_SOURCE_LENGTH_MIN)) {
|
||||
return 0;
|
||||
@ -637,10 +523,9 @@ sub _check_recipients {
|
||||
my @inputs;
|
||||
|
||||
# ToCc: pseudo-header works best, but sometimes Bcc: is better
|
||||
for ('ToCc', 'Bcc') {
|
||||
my $to = $pms->get($_); # get recipients
|
||||
$to =~ s/\(.*?\)//g; # strip out the (comments)
|
||||
push(@inputs, ($to =~ m/([\w.=-]+\@\w+(?:[\w.-]+\.)+\w+)/g));
|
||||
for ('ToCc:addr', 'Bcc:addr') {
|
||||
my @to = $pms->get($_); # get recipients
|
||||
push @inputs, @to;
|
||||
last if scalar(@inputs) >= TOCC_SIMILAR_COUNT;
|
||||
}
|
||||
|
||||
|
@ -99,13 +99,13 @@ sub new {
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule ("image_count");
|
||||
$self->register_eval_rule ("pixel_coverage");
|
||||
$self->register_eval_rule ("image_size_exact");
|
||||
$self->register_eval_rule ("image_size_range");
|
||||
$self->register_eval_rule ("image_named");
|
||||
$self->register_eval_rule ("image_name_regex");
|
||||
$self->register_eval_rule ("image_to_text_ratio");
|
||||
$self->register_eval_rule ("image_count", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pixel_coverage", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("image_size_exact", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("image_size_range", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("image_named", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("image_name_regex", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("image_to_text_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -256,7 +256,7 @@ sub _get_images {
|
||||
|
||||
sub image_named {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
return unless (defined $name);
|
||||
return 0 unless (defined $name);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
@ -272,7 +272,7 @@ sub image_named {
|
||||
|
||||
sub image_name_regex {
|
||||
my ($self,$pms,$body,$re) = @_;
|
||||
return unless (defined $re);
|
||||
return 0 unless (defined $re);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
@ -300,7 +300,7 @@ sub image_name_regex {
|
||||
sub image_count {
|
||||
my ($self,$pms,$body,$type,$min,$max) = @_;
|
||||
|
||||
return unless defined $min;
|
||||
return 0 unless defined $min;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
@ -316,7 +316,7 @@ sub image_count {
|
||||
sub pixel_coverage {
|
||||
my ($self,$pms,$body,$type,$min,$max) = @_;
|
||||
|
||||
return unless (defined $type && defined $min);
|
||||
return 0 unless (defined $type && defined $min);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
@ -331,7 +331,7 @@ sub pixel_coverage {
|
||||
|
||||
sub image_to_text_ratio {
|
||||
my ($self,$pms,$body,$type,$min,$max) = @_;
|
||||
return unless (defined $type && defined $min && defined $max);
|
||||
return 0 unless (defined $type && defined $min && defined $max);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
@ -353,7 +353,7 @@ sub image_to_text_ratio {
|
||||
|
||||
sub image_size_exact {
|
||||
my ($self,$pms,$body,$type,$height,$width) = @_;
|
||||
return unless (defined $type && defined $height && defined $width);
|
||||
return 0 unless (defined $type && defined $height && defined $width);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
@ -369,7 +369,7 @@ sub image_size_exact {
|
||||
|
||||
sub image_size_range {
|
||||
my ($self,$pms,$body,$type,$minh,$minw,$maxh,$maxw) = @_;
|
||||
return unless (defined $type && defined $minh && defined $minw);
|
||||
return 0 unless (defined $type && defined $minh && defined $minw);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'imageinfo'}) {
|
||||
@ -377,7 +377,7 @@ sub image_size_range {
|
||||
}
|
||||
|
||||
my $name = 'dems_'.$type;
|
||||
return unless (exists $pms->{'imageinfo'}->{$name});
|
||||
return 0 unless (exists $pms->{'imageinfo'}->{$name});
|
||||
|
||||
foreach my $dem ( keys %{$pms->{'imageinfo'}->{"dems_$type"}}) {
|
||||
my ($h,$w) = split(/x/,$dem);
|
||||
|
@ -68,18 +68,18 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_for_mime");
|
||||
$self->register_eval_rule("check_for_mime_html");
|
||||
$self->register_eval_rule("check_for_mime_html_only");
|
||||
$self->register_eval_rule("check_mime_multipart_ratio");
|
||||
$self->register_eval_rule("check_msg_parse_flags");
|
||||
$self->register_eval_rule("check_for_ascii_text_illegal");
|
||||
$self->register_eval_rule("check_abundant_unicode_ratio");
|
||||
$self->register_eval_rule("check_for_faraway_charset");
|
||||
$self->register_eval_rule("check_for_uppercase");
|
||||
$self->register_eval_rule("check_ma_non_text");
|
||||
$self->register_eval_rule("check_base64_length");
|
||||
$self->register_eval_rule("check_qp_ratio");
|
||||
$self->register_eval_rule("check_for_mime", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_for_mime_html", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_for_mime_html_only", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_mime_multipart_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_msg_parse_flags", $Mail::SpamAssassin::Conf::TYPE_HEADER_EVALS);
|
||||
$self->register_eval_rule("check_for_ascii_text_illegal", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_abundant_unicode_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_for_faraway_charset", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_for_uppercase", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_ma_non_text", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_base64_length", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule("check_qp_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -89,8 +89,7 @@ sub new {
|
||||
sub are_more_high_bits_set {
|
||||
my ($self, $str) = @_;
|
||||
|
||||
# TODO: I suspect a tr// trick may be faster here
|
||||
my $numhis = () = ($str =~ /[\200-\377]/g);
|
||||
my $numhis = $str =~ tr/\x00-\x7F//c; # number of non-ASCII chars
|
||||
my $numlos = length($str) - $numhis;
|
||||
|
||||
($numlos <= $numhis && $numhis > 3);
|
||||
@ -182,7 +181,7 @@ sub check_for_mime {
|
||||
|
||||
$self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
|
||||
return 0 unless exists $pms->{$test};
|
||||
return $pms->{$test};
|
||||
return $pms->{$test} ? 1 : 0;
|
||||
}
|
||||
|
||||
# any text/html MIME part
|
||||
@ -232,19 +231,19 @@ sub _check_mime_header {
|
||||
$pms->{mime_body_text_count}++;
|
||||
}
|
||||
|
||||
if ($cte =~ /base64/) {
|
||||
if (index($cte, 'base64') >= 0) {
|
||||
$pms->{mime_base64_count}++;
|
||||
}
|
||||
elsif ($cte =~ /quoted-printable/) {
|
||||
elsif (index($cte, 'quoted-printable') >= 0) {
|
||||
$pms->{mime_qp_count}++;
|
||||
}
|
||||
|
||||
if ($cd && $cd =~ /attachment/) {
|
||||
if ($cd && index($cd, 'attachment') >= 0) {
|
||||
$pms->{mime_attachment}++;
|
||||
}
|
||||
|
||||
if ($ctype =~ /^text/ &&
|
||||
$cte =~ /base64/ &&
|
||||
index($cte, 'base64') >= 0 &&
|
||||
(!$charset || $charset =~ /(?:us-ascii|ansi_x3\.4-1968|iso-ir-6|ansi_x3\.4-1986|iso_646\.irv:1991|ascii|iso646-us|us|ibm367|cp367|csascii)/) &&
|
||||
!($cd && $cd =~ /^(?:attachment|inline)/))
|
||||
{
|
||||
@ -366,9 +365,9 @@ sub _check_attachments {
|
||||
|
||||
$part++;
|
||||
$part_type[$part] = $ctype;
|
||||
$part_bytes[$part] = 0 if $cd !~ /attachment/;
|
||||
$part_bytes[$part] = 0 if index($cd, 'attachment') == -1;
|
||||
|
||||
my $cte_is_base64 = $cte =~ /base64/i;
|
||||
my $cte_is_base64 = index($cte, 'base64') >= 0;
|
||||
my $previous = '';
|
||||
foreach (@{$p->raw()}) {
|
||||
|
||||
@ -385,12 +384,12 @@ sub _check_attachments {
|
||||
# if ($pms->{mime_html_no_charset} && $ctype eq 'text/html' && defined $charset) {
|
||||
# $pms->{mime_html_no_charset} = 0;
|
||||
# }
|
||||
if ($pms->{mime_multipart_alternative} && $cd !~ /attachment/ &&
|
||||
if ($pms->{mime_multipart_alternative} && index($cd, 'attachment') == -1 &&
|
||||
($ctype eq 'text/plain' || $ctype eq 'text/html')) {
|
||||
$part_bytes[$part] += length;
|
||||
}
|
||||
|
||||
if ($where != 1 && $cte eq "quoted-printable" && ! /^SPAM: /) {
|
||||
if ($where != 1 && $cte eq "quoted-printable" && index($_, 'SPAM: ') != 0) {
|
||||
# RFC 5322: Each line SHOULD be no more than 78 characters,
|
||||
# excluding the CRLF.
|
||||
# RFC 2045: The Quoted-Printable encoding REQUIRES that
|
||||
@ -415,9 +414,11 @@ sub _check_attachments {
|
||||
# }
|
||||
|
||||
# count excessive QP bytes
|
||||
if (index($_, '=') != -1) {
|
||||
if (index($_, '=') >= 0) {
|
||||
## no critic (Perlsecret)
|
||||
# whoever wrote this next line is an evil hacker -- jm
|
||||
my $qp = () = m/=(?:09|3[0-9ABCEF]|[2456][0-9A-F]|7[0-9A-E])/g;
|
||||
## use critic
|
||||
if ($qp) {
|
||||
$qp_count += $qp;
|
||||
# tabs and spaces at end of encoded line are okay. Also, multiple
|
||||
@ -630,7 +631,7 @@ sub get_charset_from_ct_line {
|
||||
sub check_ma_non_text {
|
||||
my($self, $pms) = @_;
|
||||
|
||||
foreach my $map ($pms->{msg}->find_parts(qr@^multipart/alternative$@i)) {
|
||||
foreach my $map ($pms->{msg}->find_parts(qr@^multipart/alternative$@)) {
|
||||
foreach my $p ($map->find_parts(qr/./, 1, 0)) {
|
||||
next if (lc $p->{'type'} eq 'multipart/related');
|
||||
next if (lc $p->{'type'} eq 'application/rtf');
|
||||
|
@ -51,6 +51,23 @@ around the newline character in "folded" headers will be replaced with a single
|
||||
space. Append C<:raw> to the header name to retrieve the raw, undecoded value,
|
||||
including pristine whitespace, instead.
|
||||
|
||||
=item tflags NAME_OF_RULE range=x-y
|
||||
|
||||
Match only from specific MIME parts, indexed in the order they are parsed.
|
||||
Part 1 = main message headers. Part 2 = next part etc.
|
||||
|
||||
range=1 (match only main headers, not any subparts)
|
||||
range=2- (match any subparts, but not the main headers)
|
||||
range=-3 (match only first three parts, including main headers)
|
||||
range=2-3 (match only first two subparts)
|
||||
|
||||
=item tflags NAME_OF_RULE concat
|
||||
|
||||
Concatenate all headers from all mime parts (possible range applied) into a
|
||||
single string for matching. This allows matching headers across multiple
|
||||
parts with single regex. Normally pattern is tested individually for
|
||||
different mime parts.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
@ -151,6 +168,9 @@ sub set_config {
|
||||
$self->{parser}->add_test($rulename, $evalfn."()",
|
||||
$Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
# Support named regex captures
|
||||
$self->{parser}->parse_captures($rulename, $rec);
|
||||
|
||||
# evalfn/rulename safe, sanitized by $RULENAME_RE
|
||||
my $evalcode = '
|
||||
sub Mail::SpamAssassin::Plugin::MIMEHeader::'.$evalfn.' {
|
||||
@ -178,37 +198,104 @@ sub set_config {
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub eval_hook_called {
|
||||
my ($pobj, $scanner, $rulename) = @_;
|
||||
my ($pobj, $pms, $rulename) = @_;
|
||||
|
||||
my $rule = $scanner->{conf}->{mimeheader_tests}->{$rulename};
|
||||
my $conf = $pms->{conf};
|
||||
my $rule = $conf->{mimeheader_tests}->{$rulename};
|
||||
my $hdr = $rule->{hdr};
|
||||
my $negated = $rule->{negated};
|
||||
my $if_unset = $rule->{if_unset};
|
||||
my $pattern = $rule->{pattern};
|
||||
my $tflags = $conf->{tflags}->{$rulename}||'';
|
||||
|
||||
|
||||
my $getraw;
|
||||
my $getraw = 0;
|
||||
if ($hdr =~ s/:raw$//) {
|
||||
$getraw = 1;
|
||||
} else {
|
||||
$getraw = 0;
|
||||
}
|
||||
|
||||
foreach my $p ($scanner->{msg}->find_parts(qr/./)) {
|
||||
my $range_min = 0;
|
||||
my $range_max = 1000;
|
||||
if ($tflags =~ /(?:^|\s)range=(\d+)?(-)?(\d+)?(?:\s|$)/) {
|
||||
if (defined $1 && defined $2 && defined $3) {
|
||||
$range_min = $1;
|
||||
$range_max = $3;
|
||||
}
|
||||
elsif (defined $1 && defined $2) {
|
||||
$range_min = $1;
|
||||
}
|
||||
elsif (defined $2 && defined $3) {
|
||||
$range_max = $3;
|
||||
}
|
||||
elsif (defined $1) {
|
||||
$range_min = $range_max = $1;
|
||||
}
|
||||
}
|
||||
|
||||
my $multiple = $tflags =~ /\bmultiple\b/;
|
||||
my $concat = $tflags =~ /\bconcat\b/;
|
||||
my $maxhits = $tflags =~ /\bmaxhits=(\d+)\b/ ? $1 :
|
||||
$multiple ? 1000 : 1;
|
||||
my $cval = '';
|
||||
|
||||
my $idx = 0;
|
||||
foreach my $p ($pms->{msg}->find_parts(qr/./)) {
|
||||
$idx++;
|
||||
last if $idx > $range_max;
|
||||
next if $idx < $range_min;
|
||||
|
||||
my $val;
|
||||
if ($getraw) {
|
||||
if ($hdr eq 'ALL') {
|
||||
$val = $p->get_all_headers($getraw, 0);
|
||||
} elsif ($getraw) {
|
||||
$val = $p->raw_header($hdr);
|
||||
} else {
|
||||
$val = $p->get_header($hdr);
|
||||
}
|
||||
$val = $if_unset if !defined $val;
|
||||
$val = $rule->{if_unset} if !defined $val;
|
||||
|
||||
if ($val =~ $pattern) {
|
||||
return ($negated ? 0 : 1);
|
||||
if ($concat) {
|
||||
$val .= "\n" unless $val =~ /\n$/;
|
||||
$cval .= $val;
|
||||
next;
|
||||
}
|
||||
|
||||
if (_check($pms, $rulename, $val, $pattern, $negated, $maxhits, "part $idx")) {
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
return ($negated ? 1 : 0);
|
||||
if ($concat) {
|
||||
if (_check($pms, $rulename, $cval, $pattern, $negated, $maxhits, 'concat')) {
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
if ($negated) {
|
||||
dbg("mimeheader: ran rule $rulename ======> got hit: \"<negative match>\"");
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub _check {
|
||||
my ($pms, $rulename, $value, $pattern, $negated, $maxhits, $desc) = @_;
|
||||
|
||||
my $hits = 0;
|
||||
my %captures;
|
||||
while ($value =~ /$pattern/gp) {
|
||||
last if $negated;
|
||||
if (%-) {
|
||||
foreach my $cname (keys %-) {
|
||||
push @{$captures{$cname}}, grep { $_ ne "" } @{$-{$cname}};
|
||||
}
|
||||
}
|
||||
my $match = defined ${^MATCH} ? ${^MATCH} : "<negative match>";
|
||||
$pms->got_hit($rulename, '', ruletype => 'eval');
|
||||
dbg("mimeheader: ran rule $rulename ======> got hit: \"$match\" ($desc)");
|
||||
last if ++$hits >= $maxhits;
|
||||
}
|
||||
$pms->set_captures(\%captures) if %captures;
|
||||
return $hits;
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
@ -224,4 +311,10 @@ sub finish_tests {
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
sub has_all_header { 1 } # Supports ALL header query (Bug 5582)
|
||||
sub has_tflags_range { 1 } # Supports tflags range=x-y
|
||||
sub has_tflags_concat { 1 } # Supports tflags concat
|
||||
sub has_tflags_multiple { 1 } # Supports tflags multiple
|
||||
sub has_capture_rules { 1 } # Supports named regex captures (Bug 7992)
|
||||
|
||||
1;
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -66,6 +66,20 @@ sub check_start {
|
||||
}
|
||||
}
|
||||
|
||||
sub check_cleanup {
|
||||
my ($self, $params) = @_;
|
||||
my $pms = $params->{permsgstatus};
|
||||
my $scoresptr = $pms->{conf}->{scores};
|
||||
|
||||
# Force all body rules ready for meta rules. Need to do it here in
|
||||
# cleanup, because the body is scanned per line instead of per rule
|
||||
if ($pms->{conf}->{skip_body_rules}) {
|
||||
foreach (keys %{$pms->{conf}->{skip_body_rules}}) {
|
||||
$pms->rule_ready($_, 1) if $scoresptr->{$_};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
1;
|
||||
@ -96,13 +110,16 @@ sub do_one_line_body_tests {
|
||||
|
||||
if (($conf->{tflags}->{$rulename}||'') =~ /\bmultiple\b/)
|
||||
{
|
||||
$sub .= '
|
||||
my $hitsptr = $self->{tests_already_hit};
|
||||
';
|
||||
# support multiple matches
|
||||
my ($max) = $conf->{tflags}->{$rulename} =~ /\bmaxhits=(\d+)\b/;
|
||||
$max = untaint_var($max);
|
||||
if ($max) {
|
||||
$sub .= '
|
||||
if (exists $self->{tests_already_hit}->{q{'.$rulename.'}}) {
|
||||
return 0 if $self->{tests_already_hit}->{q{'.$rulename.'}} >= '.$max.';
|
||||
if ($hitsptr->{q{'.$rulename.'}}) {
|
||||
return 0 if $hitsptr->{q{'.$rulename.'}} >= '.$max.';
|
||||
}
|
||||
';
|
||||
}
|
||||
@ -111,17 +128,17 @@ sub do_one_line_body_tests {
|
||||
my $lref = \$line;
|
||||
pos $$lref = 0;
|
||||
'.$self->hash_line_for_rule($pms, $rulename).'
|
||||
while ($$lref =~ /$qrptr->{q{'.$rulename.'}}/go) {
|
||||
while ($$lref =~ /$qrptr->{q{'.$rulename.'}}/gop) {
|
||||
$self->got_hit(q{'.$rulename.'}, "BODY: ", ruletype => "one_line_body");
|
||||
'. $self->hit_rule_plugin_code($pms, $rulename, "one_line_body", "") . '
|
||||
'. ($max? 'last if $self->{tests_already_hit}->{q{'.$rulename.'}} >= '.$max.';' : '') . '
|
||||
'. ($max? 'last if $hitsptr->{q{'.$rulename.'}} >= '.$max.';' : '') . '
|
||||
}
|
||||
';
|
||||
|
||||
} else {
|
||||
$sub .= '
|
||||
'.$self->hash_line_for_rule($pms, $rulename).'
|
||||
if ($line =~ /$qrptr->{q{'.$rulename.'}}/o) {
|
||||
if ($line =~ /$qrptr->{q{'.$rulename.'}}/op) {
|
||||
$self->got_hit(q{'.$rulename.'}, "BODY: ", ruletype => "one_line_body");
|
||||
'. $self->hit_rule_plugin_code($pms, $rulename, "one_line_body", "return 1") . '
|
||||
}
|
||||
@ -129,6 +146,11 @@ sub do_one_line_body_tests {
|
||||
|
||||
}
|
||||
|
||||
# Make sure rule is marked ready for meta rules
|
||||
$sub .= '
|
||||
$self->rule_ready(q{'.$rulename.'}, 1);
|
||||
';
|
||||
|
||||
return if ($opts{doing_user_rules} &&
|
||||
!$self->is_user_rule_sub($rulename.'_one_line_body_test'));
|
||||
|
||||
|
@ -130,6 +130,27 @@ This plugin helps detected spam using attached PDF files
|
||||
body RULENAME eval:pdf_is_empty_body(<bytes>)
|
||||
bytes: maximum byte count to allow and still consider it empty
|
||||
|
||||
pdf_image_to_text_ratio()
|
||||
|
||||
body RULENAME eval:pdf_image_to_text_ratio(<min>,<max>)
|
||||
Ratio calculated as body_length / total_image_area
|
||||
min: minimum ratio
|
||||
max: maximum ratio
|
||||
|
||||
pdf_image_size_exact()
|
||||
|
||||
body RULENAME eval:pdf_image_size_exact(<h>,<w>)
|
||||
h: image height is exactly h
|
||||
w: image width is exactly w
|
||||
|
||||
pdf_image_size_range()
|
||||
|
||||
body RULENAME eval:pdf_image_size_range(<minh>,<minw>,[<maxh>],[<maxw>])
|
||||
minh: image height is atleast minh
|
||||
minw: image width is atleast minw
|
||||
maxh: (optional) image height is no more than maxh
|
||||
maxw: (optional) image width is no more than maxw
|
||||
|
||||
NOTE: See the ruleset for more examples that are not documented here.
|
||||
|
||||
=back
|
||||
@ -145,9 +166,8 @@ use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp);
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
use Digest::MD5 qw(md5_hex);
|
||||
use MIME::QuotedPrint;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
@ -161,89 +181,115 @@ sub new {
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule ("pdf_count");
|
||||
$self->register_eval_rule ("pdf_image_count");
|
||||
$self->register_eval_rule ("pdf_pixel_coverage");
|
||||
$self->register_eval_rule ("pdf_image_size_exact");
|
||||
$self->register_eval_rule ("pdf_image_size_range");
|
||||
$self->register_eval_rule ("pdf_named");
|
||||
$self->register_eval_rule ("pdf_name_regex");
|
||||
$self->register_eval_rule ("pdf_image_to_text_ratio");
|
||||
$self->register_eval_rule ("pdf_match_md5");
|
||||
$self->register_eval_rule ("pdf_match_fuzzy_md5");
|
||||
$self->register_eval_rule ("pdf_match_details");
|
||||
$self->register_eval_rule ("pdf_is_encrypted");
|
||||
$self->register_eval_rule ("pdf_is_empty_body");
|
||||
$self->register_eval_rule ("pdf_count", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_image_count", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_pixel_coverage", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_image_size_exact", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_image_size_range", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_named", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_name_regex", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_image_to_text_ratio", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_match_md5", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_match_fuzzy_md5", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_match_details", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_is_encrypted", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
$self->register_eval_rule ("pdf_is_empty_body", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
# lower priority for add_uri_detail_list to work
|
||||
$self->register_method_priority ("parsed_metadata", -1);
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my %get_details = (
|
||||
'pdf' => sub {
|
||||
my ($self, $pms, $part) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
|
||||
my $type = $part->{'type'} || 'base64';
|
||||
my $data = '';
|
||||
# initialize
|
||||
$pms->{pdfinfo}->{count_pdf} = 0;
|
||||
$pms->{pdfinfo}->{count_pdf_images} = 0;
|
||||
|
||||
if ($type eq 'quoted-printable') {
|
||||
$data = decode_qp($data); # use QuotedPrint->decode_qp
|
||||
}
|
||||
else {
|
||||
$data = $part->decode(); # just use built in base64 decoder
|
||||
my @parts = $pms->{msg}->find_parts(qr@^(image|application)/(pdf|octet\-stream)$@, 1);
|
||||
my $part_count = scalar @parts;
|
||||
|
||||
dbg("pdfinfo: Identified $part_count possible mime parts that need checked for PDF content");
|
||||
|
||||
foreach my $p (@parts) {
|
||||
my $type = $p->{type} || '';
|
||||
my $name = $p->{name} || '';
|
||||
|
||||
dbg("pdfinfo: found part, type=$type file=$name");
|
||||
|
||||
# filename must end with .pdf, or application type can be pdf
|
||||
# sometimes windows muas will wrap a pdf up inside a .dat file
|
||||
# v0.8 - Added .fdf phoney PDF detection
|
||||
next unless ($name =~ /\.[fp]df$/i || $type =~ m@/pdf$@);
|
||||
|
||||
_get_pdf_details($pms, $p);
|
||||
$pms->{pdfinfo}->{count_pdf}++;
|
||||
}
|
||||
|
||||
my $index = substr($data, 0, 8);
|
||||
_set_tag($pms, 'PDFCOUNT', $pms->{pdfinfo}->{count_pdf});
|
||||
_set_tag($pms, 'PDFIMGCOUNT', $pms->{pdfinfo}->{count_pdf_images});
|
||||
}
|
||||
|
||||
return unless ($index =~ /.PDF\-(\d\.\d)/);
|
||||
sub _get_pdf_details {
|
||||
my ($pms, $part) = @_;
|
||||
|
||||
my $data = $part->decode();
|
||||
|
||||
# Remove UTF-8 BOM
|
||||
$data =~ s/^\xef\xbb\xbf//;
|
||||
|
||||
# Search magic in first 1024 bytes
|
||||
if ($data !~ /^.{0,1024}\%PDF\-(\d\.\d)/s) {
|
||||
dbg("pdfinfo: PDF magic header not found, invalid file?");
|
||||
return;
|
||||
}
|
||||
my $version = $1;
|
||||
$self->_set_tag($pms, 'PDFVERSION', $version);
|
||||
_set_tag($pms, 'PDFVERSION', $version);
|
||||
# dbg("pdfinfo: pdf version = $version");
|
||||
|
||||
my ($height, $width, $fuzzy_data, $pdf_tags);
|
||||
my ($producer, $created, $modified, $title, $creator, $author) = ('unknown','0','0','untitled','unknown','unknown');
|
||||
my ($md5, $fuzzy_md5) = ('', '');
|
||||
my ($fuzzy_data, $pdf_tags);
|
||||
my ($md5, $fuzzy_md5) = ('','');
|
||||
my ($total_height, $total_width, $total_area, $line_count) = (0,0,0,0);
|
||||
|
||||
my $name = $part->{'name'} || '';
|
||||
$self->_set_tag($pms, 'PDFNAME', $name);
|
||||
my $name = $part->{name} || '';
|
||||
_set_tag($pms, 'PDFNAME', $name);
|
||||
# store the file name so we can check pdf_named() or pdf_name_match() later.
|
||||
$pms->{pdfinfo}->{names_pdf}->{$name} = 1 if $name;
|
||||
|
||||
my $no_more_fuzzy = 0;
|
||||
my $got_image = 0;
|
||||
my $encrypted = 0;
|
||||
my %uris;
|
||||
|
||||
while($data =~ /([^\n]+)/g) {
|
||||
while ($data =~ /([^\n]+)/g) {
|
||||
# dbg("pdfinfo: line=$1");
|
||||
my $line = $1;
|
||||
|
||||
$line_count++;
|
||||
|
||||
# lines containing high bytes will have no data we need, so save some cycles
|
||||
next if ($line =~ /[\x80-\xff]/);
|
||||
|
||||
if (!$no_more_fuzzy && $line_count < 70) {
|
||||
if (!$no_more_fuzzy && ++$line_count < 70) {
|
||||
if ($line !~ m/^\%/ && $line !~ m/^\/(?:Height|Width|(?:(?:Media|Crop)Box))/ && $line !~ m/^\d+\s+\d+\s+\d+\s+\d+\s+\d+\s+\d+\s+cm$/) {
|
||||
$line =~ s/\s+$//; # strip off whitespace at end.
|
||||
$fuzzy_data .= $line;
|
||||
}
|
||||
}
|
||||
|
||||
if ($line =~ m/^\/([A-Za-z]+)/) {
|
||||
$pdf_tags .= $1;
|
||||
}
|
||||
|
||||
$got_image=1 if ($line =~ m/\/Image/);
|
||||
$encrypted=1 if ($line =~ m/^\/Encrypt/);
|
||||
|
||||
# once we hit the first stream, we stop collecting data for fuzzy md5
|
||||
$no_more_fuzzy = 1 if ($line =~ m/stream/);
|
||||
$no_more_fuzzy = 1 if index($line, 'stream') >= 0;
|
||||
}
|
||||
|
||||
$got_image = 1 if index($line, '/Image') >= 0;
|
||||
if (!$encrypted && index($line, '/Encrypt') == 0) {
|
||||
# store encrypted flag.
|
||||
$encrypted = $pms->{pdfinfo}->{encrypted} = 1;
|
||||
}
|
||||
|
||||
# From a v1.3 pdf
|
||||
# [12234] dbg: pdfinfo: line=630 0 0 149 0 0 cm
|
||||
# [12234] dbg: pdfinfo: line=/Width 630
|
||||
# [12234] dbg: pdfinfo: line=/Height 149
|
||||
if ($got_image) {
|
||||
my ($width, $height);
|
||||
if ($line =~ /^(\d+)\s+\d+\s+\d+\s+(\d+)\s+\d+\s+\d+\s+cm$/) {
|
||||
$width = $1;
|
||||
$height = $2;
|
||||
@ -258,21 +304,38 @@ my %get_details = (
|
||||
$width = $1;
|
||||
$height = $2;
|
||||
}
|
||||
}
|
||||
|
||||
# did pdf contain image data?
|
||||
if ($got_image && $width && $height) {
|
||||
if ($width && $height) {
|
||||
$no_more_fuzzy = 1;
|
||||
my $area = $width * $height;
|
||||
$total_height += $height;
|
||||
$total_width += $width;
|
||||
$total_area += $area;
|
||||
$pms->{pdfinfo}->{dems_pdf}->{"${height}x${width}"} = 1;
|
||||
$pms->{'pdfinfo'}->{"count_pdf_images"} ++;
|
||||
dbg("pdfinfo: Found image in PDF ".($name ? $name : '')." - $height x $width pixels ($area pixels sq.)");
|
||||
$self->_set_tag($pms, 'PDFIMGDIM', "${height}x${width}");
|
||||
$height=0; $width=0; # reset and check for next image
|
||||
$got_image = 0;
|
||||
$pms->{pdfinfo}->{count_pdf_images}++;
|
||||
dbg("pdfinfo: Found image in PDF $name: $height x $width pixels ($area pixels sq.)");
|
||||
_set_tag($pms, 'PDFIMGDIM', "${height}x${width}");
|
||||
$got_image = $height = $width = 0; # reset and check for next image
|
||||
}
|
||||
}
|
||||
|
||||
#
|
||||
# Triage - expecting / to be found for rest of the checks
|
||||
#
|
||||
next unless index($line, '/') >= 0;
|
||||
|
||||
if ($line =~ m/^\/([A-Za-z]+)/) {
|
||||
$pdf_tags .= $1;
|
||||
}
|
||||
|
||||
# XXX some pdf have uris but are stored inside binary data
|
||||
if (keys %uris < 20 && $line =~ /(?:\/S\s{0,2}\/URI\s{0,2}|^\s*)\/URI\s{0,2}( \( .*? (?<!\\) \) | < [^>]* > )/x) {
|
||||
my $location = _parse_string($1);
|
||||
next unless index($location, '.') > 0; # ignore some binary mess
|
||||
if (!exists $uris{$location}) {
|
||||
$uris{$location} = 1;
|
||||
dbg("pdfinfo: found URI: $location");
|
||||
$pms->add_uri_detail_list($location);
|
||||
}
|
||||
}
|
||||
|
||||
# [5310] dbg: pdfinfo: line=<</Producer(GPL Ghostscript 8.15)
|
||||
@ -283,380 +346,231 @@ my %get_details = (
|
||||
# [5310] dbg: pdfinfo: line=/Author(colet)>>endobj
|
||||
# or all on same line inside xml - v1.6+
|
||||
# <</CreationDate(D:20070226165054-06'00')/Creator( Adobe Photoshop CS2 Windows)/Producer(Adobe Photoshop for Windows -- Image Conversion Plug-in)/ModDate(D:20070226165100-06'00')>>
|
||||
|
||||
if ($line =~ /\/Producer\s?\(([^\)\\]+)/) {
|
||||
$producer = $1;
|
||||
# Or hex values
|
||||
# /Creator<FEFF005700720069007400650072>
|
||||
if ($line =~ /\/Author\s{0,2}( \( .*? (?<!\\) \) | < [^>]* > )/x) {
|
||||
my $author = _parse_string($1);
|
||||
dbg("pdfinfo: found property Author=$author");
|
||||
$pms->{pdfinfo}->{details}->{author}->{$author} = 1;
|
||||
_set_tag($pms, 'PDFAUTHOR', $author);
|
||||
}
|
||||
if ($line =~ /\/CreationDate\s?\(D\:(\d+)/) {
|
||||
$created = $1;
|
||||
if ($line =~ /\/Creator\s{0,2}( \( .*? (?<!\\) \) | < [^>]* > )/x) {
|
||||
my $creator = _parse_string($1);
|
||||
dbg("pdfinfo: found property Creator=$creator");
|
||||
$pms->{pdfinfo}->{details}->{creator}->{$creator} = 1;
|
||||
_set_tag($pms, 'PDFCREATOR', $creator);
|
||||
}
|
||||
if ($line =~ /\/ModDate\s?\(D\:(\d+)/) {
|
||||
$modified = $1;
|
||||
if ($line =~ /\/CreationDate\s{0,2}\(D\:(\d+)/) {
|
||||
my $created = _parse_string($1);
|
||||
dbg("pdfinfo: found property Created=$created");
|
||||
$pms->{pdfinfo}->{details}->{created}->{$created} = 1;
|
||||
}
|
||||
if ($line =~ /\/Title\s?\(([^\)\\]+)/) {
|
||||
$title = $1;
|
||||
# Title=\376\377\000w\000w\000n\000g
|
||||
# Title=wwng
|
||||
$title =~ s/\\\d{3}//g;
|
||||
if ($line =~ /\/ModDate\s{0,2}\(D\:(\d+)/) {
|
||||
my $modified = _parse_string($1);
|
||||
dbg("pdfinfo: found property Modified=$modified");
|
||||
$pms->{pdfinfo}->{details}->{modified}->{$modified} = 1;
|
||||
}
|
||||
if ($line =~ /\/Creator\s?\(([^\)\\]+)/) {
|
||||
$creator = $1;
|
||||
if ($line =~ /\/Producer\s{0,2}( \( .*? (?<!\\) \) | < [^>]* > )/x) {
|
||||
my $producer = _parse_string($1);
|
||||
dbg("pdfinfo: found property Producer=$producer");
|
||||
$pms->{pdfinfo}->{details}->{producer}->{$producer} = 1;
|
||||
_set_tag($pms, 'PDFPRODUCER', $producer);
|
||||
}
|
||||
if ($line =~ /\/Author\s?\(([^\)]+)/) {
|
||||
$author = $1;
|
||||
# Author=\376\377\000H\000P\000_\000A\000d\000m\000i\000n\000i\000s\000t\000r\000a\000t\000o\000r
|
||||
# Author=HP_Administrator
|
||||
$author =~ s/\\\d{3}//g;
|
||||
if ($line =~ /\/Title\s{0,2}( \( .*? (?<!\\) \) | < [^>]* > )/x) {
|
||||
my $title = _parse_string($1);
|
||||
dbg("pdfinfo: found property Title=$title");
|
||||
$pms->{pdfinfo}->{details}->{title}->{$title} = 1;
|
||||
_set_tag($pms, 'PDFTITLE', $title);
|
||||
}
|
||||
}
|
||||
|
||||
# store the file name so we can check pdf_named() or pdf_name_match() later.
|
||||
$pms->{pdfinfo}->{names_pdf}->{$name} = 1 if $name;
|
||||
|
||||
# store encrypted flag.
|
||||
$pms->{pdfinfo}->{encrypted} = $encrypted;
|
||||
|
||||
# if we had multiple images in the pdf, we need to store the total HxW as well.
|
||||
# If it was a single Image PDF, then this value will already be in the hash.
|
||||
$pms->{pdfinfo}->{dems_pdf}->{"${total_height}x${total_width}"} = 1 if ($total_height && $total_width);;
|
||||
$pms->{pdfinfo}->{dems_pdf}->{"${total_height}x${total_width}"} = 1 if ($total_height && $total_width);
|
||||
|
||||
if ($total_area) {
|
||||
$pms->{pdfinfo}->{pc_pdf} = $total_area;
|
||||
$self->_set_tag($pms, 'PDFIMGAREA', $total_area);
|
||||
dbg("pdfinfo: Filename=$name Total HxW: $total_height x $total_width ($total_area area)") if ($total_area);
|
||||
_set_tag($pms, 'PDFIMGAREA', $total_area);
|
||||
dbg("pdfinfo: Total HxW: $total_height x $total_width ($total_area area)");
|
||||
}
|
||||
|
||||
dbg("pdfinfo: Filename=$name Title=$title Author=$author Producer=$producer Created=$created Modified=$modified");
|
||||
|
||||
$md5 = uc(md5_hex($data)) if $data;
|
||||
$fuzzy_md5 = uc(md5_hex($fuzzy_data)) if $fuzzy_data;
|
||||
my $tags_md5;
|
||||
my $tags_md5 = '';
|
||||
$tags_md5 = uc(md5_hex($pdf_tags)) if $pdf_tags;
|
||||
|
||||
dbg("pdfinfo: MD5 results for ".($name ? $name : '')." - md5=".($md5 ? $md5 : '')." fuzzy1=".($fuzzy_md5 ? $fuzzy_md5 : '')." fuzzy2=".($tags_md5 ? $tags_md5 : ''));
|
||||
dbg("pdfinfo: MD5 results for $name: md5=$md5 fuzzy1=$fuzzy_md5 fuzzy2=$tags_md5");
|
||||
|
||||
# we dont need tags for these.
|
||||
$pms->{pdfinfo}->{details}->{created} = $created if $created;
|
||||
$pms->{pdfinfo}->{details}->{modified} = $modified if $modified;
|
||||
|
||||
if ($producer) {
|
||||
$pms->{pdfinfo}->{details}->{producer} = $producer if $producer;
|
||||
$self->_set_tag($pms, 'PDFPRODUCER', $producer);
|
||||
}
|
||||
if ($title) {
|
||||
$pms->{pdfinfo}->{details}->{title} = $title;
|
||||
$self->_set_tag($pms, 'PDFTITLE', $title);
|
||||
}
|
||||
if ($creator) {
|
||||
$pms->{pdfinfo}->{details}->{creator} = $creator;
|
||||
$self->_set_tag($pms, 'PDFCREATOR', $creator);
|
||||
}
|
||||
if ($author) {
|
||||
$pms->{pdfinfo}->{details}->{author} = $author;
|
||||
$self->_set_tag($pms, 'PDFAUTHOR', $author);
|
||||
}
|
||||
if ($md5) {
|
||||
$pms->{pdfinfo}->{md5}->{$md5} = 1;
|
||||
$self->_set_tag($pms, 'PDFMD5', $fuzzy_md5);
|
||||
_set_tag($pms, 'PDFMD5', $fuzzy_md5);
|
||||
}
|
||||
if ($fuzzy_md5) {
|
||||
$pms->{pdfinfo}->{fuzzy_md5}->{$fuzzy_md5} = 1;
|
||||
$self->_set_tag($pms, 'PDFMD5FUZZY1', $fuzzy_md5);
|
||||
_set_tag($pms, 'PDFMD5FUZZY1', $fuzzy_md5);
|
||||
}
|
||||
if ($tags_md5) {
|
||||
$pms->{pdfinfo}->{fuzzy_md5}->{$tags_md5} = 1;
|
||||
$self->_set_tag($pms, 'PDFMD5FUZZY2', $tags_md5);
|
||||
_set_tag($pms, 'PDFMD5FUZZY2', $tags_md5);
|
||||
}
|
||||
},
|
||||
}
|
||||
|
||||
);
|
||||
|
||||
# ----------------------------------------
|
||||
sub _parse_string {
|
||||
local $_ = shift;
|
||||
# Anything inside < > is hex encoded
|
||||
if (/^</) {
|
||||
# Might contain whitespace so search all hex values
|
||||
my $str = '';
|
||||
$str .= pack("H*", $1) while (/([0-9A-Fa-f]{2})/g);
|
||||
$_ = $str;
|
||||
# Handle/strip UTF-16 (in ultra-naive way for now)
|
||||
s/\x00//g if (s/^(?:\xfe\xff|\xff\xfe)//);
|
||||
} else {
|
||||
s/^\(//; s/\)$//;
|
||||
# Decode octals
|
||||
# Author=\376\377\000H\000P\000_\000A\000d\000m\000i\000n\000i\000s\000t\000r\000a\000t\000o\000r
|
||||
s/(?<!\\)\\([0-3][0-7][0-7])/pack("C",oct($1))/ge;
|
||||
# Handle/strip UTF-16 (in ultra-naive way for now)
|
||||
s/\x00//g if (s/^(?:\xfe\xff|\xff\xfe)//);
|
||||
# Unescape some stuff like \\ \( \)
|
||||
# Title(Foo \(bar\))
|
||||
s/\\([()\\])/$1/g;
|
||||
}
|
||||
# Limit to some sane length
|
||||
return substr($_, 0, 256);
|
||||
}
|
||||
|
||||
sub _set_tag {
|
||||
my ($pms, $tag, $value) = @_;
|
||||
|
||||
my ($self, $pms, $tag, $value) = @_;
|
||||
|
||||
dbg("pdfinfo: set_tag called for $tag $value");
|
||||
return unless ($tag && $value);
|
||||
return unless defined $value && $value ne '';
|
||||
dbg("pdfinfo: set_tag called for $tag: $value");
|
||||
|
||||
if (exists $pms->{tag_data}->{$tag}) {
|
||||
$pms->{tag_data}->{$tag} .= " $value"; # append value
|
||||
# Limit to some sane length
|
||||
if (length($pms->{tag_data}->{$tag}) < 2048) {
|
||||
$pms->{tag_data}->{$tag} .= ' '.$value; # append value
|
||||
}
|
||||
}
|
||||
else {
|
||||
$pms->{tag_data}->{$tag} = $value;
|
||||
}
|
||||
}
|
||||
|
||||
# ----------------------------------------
|
||||
|
||||
sub _find_pdf_mime_parts {
|
||||
my ($self,$pms) = @_;
|
||||
|
||||
# bail early if message does not have pdf parts
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
|
||||
# initialize
|
||||
$pms->{'pdfinfo'}->{"pc_pdf"} = 0;
|
||||
$pms->{'pdfinfo'}->{"count_pdf"} = 0;
|
||||
$pms->{'pdfinfo'}->{"count_pdf_images"} = 0;
|
||||
|
||||
my @parts = $pms->{msg}->find_parts(qr@^(image|application)/(pdf|octet\-stream)$@, 1);
|
||||
my $part_count = scalar @parts;
|
||||
|
||||
dbg("pdfinfo: Identified $part_count possible mime parts that need checked for PDF content");
|
||||
|
||||
# cache this so we can easily bail
|
||||
$pms->{'pdfinfo'}->{'no_parts'} = 1 unless $part_count;
|
||||
|
||||
foreach my $p (@parts) {
|
||||
my $type = $p->{'type'} =~ m@/([\w\-]+)$@;
|
||||
my $name = $p->{'name'} || '';
|
||||
|
||||
my $cte = lc( $p->get_header('content-transfer-encoding') || '' );
|
||||
|
||||
dbg("pdfinfo: found part, type=".($type ? $type : '')." file=".($name ? $name : '')." cte=".($cte ? $cte : '')."");
|
||||
|
||||
# make sure its a cte we support
|
||||
next unless ($cte =~ /^(?:base64|quoted\-printable)$/);
|
||||
|
||||
# filename must end with .pdf, or application type can be pdf
|
||||
# sometimes windows muas will wrap a pdf up inside a .dat file
|
||||
# v0.8 - Added .fdf phoney PDF detection
|
||||
next unless ($name =~ /\.[fp]df$/ || $type eq 'pdf');
|
||||
|
||||
# if we get this far, make sure type is pdf for sure (not octet-stream or anything else)
|
||||
$type='pdf';
|
||||
|
||||
if ($type && exists $get_details{$type}) {
|
||||
$get_details{$type}->($self, $pms, $p);
|
||||
$pms->{'pdfinfo'}->{"count_$type"} ++;
|
||||
}
|
||||
}
|
||||
|
||||
$self->_set_tag($pms, 'PDFCOUNT', $pms->{'pdfinfo'}->{"count_pdf"});
|
||||
$self->_set_tag($pms, 'PDFIMGCOUNT', $pms->{'pdfinfo'}->{"count_pdf_images"});
|
||||
|
||||
}
|
||||
|
||||
# ----------------------------------------
|
||||
|
||||
sub pdf_named {
|
||||
my ($self,$pms,$body,$name) = @_;
|
||||
return unless (defined $name);
|
||||
my ($self, $pms, $body, $name) = @_;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
return 0 unless defined $name;
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"names_pdf"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"names_pdf"}->{$name});
|
||||
return 1 if exists $pms->{pdfinfo}->{names_pdf}->{$name};
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_name_regex {
|
||||
my ($self,$pms,$body,$re) = @_;
|
||||
return unless (defined $re);
|
||||
my ($self, $pms, $body, $regex) = @_;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
return 0 unless defined $regex;
|
||||
return 0 unless exists $pms->{pdfinfo}->{names_pdf};
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"names_pdf"});
|
||||
|
||||
my ($rec, $err) = compile_regexp($re, 2);
|
||||
my ($rec, $err) = compile_regexp($regex, 2);
|
||||
if (!$rec) {
|
||||
info("pdfinfo: invalid regexp '$re': $err");
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
warn "pdfinfo: invalid regexp for $rulename '$regex': $err";
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $hit = 0;
|
||||
foreach my $name (keys %{$pms->{'pdfinfo'}->{"names_pdf"}}) {
|
||||
foreach my $name (keys %{$pms->{pdfinfo}->{names_pdf}}) {
|
||||
if ($name =~ $rec) {
|
||||
dbg("pdfinfo: pdf_name_regex hit on $name");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_is_encrypted {
|
||||
my ($self,$pms,$body) = @_;
|
||||
my ($self, $pms, $body) = @_;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return $pms->{'pdfinfo'}->{'encrypted'};
|
||||
return $pms->{pdfinfo}->{encrypted} ? 1 : 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_count {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless defined $min;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"count_pdf"});
|
||||
return result_check($min, $max, $pms->{'pdfinfo'}->{"count_pdf"});
|
||||
my ($self, $pms, $body, $min, $max) = @_;
|
||||
|
||||
return _result_check($min, $max, $pms->{pdfinfo}->{count_pdf});
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_count {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless defined $min;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"count_pdf_images"});
|
||||
return result_check($min, $max, $pms->{'pdfinfo'}->{"count_pdf_images"});
|
||||
my ($self, $pms, $body, $min, $max) = @_;
|
||||
|
||||
return _result_check($min, $max, $pms->{pdfinfo}->{count_pdf_images});
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_pixel_coverage {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless (defined $min);
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
|
||||
# dbg("pdfinfo: pc_$type: $min, ".($max ? $max:'').", $type, ".$pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
return result_check($min, $max, $pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
return _result_check($min, $max, $pms->{pdfinfo}->{pc_pdf});
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_to_text_ratio {
|
||||
my ($self,$pms,$body,$min,$max) = @_;
|
||||
return unless (defined $min && defined $max);
|
||||
my ($self, $pms, $body, $min, $max) = @_;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"pc_pdf"});
|
||||
return 0 unless defined $max;
|
||||
return 0 unless $pms->{pdfinfo}->{pc_pdf};
|
||||
|
||||
# depending on how you call this eval (body vs rawbody),
|
||||
# the $textlen will differ.
|
||||
my $textlen = length(join('',@$body));
|
||||
my $textlen = length(join('', @$body));
|
||||
return 0 unless $textlen;
|
||||
|
||||
return 0 unless ( $textlen > 0 && exists $pms->{'pdfinfo'}->{"pc_pdf"} && $pms->{'pdfinfo'}->{"pc_pdf"} > 0);
|
||||
|
||||
my $ratio = $textlen / $pms->{'pdfinfo'}->{"pc_pdf"};
|
||||
my $ratio = $textlen / $pms->{pdfinfo}->{pc_pdf};
|
||||
dbg("pdfinfo: image ratio=$ratio, min=$min max=$max");
|
||||
return result_check($min, $max, $ratio, 1);
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
return _result_check($min, $max, $ratio, 1);
|
||||
}
|
||||
|
||||
sub pdf_is_empty_body {
|
||||
my ($self,$pms,$body,$min) = @_;
|
||||
my ($self, $pms, $body, $min) = @_;
|
||||
|
||||
return 0 unless $pms->{pdfinfo}->{count_pdf};
|
||||
$min ||= 0; # default to 0 bytes
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless $pms->{'pdfinfo'}->{"count_pdf"};
|
||||
|
||||
# check for cached result
|
||||
return 1 if $pms->{'pdfinfo'}->{"no_body_text"};
|
||||
|
||||
shift @$body; # shift body array removes line #1 -> subject line.
|
||||
|
||||
my $bytes = 0;
|
||||
my $textlen = length(join('',@$body));
|
||||
my $idx = 0;
|
||||
foreach my $line (@$body) {
|
||||
next unless ($line =~ m/\S/);
|
||||
next if ($line =~ m/^Subject/);
|
||||
next if $idx++ == 0; # skip subject line
|
||||
next unless $line =~ /\S/;
|
||||
$bytes += length($line);
|
||||
# no hit if minimum already exceeded
|
||||
return 0 if $bytes > $min;
|
||||
}
|
||||
|
||||
dbg("pdfinfo: is_empty_body = $bytes bytes");
|
||||
|
||||
if ($bytes == 0 || ($bytes <= $min)) {
|
||||
$pms->{'pdfinfo'}->{"no_body_text"} = 1;
|
||||
dbg("pdfinfo: pdf_is_empty_body matched ($bytes <= $min)");
|
||||
return 1;
|
||||
}
|
||||
|
||||
# cache it and return 0
|
||||
$pms->{'pdfinfo'}->{"no_body_text"} = 0;
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_size_exact {
|
||||
my ($self,$pms,$body,$height,$width) = @_;
|
||||
return unless (defined $height && defined $width);
|
||||
my ($self, $pms, $body, $height, $width) = @_;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
return 0 unless defined $width;
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"dems_pdf"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"dems_pdf"}->{"${height}x${width}"});
|
||||
return 1 if exists $pms->{pdfinfo}->{dems_pdf}->{"${height}x${width}"};
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_image_size_range {
|
||||
my ($self,$pms,$body,$minh,$minw,$maxh,$maxw) = @_;
|
||||
return unless (defined $minh && defined $minw);
|
||||
my ($self, $pms, $body, $minh, $minw, $maxh, $maxw) = @_;
|
||||
|
||||
# make sure we have image data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
return 0 unless defined $minw;
|
||||
return 0 unless exists $pms->{pdfinfo}->{dems_pdf};
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"dems_pdf"});
|
||||
|
||||
foreach my $dem ( keys %{$pms->{'pdfinfo'}->{"dems_pdf"}}) {
|
||||
my ($h,$w) = split(/x/,$dem);
|
||||
foreach my $dem (keys %{$pms->{pdfinfo}->{dems_pdf}}) {
|
||||
my ($h, $w) = split(/x/, $dem);
|
||||
next if ($h < $minh); # height less than min height
|
||||
next if ($w < $minw); # width less than min width
|
||||
next if (defined $maxh && $h > $maxh); # height more than max height
|
||||
next if (defined $maxw && $w > $maxw); # width more than max width
|
||||
|
||||
# if we make it here, we have a match
|
||||
return 1;
|
||||
}
|
||||
@ -664,88 +578,54 @@ sub pdf_image_size_range {
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_match_md5 {
|
||||
my ($self, $pms, $body, $md5) = @_;
|
||||
|
||||
my ($self,$pms,$body,$md5) = @_;
|
||||
return unless defined $md5;
|
||||
return 0 unless defined $md5;
|
||||
|
||||
my $uc_md5 = uc($md5); # uppercase matches only
|
||||
|
||||
# make sure we have pdf data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"md5"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"md5"}->{$uc_md5});
|
||||
return 1 if exists $pms->{pdfinfo}->{md5}->{uc $md5};
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_match_fuzzy_md5 {
|
||||
my ($self, $pms, $body, $md5) = @_;
|
||||
|
||||
my ($self,$pms,$body,$md5) = @_;
|
||||
return unless defined $md5;
|
||||
return 0 unless defined $md5;
|
||||
|
||||
my $uc_md5 = uc($md5); # uppercase matches only
|
||||
|
||||
# make sure we have pdf data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{"fuzzy_md5"});
|
||||
return 1 if (exists $pms->{'pdfinfo'}->{"fuzzy_md5"}->{$uc_md5});
|
||||
return 1 if exists $pms->{pdfinfo}->{fuzzy_md5}->{uc $md5};
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub pdf_match_details {
|
||||
my ($self, $pms, $body, $detail, $regex) = @_;
|
||||
return unless ($detail && $regex);
|
||||
|
||||
# make sure we have pdf data read in.
|
||||
if (!exists $pms->{'pdfinfo'}) {
|
||||
$self->_find_pdf_mime_parts($pms);
|
||||
}
|
||||
|
||||
return 0 if (exists $pms->{'pdfinfo'}->{'no_parts'});
|
||||
return 0 unless (exists $pms->{'pdfinfo'}->{'details'});
|
||||
|
||||
my $check_value = $pms->{pdfinfo}->{details}->{$detail};
|
||||
return unless $check_value;
|
||||
return 0 unless defined $regex;
|
||||
return 0 unless exists $pms->{pdfinfo}->{details}->{$detail};
|
||||
|
||||
my ($rec, $err) = compile_regexp($regex, 2);
|
||||
if (!$rec) {
|
||||
info("pdfinfo: invalid regexp '$regex': $err");
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
warn "pdfinfo: invalid regexp for $rulename '$regex': $err";
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($check_value =~ $rec) {
|
||||
dbg("pdfinfo: pdf_match_details $detail $regex matches $check_value");
|
||||
foreach (keys %{$pms->{pdfinfo}->{details}->{$detail}}) {
|
||||
if ($_ =~ $rec) {
|
||||
dbg("pdfinfo: pdf_match_details $detail ($regex) match: $_");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
sub result_check {
|
||||
sub _result_check {
|
||||
my ($min, $max, $value, $nomaxequal) = @_;
|
||||
return 0 unless defined $value;
|
||||
return 0 if ($value < $min);
|
||||
return 0 if (defined $max && $value > $max);
|
||||
return 0 if (defined $nomaxequal && $nomaxequal && $value == $max);
|
||||
return 0 unless defined $min && defined $value;
|
||||
return 0 if $value < $min;
|
||||
return 0 if defined $max && $value > $max;
|
||||
return 0 if defined $nomaxequal && $nomaxequal && $value == $max;
|
||||
return 1;
|
||||
}
|
||||
|
||||
# -----------------------------------------
|
||||
|
||||
1;
|
||||
|
||||
|
@ -21,8 +21,10 @@ package Mail::SpamAssassin::Plugin::PhishTag;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
use Errno qw(EBADF);
|
||||
use Mail::SpamAssassin;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
@ -220,7 +222,7 @@ PhishTag - SpamAssassin plugin for redirecting links in incoming emails.
|
||||
=head1 DESCRIPTION
|
||||
|
||||
PhishTag enables administrators to rewrite links in emails that trigger certain
|
||||
tests, preferably anti-phishing blacklist tests. The plugin will inhibit the
|
||||
tests, preferably anti-phishing blocklist tests. The plugin will inhibit the
|
||||
blocking of a portion of the emails that trigger the test by SpamAssassin, and
|
||||
let them pass to the users' inbox after the rewrite. It is useful in providing
|
||||
training to email users about company policies and general email usage.
|
||||
|
@ -1,6 +1,6 @@
|
||||
#
|
||||
# Author: Giovanni Bechis <gbechis@apache.org>
|
||||
# Copyright 2018,2019 Giovanni Bechis
|
||||
# Copyright 2018,2020 Giovanni Bechis
|
||||
#
|
||||
# <@LICENSE>
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
@ -31,6 +31,7 @@ Mail::SpamAssassin::Plugin::Phishing - check uris against phishing feed
|
||||
ifplugin Mail::SpamAssassin::Plugin::Phishing
|
||||
phishing_openphish_feed /etc/mail/spamassassin/openphish-feed.txt
|
||||
phishing_phishtank_feed /etc/mail/spamassassin/phishtank-feed.csv
|
||||
phishing_phishstats_feed /etc/mail/spamassassin/phishstats-feed.csv
|
||||
body URI_PHISHING eval:check_phishing()
|
||||
describe URI_PHISHING Url match phishing in feed
|
||||
endif
|
||||
@ -38,21 +39,25 @@ Mail::SpamAssassin::Plugin::Phishing - check uris against phishing feed
|
||||
=head1 DESCRIPTION
|
||||
|
||||
This plugin finds uris used in phishing campaigns detected by
|
||||
OpenPhish or PhishTank feeds.
|
||||
OpenPhish, PhishTank or PhishStats feeds.
|
||||
|
||||
The Openphish free feed is updated every 6 hours and can be downloaded from
|
||||
https://openphish.com/feed.txt.
|
||||
The Premium Openphish feed is not currently supported.
|
||||
|
||||
The PhishTank free feed is updated every 1 hours and can be downloaded from
|
||||
http://data.phishtank.com/data/online-valid.csv.
|
||||
To avoid download limits a registration is required.
|
||||
|
||||
The PhishStats feed is updated every 90 minutes and can be downloaded from
|
||||
https://phishstats.info/phish_score.csv.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::Phishing;
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
my $VERSION = 1.1;
|
||||
|
||||
use Errno qw(EBADF);
|
||||
@ -61,7 +66,7 @@ use Mail::SpamAssassin::PerMsgStatus;
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub dbg { Mail::SpamAssassin::Plugin::dbg ("Phishing: @_"); }
|
||||
sub dbg { my $msg = shift; Mail::SpamAssassin::Plugin::dbg("Phishing: $msg", @_); }
|
||||
|
||||
sub new {
|
||||
my ($class, $mailsa) = @_;
|
||||
@ -71,7 +76,7 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsa->{conf});
|
||||
$self->register_eval_rule("check_phishing");
|
||||
$self->register_eval_rule("check_phishing", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -81,14 +86,98 @@ sub set_config {
|
||||
my @cmds;
|
||||
push(@cmds, {
|
||||
setting => 'phishing_openphish_feed',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
}
|
||||
);
|
||||
|
||||
=head1 ADMIN PREFERENCES
|
||||
|
||||
The following options can be used in site-wide (C<local.cf>)
|
||||
configuration files to customize how the module handles phishing uris
|
||||
|
||||
=cut
|
||||
|
||||
=over 4
|
||||
|
||||
=item phishing_openphish_feed
|
||||
|
||||
Absolute path of the downloaded OpenPhish datafeed.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
push(@cmds, {
|
||||
setting => 'phishing_phishtank_feed',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
}
|
||||
);
|
||||
|
||||
=over 4
|
||||
|
||||
=item phishing_phishtank_feed
|
||||
|
||||
Absolute path of the downloaded PhishTank datafeed.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
push(@cmds, {
|
||||
setting => 'phishing_uri_noparam',
|
||||
is_admin => 1,
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
}
|
||||
);
|
||||
|
||||
=over 4
|
||||
|
||||
=item phishing_uri_noparam ( 0 | 1 ) (default: 0)
|
||||
|
||||
If this option is set uri parameters will not be take into consideration
|
||||
when parsing the phishing uris datafeed.
|
||||
If this option is enabled and the url without parameters is "generic"
|
||||
(like https://www.kisa.link/url_redirector.php?url=...) the url will be
|
||||
skipped.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
push(@cmds, {
|
||||
setting => 'phishing_phishstats_feed',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
}
|
||||
);
|
||||
|
||||
=over 4
|
||||
|
||||
=item phishing_phishstats_feed
|
||||
|
||||
Absolute path of the downloaded PhishStats datafeed.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
push(@cmds, {
|
||||
setting => 'phishing_phishstats_minscore',
|
||||
is_admin => 1,
|
||||
default => 6,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
}
|
||||
);
|
||||
|
||||
=over 4
|
||||
|
||||
=item phishing_phishstats_minscore ( 0 - 10 ) (default: 6)
|
||||
|
||||
Minimum score to take into consideration for phishing uris downloaded
|
||||
from PhishStats datafeed.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
@ -100,7 +189,8 @@ sub finish_parsing_end {
|
||||
sub _read_configfile {
|
||||
my ($self) = @_;
|
||||
my $conf = $self->{main}->{registryboundaries}->{conf};
|
||||
my @phtank_ln;
|
||||
my (@phtank_ln, @phstats_ln);
|
||||
my $stripped_cluri;
|
||||
|
||||
local *F;
|
||||
if ( defined($conf->{phishing_openphish_feed}) && ( -f $conf->{phishing_openphish_feed} ) ) {
|
||||
@ -109,10 +199,14 @@ sub _read_configfile {
|
||||
chomp;
|
||||
#lines that start with pound are comments
|
||||
next if(/^\s*\#/);
|
||||
$stripped_cluri = $_;
|
||||
if ( $conf->{phishing_uri_noparam} eq 1 ) {
|
||||
$stripped_cluri =~ s/\?.*//;
|
||||
}
|
||||
my $phishdomain = $self->{main}->{registryboundaries}->uri_to_domain($_);
|
||||
if ( defined $phishdomain ) {
|
||||
push @{$self->{PHISHING}->{$_}->{phishdomain}}, $phishdomain;
|
||||
push @{$self->{PHISHING}->{$_}->{phishinfo}->{$phishdomain}}, "OpenPhish";
|
||||
push @{$self->{PHISHING}->{$stripped_cluri}->{phishdomain}}, $phishdomain;
|
||||
push @{$self->{PHISHING}->{$stripped_cluri}->{phishinfo}->{$phishdomain}}, "OpenPhish";
|
||||
}
|
||||
}
|
||||
|
||||
@ -133,11 +227,14 @@ sub _read_configfile {
|
||||
|
||||
@phtank_ln = split(/,/, $_);
|
||||
$phtank_ln[1] =~ s/\"//g;
|
||||
|
||||
$stripped_cluri = $phtank_ln[1];
|
||||
if ( $conf->{phishing_uri_noparam} eq 1 ) {
|
||||
$stripped_cluri =~ s/\?.*//;
|
||||
}
|
||||
my $phishdomain = $self->{main}->{registryboundaries}->uri_to_domain($phtank_ln[1]);
|
||||
if ( defined $phishdomain ) {
|
||||
push @{$self->{PHISHING}->{$phtank_ln[1]}->{phishdomain}}, $phishdomain;
|
||||
push @{$self->{PHISHING}->{$phtank_ln[1]}->{phishinfo}->{$phishdomain}}, "PhishTank";
|
||||
push @{$self->{PHISHING}->{$stripped_cluri}->{phishdomain}}, $phishdomain;
|
||||
push @{$self->{PHISHING}->{$stripped_cluri}->{phishinfo}->{$phishdomain}}, "PhishTank";
|
||||
}
|
||||
}
|
||||
|
||||
@ -146,6 +243,40 @@ sub _read_configfile {
|
||||
: die "error reading config file: $!";
|
||||
close(F) or die "error closing config file: $!";
|
||||
}
|
||||
|
||||
if ( defined($conf->{phishing_phishstats_feed}) && (-f $conf->{phishing_phishstats_feed} ) ) {
|
||||
open(F, '<', $conf->{phishing_phishstats_feed});
|
||||
for ($!=0; <F>; $!=0) {
|
||||
#skip first line
|
||||
next if ( $. eq 1);
|
||||
chomp;
|
||||
#lines that start with pound are comments
|
||||
next if(/^\s*\#/);
|
||||
|
||||
# CSV: Date,Score,URL,IP
|
||||
@phstats_ln = split(/,/, $_);
|
||||
$phstats_ln[1] =~ s/\"//g;
|
||||
$phstats_ln[2] =~ s/\"//g;
|
||||
if ( $conf->{phishing_phishstats_minscore} >= $phstats_ln[1] ) {
|
||||
next;
|
||||
}
|
||||
$stripped_cluri = $phstats_ln[2];
|
||||
if ( $conf->{phishing_uri_noparam} eq 1 ) {
|
||||
$stripped_cluri =~ s/\?.*//;
|
||||
}
|
||||
my $phishdomain = $self->{main}->{registryboundaries}->uri_to_domain($phstats_ln[2]);
|
||||
if ( defined $phishdomain ) {
|
||||
push @{$self->{PHISHING}->{$stripped_cluri}->{phishdomain}}, $phishdomain;
|
||||
push @{$self->{PHISHING}->{$stripped_cluri}->{phishinfo}->{$phishdomain}}, "PhishStats";
|
||||
}
|
||||
}
|
||||
|
||||
defined $_ || $!==0 or
|
||||
$!==EBADF ? dbg("PHISHING: error reading config file: $!")
|
||||
: die "error reading config file: $!";
|
||||
close(F) or die "error closing config file: $!";
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
sub check_phishing {
|
||||
@ -153,10 +284,11 @@ sub check_phishing {
|
||||
|
||||
my $feedname;
|
||||
my $domain;
|
||||
my $stripped_cluri;
|
||||
my $dcnt;
|
||||
|
||||
my $uris = $pms->get_uri_detail_list();
|
||||
|
||||
my $rulename = $pms->get_current_eval_rule_name();
|
||||
|
||||
while (my($uri, $info) = each %{$uris}) {
|
||||
# we want to skip mailto: uris
|
||||
next if ($uri =~ /^mailto:/i);
|
||||
@ -166,12 +298,21 @@ sub check_phishing {
|
||||
if (($info->{types}->{a}) || ($info->{types}->{parsed})) {
|
||||
# check url
|
||||
foreach my $cluri (@{$info->{cleaned}}) {
|
||||
if ( exists $self->{PHISHING}->{$cluri} ) {
|
||||
$stripped_cluri = $cluri;
|
||||
if( $self->{main}->{conf}->{phishing_uri_noparam} eq 1 ) {
|
||||
$stripped_cluri =~ s/\?.*//;
|
||||
$dcnt = $stripped_cluri =~ tr/\///;
|
||||
}
|
||||
# If uri without parameters are considered, skip too short uris
|
||||
# like https://www.google.com/url?sa=t&url=http://badsite.com
|
||||
if( ($self->{main}->{conf}->{phishing_uri_noparam} eq 1) && ($dcnt <= 3) ) {
|
||||
next;
|
||||
}
|
||||
if ( exists $self->{PHISHING}->{$stripped_cluri} ) {
|
||||
$domain = $self->{main}->{registryboundaries}->uri_to_domain($cluri);
|
||||
$feedname = $self->{PHISHING}->{$cluri}->{phishinfo}->{$domain}[0];
|
||||
dbg("HIT! $domain [$cluri] found in $feedname feed");
|
||||
$pms->test_log("$feedname ($domain)");
|
||||
$pms->got_hit($rulename, "", ruletype => 'eval');
|
||||
$feedname = $self->{PHISHING}->{$stripped_cluri}->{phishinfo}->{$domain}[0];
|
||||
dbg("HIT! $domain [$stripped_cluri] found in $feedname feed");
|
||||
$pms->test_log("$feedname ($domain)", $rulename);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
@ -37,13 +37,17 @@ package Mail::SpamAssassin::Plugin::Pyzor;
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Timeout;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var untaint_file_path
|
||||
proc_status_ok exit_status_str);
|
||||
use Mail::SpamAssassin::SubProcBackChannel;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var untaint_file_path am_running_on_windows
|
||||
proc_status_ok exit_status_str force_die);
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Storable;
|
||||
use POSIX qw(PIPE_BUF WNOHANG);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub new {
|
||||
@ -64,7 +68,7 @@ sub new {
|
||||
dbg("pyzor: network tests on, attempting Pyzor");
|
||||
}
|
||||
|
||||
$self->register_eval_rule("check_pyzor");
|
||||
$self->register_eval_rule("check_pyzor", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
@ -87,11 +91,27 @@ Whether to use Pyzor, if it is available.
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'use_pyzor',
|
||||
is_admin => 1,
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL
|
||||
});
|
||||
|
||||
=item pyzor_max NUMBER (default: 5)
|
||||
=item pyzor_fork (0|1) (default: 1)
|
||||
|
||||
Instead of running Pyzor synchronously, fork separate process for it and
|
||||
read the results in later (similar to async DNS lookups). Increases
|
||||
throughput. Considered experimental on Windows, where default is 0.
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'pyzor_fork',
|
||||
is_admin => 1,
|
||||
default => am_running_on_windows()?0:1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
=item pyzor_count_min NUMBER (default: 5)
|
||||
|
||||
This option sets how often a message's body checksum must have been
|
||||
reported to the Pyzor server before SpamAssassin will consider the Pyzor
|
||||
@ -103,18 +123,69 @@ set this to a relatively low value, e.g. C<5>.
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'pyzor_max',
|
||||
setting => 'pyzor_count_min',
|
||||
is_admin => 1,
|
||||
default => 5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
# Deprecated setting, the name makes no sense!
|
||||
push (@cmds, {
|
||||
setting => 'pyzor_max',
|
||||
is_admin => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
warn("deprecated setting used, change pyzor_max to pyzor_count_min\n");
|
||||
if ($value !~ /^\d+$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{pyzor_count_min} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
=item pyzor_welcomelist_min NUMBER (default: 10)
|
||||
|
||||
Previously pyzor_whitelist_min which will work interchangeably until 4.1.
|
||||
|
||||
This option sets how often a message's body checksum must have been
|
||||
welcomelisted to the Pyzor server for SpamAssassin to consider ignoring the
|
||||
result. Final decision is made by pyzor_welcomelist_factor.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'pyzor_welcomelist_min',
|
||||
aliases => ['pyzor_whitelist_min'], # removed in 4.1
|
||||
is_admin => 1,
|
||||
default => 10,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=item pyzor_welcomelist_factor NUMBER (default: 0.2)
|
||||
|
||||
Previously pyzor_whitelist_factor which will work interchangeably until 4.1.
|
||||
|
||||
Ignore Pyzor result if REPORTCOUNT x NUMBER >= pyzor_welcomelist_min.
|
||||
For default setting this means: 50 reports requires 10 welcomelistings.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'pyzor_welcomelist_factor',
|
||||
aliases => ['pyzor_whitelist_factor'], # removed in 4.1
|
||||
is_admin => 1,
|
||||
default => 0.2,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=back
|
||||
|
||||
=head1 ADMINISTRATOR OPTIONS
|
||||
|
||||
=over 4
|
||||
|
||||
=item pyzor_timeout n (default: 3.5)
|
||||
=item pyzor_timeout n (default: 5)
|
||||
|
||||
How many seconds you wait for Pyzor to complete, before scanning continues
|
||||
without the Pyzor results. A numeric value is optionally suffixed by a
|
||||
@ -142,7 +213,7 @@ removing one of them.
|
||||
push (@cmds, {
|
||||
setting => 'pyzor_timeout',
|
||||
is_admin => 1,
|
||||
default => 3.5,
|
||||
default => 5,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_DURATION
|
||||
});
|
||||
|
||||
@ -202,178 +273,339 @@ you should use this, as the current PATH will have been cleared.
|
||||
sub is_pyzor_available {
|
||||
my ($self) = @_;
|
||||
|
||||
my $pyzor = $self->{main}->{conf}->{pyzor_path} || '';
|
||||
unless ($pyzor) {
|
||||
$pyzor = Mail::SpamAssassin::Util::find_executable_in_env_path('pyzor');
|
||||
}
|
||||
my $pyzor = $self->{main}->{conf}->{pyzor_path} ||
|
||||
Mail::SpamAssassin::Util::find_executable_in_env_path('pyzor');
|
||||
|
||||
unless ($pyzor && -x $pyzor) {
|
||||
dbg("pyzor: pyzor is not available: no pyzor executable found");
|
||||
dbg("pyzor: no pyzor executable found");
|
||||
$self->{pyzor_available} = 0;
|
||||
return 0;
|
||||
}
|
||||
|
||||
# remember any found pyzor
|
||||
$self->{main}->{conf}->{pyzor_path} = $pyzor;
|
||||
|
||||
dbg("pyzor: pyzor is available: " . $self->{main}->{conf}->{pyzor_path});
|
||||
dbg("pyzor: pyzor is available: $pyzor");
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub get_pyzor_interface {
|
||||
my ($self) = @_;
|
||||
sub finish_parsing_start {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
if (!$self->{main}->{conf}->{use_pyzor}) {
|
||||
dbg("pyzor: use_pyzor option not enabled, disabling Pyzor");
|
||||
$self->{pyzor_interface} = "disabled";
|
||||
$self->{pyzor_available} = 0;
|
||||
# If forking, hard adjust priority -100 to launch early
|
||||
# Find rulenames from eval_to_rule mappings
|
||||
if ($opts->{conf}->{pyzor_fork}) {
|
||||
foreach (@{$opts->{conf}->{eval_to_rule}->{check_pyzor}}) {
|
||||
dbg("pyzor: adjusting rule $_ priority to -100");
|
||||
$opts->{conf}->{priority}->{$_} = -100;
|
||||
}
|
||||
elsif ($self->is_pyzor_available()) {
|
||||
$self->{pyzor_interface} = "pyzor";
|
||||
$self->{pyzor_available} = 1;
|
||||
}
|
||||
else {
|
||||
dbg("pyzor: no pyzor found, disabling Pyzor");
|
||||
$self->{pyzor_available} = 0;
|
||||
}
|
||||
}
|
||||
|
||||
sub check_pyzor {
|
||||
my ($self, $permsgstatus, $full) = @_;
|
||||
my ($self, $pms, $full) = @_;
|
||||
|
||||
# initialize valid tags
|
||||
$permsgstatus->{tag_data}->{PYZOR} = "";
|
||||
return 0 if !$self->{pyzor_available};
|
||||
return 0 if !$self->{main}->{conf}->{use_pyzor};
|
||||
|
||||
return 0 if $pms->{pyzor_running};
|
||||
$pms->{pyzor_running} = 1;
|
||||
|
||||
return 0 if !$self->is_pyzor_available();
|
||||
|
||||
my $timer = $self->{main}->time_method("check_pyzor");
|
||||
|
||||
$self->get_pyzor_interface();
|
||||
return 0 unless $self->{pyzor_available};
|
||||
# initialize valid tags
|
||||
$pms->{tag_data}->{PYZOR} = '';
|
||||
|
||||
return $self->pyzor_lookup($permsgstatus, $full);
|
||||
# create fulltext tmpfile now (before possible forking)
|
||||
$pms->{pyzor_tmpfile} = $pms->create_fulltext_tmpfile();
|
||||
|
||||
## non-forking method
|
||||
|
||||
if (!$self->{main}->{conf}->{pyzor_fork}) {
|
||||
my @results = $self->pyzor_lookup($pms);
|
||||
return $self->_check_result($pms, \@results);
|
||||
}
|
||||
|
||||
## forking method
|
||||
|
||||
$pms->{pyzor_rulename} = $pms->get_current_eval_rule_name();
|
||||
|
||||
# create socketpair for communication
|
||||
$pms->{pyzor_backchannel} = Mail::SpamAssassin::SubProcBackChannel->new();
|
||||
my $back_selector = '';
|
||||
$pms->{pyzor_backchannel}->set_selector(\$back_selector);
|
||||
eval {
|
||||
$pms->{pyzor_backchannel}->setup_backchannel_parent_pre_fork();
|
||||
} or do {
|
||||
dbg("pyzor: backchannel pre-setup failed: $@");
|
||||
delete $pms->{pyzor_backchannel};
|
||||
return 0;
|
||||
};
|
||||
|
||||
my $pid = fork();
|
||||
if (!defined $pid) {
|
||||
info("pyzor: child fork failed: $!");
|
||||
delete $pms->{pyzor_backchannel};
|
||||
return 0;
|
||||
}
|
||||
if (!$pid) {
|
||||
$0 = "$0 (pyzor)";
|
||||
$SIG{CHLD} = 'DEFAULT';
|
||||
$SIG{PIPE} = 'IGNORE';
|
||||
$SIG{$_} = sub {
|
||||
eval { dbg("pyzor: child process $$ caught signal $_[0]"); };
|
||||
force_die(6); # avoid END and destructor processing
|
||||
} foreach am_running_on_windows()?qw(INT HUP TERM QUIT):qw(INT HUP TERM TSTP QUIT USR1 USR2);
|
||||
dbg("pyzor: child process $$ forked");
|
||||
$pms->{pyzor_backchannel}->setup_backchannel_child_post_fork();
|
||||
my @results = $self->pyzor_lookup($pms);
|
||||
my $backmsg;
|
||||
eval {
|
||||
$backmsg = Storable::freeze(\@results);
|
||||
};
|
||||
if ($@) {
|
||||
dbg("pyzor: child return value freeze failed: $@");
|
||||
force_die(0); # avoid END and destructor processing
|
||||
}
|
||||
if (!syswrite($pms->{pyzor_backchannel}->{parent}, $backmsg)) {
|
||||
dbg("pyzor: child backchannel write failed: $!");
|
||||
}
|
||||
force_die(0); # avoid END and destructor processing
|
||||
}
|
||||
|
||||
$pms->{pyzor_pid} = $pid;
|
||||
|
||||
eval {
|
||||
$pms->{pyzor_backchannel}->setup_backchannel_parent_post_fork($pid);
|
||||
} or do {
|
||||
dbg("pyzor: backchannel post-setup failed: $@");
|
||||
delete $pms->{pyzor_backchannel};
|
||||
return 0;
|
||||
};
|
||||
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
sub pyzor_lookup {
|
||||
my ($self, $permsgstatus, $fulltext) = @_;
|
||||
my @response;
|
||||
my $pyzor_count;
|
||||
my $pyzor_whitelisted;
|
||||
my $timeout = $self->{main}->{conf}->{pyzor_timeout};
|
||||
my ($self, $pms) = @_;
|
||||
|
||||
$pyzor_count = 0;
|
||||
$pyzor_whitelisted = 0;
|
||||
my $pid;
|
||||
|
||||
# use a temp file here -- open2() is unreliable, buffering-wise, under spamd
|
||||
my $tmpf = $permsgstatus->create_fulltext_tmpfile($fulltext);
|
||||
my $conf = $self->{main}->{conf};
|
||||
my $timeout = $conf->{pyzor_timeout};
|
||||
|
||||
# note: not really tainted, this came from system configuration file
|
||||
my $path = untaint_file_path($self->{main}->{conf}->{pyzor_path});
|
||||
my $opts = untaint_var($self->{main}->{conf}->{pyzor_options}) || '';
|
||||
my $path = untaint_file_path($conf->{pyzor_path});
|
||||
my $opts = untaint_var($conf->{pyzor_options}) || '';
|
||||
|
||||
$permsgstatus->enter_helper_run_mode();
|
||||
$pms->enter_helper_run_mode();
|
||||
|
||||
my $pid;
|
||||
my @resp;
|
||||
my $timer = Mail::SpamAssassin::Timeout->new(
|
||||
{ secs => $timeout, deadline => $permsgstatus->{master_deadline} });
|
||||
{ secs => $timeout, deadline => $pms->{master_deadline} });
|
||||
my $err = $timer->run_and_catch(sub {
|
||||
|
||||
local $SIG{PIPE} = sub { die "__brokenpipe__ignore__\n" };
|
||||
|
||||
dbg("pyzor: opening pipe: " . join(' ', $path, $opts, "check", "< $tmpf"));
|
||||
dbg("pyzor: opening pipe: ".
|
||||
join(' ', $path, $opts, "check", "<".$pms->{pyzor_tmpfile}));
|
||||
|
||||
$pid = Mail::SpamAssassin::Util::helper_app_pipe_open(*PYZOR,
|
||||
$tmpf, 1, $path, split(' ', $opts), "check");
|
||||
$pms->{pyzor_tmpfile}, 1, $path, split(' ', $opts), "check");
|
||||
$pid or die "$!\n";
|
||||
|
||||
# read+split avoids a Perl I/O bug (Bug 5985)
|
||||
my($inbuf,$nread,$resp); $resp = '';
|
||||
while ( $nread=read(PYZOR,$inbuf,8192) ) { $resp .= $inbuf }
|
||||
my($inbuf, $nread);
|
||||
my $resp = '';
|
||||
while ($nread = read(PYZOR, $inbuf, 8192)) { $resp .= $inbuf }
|
||||
defined $nread or die "error reading from pipe: $!";
|
||||
@response = split(/^/m, $resp, -1); undef $resp;
|
||||
@resp = split(/^/m, $resp, -1);
|
||||
|
||||
my $errno = 0; close PYZOR or $errno = $!;
|
||||
if (proc_status_ok($?,$errno)) {
|
||||
my $errno = 0;
|
||||
close PYZOR or $errno = $!;
|
||||
if (proc_status_ok($?, $errno)) {
|
||||
dbg("pyzor: [%s] finished successfully", $pid);
|
||||
} elsif (proc_status_ok($?,$errno, 0,1)) { # sometimes it exits with 1
|
||||
dbg("pyzor: [%s] finished: %s", $pid, exit_status_str($?,$errno));
|
||||
} elsif (proc_status_ok($?, $errno, 0, 1)) { # sometimes it exits with 1
|
||||
dbg("pyzor: [%s] finished: %s", $pid, exit_status_str($?, $errno));
|
||||
} else {
|
||||
info("pyzor: [%s] error: %s", $pid, exit_status_str($?,$errno));
|
||||
}
|
||||
|
||||
if (!@response) {
|
||||
# this exact string is needed below
|
||||
warn("no response\n"); # yes, this is possible
|
||||
return;
|
||||
}
|
||||
chomp for @response;
|
||||
|
||||
if ($response[0] =~ /^Traceback/) {
|
||||
warn("internal error, python traceback seen in response: ".
|
||||
join("\\n", @response));
|
||||
} else {
|
||||
dbg("pyzor: got response: ".join("\\n", @response));
|
||||
info("pyzor: [%s] error: %s", $pid, exit_status_str($?, $errno));
|
||||
}
|
||||
|
||||
});
|
||||
|
||||
if (defined(fileno(*PYZOR))) { # still open
|
||||
if ($pid) {
|
||||
if (kill('TERM',$pid)) { dbg("pyzor: killed stale helper [$pid]") }
|
||||
else { dbg("pyzor: killing helper application [$pid] failed: $!") }
|
||||
if (kill('TERM', $pid)) {
|
||||
dbg("pyzor: killed stale helper [$pid]");
|
||||
} else {
|
||||
dbg("pyzor: killing helper application [$pid] failed: $!");
|
||||
}
|
||||
my $errno = 0; close PYZOR or $errno = $!;
|
||||
proc_status_ok($?,$errno)
|
||||
or info("pyzor: [%s] error: %s", $pid, exit_status_str($?,$errno));
|
||||
}
|
||||
$permsgstatus->leave_helper_run_mode();
|
||||
my $errno = 0;
|
||||
close PYZOR or $errno = $!;
|
||||
proc_status_ok($?, $errno)
|
||||
or info("pyzor: [%s] error: %s", $pid, exit_status_str($?, $errno));
|
||||
}
|
||||
|
||||
$pms->leave_helper_run_mode();
|
||||
|
||||
if ($timer->timed_out()) {
|
||||
dbg("pyzor: check timed out after $timeout seconds");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($err) {
|
||||
return ();
|
||||
} elsif ($err) {
|
||||
chomp $err;
|
||||
if ($err eq "__brokenpipe__ignore__") {
|
||||
dbg("pyzor: check failed: broken pipe");
|
||||
} elsif ($err eq "no response") {
|
||||
dbg("pyzor: check failed: no response");
|
||||
} else {
|
||||
warn("pyzor: check failed: $err\n");
|
||||
info("pyzor: check failed: $err");
|
||||
return ();
|
||||
}
|
||||
|
||||
return @resp;
|
||||
}
|
||||
|
||||
sub check_tick {
|
||||
my ($self, $opts) = @_;
|
||||
$self->_check_forked_result($opts->{permsgstatus}, 0);
|
||||
}
|
||||
|
||||
sub check_cleanup {
|
||||
my ($self, $opts) = @_;
|
||||
$self->_check_forked_result($opts->{permsgstatus}, 1);
|
||||
}
|
||||
|
||||
sub _check_forked_result {
|
||||
my ($self, $pms, $finish) = @_;
|
||||
|
||||
return 0 if !$pms->{pyzor_backchannel};
|
||||
return 0 if !$pms->{pyzor_pid};
|
||||
|
||||
my $timer = $self->{main}->time_method("check_pyzor");
|
||||
|
||||
$pms->{pyzor_abort} = $pms->{deadline_exceeded} || $pms->{shortcircuited};
|
||||
|
||||
my $kid_pid = $pms->{pyzor_pid};
|
||||
# if $finish, force waiting for the child
|
||||
my $pid = waitpid($kid_pid, $finish && !$pms->{pyzor_abort} ? 0 : WNOHANG);
|
||||
if ($pid == 0) {
|
||||
#dbg("pyzor: child process $kid_pid not finished yet, trying later");
|
||||
if ($pms->{pyzor_abort}) {
|
||||
dbg("pyzor: bailing out due to deadline/shortcircuit");
|
||||
kill('TERM', $kid_pid);
|
||||
if (waitpid($kid_pid, WNOHANG) == 0) {
|
||||
sleep(1);
|
||||
if (waitpid($kid_pid, WNOHANG) == 0) {
|
||||
dbg("pyzor: child process $kid_pid still alive, KILL");
|
||||
kill('KILL', $kid_pid);
|
||||
waitpid($kid_pid, 0);
|
||||
}
|
||||
}
|
||||
delete $pms->{pyzor_pid};
|
||||
delete $pms->{pyzor_backchannel};
|
||||
}
|
||||
return 0;
|
||||
} elsif ($pid == -1) {
|
||||
# child does not exist?
|
||||
dbg("pyzor: child process $kid_pid already handled?");
|
||||
delete $pms->{pyzor_backchannel};
|
||||
return 0;
|
||||
}
|
||||
|
||||
foreach my $one_response (@response) {
|
||||
$pms->rule_ready($pms->{pyzor_rulename}); # mark rule ready for metas
|
||||
|
||||
dbg("pyzor: child process $kid_pid finished, reading results");
|
||||
|
||||
my $backmsg;
|
||||
my $ret = sysread($pms->{pyzor_backchannel}->{latest_kid_fh}, $backmsg, am_running_on_windows()?512:PIPE_BUF);
|
||||
if (!defined $ret || $ret == 0) {
|
||||
dbg("pyzor: could not read result from child: ".($ret == 0 ? 0 : $!));
|
||||
delete $pms->{pyzor_backchannel};
|
||||
return 0;
|
||||
}
|
||||
|
||||
delete $pms->{pyzor_backchannel};
|
||||
|
||||
my $results;
|
||||
eval {
|
||||
$results = Storable::thaw($backmsg);
|
||||
};
|
||||
if ($@) {
|
||||
dbg("pyzor: child return value thaw failed: $@");
|
||||
return;
|
||||
}
|
||||
|
||||
$self->_check_result($pms, $results);
|
||||
}
|
||||
|
||||
sub _check_result {
|
||||
my ($self, $pms, $results) = @_;
|
||||
|
||||
if (!@$results) {
|
||||
dbg("pyzor: no response from server");
|
||||
return 0;
|
||||
}
|
||||
|
||||
my $count = 0;
|
||||
my $count_wl = 0;
|
||||
foreach my $res (@$results) {
|
||||
chomp($res);
|
||||
if ($res =~ /^Traceback/) {
|
||||
info("pyzor: internal error, python traceback seen in response: $res");
|
||||
return 0;
|
||||
}
|
||||
dbg("pyzor: got response: $res");
|
||||
# this regexp is intended to be a little bit forgiving
|
||||
if ($one_response =~ /^\S+\t.*?\t(\d+)\t(\d+)\s*$/) {
|
||||
if ($res =~ /^\S+\t.*?\t(\d+)\t(\d+)\s*$/) {
|
||||
# until pyzor servers can sync their DBs,
|
||||
# sum counts obtained from all servers
|
||||
$pyzor_whitelisted += $2+0;
|
||||
$pyzor_count += $1+0;
|
||||
}
|
||||
else {
|
||||
$count += untaint_var($1)+0; # crazy but needs untainting
|
||||
$count_wl += untaint_var($2)+0;
|
||||
} else {
|
||||
# warn on failures to parse
|
||||
dbg("pyzor: failure to parse response \"$one_response\"");
|
||||
info("pyzor: failure to parse response \"$res\"");
|
||||
}
|
||||
}
|
||||
|
||||
$permsgstatus->set_tag('PYZOR', $pyzor_whitelisted ? "Whitelisted."
|
||||
: "Reported $pyzor_count times.");
|
||||
my $conf = $self->{main}->{conf};
|
||||
|
||||
if ($pyzor_count >= $self->{main}->{conf}->{pyzor_max}) {
|
||||
dbg("pyzor: listed: COUNT=$pyzor_count/$self->{main}->{conf}->{pyzor_max} WHITELIST=$pyzor_whitelisted");
|
||||
my $count_min = $conf->{pyzor_count_min};
|
||||
my $wl_min = $conf->{pyzor_welcomelist_min};
|
||||
|
||||
my $wl_limit = $count_wl >= $wl_min ?
|
||||
$count * $conf->{pyzor_welcomelist_factor} : 0;
|
||||
|
||||
dbg("pyzor: result: COUNT=$count/$count_min WELCOMELIST=$count_wl/$wl_min/%.1f",
|
||||
$wl_limit);
|
||||
$pms->set_tag('PYZOR', "Reported $count times, welcomelisted $count_wl times.");
|
||||
|
||||
# Empty body etc results in same hash, we should skip very large numbers..
|
||||
if ($count >= 1000000 || $count_wl >= 10000) {
|
||||
dbg("pyzor: result exceeded hardcoded limits, ignoring: count/wl 1000000/10000");
|
||||
return 0;
|
||||
}
|
||||
|
||||
# Welcomelisted?
|
||||
if ($wl_limit && $count_wl >= $wl_limit) {
|
||||
dbg("pyzor: message welcomelisted");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if ($count >= $count_min) {
|
||||
if ($conf->{pyzor_fork}) {
|
||||
# forked needs to run got_hit()
|
||||
$pms->got_hit($pms->{pyzor_rulename}, "", ruletype => 'eval');
|
||||
return 0;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
sub plugin_report {
|
||||
my ($self, $options) = @_;
|
||||
|
||||
return unless $self->{pyzor_available};
|
||||
return unless $self->{main}->{conf}->{use_pyzor};
|
||||
return if !$self->{pyzor_available};
|
||||
return if !$self->{main}->{conf}->{use_pyzor};
|
||||
return if $options->{report}->{options}->{dont_report_to_pyzor};
|
||||
return if !$self->is_pyzor_available();
|
||||
|
||||
if (!$options->{report}->{options}->{dont_report_to_pyzor} && $self->is_pyzor_available())
|
||||
{
|
||||
# use temporary file: open2() is unreliable due to buffering under spamd
|
||||
my $tmpf = $options->{report}->create_fulltext_tmpfile($options->{text});
|
||||
if ($self->pyzor_report($options, $tmpf)) {
|
||||
@ -384,8 +616,9 @@ sub plugin_report {
|
||||
else {
|
||||
info("reporter: could not report spam to Pyzor");
|
||||
}
|
||||
$options->{report}->delete_fulltext_tmpfile();
|
||||
}
|
||||
$options->{report}->delete_fulltext_tmpfile($tmpf);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
sub pyzor_report {
|
||||
@ -449,6 +682,9 @@ sub pyzor_report {
|
||||
return 1;
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_fork { 1 }
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
@ -43,11 +43,16 @@ package Mail::SpamAssassin::Plugin::Razor2;
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Timeout;
|
||||
use Mail::SpamAssassin::SubProcBackChannel;
|
||||
use Mail::SpamAssassin::Util qw(force_die am_running_on_windows);
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use Storable;
|
||||
use POSIX qw(PIPE_BUF WNOHANG);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
sub new {
|
||||
@ -73,8 +78,8 @@ sub new {
|
||||
}
|
||||
}
|
||||
|
||||
$self->register_eval_rule("check_razor2");
|
||||
$self->register_eval_rule("check_razor2_range");
|
||||
$self->register_eval_rule("check_razor2", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
$self->register_eval_rule("check_razor2_range", $Mail::SpamAssassin::Conf::TYPE_FULL_EVALS);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
@ -93,10 +98,26 @@ Whether to use Razor2, if it is available.
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'use_razor2',
|
||||
is_admin => 1,
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
=item razor_fork (0|1) (default: 1)
|
||||
|
||||
Instead of running Razor2 synchronously, fork separate process for it and
|
||||
read the results in later (similar to async DNS lookups). Increases
|
||||
throughput. Considered experimental on Windows, where default is 0.
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'razor_fork',
|
||||
is_admin => 1,
|
||||
default => am_running_on_windows()?0:1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC,
|
||||
});
|
||||
|
||||
=back
|
||||
|
||||
=head1 ADMINISTRATOR SETTINGS
|
||||
@ -195,7 +216,7 @@ sub razor2_access {
|
||||
my $sigs = $rc->compute_sigs($objects)
|
||||
or die "$debug: error in compute_sigs";
|
||||
|
||||
# if mail isn't whitelisted, check it out
|
||||
# if mail isn't welcomelisted, check it out
|
||||
# see 'man razor-whitelist'
|
||||
if ($type ne 'check' || ! $rc->local_check($objects->[0])) {
|
||||
# provide a better error message when servers are unavailable,
|
||||
@ -347,6 +368,8 @@ sub plugin_report {
|
||||
return unless $self->{main}->{conf}->{use_razor2};
|
||||
return if $options->{report}->{options}->{dont_report_to_razor};
|
||||
|
||||
my $timer = $self->{main}->time_method("razor2_report");
|
||||
|
||||
if ($self->razor2_access($options->{text}, 'report', undef)) {
|
||||
$options->{report}->{report_available} = 1;
|
||||
info('reporter: spam reported to Razor');
|
||||
@ -360,6 +383,8 @@ sub plugin_report {
|
||||
sub plugin_revoke {
|
||||
my ($self, $options) = @_;
|
||||
|
||||
my $timer = $self->{main}->time_method("razor2_revoke");
|
||||
|
||||
return unless $self->{razor2_available};
|
||||
return if $self->{main}->{local_tests_only};
|
||||
return unless $self->{main}->{conf}->{use_razor2};
|
||||
@ -375,34 +400,191 @@ sub plugin_revoke {
|
||||
}
|
||||
}
|
||||
|
||||
sub finish_parsing_start {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
# If forking, hard adjust priority -100 to launch early
|
||||
# Find rulenames from eval_to_rule mappings
|
||||
if ($opts->{conf}->{razor_fork}) {
|
||||
foreach (@{$opts->{conf}->{eval_to_rule}->{check_razor2}}) {
|
||||
dbg("razor2: adjusting rule $_ priority to -100");
|
||||
$opts->{conf}->{priority}->{$_} = -100;
|
||||
}
|
||||
foreach (@{$opts->{conf}->{eval_to_rule}->{check_razor2_range}}) {
|
||||
dbg("razor2: adjusting rule $_ priority to -100");
|
||||
$opts->{conf}->{priority}->{$_} = -100;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
sub check_razor2 {
|
||||
my ($self, $permsgstatus, $full) = @_;
|
||||
my ($self, $pms, $full) = @_;
|
||||
|
||||
return $permsgstatus->{razor2_result} if (defined $permsgstatus->{razor2_result});
|
||||
$permsgstatus->{razor2_result} = 0;
|
||||
$permsgstatus->{razor2_cf_score} = { '4' => 0, '8' => 0 };
|
||||
return 0 unless $self->{razor2_available};
|
||||
return 0 unless $self->{main}->{conf}->{use_razor2};
|
||||
|
||||
return unless $self->{razor2_available};
|
||||
return unless $self->{main}->{conf}->{use_razor2};
|
||||
return $pms->{razor2_result} if (defined $pms->{razor2_result});
|
||||
|
||||
return 0 if $pms->{razor2_running};
|
||||
$pms->{razor2_running} = 1;
|
||||
|
||||
my $timer = $self->{main}->time_method("check_razor2");
|
||||
|
||||
my $return;
|
||||
my @results;
|
||||
## non-forking method
|
||||
|
||||
if (!$self->{main}->{conf}->{razor_fork}) {
|
||||
# TODO: check for cache header, set results appropriately
|
||||
|
||||
# do it this way to make it easier to get out the results later from the
|
||||
# netcache plugin
|
||||
($return, @results) =
|
||||
$self->razor2_access($full, 'check', $permsgstatus->{master_deadline});
|
||||
# netcache plugin ... what netcache plugin?
|
||||
(undef, my @results) =
|
||||
$self->razor2_access($full, 'check', $pms->{master_deadline});
|
||||
return $self->_check_result($pms, \@results);
|
||||
}
|
||||
|
||||
## forking method
|
||||
|
||||
$pms->{razor2_rulename} = $pms->get_current_eval_rule_name();
|
||||
|
||||
# create socketpair for communication
|
||||
$pms->{razor2_backchannel} = Mail::SpamAssassin::SubProcBackChannel->new();
|
||||
my $back_selector = '';
|
||||
$pms->{razor2_backchannel}->set_selector(\$back_selector);
|
||||
eval {
|
||||
$pms->{razor2_backchannel}->setup_backchannel_parent_pre_fork();
|
||||
} or do {
|
||||
dbg("razor2: backchannel pre-setup failed: $@");
|
||||
delete $pms->{razor2_backchannel};
|
||||
return 0;
|
||||
};
|
||||
|
||||
my $pid = fork();
|
||||
if (!defined $pid) {
|
||||
info("razor2: child fork failed: $!");
|
||||
delete $pms->{razor2_backchannel};
|
||||
return 0;
|
||||
}
|
||||
if (!$pid) {
|
||||
$0 = "$0 (razor2)";
|
||||
$SIG{CHLD} = 'DEFAULT';
|
||||
$SIG{PIPE} = 'IGNORE';
|
||||
$SIG{$_} = sub {
|
||||
eval { dbg("razor2: child process $$ caught signal $_[0]"); };
|
||||
force_die(6); # avoid END and destructor processing
|
||||
} foreach am_running_on_windows()?qw(INT HUP TERM QUIT):qw(INT HUP TERM TSTP QUIT USR1 USR2);
|
||||
dbg("razor2: child process $$ forked");
|
||||
$pms->{razor2_backchannel}->setup_backchannel_child_post_fork();
|
||||
(undef, my @results) =
|
||||
$self->razor2_access($full, 'check', $pms->{master_deadline});
|
||||
my $backmsg;
|
||||
eval {
|
||||
$backmsg = Storable::freeze(\@results);
|
||||
};
|
||||
if ($@) {
|
||||
dbg("razor2: child return value freeze failed: $@");
|
||||
force_die(0); # avoid END and destructor processing
|
||||
}
|
||||
if (!syswrite($pms->{razor2_backchannel}->{parent}, $backmsg)) {
|
||||
dbg("razor2: child backchannel write failed: $!");
|
||||
}
|
||||
force_die(0); # avoid END and destructor processing
|
||||
}
|
||||
|
||||
$pms->{razor2_pid} = $pid;
|
||||
|
||||
eval {
|
||||
$pms->{razor2_backchannel}->setup_backchannel_parent_post_fork($pid);
|
||||
} or do {
|
||||
dbg("razor2: backchannel post-setup failed: $@");
|
||||
delete $pms->{razor2_backchannel};
|
||||
return 0;
|
||||
};
|
||||
|
||||
return; # return undef for async status
|
||||
}
|
||||
|
||||
sub check_tick {
|
||||
my ($self, $opts) = @_;
|
||||
$self->_check_forked_result($opts->{permsgstatus}, 0);
|
||||
}
|
||||
|
||||
sub check_cleanup {
|
||||
my ($self, $opts) = @_;
|
||||
$self->_check_forked_result($opts->{permsgstatus}, 1);
|
||||
}
|
||||
|
||||
sub _check_forked_result {
|
||||
my ($self, $pms, $finish) = @_;
|
||||
|
||||
return 0 if !$pms->{razor2_backchannel};
|
||||
return 0 if !$pms->{razor2_pid};
|
||||
|
||||
my $timer = $self->{main}->time_method("check_razor2");
|
||||
|
||||
$pms->{razor2_abort} = $pms->{deadline_exceeded} || $pms->{shortcircuited};
|
||||
|
||||
my $kid_pid = $pms->{razor2_pid};
|
||||
# if $finish, force waiting for the child
|
||||
my $pid = waitpid($kid_pid, $finish && !$pms->{razor2_abort} ? 0 : WNOHANG);
|
||||
if ($pid == 0) {
|
||||
#dbg("razor2: child process $kid_pid not finished yet, trying later");
|
||||
if ($pms->{razor2_abort}) {
|
||||
dbg("razor2: bailing out due to deadline/shortcircuit");
|
||||
kill('TERM', $kid_pid);
|
||||
if (waitpid($kid_pid, WNOHANG) == 0) {
|
||||
sleep(1);
|
||||
if (waitpid($kid_pid, WNOHANG) == 0) {
|
||||
dbg("razor2: child process $kid_pid still alive, KILL");
|
||||
kill('KILL', $kid_pid);
|
||||
waitpid($kid_pid, 0);
|
||||
}
|
||||
}
|
||||
delete $pms->{razor2_pid};
|
||||
delete $pms->{razor2_backchannel};
|
||||
}
|
||||
return 0;
|
||||
} elsif ($pid == -1) {
|
||||
# child does not exist?
|
||||
dbg("razor2: child process $kid_pid already handled?");
|
||||
delete $pms->{razor2_backchannel};
|
||||
return 0;
|
||||
}
|
||||
|
||||
$pms->rule_ready($pms->{razor2_rulename}); # mark rule ready for metas
|
||||
|
||||
dbg("razor2: child process $kid_pid finished, reading results");
|
||||
|
||||
my $backmsg;
|
||||
my $ret = sysread($pms->{razor2_backchannel}->{latest_kid_fh}, $backmsg, am_running_on_windows()?512:PIPE_BUF);
|
||||
if (!defined $ret || $ret == 0) {
|
||||
dbg("razor2: could not read result from child: ".($ret == 0 ? 0 : $!));
|
||||
delete $pms->{razor2_backchannel};
|
||||
return 0;
|
||||
}
|
||||
|
||||
delete $pms->{razor2_backchannel};
|
||||
|
||||
my $results;
|
||||
eval {
|
||||
$results = Storable::thaw($backmsg);
|
||||
};
|
||||
if ($@) {
|
||||
dbg("razor2: child return value thaw failed: $@");
|
||||
return;
|
||||
}
|
||||
|
||||
$self->_check_result($pms, $results);
|
||||
}
|
||||
|
||||
sub _check_result {
|
||||
my ($self, $pms, $results) = @_;
|
||||
|
||||
$self->{main}->call_plugins ('process_razor_result',
|
||||
{ results => \@results, permsgstatus => $permsgstatus }
|
||||
{ results => $results, permsgstatus => $pms }
|
||||
);
|
||||
|
||||
foreach my $result (@results) {
|
||||
foreach my $result (@$results) {
|
||||
if (exists $result->{result}) {
|
||||
$permsgstatus->{razor2_result} = $result->{result} if $result->{result};
|
||||
$pms->{razor2_result} = $result->{result} if $result->{result};
|
||||
}
|
||||
elsif ($result->{noresponse}) {
|
||||
dbg('razor2: part=' . $result->{part} . ' noresponse');
|
||||
@ -415,46 +597,86 @@ sub check_razor2 {
|
||||
|
||||
next if $result->{contested};
|
||||
|
||||
my $cf = $permsgstatus->{razor2_cf_score}->{$result->{engine}} || 0;
|
||||
my $cf = $pms->{razor2_cf_score}->{$result->{engine}} || 0;
|
||||
if ($result->{confidence} > $cf) {
|
||||
$permsgstatus->{razor2_cf_score}->{$result->{engine}} = $result->{confidence};
|
||||
$pms->{razor2_cf_score}->{$result->{engine}} = $result->{confidence};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
dbg("razor2: results: spam? " . $permsgstatus->{razor2_result});
|
||||
while(my ($engine, $cf) = each %{$permsgstatus->{razor2_cf_score}}) {
|
||||
$pms->{razor2_result} ||= 0;
|
||||
$pms->{razor2_cf_score} ||= {};
|
||||
|
||||
dbg("razor2: results: spam? " . $pms->{razor2_result});
|
||||
while(my ($engine, $cf) = each %{$pms->{razor2_cf_score}}) {
|
||||
dbg("razor2: results: engine $engine, highest cf score: $cf");
|
||||
}
|
||||
|
||||
return $permsgstatus->{razor2_result};
|
||||
if ($self->{main}->{conf}->{razor_fork}) {
|
||||
# forked needs to run got_hit()
|
||||
if ($pms->{razor2_rulename} && $pms->{razor2_result}) {
|
||||
$pms->got_hit($pms->{razor2_rulename}, "", ruletype => 'eval');
|
||||
}
|
||||
# forked needs to run range callbacks
|
||||
if ($pms->{razor2_range_callbacks}) {
|
||||
foreach (@{$pms->{razor2_range_callbacks}}) {
|
||||
$self->check_razor2_range($pms, '', @$_);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return $pms->{razor2_result};
|
||||
}
|
||||
|
||||
# Check the cf value of a given message and return if it's within the
|
||||
# given range
|
||||
sub check_razor2_range {
|
||||
my ($self, $permsgstatus, $body, $engine, $min, $max) = @_;
|
||||
my ($self, $pms, $body, $engine, $min, $max, $rulename) = @_;
|
||||
|
||||
# If Razor2 isn't available, or the general test is disabled, don't
|
||||
# continue.
|
||||
return unless $self->{razor2_available};
|
||||
return unless $self->{main}->{conf}->{use_razor2};
|
||||
return unless $self->{main}->{conf}->{scores}->{'RAZOR2_CHECK'};
|
||||
return 0 unless $self->{razor2_available};
|
||||
return 0 unless $self->{main}->{conf}->{use_razor2};
|
||||
|
||||
# If Razor2 hasn't been checked yet, go ahead and run it.
|
||||
unless (defined $permsgstatus->{razor2_result}) {
|
||||
$self->check_razor2($permsgstatus, $body);
|
||||
# Check if callback overriding rulename
|
||||
if (!defined $rulename) {
|
||||
$rulename = $pms->get_current_eval_rule_name();
|
||||
}
|
||||
|
||||
if ($pms->{razor2_abort}) {
|
||||
$pms->rule_ready($rulename); # mark rule ready for metas
|
||||
return;
|
||||
}
|
||||
|
||||
# If forked, call back later unless results are in
|
||||
if ($self->{main}->{conf}->{razor_fork}) {
|
||||
if (!defined $pms->{razor2_result}) {
|
||||
dbg("razor2: delaying check_razor2_range call for $rulename");
|
||||
# array matches check_razor2_range() argument order
|
||||
push @{$pms->{razor2_range_callbacks}},
|
||||
[$engine, $min, $max, $rulename];
|
||||
return; # return undef for async status
|
||||
}
|
||||
} else {
|
||||
# If Razor2 hasn't been checked yet, go ahead and run it.
|
||||
# (only if we are non-forking.. forking will handle these in
|
||||
# callbacks)
|
||||
if (!$pms->{razor2_running}) {
|
||||
$self->check_razor2($pms, $body);
|
||||
}
|
||||
}
|
||||
|
||||
$pms->rule_ready($rulename); # mark rule ready for metas
|
||||
|
||||
my $cf = 0;
|
||||
if ($engine) {
|
||||
$cf = $permsgstatus->{razor2_cf_score}->{$engine};
|
||||
return unless defined $cf;
|
||||
$cf = $pms->{razor2_cf_score}->{$engine};
|
||||
return 0 unless defined $cf;
|
||||
}
|
||||
else {
|
||||
# If no specific engine was given to the rule, find the highest cf
|
||||
# determined and use that
|
||||
while(my ($engine, $ecf) = each %{$permsgstatus->{razor2_cf_score}}) {
|
||||
while(my ($engine, $ecf) = each %{$pms->{razor2_cf_score}}) {
|
||||
if ($ecf > $cf) {
|
||||
$cf = $ecf;
|
||||
}
|
||||
@ -462,13 +684,20 @@ sub check_razor2_range {
|
||||
}
|
||||
|
||||
if ($cf >= $min && $cf <= $max) {
|
||||
$permsgstatus->test_log(sprintf("cf: %3d", $cf));
|
||||
my $cf_str = sprintf("cf: %3d", $cf);
|
||||
$pms->test_log($cf_str, $rulename);
|
||||
if ($self->{main}->{conf}->{razor_fork}) {
|
||||
$pms->got_hit($rulename, "", ruletype => 'eval');
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
# Version features
|
||||
sub has_fork { 1 }
|
||||
|
||||
1;
|
||||
|
||||
=back
|
||||
|
@ -53,10 +53,8 @@ Following metadata headers and tags are added:
|
||||
|
||||
=head1 REQUIREMENT
|
||||
|
||||
This plugin requires the GeoIP2, Geo::IP, IP::Country::DB_File or
|
||||
IP::Country::Fast module from CPAN.
|
||||
For backward compatibility IP::Country::Fast is used as fallback if no db_type
|
||||
is specified in the config file.
|
||||
This plugin uses Mail::SpamAssassin::GeoDB and requires a module supported
|
||||
by it, for example MaxMind::DB::Reader (GeoIP2).
|
||||
|
||||
=cut
|
||||
|
||||
@ -64,7 +62,6 @@ package Mail::SpamAssassin::Plugin::RelayCountry;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:ip);
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
@ -72,6 +69,11 @@ use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
my $db;
|
||||
my $dbv6;
|
||||
my $db_info; # will hold database info
|
||||
my $db_type; # will hold database type
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
@ -82,282 +84,39 @@ sub new {
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
# we need GeoDB country
|
||||
$self->{main}->{geodb_wanted}->{country} = 1;
|
||||
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
=head1 USER PREFERENCES
|
||||
|
||||
The following options can be used in both site-wide (C<local.cf>) and
|
||||
user-specific (C<user_prefs>) configuration files to customize how
|
||||
SpamAssassin handles incoming email messages.
|
||||
|
||||
=over 4
|
||||
|
||||
=item country_db_type STRING
|
||||
|
||||
This option tells SpamAssassin which type of Geo database to use.
|
||||
Valid database types are GeoIP, GeoIP2, DB_File and Fast.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'country_db_type',
|
||||
default => "GeoIP",
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value !~ /^(?:GeoIP|GeoIP2|DB_File|Fast)$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{country_db_type} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
=over 4
|
||||
|
||||
=item country_db_path STRING
|
||||
|
||||
This option tells SpamAssassin where to find MaxMind GeoIP2 or IP::Country::DB_File database.
|
||||
|
||||
If not defined, GeoIP2 default search includes:
|
||||
/usr/local/share/GeoIP/GeoIP2-Country.mmdb
|
||||
/usr/share/GeoIP/GeoIP2-Country.mmdb
|
||||
/var/lib/GeoIP/GeoIP2-Country.mmdb
|
||||
/usr/local/share/GeoIP/GeoLite2-Country.mmdb
|
||||
/usr/share/GeoIP/GeoLite2-Country.mmdb
|
||||
/var/lib/GeoIP/GeoLite2-Country.mmdb
|
||||
(and same paths again for -City.mmdb, which also has country functionality)
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'country_db_path',
|
||||
default => "",
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if (!defined $value || !length $value) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
if (!-e $value) {
|
||||
info("config: country_db_path \"$value\" is not accessible");
|
||||
$self->{country_db_path} = $value;
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{country_db_path} = $value;
|
||||
}
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'geoip2_default_db_path',
|
||||
default => [
|
||||
'/usr/local/share/GeoIP/GeoIP2-Country.mmdb',
|
||||
'/usr/share/GeoIP/GeoIP2-Country.mmdb',
|
||||
'/var/lib/GeoIP/GeoIP2-Country.mmdb',
|
||||
'/usr/local/share/GeoIP/GeoLite2-Country.mmdb',
|
||||
'/usr/share/GeoIP/GeoLite2-Country.mmdb',
|
||||
'/var/lib/GeoIP/GeoLite2-Country.mmdb',
|
||||
'/usr/local/share/GeoIP/GeoIP2-City.mmdb',
|
||||
'/usr/share/GeoIP/GeoIP2-City.mmdb',
|
||||
'/var/lib/GeoIP/GeoIP2-City.mmdb',
|
||||
'/usr/local/share/GeoIP/GeoLite2-City.mmdb',
|
||||
'/usr/share/GeoIP/GeoLite2-City.mmdb',
|
||||
'/var/lib/GeoIP/GeoLite2-City.mmdb',
|
||||
],
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRINGLIST,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value eq '') {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
push(@{$self->{geoip2_default_db_path}}, split(/\s+/, $value));
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
sub get_country {
|
||||
my ($self, $ip, $db, $dbv6, $country_db_type) = @_;
|
||||
my $cc;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
my $IPV4_ADDRESS = IPV4_ADDRESS;
|
||||
|
||||
# Private IPs will always be returned as '**'
|
||||
if ($ip =~ /^$IP_PRIVATE$/o) {
|
||||
$cc = "**";
|
||||
}
|
||||
elsif ($country_db_type eq "GeoIP") {
|
||||
if ($ip =~ /^$IPV4_ADDRESS$/o) {
|
||||
$cc = $db->country_code_by_addr($ip);
|
||||
} elsif (defined $dbv6) {
|
||||
$cc = $dbv6->country_code_by_addr_v6($ip);
|
||||
}
|
||||
}
|
||||
elsif ($country_db_type eq "GeoIP2") {
|
||||
my ($country, $country_rec);
|
||||
eval {
|
||||
if (index($db->metadata()->description()->{en}, 'City') != -1) {
|
||||
$country = $db->city( ip => $ip );
|
||||
} else {
|
||||
$country = $db->country( ip => $ip );
|
||||
}
|
||||
$country_rec = $country->country();
|
||||
$cc = $country_rec->iso_code();
|
||||
1;
|
||||
} or do {
|
||||
$@ =~ s/\s+Trace begun.*//s;
|
||||
dbg("metadata: RelayCountry: GeoIP2 failed: $@");
|
||||
}
|
||||
}
|
||||
elsif ($country_db_type eq "DB_File") {
|
||||
if ($ip =~ /^$IPV4_ADDRESS$/o ) {
|
||||
$cc = $db->inet_atocc($ip);
|
||||
} else {
|
||||
$cc = $db->inet6_atocc($ip);
|
||||
}
|
||||
}
|
||||
elsif ($country_db_type eq "Fast") {
|
||||
$cc = $db->inet_atocc($ip);
|
||||
}
|
||||
|
||||
$cc ||= 'XX';
|
||||
|
||||
return $cc;
|
||||
}
|
||||
|
||||
sub extract_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
my $pms = $opts->{permsgstatus};
|
||||
|
||||
my $db;
|
||||
my $dbv6;
|
||||
my $db_info; # will hold database info
|
||||
my $db_type; # will hold database type
|
||||
return if $self->{relaycountry_disabled};
|
||||
|
||||
my $country_db_type = $opts->{conf}->{country_db_type};
|
||||
my $country_db_path = $opts->{conf}->{country_db_path};
|
||||
|
||||
if ($country_db_type eq "GeoIP") {
|
||||
eval {
|
||||
require Geo::IP;
|
||||
$db = Geo::IP->open_type(Geo::IP->GEOIP_COUNTRY_EDITION, Geo::IP->GEOIP_STANDARD);
|
||||
die "GeoIP.dat not found" unless $db;
|
||||
# IPv6 requires version Geo::IP 1.39+ with GeoIP C API 1.4.7+
|
||||
if (Geo::IP->VERSION >= 1.39 && Geo::IP->api eq 'CAPI') {
|
||||
$dbv6 = Geo::IP->open_type(Geo::IP->GEOIP_COUNTRY_EDITION_V6, Geo::IP->GEOIP_STANDARD);
|
||||
if (!$dbv6) {
|
||||
dbg("metadata: RelayCountry: GeoIP: IPv6 support not enabled, GeoIPv6.dat not found");
|
||||
}
|
||||
} else {
|
||||
dbg("metadata: RelayCountry: GeoIP: IPv6 support not enabled, versions Geo::IP 1.39, GeoIP C API 1.4.7 required");
|
||||
}
|
||||
$db_info = sub { return "Geo::IP IPv4: " . ($db->database_info || '?')." / IPv6: ".($dbv6 ? $dbv6->database_info || '?' : '?') };
|
||||
1;
|
||||
} or do {
|
||||
# Fallback to IP::Country::Fast
|
||||
dbg("metadata: RelayCountry: GeoIP: GeoIP.dat not found, trying IP::Country::Fast as fallback");
|
||||
$country_db_type = "Fast";
|
||||
}
|
||||
}
|
||||
elsif ($country_db_type eq "GeoIP2") {
|
||||
if (!$country_db_path) {
|
||||
# Try some default locations
|
||||
foreach (@{$opts->{conf}->{geoip2_default_db_path}}) {
|
||||
if (-f $_) {
|
||||
$country_db_path = $_;
|
||||
last;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (-f $country_db_path) {
|
||||
eval {
|
||||
require GeoIP2::Database::Reader;
|
||||
$db = GeoIP2::Database::Reader->new(
|
||||
file => $country_db_path,
|
||||
locales => [ 'en' ]
|
||||
);
|
||||
die "unknown error" unless $db;
|
||||
$db_info = sub {
|
||||
my $m = $db->metadata();
|
||||
return "GeoIP2 ".$m->description()->{en}." / ".localtime($m->build_epoch());
|
||||
};
|
||||
1;
|
||||
} or do {
|
||||
# Fallback to IP::Country::Fast
|
||||
$@ =~ s/\s+Trace begun.*//s;
|
||||
dbg("metadata: RelayCountry: GeoIP2: ${country_db_path} load failed: $@, trying IP::Country::Fast as fallback");
|
||||
$country_db_type = "Fast";
|
||||
}
|
||||
} else {
|
||||
# Fallback to IP::Country::Fast
|
||||
my $err = $country_db_path ?
|
||||
"$country_db_path not found" : "database not found from default locations";
|
||||
dbg("metadata: RelayCountry: GeoIP2: $err, trying IP::Country::Fast as fallback");
|
||||
$country_db_type = "Fast";
|
||||
}
|
||||
}
|
||||
elsif ($country_db_type eq "DB_File") {
|
||||
if (-f $country_db_path) {
|
||||
eval {
|
||||
require IP::Country::DB_File;
|
||||
$db = IP::Country::DB_File->new($country_db_path);
|
||||
die "unknown error" unless $db;
|
||||
$db_info = sub { return "IP::Country::DB_File ".localtime($db->db_time()); };
|
||||
1;
|
||||
} or do {
|
||||
# Fallback to IP::Country::Fast
|
||||
dbg("metadata: RelayCountry: DB_File: ${country_db_path} load failed: $@, trying IP::Country::Fast as fallback");
|
||||
$country_db_type = "Fast";
|
||||
}
|
||||
} else {
|
||||
# Fallback to IP::Country::Fast
|
||||
dbg("metadata: RelayCountry: DB_File: ${country_db_path} not found, trying IP::Country::Fast as fallback");
|
||||
$country_db_type = "Fast";
|
||||
}
|
||||
if (!$self->{main}->{geodb} ||
|
||||
!$self->{main}->{geodb}->can('country')) {
|
||||
dbg("metadata: RelayCountry: plugin disabled, GeoDB country not available");
|
||||
$self->{relaycountry_disabled} = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
if ($country_db_type eq "Fast") {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
eval {
|
||||
require IP::Country::Fast;
|
||||
$db = IP::Country::Fast->new();
|
||||
$db_info = sub { return "IP::Country::Fast ".localtime($db->db_time()); };
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("metadata: RelayCountry: failed to load 'IP::Country::Fast', skipping: $eval_stat");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
if (!$db) {
|
||||
return 1;
|
||||
}
|
||||
|
||||
dbg("metadata: RelayCountry: Using database: ".$db_info->());
|
||||
my $msg = $opts->{msg};
|
||||
my $geodb = $self->{main}->{geodb};
|
||||
|
||||
my @cc_untrusted;
|
||||
foreach my $relay (@{$msg->{metadata}->{relays_untrusted}}) {
|
||||
my $ip = $relay->{ip};
|
||||
my $cc = $self->get_country($ip, $db, $dbv6, $country_db_type);
|
||||
my $cc = $geodb->get_country($ip);
|
||||
push @cc_untrusted, $cc;
|
||||
}
|
||||
|
||||
my @cc_external;
|
||||
foreach my $relay (@{$msg->{metadata}->{relays_external}}) {
|
||||
my $ip = $relay->{ip};
|
||||
my $cc = $self->get_country($ip, $db, $dbv6, $country_db_type);
|
||||
my $cc = $geodb->get_country($ip);
|
||||
push @cc_external, $cc;
|
||||
}
|
||||
|
||||
@ -369,7 +128,7 @@ sub extract_metadata {
|
||||
}
|
||||
if ($found_auth) {
|
||||
my $ip = $relay->{ip};
|
||||
my $cc = $self->get_country($ip, $db, $dbv6, $country_db_type);
|
||||
my $cc = $geodb->get_country($ip);
|
||||
push @cc_auth, $cc;
|
||||
}
|
||||
}
|
||||
@ -377,7 +136,7 @@ sub extract_metadata {
|
||||
my @cc_all;
|
||||
foreach my $relay (@{$msg->{metadata}->{relays_internal}}, @{$msg->{metadata}->{relays_external}}) {
|
||||
my $ip = $relay->{ip};
|
||||
my $cc = $self->get_country($ip, $db, $dbv6, $country_db_type);
|
||||
my $cc = $geodb->get_country($ip);
|
||||
push @cc_all, $cc;
|
||||
}
|
||||
|
||||
@ -400,8 +159,6 @@ sub extract_metadata {
|
||||
$msg->put_metadata("X-Relay-Countries-All", $ccstr);
|
||||
dbg("metadata: X-Relay-Countries-All: $ccstr");
|
||||
$pms->set_tag("RELAYCOUNTRYALL", @cc_all == 1 ? $cc_all[0] : \@cc_all);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
1;
|
||||
|
@ -28,6 +28,8 @@ use re 'taint';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
my $IPV4_ADDRESS = IPV4_ADDRESS;
|
||||
|
||||
# constructor: register the eval rule
|
||||
sub new {
|
||||
my $class = shift;
|
||||
@ -39,17 +41,17 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# the important bit!
|
||||
$self->register_eval_rule("check_for_numeric_helo");
|
||||
$self->register_eval_rule("check_for_illegal_ip");
|
||||
$self->register_eval_rule("check_all_trusted");
|
||||
$self->register_eval_rule("check_no_relays");
|
||||
$self->register_eval_rule("check_relays_unparseable");
|
||||
$self->register_eval_rule("check_for_sender_no_reverse");
|
||||
$self->register_eval_rule("check_for_from_domain_in_received_headers");
|
||||
$self->register_eval_rule("check_for_forged_received_trail");
|
||||
$self->register_eval_rule("check_for_forged_received_ip_helo");
|
||||
$self->register_eval_rule("helo_ip_mismatch");
|
||||
$self->register_eval_rule("check_for_no_rdns_dotcom_helo");
|
||||
$self->register_eval_rule("check_for_numeric_helo"); # type does not matter
|
||||
$self->register_eval_rule("check_for_illegal_ip"); # type does not matter
|
||||
$self->register_eval_rule("check_all_trusted"); # type does not matter
|
||||
$self->register_eval_rule("check_no_relays"); # type does not matter
|
||||
$self->register_eval_rule("check_relays_unparseable"); # type does not matter
|
||||
$self->register_eval_rule("check_for_sender_no_reverse"); # type does not matter
|
||||
$self->register_eval_rule("check_for_from_domain_in_received_headers", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule("check_for_forged_received_trail"); # type does not matter
|
||||
$self->register_eval_rule("check_for_forged_received_ip_helo"); # type does not matter
|
||||
$self->register_eval_rule("helo_ip_mismatch"); # type does not matter
|
||||
$self->register_eval_rule("check_for_no_rdns_dotcom_helo"); # type does not matter
|
||||
|
||||
return $self;
|
||||
}
|
||||
@ -72,7 +74,7 @@ sub hostname_to_domain {
|
||||
}
|
||||
}
|
||||
|
||||
sub _helo_forgery_whitelisted {
|
||||
sub _helo_forgery_welcomelisted {
|
||||
my ($helo, $rdns) = @_;
|
||||
if ($helo eq 'msn.com' && $rdns eq 'hotmail.com') { return 1; }
|
||||
0;
|
||||
@ -84,12 +86,10 @@ sub check_for_numeric_helo {
|
||||
my $rcvd = $pms->{relays_untrusted_str};
|
||||
|
||||
if ($rcvd) {
|
||||
my $IP_ADDRESS = IPV4_ADDRESS;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
local $1;
|
||||
# no re "strict"; # since perl 5.21.8: Ranges of ASCII printables...
|
||||
if ($rcvd =~ /\bhelo=($IP_ADDRESS)(?=[\000-\040,;\[()<>]|\z)/i # Bug 5878
|
||||
&& $1 !~ /$IP_PRIVATE/) {
|
||||
if ($rcvd =~ /\bhelo=($IPV4_ADDRESS)(?=[\000-\040,;\[()<>]|\z)/i # Bug 5878
|
||||
&& $1 !~ IS_IP_PRIVATE) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
@ -108,23 +108,21 @@ sub check_for_illegal_ip {
|
||||
# due to bug in pure IPv6 address regular expression
|
||||
sub helo_ip_mismatch {
|
||||
my ($self, $pms) = @_;
|
||||
my $IP_ADDRESS = IPV4_ADDRESS;
|
||||
my $IP_PRIVATE = IP_PRIVATE;
|
||||
|
||||
for my $relay (@{$pms->{relays_untrusted}}) {
|
||||
# is HELO usable?
|
||||
next unless ($relay->{helo} =~ m/^$IP_ADDRESS$/ &&
|
||||
$relay->{helo} !~ /$IP_PRIVATE/);
|
||||
next unless ($relay->{helo} =~ IS_IPV4_ADDRESS &&
|
||||
$relay->{helo} !~ IS_IP_PRIVATE);
|
||||
# compare HELO with IP
|
||||
return 1 if ($relay->{ip} =~ m/^$IP_ADDRESS$/ &&
|
||||
$relay->{ip} !~ m/$IP_PRIVATE/ &&
|
||||
return 1 if ($relay->{ip} =~ IS_IPV4_ADDRESS &&
|
||||
$relay->{ip} !~ IS_IP_PRIVATE &&
|
||||
$relay->{helo} ne $relay->{ip} &&
|
||||
# different IP is okay if in same /24
|
||||
$relay->{helo} =~ /^(\d+\.\d+\.\d+\.)/ &&
|
||||
index($relay->{ip}, $1) != 0);
|
||||
}
|
||||
|
||||
0;
|
||||
return 0;
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
@ -145,7 +143,7 @@ sub check_no_relays {
|
||||
|
||||
sub check_relays_unparseable {
|
||||
my ($self, $pms) = @_;
|
||||
return $pms->{num_relays_unparseable};
|
||||
return $pms->{num_relays_unparseable} ? 1 : 0;
|
||||
}
|
||||
|
||||
# Check if the apparent sender (in the last received header) had
|
||||
@ -215,7 +213,7 @@ sub check_for_from_domain_in_received_headers {
|
||||
sub check_for_no_rdns_dotcom_helo {
|
||||
my ($self, $pms) = @_;
|
||||
if (!exists $pms->{no_rdns_dotcom_helo}) { $self->_check_received_helos($pms); }
|
||||
return $pms->{no_rdns_dotcom_helo};
|
||||
return $pms->{no_rdns_dotcom_helo} ? 1 : 0;
|
||||
}
|
||||
|
||||
# Bug 1133
|
||||
@ -346,7 +344,7 @@ sub _check_for_forged_received {
|
||||
# allow private IP addrs here, could be a legit screwup
|
||||
if ($hclassb && $fclassb &&
|
||||
$hclassb ne $fclassb &&
|
||||
!($hlo =~ /$IP_PRIVATE/o))
|
||||
$hlo !~ IS_IP_PRIVATE)
|
||||
{
|
||||
dbg2("eval: forged-HELO: massive mismatch on IP-addr HELO: '$hlo' != '$fip'");
|
||||
$pms->{mismatch_ip_helo}++;
|
||||
@ -357,7 +355,7 @@ sub _check_for_forged_received {
|
||||
my $prev = $from[$i-1];
|
||||
if (defined($prev) && $i > 0
|
||||
&& $prev =~ /^\w+(?:[\w.-]+\.)+\w+$/
|
||||
&& $by ne $prev && !_helo_forgery_whitelisted($by, $prev))
|
||||
&& $by ne $prev && !_helo_forgery_welcomelisted($by, $prev))
|
||||
{
|
||||
dbg2("eval: forged-HELO: mismatch on from: '$prev' != '$by'");
|
||||
$pms->{mismatch_from}++;
|
||||
|
@ -49,7 +49,6 @@ SpamAssassin; it is not guaranteed to work with other versions of SpamAssassin.
|
||||
|
||||
package Mail::SpamAssassin::Plugin::ReplaceTags;
|
||||
|
||||
use Mail::SpamAssassin;
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(compile_regexp qr_to_string);
|
||||
@ -173,12 +172,15 @@ sub finish_parsing_end {
|
||||
# do the actual replacement
|
||||
my ($rec, $err) = compile_regexp($re, 0);
|
||||
if (!$rec) {
|
||||
info("replacetags: regexp compilation failed '$re': $err");
|
||||
info("replacetags: regexp compilation failed for $rule: '$re': $err");
|
||||
next;
|
||||
}
|
||||
$conf->{test_qrs}->{$rule} = $rec;
|
||||
#dbg("replacetags: replaced $rule: '$origre' => '$re'");
|
||||
if (would_log('dbg','replacetags') > 1) {
|
||||
dbg("replacetags: replaced $rule: '$origre' => '$re'");
|
||||
} else {
|
||||
dbg("replacetags: replaced $rule");
|
||||
}
|
||||
} else {
|
||||
dbg("replacetags: nothing was replaced in $rule");
|
||||
}
|
||||
|
@ -68,15 +68,15 @@ resident set size
|
||||
|
||||
package Mail::SpamAssassin::Plugin::ResourceLimits;
|
||||
|
||||
use Mail::SpamAssassin::Plugin ();
|
||||
use Mail::SpamAssassin::Logger ();
|
||||
use Mail::SpamAssassin::Util ();
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use BSD::Resource qw(RLIMIT_RSS RLIMIT_AS RLIMIT_CPU);
|
||||
use constant HAS_BSD_RESOURCE =>
|
||||
eval 'use BSD::Resource qw(RLIMIT_CPU RLIMIT_RSS RLIMIT_AS); 1;';
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
@ -86,58 +86,62 @@ sub new {
|
||||
|
||||
$class = ref($class) || $class;
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless( $self, $class );
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config( $mailsaobject->{conf} );
|
||||
if (!HAS_BSD_RESOURCE) {
|
||||
warn "ResourceLimits not used, required module BSD::Resource missing\n";
|
||||
}
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
return $self;
|
||||
}
|
||||
|
||||
sub set_config {
|
||||
my ( $self, $conf ) = @_;
|
||||
my @cmds = ();
|
||||
my ($self, $conf) = @_;
|
||||
my @cmds;
|
||||
|
||||
push(
|
||||
@cmds,
|
||||
{
|
||||
push(@cmds, {
|
||||
setting => 'resource_limit_mem',
|
||||
is_admin => 1,
|
||||
default => '0',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
}
|
||||
);
|
||||
});
|
||||
|
||||
push(
|
||||
@cmds,
|
||||
{
|
||||
push(@cmds, {
|
||||
setting => 'resource_limit_cpu',
|
||||
is_admin => 1,
|
||||
default => '0',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
}
|
||||
);
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands( \@cmds );
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
if (HAS_BSD_RESOURCE) { eval '
|
||||
sub spamd_child_init {
|
||||
my ($self) = @_;
|
||||
|
||||
my $conf = $self->{main}->{conf};
|
||||
|
||||
# Set CPU Resource limits if they were specified.
|
||||
Mail::SpamAssassin::Util::dbg("resourcelimitplugin: In spamd_child_init");
|
||||
Mail::SpamAssassin::Util::dbg( "resourcelimitplugin: cpu limit: " . $self->{main}->{conf}->{resource_limit_cpu} );
|
||||
if ( $self->{main}->{conf}->{resource_limit_cpu} ) {
|
||||
BSD::Resource::setrlimit( RLIMIT_CPU, $self->{main}->{conf}->{resource_limit_cpu}, $self->{main}->{conf}->{resource_limit_cpu} )
|
||||
|| info("resourcelimitplugin: Unable to set RLIMIT_CPU");
|
||||
dbg("resourcelimits: cpu limit: " . $conf->{resource_limit_cpu});
|
||||
if ($conf->{resource_limit_cpu}) {
|
||||
BSD::Resource::setrlimit( RLIMIT_CPU,
|
||||
$conf->{resource_limit_cpu}, $conf->{resource_limit_cpu} )
|
||||
or info("resourcelimits: Unable to set RLIMIT_CPU");
|
||||
}
|
||||
|
||||
# Set Resource limits if they were specified.
|
||||
Mail::SpamAssassin::Util::dbg( "resourcelimitplugin: mem limit: " . $self->{main}->{conf}->{resource_limit_mem} );
|
||||
if ( $self->{main}->{conf}->{resource_limit_mem} ) {
|
||||
BSD::Resource::setrlimit( RLIMIT_RSS, $self->{main}->{conf}->{resource_limit_mem}, $self->{main}->{conf}->{resource_limit_mem} )
|
||||
|| info("resourcelimitplugin: Unable to set RLIMIT_RSS");
|
||||
BSD::Resource::setrlimit( RLIMIT_AS, $self->{main}->{conf}->{resource_limit_mem}, $self->{main}->{conf}->{resource_limit_mem} )
|
||||
|| info("resourcelimitplugin: Unable to set RLIMIT_AS");
|
||||
dbg("resourcelimits: mem limit: " . $conf->{resource_limit_mem});
|
||||
if ($conf->{resource_limit_mem}) {
|
||||
BSD::Resource::setrlimit( RLIMIT_RSS,
|
||||
$conf->{resource_limit_mem}, $conf->{resource_limit_mem} )
|
||||
or info("resourcelimits: Unable to set RLIMIT_RSS");
|
||||
BSD::Resource::setrlimit( RLIMIT_AS,
|
||||
$conf->{resource_limit_mem}, $conf->{resource_limit_mem} )
|
||||
or info("resourcelimits: Unable to set RLIMIT_AS");
|
||||
}
|
||||
}
|
||||
'; }
|
||||
|
||||
1;
|
||||
|
@ -10,6 +10,8 @@ Mail::SpamAssassin::Plugin::Reuse - For reusing old rule hits during a mass-chec
|
||||
|
||||
reuse NETWORK_RULE [ NETWORK_RULE_OLD_NAME ]
|
||||
|
||||
run_reuse_tests_only 0/1
|
||||
|
||||
endif
|
||||
|
||||
=head1 DESCRIPTION
|
||||
@ -18,6 +20,15 @@ The purpose of this plugin is to work in conjunction with B<mass-check
|
||||
--reuse> to map rules hit in input messages to rule hits in the
|
||||
mass-check output.
|
||||
|
||||
run_reuse_tests_only 1 is special option for spamassassin/spamd use.
|
||||
Only reuse flagged tests will be run. It will also _enable_ network/DNS
|
||||
lookups. This is mainly intended for fast mass processing of corpus
|
||||
messages, so they can be properly reused later. For example:
|
||||
spamd --pre="loadmodule Mail::SpamAssassin::Plugin::Reuse" \
|
||||
--pre="run_reuse_tests_only 1" ...
|
||||
Such dedicated spamd could be scripted to add X-Spam-Status header to
|
||||
messages efficiently.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::Reuse;
|
||||
@ -25,12 +36,16 @@ package Mail::SpamAssassin::Plugin::Reuse;
|
||||
# use bytes;
|
||||
use strict;
|
||||
use warnings;
|
||||
use re 'taint';
|
||||
|
||||
use Mail::SpamAssassin::Conf;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Constants qw(:sa);
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
|
||||
my $RULENAME_RE = RULENAME_RE;
|
||||
|
||||
# constructor
|
||||
sub new {
|
||||
my $invocant = shift;
|
||||
@ -42,7 +57,7 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
$self->set_config($samain->{conf});
|
||||
# make sure we run last (or close) of the finish_parsing_start since
|
||||
# make sure we run last (or close) of the finish_parsing_end since
|
||||
# we need all other rules to be defined
|
||||
$self->register_method_priority("finish_parsing_start", 100);
|
||||
return $self;
|
||||
@ -56,28 +71,35 @@ sub set_config {
|
||||
# e.g.
|
||||
# reuse NET_TEST_V1 NET_TEST_V0
|
||||
|
||||
push (@cmds, { setting => 'reuse',
|
||||
push (@cmds, {
|
||||
setting => 'reuse',
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_HASH_KEY_VALUE,
|
||||
code => sub {
|
||||
my ($conf, $key, $value, $line) = @_;
|
||||
|
||||
if ($value !~ /\s*(\w+)(?:\s+(\w+(?:\s+\w+)*))?\s*$/) {
|
||||
if ($value !~ /^\s*(${RULENAME_RE})(?:\s+(${RULENAME_RE}(?:\s+${RULENAME_RE})*))?\s*$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
|
||||
my $new_name = $1;
|
||||
my @old_names = ($new_name);
|
||||
if ($2) {
|
||||
push @old_names, split (' ', $2);
|
||||
if (defined $2) {
|
||||
push @old_names, split (/\s+/, $2);
|
||||
}
|
||||
|
||||
dbg("reuse: read rule, old: @old_names new: $new_name");
|
||||
dbg("reuse: read rule, old: %s new: %s", join(' ', @old_names), $new_name);
|
||||
|
||||
foreach my $old (@old_names) {
|
||||
push @{$conf->{reuse_tests}->{$new_name}}, $old;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
}});
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'run_reuse_tests_only',
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
@ -86,11 +108,27 @@ sub finish_parsing_start {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $conf = $opts->{conf};
|
||||
my $tflags = $conf->{tflags};
|
||||
|
||||
dbg("reuse: finish_parsing_start called");
|
||||
while (my($rulename,$tfl) = each %{$tflags}) {
|
||||
if ($tfl =~ /\bnet\b/ && !exists $conf->{reuse_tests}->{$rulename}) {
|
||||
dbg("reuse: forcing reuse of net rule $rulename");
|
||||
push @{$conf->{reuse_tests}->{$rulename}}, $rulename;
|
||||
}
|
||||
}
|
||||
|
||||
return 0 if (!exists $conf->{reuse_tests});
|
||||
|
||||
if ($conf->{run_reuse_tests_only}) {
|
||||
# simply delete all rules not reuse
|
||||
foreach (keys %{$conf->{tests}}) {
|
||||
if (!defined $conf->{reuse_tests}->{$_}) {
|
||||
delete $conf->{tests}->{$_};
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
foreach my $rule_name (keys %{$conf->{reuse_tests}}) {
|
||||
|
||||
# If the rule does not exist, add a new EMPTY test, set default score
|
||||
@ -100,7 +138,7 @@ sub finish_parsing_start {
|
||||
}
|
||||
if (!exists $conf->{scores}->{$rule_name}) {
|
||||
my $set_score = ($rule_name =~/^T_/) ? 0.01 : 1.0;
|
||||
$set_score = -$set_score if ( ($conf->{tflags}->{$rule_name}||'') =~ /\bnice\b/ );
|
||||
$set_score = -$set_score if ( ($tflags->{$rule_name}||'') =~ /\bnice\b/ );
|
||||
foreach my $ss (0..3) {
|
||||
$conf->{scoreset}->[$ss]->{$rule_name} = $set_score;
|
||||
}
|
||||
@ -108,7 +146,7 @@ sub finish_parsing_start {
|
||||
|
||||
# Figure out when to add any hits -- grab priority and "stage"
|
||||
my $priority = $conf->{priority}->{$rule_name} || 0;
|
||||
my $stage = $self->_get_stage_from_rule($opts->{conf}, $rule_name);
|
||||
my $stage = $self->_get_stage_from_rule($conf, $rule_name);
|
||||
$conf->{reuse_tests_order}->{$rule_name} = [ $priority, $stage ];
|
||||
|
||||
}
|
||||
@ -118,6 +156,10 @@ sub check_start {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $pms = $opts->{permsgstatus};
|
||||
my $conf = $pms->{conf};
|
||||
my $scoreset = $conf->{scoreset};
|
||||
|
||||
return 0 if $conf->{run_reuse_tests_only};
|
||||
|
||||
# Can we reuse?
|
||||
my $msg = $pms->get_message();
|
||||
@ -130,30 +172,34 @@ sub check_start {
|
||||
|
||||
# now go through the rules and priorities and figure out which ones
|
||||
# need to be disabled
|
||||
foreach my $rule (keys %{$pms->{conf}->{reuse_tests}}) {
|
||||
foreach my $rule (keys %{$conf->{reuse_tests}}) {
|
||||
|
||||
dbg("reuse: looking at rule $rule");
|
||||
my ($priority, $stage) = @{$pms->{conf}->{reuse_tests_order}->{$rule}};
|
||||
my ($priority, $stage) = @{$conf->{reuse_tests_order}->{$rule}};
|
||||
|
||||
# score set could change after check_start but before we add hits,
|
||||
# so we need to disable the rule in all sets
|
||||
my @dis;
|
||||
foreach my $ss (0..3) {
|
||||
if (exists $pms->{conf}->{scoreset}->[$ss]->{$rule}) {
|
||||
dbg("reuse: disabling rule $rule in score set $ss");
|
||||
if (exists $scoreset->[$ss]->{$rule}) {
|
||||
$pms->{reuse_old_scores}->{$rule}->[$ss] =
|
||||
$pms->{conf}->{scoreset}->[$ss]->{$rule};
|
||||
$pms->{conf}->{scoreset}->[$ss]->{$rule} = 0;
|
||||
$scoreset->[$ss]->{$rule};
|
||||
$scoreset->[$ss]->{$rule} = 0;
|
||||
push @dis, $ss;
|
||||
}
|
||||
}
|
||||
dbg("reuse: disabling rule $rule in score sets %s",
|
||||
join(',', @dis)) if @dis;
|
||||
|
||||
# now, check for hits
|
||||
OLD: foreach my $old_test (@{$pms->{conf}->{reuse_tests}->{$rule}}) {
|
||||
dbg("reuse: looking for rule $old_test");
|
||||
foreach my $old_test (@{$conf->{reuse_tests}->{$rule}}) {
|
||||
if ($old_hash->{$old_test}) {
|
||||
push @{$pms->{reuse_hits_to_add}->{"$priority $stage"}}, $rule;
|
||||
dbg("reuse: rule $rule hit, will add at priority $priority, stage " .
|
||||
"$stage");
|
||||
last OLD;
|
||||
last;
|
||||
} else {
|
||||
# Make sure rule is marked ready for meta rules
|
||||
$pms->rule_ready($rule);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -163,11 +209,15 @@ sub check_end {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
my $pms = $opts->{permsgstatus};
|
||||
my $conf = $pms->{conf};
|
||||
my $scoreset = $conf->{scoreset};
|
||||
|
||||
return 0 if $conf->{run_reuse_tests_only};
|
||||
|
||||
foreach my $disabled_rule (keys %{$pms->{reuse_old_scores}}) {
|
||||
foreach my $ss (0..3) {
|
||||
next unless exists $pms->{conf}->{scoreset}->[$ss]->{$disabled_rule};
|
||||
$pms->{conf}->{scoreset}->[$ss]->{$disabled_rule} =
|
||||
next unless exists $scoreset->[$ss]->{$disabled_rule};
|
||||
$scoreset->[$ss]->{$disabled_rule} =
|
||||
$pms->{reuse_old_scores}->{$disabled_rule}->[$ss];
|
||||
}
|
||||
}
|
||||
@ -178,8 +228,11 @@ sub check_end {
|
||||
sub start_rules {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
return $self->_add_hits($opts->{permsgstatus}, $opts->{priority},
|
||||
$opts->{ruletype});
|
||||
my $pms = $opts->{permsgstatus};
|
||||
|
||||
return 0 if $pms->{conf}->{run_reuse_tests_only};
|
||||
|
||||
return $self->_add_hits($pms, $opts->{priority}, $opts->{ruletype});
|
||||
}
|
||||
|
||||
sub _add_hits {
|
||||
@ -215,7 +268,7 @@ my %type_to_stage = (
|
||||
$Mail::SpamAssassin::Conf::TYPE_URI_EVALS => "eval",
|
||||
$Mail::SpamAssassin::Conf::TYPE_META_TESTS => "meta",
|
||||
$Mail::SpamAssassin::Conf::TYPE_RBL_EVALS => "eval",
|
||||
);
|
||||
);
|
||||
|
||||
sub _get_stage_from_rule {
|
||||
my ($self, $conf, $rule) = @_;
|
||||
@ -239,4 +292,4 @@ sub _get_stage_from_rule {
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
1;
|
||||
|
@ -143,6 +143,16 @@ sub setup_test_set_pri {
|
||||
dbg "zoom: skipping rule $name, ReplaceTags";
|
||||
next;
|
||||
}
|
||||
# ignore regex capture rules
|
||||
if ($conf->{capture_rules}->{$name}) {
|
||||
dbg "zoom: skipping rule $name, regex capture";
|
||||
next;
|
||||
}
|
||||
# ignore regex capture template rules
|
||||
if ($conf->{capture_template_rules}->{$name}) {
|
||||
dbg "zoom: skipping rule $name, regex capture template";
|
||||
next;
|
||||
}
|
||||
|
||||
# we have the rule, and its regexp matches. zero out the body
|
||||
# rule, so that the module can do the work instead
|
||||
@ -197,6 +207,11 @@ sub check_rules_at_priority {
|
||||
$self->{one_line_body}->check_rules_at_priority($params);
|
||||
}
|
||||
|
||||
sub check_cleanup {
|
||||
my ($self, $params) = @_;
|
||||
$self->{one_line_body}->check_cleanup($params);
|
||||
}
|
||||
|
||||
###########################################################################
|
||||
|
||||
sub run_body_fast_scan {
|
||||
|
@ -29,6 +29,12 @@ This plugin checks a message against Sender Policy Framework (SPF)
|
||||
records published by the domain owners in DNS to fight email address
|
||||
forgery and make it easier to identify spams.
|
||||
|
||||
It's recommended to use MTA filter (pypolicyd-spf / spf-engine etc), so this
|
||||
plugin can reuse the Received-SPF and/or Authentication-Results header results as is.
|
||||
Otherwise throughput could suffer, DNS lookups done by this plugin are not
|
||||
asynchronous.
|
||||
Those headers will also help when SpamAssassin is not able to correctly detect EnvelopeFrom.
|
||||
|
||||
=cut
|
||||
|
||||
package Mail::SpamAssassin::Plugin::SPF;
|
||||
@ -53,22 +59,25 @@ sub new {
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule ("check_for_spf_pass");
|
||||
$self->register_eval_rule ("check_for_spf_neutral");
|
||||
$self->register_eval_rule ("check_for_spf_none");
|
||||
$self->register_eval_rule ("check_for_spf_fail");
|
||||
$self->register_eval_rule ("check_for_spf_softfail");
|
||||
$self->register_eval_rule ("check_for_spf_permerror");
|
||||
$self->register_eval_rule ("check_for_spf_temperror");
|
||||
$self->register_eval_rule ("check_for_spf_helo_pass");
|
||||
$self->register_eval_rule ("check_for_spf_helo_neutral");
|
||||
$self->register_eval_rule ("check_for_spf_helo_none");
|
||||
$self->register_eval_rule ("check_for_spf_helo_fail");
|
||||
$self->register_eval_rule ("check_for_spf_helo_softfail");
|
||||
$self->register_eval_rule ("check_for_spf_helo_permerror");
|
||||
$self->register_eval_rule ("check_for_spf_helo_temperror");
|
||||
$self->register_eval_rule ("check_for_spf_whitelist_from");
|
||||
$self->register_eval_rule ("check_for_def_spf_whitelist_from");
|
||||
$self->register_eval_rule ("check_for_spf_pass", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_neutral", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_none", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_fail", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_softfail", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_permerror", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_temperror", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_helo_pass", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_helo_neutral", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_helo_none", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_helo_fail", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_helo_softfail", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_helo_permerror", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_helo_temperror", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_welcomelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_spf_whitelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS); # removed in 4.1
|
||||
$self->register_eval_rule ("check_for_def_spf_welcomelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
$self->register_eval_rule ("check_for_def_spf_whitelist_from", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS); # removed in 4.1
|
||||
$self->register_eval_rule ("check_spf_skipped_noenvfrom", $Mail::SpamAssassin::Conf::TYPE_HEAD_EVALS);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
@ -85,37 +94,43 @@ sub set_config {
|
||||
|
||||
=over 4
|
||||
|
||||
=item whitelist_from_spf user@example.com
|
||||
=item welcomelist_from_spf user@example.com
|
||||
|
||||
Works similarly to whitelist_from, except that in addition to matching
|
||||
a sender address, a check against the domain's SPF record must pass.
|
||||
The first parameter is an address to whitelist, and the second is a string
|
||||
to match the relay's rDNS.
|
||||
Previously whitelist_from_spf which will work interchangeably until 4.1.
|
||||
|
||||
Just like whitelist_from, multiple addresses per line, separated by spaces,
|
||||
are OK. Multiple C<whitelist_from_spf> lines are also OK.
|
||||
Works similarly to welcomelist_from, except that in addition to matching a
|
||||
sender address, a check against the domain's SPF record must pass. The
|
||||
first parameter is an address to welcomelist, and the second is a string to
|
||||
match the relay's rDNS.
|
||||
|
||||
The headers checked for whitelist_from_spf addresses are the same headers
|
||||
Just like welcomelist_from, multiple addresses per line, separated by
|
||||
spaces, are OK. Multiple C<welcomelist_from_spf> lines are also OK.
|
||||
|
||||
The headers checked for welcomelist_from_spf addresses are the same headers
|
||||
used for SPF checks (Envelope-From, Return-Path, X-Envelope-From, etc).
|
||||
|
||||
Since this whitelist requires an SPF check to be made, network tests must be
|
||||
Since this welcomelist requires an SPF check to be made, network tests must be
|
||||
enabled. It is also required that your trust path be correctly configured.
|
||||
See the section on C<trusted_networks> for more info on trust paths.
|
||||
|
||||
e.g.
|
||||
|
||||
whitelist_from_spf joe@example.com fred@example.com
|
||||
whitelist_from_spf *@example.com
|
||||
welcomelist_from_spf joe@example.com fred@example.com
|
||||
welcomelist_from_spf *@example.com
|
||||
|
||||
=item def_whitelist_from_spf user@example.com
|
||||
=item def_welcomelist_from_spf user@example.com
|
||||
|
||||
Same as C<whitelist_from_spf>, but used for the default whitelist entries
|
||||
in the SpamAssassin distribution. The whitelist score is lower, because
|
||||
Previously def_whitelist_from_spf which will work interchangeably until 4.1.
|
||||
|
||||
Same as C<welcomelist_from_spf>, but used for the default welcomelist entries
|
||||
in the SpamAssassin distribution. The welcomelist score is lower, because
|
||||
these are often targets for spammer spoofing.
|
||||
|
||||
=item unwhitelist_from_spf user@example.com
|
||||
=item unwelcomelist_from_spf user@example.com
|
||||
|
||||
Used to remove a C<whitelist_from_spf> or C<def_whitelist_from_spf> entry.
|
||||
Previously unwhitelist_from_spf which will work interchangeably until 4.1.
|
||||
|
||||
Used to remove a C<welcomelist_from_spf> or C<def_welcomelist_from_spf> entry.
|
||||
The specified email address has to match exactly the address previously used.
|
||||
|
||||
Useful for removing undesired default entries from a distributed configuration
|
||||
@ -124,17 +139,20 @@ by a local or site-specific configuration or by C<user_prefs>.
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'whitelist_from_spf',
|
||||
setting => 'welcomelist_from_spf',
|
||||
aliases => ['whitelist_from_spf'], # removed in 4.1
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'def_whitelist_from_spf',
|
||||
setting => 'def_welcomelist_from_spf',
|
||||
aliases => ['def_whitelist_from_spf'], # removed in 4.1
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST
|
||||
});
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'unwhitelist_from_spf',
|
||||
setting => 'unwelcomelist_from_spf',
|
||||
aliases => ['unwhitelist_from_spf'], # removed in 4.1
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
@ -144,9 +162,9 @@ by a local or site-specific configuration or by C<user_prefs>.
|
||||
unless ($value =~ /^(?:\S+(?:\s+\S+)*)$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
$self->{parser}->remove_from_addrlist('whitelist_from_spf',
|
||||
$self->{parser}->remove_from_addrlist('welcomelist_from_spf',
|
||||
split (/\s+/, $value));
|
||||
$self->{parser}->remove_from_addrlist('def_whitelist_from_spf',
|
||||
$self->{parser}->remove_from_addrlist('def_welcomelist_from_spf',
|
||||
split (/\s+/, $value));
|
||||
}
|
||||
});
|
||||
@ -173,38 +191,6 @@ days, weeks).
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_DURATION
|
||||
});
|
||||
|
||||
=item do_not_use_mail_spf (0|1) (default: 0)
|
||||
|
||||
By default the plugin will try to use the Mail::SPF module for SPF checks if
|
||||
it can be loaded. If Mail::SPF cannot be used the plugin will fall back to
|
||||
using the legacy Mail::SPF::Query module if it can be loaded.
|
||||
|
||||
Use this option to stop the plugin from using Mail::SPF and cause it to try to
|
||||
use Mail::SPF::Query instead.
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'do_not_use_mail_spf',
|
||||
is_admin => 1,
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
=item do_not_use_mail_spf_query (0|1) (default: 0)
|
||||
|
||||
As above, but instead stop the plugin from trying to use Mail::SPF::Query and
|
||||
cause it to only try to use Mail::SPF.
|
||||
|
||||
=cut
|
||||
|
||||
push(@cmds, {
|
||||
setting => 'do_not_use_mail_spf_query',
|
||||
is_admin => 1,
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
=item ignore_received_spf_header (0|1) (default: 0)
|
||||
|
||||
By default, to avoid unnecessary DNS lookups, the plugin will try to use the
|
||||
@ -250,10 +236,23 @@ working downwards until results are successfully parsed.
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
# Deprecated since 4.0.0, leave for backwards compatibility
|
||||
push(@cmds, {
|
||||
setting => 'do_not_use_mail_spf',
|
||||
is_admin => 1,
|
||||
default => 0,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
push(@cmds, {
|
||||
setting => 'do_not_use_mail_spf_query',
|
||||
is_admin => 1,
|
||||
default => 1,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_BOOL,
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
|
||||
=item has_check_for_spf_errors
|
||||
|
||||
Adds capability check for "if can()" for check_for_spf_permerror, check_for_spf_temperror, check_for_spf_helo_permerror and check_for_spf_helo_permerror
|
||||
@ -262,23 +261,39 @@ Adds capability check for "if can()" for check_for_spf_permerror, check_for_spf_
|
||||
|
||||
sub has_check_for_spf_errors { 1 }
|
||||
|
||||
=item has_check_spf_skipped_noenvfrom
|
||||
|
||||
Adds capability check for "if can()" for check_spf_skipped_noenvfrom
|
||||
|
||||
=cut
|
||||
|
||||
sub has_check_spf_skipped_noenvfrom { 1 }
|
||||
|
||||
sub parsed_metadata {
|
||||
my ($self, $opts) = @_;
|
||||
|
||||
$self->_get_sender($opts->{permsgstatus});
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
# SPF support
|
||||
sub check_for_spf_pass {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 0) unless $scanner->{spf_checked};
|
||||
$scanner->{spf_pass};
|
||||
return $scanner->{spf_pass} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_neutral {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 0) unless $scanner->{spf_checked};
|
||||
$scanner->{spf_neutral};
|
||||
return $scanner->{spf_neutral} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_none {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 0) unless $scanner->{spf_checked};
|
||||
$scanner->{spf_none};
|
||||
return $scanner->{spf_none} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_fail {
|
||||
@ -287,43 +302,43 @@ sub check_for_spf_fail {
|
||||
if ($scanner->{spf_failure_comment}) {
|
||||
$scanner->test_log ($scanner->{spf_failure_comment});
|
||||
}
|
||||
$scanner->{spf_fail};
|
||||
return $scanner->{spf_fail} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_softfail {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 0) unless $scanner->{spf_checked};
|
||||
$scanner->{spf_softfail};
|
||||
return $scanner->{spf_softfail} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_permerror {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 0) unless $scanner->{spf_checked};
|
||||
$scanner->{spf_permerror};
|
||||
return $scanner->{spf_permerror} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_temperror {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 0) unless $scanner->{spf_checked};
|
||||
$scanner->{spf_temperror};
|
||||
return $scanner->{spf_temperror} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_helo_pass {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 1) unless $scanner->{spf_helo_checked};
|
||||
$scanner->{spf_helo_pass};
|
||||
return $scanner->{spf_helo_pass} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_helo_neutral {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 1) unless $scanner->{spf_helo_checked};
|
||||
$scanner->{spf_helo_neutral};
|
||||
return $scanner->{spf_helo_neutral} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_helo_none {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 1) unless $scanner->{spf_helo_checked};
|
||||
$scanner->{spf_helo_none};
|
||||
return $scanner->{spf_helo_none} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_helo_fail {
|
||||
@ -332,38 +347,60 @@ sub check_for_spf_helo_fail {
|
||||
if ($scanner->{spf_helo_failure_comment}) {
|
||||
$scanner->test_log ($scanner->{spf_helo_failure_comment});
|
||||
}
|
||||
$scanner->{spf_helo_fail};
|
||||
return $scanner->{spf_helo_fail} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_helo_softfail {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 1) unless $scanner->{spf_helo_checked};
|
||||
$scanner->{spf_helo_softfail};
|
||||
return $scanner->{spf_helo_softfail} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_helo_permerror {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 1) unless $scanner->{spf_helo_checked};
|
||||
$scanner->{spf_helo_permerror};
|
||||
return $scanner->{spf_helo_permerror} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_helo_temperror {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf ($scanner, 1) unless $scanner->{spf_helo_checked};
|
||||
$scanner->{spf_helo_temperror};
|
||||
return $scanner->{spf_helo_temperror} ? 1 : 0;
|
||||
}
|
||||
|
||||
sub check_for_spf_whitelist_from {
|
||||
=over 4
|
||||
|
||||
=item check_spf_skipped_noenvfrom
|
||||
|
||||
Checks if SPF checks have been skipped because EnvelopeFrom cannot be determined.
|
||||
|
||||
=back
|
||||
|
||||
=cut
|
||||
|
||||
sub check_spf_skipped_noenvfrom {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_spf_whitelist($scanner) unless $scanner->{spf_whitelist_from_checked};
|
||||
$scanner->{spf_whitelist_from};
|
||||
$self->_check_spf ($scanner, 0) unless $scanner->{spf_checked};
|
||||
if (!exists $scanner->{spf_sender}) {
|
||||
return 1;
|
||||
} else {
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
sub check_for_def_spf_whitelist_from {
|
||||
sub check_for_spf_welcomelist_from {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_def_spf_whitelist($scanner) unless $scanner->{def_spf_whitelist_from_checked};
|
||||
$scanner->{def_spf_whitelist_from};
|
||||
$self->_check_spf_welcomelist($scanner) unless $scanner->{spf_welcomelist_from_checked};
|
||||
return $scanner->{spf_welcomelist_from} ? 1 : 0;
|
||||
}
|
||||
*check_for_spf_whitelist_from = \&check_for_spf_welcomelist_from; # removed in 4.1
|
||||
|
||||
sub check_for_def_spf_welcomelist_from {
|
||||
my ($self, $scanner) = @_;
|
||||
$self->_check_def_spf_welcomelist($scanner) unless $scanner->{def_spf_welcomelist_from_checked};
|
||||
return $scanner->{def_spf_welcomelist_from} ? 1 : 0;
|
||||
}
|
||||
*check_for_def_spf_whitelist_from = \&check_for_def_spf_welcomelist_from; # removed in 4.1
|
||||
|
||||
sub _check_spf {
|
||||
my ($self, $scanner, $ishelo) = @_;
|
||||
@ -387,7 +424,7 @@ sub _check_spf {
|
||||
$scanner->{checked_for_received_spf_header} = 1;
|
||||
dbg("spf: checking to see if the message has a Received-SPF header that we can use");
|
||||
|
||||
my @internal_hdrs = split("\n", $scanner->get('ALL-INTERNAL'));
|
||||
my @internal_hdrs = $scanner->get('ALL-INTERNAL');
|
||||
unless ($scanner->{conf}->{use_newest_received_spf_header}) {
|
||||
# look for the LAST (earliest in time) header, it'll be the most accurate
|
||||
@internal_hdrs = reverse(@internal_hdrs);
|
||||
@ -459,7 +496,7 @@ sub _check_spf {
|
||||
dbg("spf: could not parse result from existing Received-SPF header");
|
||||
}
|
||||
|
||||
} elsif ($hdr =~ /^Authentication-Results:.*;\s*SPF\s*=\s*([^;]*)/i) {
|
||||
} elsif ($hdr =~ /^(?:Arc\-)?Authentication-Results:.*;\s*SPF\s*=\s*([^;]*)/i) {
|
||||
dbg("spf: found an Authentication-Results header added by an internal host: $hdr");
|
||||
|
||||
# RFC 5451 header parser - added by D. Stussy 2010-09-09:
|
||||
@ -524,8 +561,6 @@ sub _check_spf {
|
||||
unless (defined $self->{has_mail_spf}) {
|
||||
my $eval_stat;
|
||||
eval {
|
||||
die("Mail::SPF disabled by admin setting\n") if $scanner->{conf}->{do_not_use_mail_spf};
|
||||
|
||||
require Mail::SPF;
|
||||
if (!defined $Mail::SPF::VERSION || $Mail::SPF::VERSION < 2.001) {
|
||||
die "Mail::SPF 2.001 or later required, this is ".
|
||||
@ -547,43 +582,18 @@ sub _check_spf {
|
||||
dbg("spf: using Mail::SPF for SPF checks");
|
||||
$self->{has_mail_spf} = 1;
|
||||
} else {
|
||||
# strip the @INC paths... users are going to see it and think there's a problem even though
|
||||
# we're going to fall back to Mail::SPF::Query (which will display the same paths if it fails)
|
||||
$eval_stat =~ s#^Can't locate Mail/SPFd.pm in \@INC .*#Can't locate Mail/SPFd.pm#;
|
||||
dbg("spf: cannot load Mail::SPF module or create Mail::SPF::Server object: $eval_stat");
|
||||
dbg("spf: attempting to use legacy Mail::SPF::Query module instead");
|
||||
|
||||
undef $eval_stat;
|
||||
eval {
|
||||
die("Mail::SPF::Query disabled by admin setting\n") if $scanner->{conf}->{do_not_use_mail_spf_query};
|
||||
|
||||
require Mail::SPF::Query;
|
||||
if (!defined $Mail::SPF::Query::VERSION || $Mail::SPF::Query::VERSION < 1.996) {
|
||||
die "Mail::SPF::Query 1.996 or later required, this is ".
|
||||
(defined $Mail::SPF::Query::VERSION ? $Mail::SPF::Query::VERSION : 'unknown')."\n";
|
||||
}
|
||||
1;
|
||||
} or do {
|
||||
$eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
};
|
||||
|
||||
if (!defined($eval_stat)) {
|
||||
dbg("spf: using Mail::SPF::Query for SPF checks");
|
||||
$self->{has_mail_spf} = 0;
|
||||
} else {
|
||||
dbg("spf: cannot load Mail::SPF::Query module: $eval_stat");
|
||||
dbg("spf: one of Mail::SPF or Mail::SPF::Query is required for SPF checks, SPF checks disabled");
|
||||
dbg("spf: cannot load Mail::SPF: module: $eval_stat");
|
||||
dbg("spf: Mail::SPF is required for SPF checks, SPF checks disabled");
|
||||
$self->{no_spf_module} = 1;
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
# skip SPF checks if the A/MX records are nonexistent for the From
|
||||
# domain, anyway, to avoid crappy messages from slowing us down
|
||||
# (bug 3016)
|
||||
return if $scanner->check_for_from_dns();
|
||||
# TODO: this will only work if the queries are ready before SPF, so never?
|
||||
return if $scanner->{sender_host_fail} && $scanner->{sender_host_fail} == 2;
|
||||
|
||||
if ($ishelo) {
|
||||
# SPF HELO-checking variant
|
||||
@ -609,7 +619,7 @@ sub _check_spf {
|
||||
$scanner->{spf_failure_comment} = undef;
|
||||
}
|
||||
|
||||
my $lasthop = $self->_get_relay($scanner);
|
||||
my $lasthop = $scanner->{relays_external}->[0];
|
||||
if (!defined $lasthop) {
|
||||
dbg("spf: no suitable relay for spf use found, skipping SPF%s check",
|
||||
$ishelo ? '-helo' : '');
|
||||
@ -618,7 +628,6 @@ sub _check_spf {
|
||||
|
||||
my $ip = $lasthop->{ip}; # always present
|
||||
my $helo = $lasthop->{helo}; # could be missing
|
||||
$scanner->{sender} = '' unless $scanner->{sender_got};
|
||||
|
||||
if ($ishelo) {
|
||||
unless ($helo) {
|
||||
@ -627,19 +636,17 @@ sub _check_spf {
|
||||
}
|
||||
dbg("spf: checking HELO (helo=$helo, ip=$ip)");
|
||||
} else {
|
||||
$self->_get_sender($scanner) unless $scanner->{sender_got};
|
||||
|
||||
# TODO: we're supposed to use the helo domain as the sender identity (for
|
||||
# mfrom checks) if the sender is the null sender, however determining that
|
||||
# it's the null sender, and not just a failure to get the envelope isn't
|
||||
# exactly trivial... so for now we'll just skip the check
|
||||
|
||||
if (!$scanner->{sender}) {
|
||||
if (!$scanner->{spf_sender}) {
|
||||
# we already dbg'd that we couldn't get an Envelope-From and can't do SPF
|
||||
return;
|
||||
}
|
||||
dbg("spf: checking EnvelopeFrom (helo=%s, ip=%s, envfrom=%s)",
|
||||
($helo ? $helo : ''), $ip, $scanner->{sender});
|
||||
($helo ? $helo : ''), $ip, $scanner->{spf_sender});
|
||||
}
|
||||
|
||||
# this test could probably stand to be more strict, but try to test
|
||||
@ -657,11 +664,8 @@ sub _check_spf {
|
||||
|
||||
my ($result, $comment, $text, $err);
|
||||
|
||||
# use Mail::SPF if it was available, otherwise use the legacy Mail::SPF::Query
|
||||
if ($self->{has_mail_spf}) {
|
||||
|
||||
# TODO: currently we won't get to here for a mfrom check with a null sender
|
||||
my $identity = $ishelo ? $helo : ($scanner->{sender}); # || $helo);
|
||||
my $identity = $ishelo ? $helo : ($scanner->{spf_sender}); # || $helo);
|
||||
|
||||
unless ($identity) {
|
||||
dbg("spf: cannot determine %s identity, skipping %s SPF check",
|
||||
@ -685,61 +689,21 @@ sub _check_spf {
|
||||
|
||||
my $timeout = $scanner->{conf}->{spf_timeout};
|
||||
|
||||
my $timer = Mail::SpamAssassin::Timeout->new(
|
||||
my $timer_spf = Mail::SpamAssassin::Timeout->new(
|
||||
{ secs => $timeout, deadline => $scanner->{master_deadline} });
|
||||
$err = $timer->run_and_catch(sub {
|
||||
|
||||
$err = $timer_spf->run_and_catch(sub {
|
||||
my $query = $self->{spf_server}->process($request);
|
||||
|
||||
$result = $query->code;
|
||||
$comment = $query->authority_explanation if $query->can("authority_explanation");
|
||||
$text = $query->text;
|
||||
|
||||
});
|
||||
|
||||
|
||||
} else {
|
||||
|
||||
if (!$helo) {
|
||||
dbg("spf: cannot get HELO, cannot use Mail::SPF::Query, consider installing Mail::SPF");
|
||||
return;
|
||||
}
|
||||
|
||||
# TODO: if we start doing checks on the null sender using the helo domain
|
||||
# be sure to fix this so that it uses the correct sender identity
|
||||
my $query;
|
||||
eval {
|
||||
$query = Mail::SPF::Query->new (ip => $ip,
|
||||
sender => $scanner->{sender},
|
||||
helo => $helo,
|
||||
debug => 0,
|
||||
trusted => 0);
|
||||
1;
|
||||
} or do {
|
||||
my $eval_stat = $@ ne '' ? $@ : "errno=$!"; chomp $eval_stat;
|
||||
dbg("spf: cannot create Mail::SPF::Query object: $eval_stat");
|
||||
return;
|
||||
};
|
||||
|
||||
my $timeout = $scanner->{conf}->{spf_timeout};
|
||||
|
||||
my $timer = Mail::SpamAssassin::Timeout->new(
|
||||
{ secs => $timeout, deadline => $scanner->{master_deadline} });
|
||||
$err = $timer->run_and_catch(sub {
|
||||
|
||||
($result, $comment) = $query->result();
|
||||
|
||||
});
|
||||
|
||||
} # end of differences between Mail::SPF and Mail::SPF::Query
|
||||
|
||||
if ($err) {
|
||||
chomp $err;
|
||||
warn("spf: lookup failed: $err\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
$result ||= 'timeout'; # bug 5077
|
||||
$comment ||= '';
|
||||
$comment =~ s/\s+/ /gs; # no newlines please
|
||||
@ -774,138 +738,131 @@ sub _check_spf {
|
||||
}
|
||||
}
|
||||
|
||||
dbg("spf: query for $scanner->{sender}/$ip/$helo: result: $result, comment: $comment, text: $text");
|
||||
}
|
||||
|
||||
sub _get_relay {
|
||||
my ($self, $scanner) = @_;
|
||||
|
||||
# dos: first external relay, not first untrusted
|
||||
return $scanner->{relays_external}->[0];
|
||||
if ($ishelo) {
|
||||
dbg("spf: query for $ip/$helo: result: $result, comment: $comment, text: $text");
|
||||
} else {
|
||||
dbg("spf: query for $scanner->{spf_sender}/$ip/$helo: result: $result, comment: $comment, text: $text");
|
||||
}
|
||||
}
|
||||
|
||||
sub _get_sender {
|
||||
my ($self, $scanner) = @_;
|
||||
my $sender;
|
||||
|
||||
$scanner->{sender_got} = 1;
|
||||
$scanner->{sender} = '';
|
||||
|
||||
my $relay = $self->_get_relay($scanner);
|
||||
my $relay = $scanner->{relays_external}->[0];
|
||||
if (defined $relay) {
|
||||
$sender = $relay->{envfrom};
|
||||
my $sender = $relay->{envfrom};
|
||||
if (defined $sender) {
|
||||
dbg("spf: found EnvelopeFrom '$sender' in first external Received header");
|
||||
$scanner->{spf_sender} = lc $sender;
|
||||
} else {
|
||||
dbg("spf: EnvelopeFrom not found in first external Received header");
|
||||
}
|
||||
}
|
||||
|
||||
if ($sender) {
|
||||
dbg("spf: found Envelope-From in first external Received header");
|
||||
}
|
||||
else {
|
||||
if (!exists $scanner->{spf_sender}) {
|
||||
# We cannot use the env-from data, since it went through 1 or more relays
|
||||
# since the untrusted sender and they may have rewritten it.
|
||||
if ($scanner->{num_relays_trusted} > 0 && !$scanner->{conf}->{always_trust_envelope_sender}) {
|
||||
dbg("spf: relayed through one or more trusted relays, cannot use header-based Envelope-From, skipping");
|
||||
return;
|
||||
}
|
||||
|
||||
# we can (apparently) use whatever the current Envelope-From was,
|
||||
if ($scanner->{num_relays_trusted} > 0 &&
|
||||
!$scanner->{conf}->{always_trust_envelope_sender}) {
|
||||
dbg("spf: relayed through one or more trusted relays, ".
|
||||
"cannot use header-based EnvelopeFrom");
|
||||
} else {
|
||||
# we can (apparently) use whatever the current EnvelopeFrom was,
|
||||
# from the Return-Path, X-Envelope-From, or whatever header.
|
||||
# it's better to get it from Received though, as that is updated
|
||||
# hop-by-hop.
|
||||
$sender = $scanner->get("EnvelopeFrom:addr");
|
||||
}
|
||||
|
||||
if (!$sender) {
|
||||
dbg("spf: cannot get Envelope-From, cannot use SPF");
|
||||
return; # avoid setting $scanner->{sender} to undef
|
||||
}
|
||||
|
||||
return $scanner->{sender} = lc $sender;
|
||||
}
|
||||
|
||||
sub _check_spf_whitelist {
|
||||
my ($self, $scanner) = @_;
|
||||
|
||||
$scanner->{spf_whitelist_from_checked} = 1;
|
||||
$scanner->{spf_whitelist_from} = 0;
|
||||
|
||||
# if we've already checked for an SPF PASS and didn't get it don't waste time
|
||||
# checking to see if the sender address is in the spf whitelist
|
||||
if ($scanner->{spf_checked} && !$scanner->{spf_pass}) {
|
||||
dbg("spf: whitelist_from_spf: already checked spf and didn't get pass, skipping whitelist check");
|
||||
return;
|
||||
}
|
||||
|
||||
$self->_get_sender($scanner) unless $scanner->{sender_got};
|
||||
|
||||
unless ($scanner->{sender}) {
|
||||
dbg("spf: spf_whitelist_from: could not find usable envelope sender");
|
||||
return;
|
||||
}
|
||||
|
||||
$scanner->{spf_whitelist_from} = $self->_wlcheck($scanner,'whitelist_from_spf');
|
||||
if (!$scanner->{spf_whitelist_from}) {
|
||||
$scanner->{spf_whitelist_from} = $self->_wlcheck($scanner, 'whitelist_auth');
|
||||
}
|
||||
|
||||
# if the message doesn't pass SPF validation, it can't pass an SPF whitelist
|
||||
if ($scanner->{spf_whitelist_from}) {
|
||||
if ($self->check_for_spf_pass($scanner)) {
|
||||
dbg("spf: whitelist_from_spf: $scanner->{sender} is in user's WHITELIST_FROM_SPF and passed SPF check");
|
||||
my $sender = ($scanner->get("EnvelopeFrom:addr"))[0];
|
||||
if (defined $sender) {
|
||||
dbg("spf: found EnvelopeFrom '$sender' from header");
|
||||
$scanner->{spf_sender} = lc $sender;
|
||||
} else {
|
||||
dbg("spf: whitelist_from_spf: $scanner->{sender} is in user's WHITELIST_FROM_SPF but failed SPF check");
|
||||
$scanner->{spf_whitelist_from} = 0;
|
||||
dbg("spf: EnvelopeFrom header not found");
|
||||
}
|
||||
} else {
|
||||
dbg("spf: whitelist_from_spf: $scanner->{sender} is not in user's WHITELIST_FROM_SPF");
|
||||
}
|
||||
}
|
||||
|
||||
if (!exists $scanner->{spf_sender}) {
|
||||
dbg("spf: cannot get EnvelopeFrom, cannot use SPF by DNS");
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_def_spf_whitelist {
|
||||
sub _check_spf_welcomelist {
|
||||
my ($self, $scanner) = @_;
|
||||
|
||||
$scanner->{def_spf_whitelist_from_checked} = 1;
|
||||
$scanner->{def_spf_whitelist_from} = 0;
|
||||
$scanner->{spf_welcomelist_from_checked} = 1;
|
||||
$scanner->{spf_welcomelist_from} = 0;
|
||||
|
||||
# if we've already checked for an SPF PASS and didn't get it don't waste time
|
||||
# checking to see if the sender address is in the spf whitelist
|
||||
# checking to see if the sender address is in the spf welcomelist
|
||||
if ($scanner->{spf_checked} && !$scanner->{spf_pass}) {
|
||||
dbg("spf: def_spf_whitelist_from: already checked spf and didn't get pass, skipping whitelist check");
|
||||
dbg("spf: welcomelist_from_spf: already checked spf and didn't get pass, skipping welcomelist check");
|
||||
return;
|
||||
}
|
||||
|
||||
$self->_get_sender($scanner) unless $scanner->{sender_got};
|
||||
|
||||
unless ($scanner->{sender}) {
|
||||
dbg("spf: def_spf_whitelist_from: could not find usable envelope sender");
|
||||
if (!$scanner->{spf_sender}) {
|
||||
dbg("spf: spf_welcomelist_from: no EnvelopeFrom available for welcomelist check");
|
||||
return;
|
||||
}
|
||||
|
||||
$scanner->{def_spf_whitelist_from} = $self->_wlcheck($scanner,'def_whitelist_from_spf');
|
||||
if (!$scanner->{def_spf_whitelist_from}) {
|
||||
$scanner->{def_spf_whitelist_from} = $self->_wlcheck($scanner, 'def_whitelist_auth');
|
||||
}
|
||||
$scanner->{spf_welcomelist_from} =
|
||||
$self->_wlcheck($scanner, 'welcomelist_from_spf') ||
|
||||
$self->_wlcheck($scanner, 'welcomelist_auth');
|
||||
|
||||
# if the message doesn't pass SPF validation, it can't pass an SPF whitelist
|
||||
if ($scanner->{def_spf_whitelist_from}) {
|
||||
# if the message doesn't pass SPF validation, it can't pass an SPF welcomelist
|
||||
if ($scanner->{spf_welcomelist_from}) {
|
||||
if ($self->check_for_spf_pass($scanner)) {
|
||||
dbg("spf: def_whitelist_from_spf: $scanner->{sender} is in DEF_WHITELIST_FROM_SPF and passed SPF check");
|
||||
dbg("spf: welcomelist_from_spf: $scanner->{spf_sender} is in user's WELCOMELIST_FROM_SPF and passed SPF check");
|
||||
} else {
|
||||
dbg("spf: def_whitelist_from_spf: $scanner->{sender} is in DEF_WHITELIST_FROM_SPF but failed SPF check");
|
||||
$scanner->{def_spf_whitelist_from} = 0;
|
||||
dbg("spf: welcomelist_from_spf: $scanner->{spf_sender} is in user's WELCOMELIST_FROM_SPF but failed SPF check");
|
||||
$scanner->{spf_welcomelist_from} = 0;
|
||||
}
|
||||
} else {
|
||||
dbg("spf: def_whitelist_from_spf: $scanner->{sender} is not in DEF_WHITELIST_FROM_SPF");
|
||||
dbg("spf: welcomelist_from_spf: $scanner->{spf_sender} is not in user's WELCOMELIST_FROM_SPF");
|
||||
}
|
||||
}
|
||||
|
||||
sub _check_def_spf_welcomelist {
|
||||
my ($self, $scanner) = @_;
|
||||
|
||||
$scanner->{def_spf_welcomelist_from_checked} = 1;
|
||||
$scanner->{def_spf_welcomelist_from} = 0;
|
||||
|
||||
# if we've already checked for an SPF PASS and didn't get it don't waste time
|
||||
# checking to see if the sender address is in the spf welcomelist
|
||||
if ($scanner->{spf_checked} && !$scanner->{spf_pass}) {
|
||||
dbg("spf: def_spf_welcomelist_from: already checked spf and didn't get pass, skipping welcomelist check");
|
||||
return;
|
||||
}
|
||||
|
||||
if (!$scanner->{spf_sender}) {
|
||||
dbg("spf: def_spf_welcomelist_from: could not find usable envelope sender");
|
||||
return;
|
||||
}
|
||||
|
||||
$scanner->{def_spf_welcomelist_from} =
|
||||
$self->_wlcheck($scanner, 'def_welcomelist_from_spf') ||
|
||||
$self->_wlcheck($scanner, 'def_welcomelist_auth');
|
||||
|
||||
# if the message doesn't pass SPF validation, it can't pass an SPF welcomelist
|
||||
if ($scanner->{def_spf_welcomelist_from}) {
|
||||
if ($self->check_for_spf_pass($scanner)) {
|
||||
dbg("spf: def_welcomelist_from_spf: $scanner->{spf_sender} is in DEF_WELCOMELIST_FROM_SPF and passed SPF check");
|
||||
} else {
|
||||
dbg("spf: def_welcomelist_from_spf: $scanner->{spf_sender} is in DEF_WELCOMELIST_FROM_SPF but failed SPF check");
|
||||
$scanner->{def_spf_welcomelist_from} = 0;
|
||||
}
|
||||
} else {
|
||||
dbg("spf: def_welcomelist_from_spf: $scanner->{spf_sender} is not in DEF_WELCOMELIST_FROM_SPF");
|
||||
}
|
||||
}
|
||||
|
||||
sub _wlcheck {
|
||||
my ($self, $scanner, $param) = @_;
|
||||
if (defined ($scanner->{conf}->{$param}->{$scanner->{sender}})) {
|
||||
if (defined ($scanner->{conf}->{$param}->{$scanner->{spf_sender}})) {
|
||||
return 1;
|
||||
} else {
|
||||
study $scanner->{sender}; # study is a no-op since perl 5.16.0
|
||||
foreach my $regexp (values %{$scanner->{conf}->{$param}}) {
|
||||
if ($scanner->{sender} =~ qr/$regexp/i) {
|
||||
if ($scanner->{spf_sender} =~ $regexp) {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
@ -58,7 +58,7 @@ sub new {
|
||||
my $self = $class->SUPER::new($mailsaobject);
|
||||
bless ($self, $class);
|
||||
|
||||
$self->register_eval_rule("check_shortcircuit");
|
||||
$self->register_eval_rule("check_shortcircuit"); # type does not matter
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
return $self;
|
||||
@ -88,6 +88,14 @@ that.
|
||||
To override a test that uses shortcircuiting, you can set the classification
|
||||
type to C<off>.
|
||||
|
||||
Note that DNS and other network lookups are launched when SA reaches
|
||||
priority -100. If you want to shortcircuit scanning before any network
|
||||
queries are sent, you need to set lower than -100 priority to any such rule,
|
||||
like -200 as in the examples below.
|
||||
|
||||
Shortcircuited test will be automatically set to priority -200, but only if
|
||||
the original priority is unchanged at default 0.
|
||||
|
||||
=over 4
|
||||
|
||||
=item on
|
||||
@ -99,7 +107,7 @@ shortcircuited. This would allow you, for example, to define a rule such as
|
||||
body TEST /test/
|
||||
describe TEST test rule that scores barely over spam threshold
|
||||
score TEST 5.5
|
||||
priority TEST -100
|
||||
priority TEST -200
|
||||
shortcircuit TEST on
|
||||
|
||||
The result of a message hitting the above rule would be a final score of 5.5,
|
||||
@ -113,11 +121,11 @@ Disables shortcircuiting on said rule.
|
||||
|
||||
Shortcircuit the rule using a set of defaults; override the default score of
|
||||
this rule with the score from C<shortcircuit_spam_score>, set the
|
||||
C<noautolearn> tflag, and set priority to C<-100>. In other words,
|
||||
C<noautolearn> tflag, and set priority to C<-200>. In other words,
|
||||
equivalent to:
|
||||
|
||||
shortcircuit TEST on
|
||||
priority TEST -100
|
||||
priority TEST -200
|
||||
score TEST 100
|
||||
tflags TEST noautolearn
|
||||
|
||||
@ -125,11 +133,11 @@ equivalent to:
|
||||
|
||||
Shortcircuit the rule using a set of defaults; override the default score of
|
||||
this rule with the score from C<shortcircuit_ham_score>, set the C<noautolearn>
|
||||
and C<nice> tflags, and set priority to C<-100>. In other words, equivalent
|
||||
and C<nice> tflags, and set priority to C<-200>. In other words, equivalent
|
||||
to:
|
||||
|
||||
shortcircuit TEST on
|
||||
priority TEST -100
|
||||
priority TEST -200
|
||||
score TEST -100
|
||||
tflags TEST noautolearn nice
|
||||
|
||||
@ -141,23 +149,22 @@ to:
|
||||
setting => 'shortcircuit',
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
my ($rule,$type);
|
||||
unless (defined $value && $value !~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
if ($value =~ /^(\S+)\s+(\S+)$/) {
|
||||
$rule=$1;
|
||||
$type=$2;
|
||||
} else {
|
||||
local($1,$2);
|
||||
unless ($value =~ /^(\w+)\s+(\w+)$/) {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
my ($rule, $type) = ($1, $2);
|
||||
|
||||
if ($type =~ m/^(?:spam|ham)$/) {
|
||||
if ($type eq "ham" || $type eq "spam") {
|
||||
dbg("shortcircuit: adding $rule using abbreviation $type");
|
||||
|
||||
# set the defaults:
|
||||
$self->{shortcircuit}->{$rule} = $type;
|
||||
$self->{priority}->{$rule} = -100;
|
||||
# don't override existing priority unless it's default 0
|
||||
$self->{priority}->{$rule} ||= -200;
|
||||
|
||||
my $tf = $self->{tflags}->{$rule};
|
||||
$self->{tflags}->{$rule} = ($tf ? $tf." " : "") .
|
||||
@ -227,7 +234,7 @@ sub hit_rule {
|
||||
my $rule = $params->{rulename};
|
||||
|
||||
# don't s/c if we're linting
|
||||
return if ($scan->{lint_rules});
|
||||
return if ($self->{main}->{lint_rules});
|
||||
|
||||
# don't s/c if we're in compile_now()
|
||||
return if ($self->{am_compiling});
|
||||
@ -256,6 +263,8 @@ sub hit_rule {
|
||||
$scscore = $score;
|
||||
}
|
||||
|
||||
$scan->{shortcircuited} = 1;
|
||||
|
||||
# bug 5256: if we short-circuit, don't do auto-learning
|
||||
$scan->{disable_auto_learning} = 1;
|
||||
$scan->got_hit('SHORTCIRCUIT', '', score => $scscore);
|
||||
@ -306,6 +315,6 @@ sub compile_now_finish {
|
||||
|
||||
=head1 SEE ALSO
|
||||
|
||||
C<http://issues.apache.org/SpamAssassin/show_bug.cgi?id=3109>
|
||||
C<https://issues.apache.org/SpamAssassin/show_bug.cgi?id=3109>
|
||||
|
||||
=cut
|
||||
|
@ -29,13 +29,13 @@ SpamCop is a service for reporting spam. SpamCop determines the origin
|
||||
of unwanted email and reports it to the relevant Internet service
|
||||
providers. By reporting spam, you have a positive impact on the
|
||||
problem. Reporting unsolicited email also helps feed spam filtering
|
||||
systems, including, but not limited to, the SpamCop blacklist used in
|
||||
systems, including, but not limited to, the SpamCop blocklist used in
|
||||
SpamAssassin as a DNSBL.
|
||||
|
||||
Note that spam reports sent by this plugin to SpamCop each include the
|
||||
entire spam message.
|
||||
|
||||
See http://www.spamcop.net/ for more information about SpamCop.
|
||||
See https://www.spamcop.net/ for more information about SpamCop.
|
||||
|
||||
=cut
|
||||
|
||||
@ -43,13 +43,13 @@ package Mail::SpamAssassin::Plugin::SpamCop;
|
||||
|
||||
use Mail::SpamAssassin::Plugin;
|
||||
use Mail::SpamAssassin::Logger;
|
||||
use Mail::SpamAssassin::Util qw(untaint_var);
|
||||
use IO::Socket;
|
||||
use strict;
|
||||
use warnings;
|
||||
# use bytes;
|
||||
use re 'taint';
|
||||
|
||||
use constant HAS_NET_DNS => eval { require Net::DNS; };
|
||||
use constant HAS_NET_SMTP => eval { require Net::SMTP; };
|
||||
|
||||
our @ISA = qw(Mail::SpamAssassin::Plugin);
|
||||
@ -63,7 +63,7 @@ sub new {
|
||||
bless ($self, $class);
|
||||
|
||||
# are network tests enabled?
|
||||
if (!$mailsaobject->{local_tests_only} && HAS_NET_DNS && HAS_NET_SMTP) {
|
||||
if (!$mailsaobject->{local_tests_only} && HAS_NET_SMTP) {
|
||||
$self->{spamcop_available} = 1;
|
||||
dbg("reporter: network tests on, attempting SpamCop");
|
||||
}
|
||||
@ -114,7 +114,7 @@ guess will be used as the From: address in SpamCop reports.
|
||||
=item spamcop_to_address user@example.com (default: generic reporting address)
|
||||
|
||||
Your customized SpamCop report submission address. You need to obtain
|
||||
this address by registering at C<http://www.spamcop.net/>. If this is
|
||||
this address by registering at C<https://www.spamcop.net/>. If this is
|
||||
not set, SpamCop reports will go to a generic reporting address for
|
||||
SpamAssassin users and your reports will probably have less weight in
|
||||
the SpamCop system.
|
||||
@ -153,6 +153,35 @@ size that SpamCop will accept at the time of release.
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_NUMERIC
|
||||
});
|
||||
|
||||
=item spamcop_relayhost server:port (default: direct connection to SpamCop)
|
||||
|
||||
Direct connection to SpamCop servers (port 587) is used for report
|
||||
submission by default. If this is undesirable or blocked by local firewall
|
||||
policies, you can specify a local SMTP relayhost to forward reports.
|
||||
Relayhost should be configured to not scan the report, for example by using
|
||||
a separate submission port. SSL or authentication is not supported.
|
||||
|
||||
=cut
|
||||
|
||||
push (@cmds, {
|
||||
setting => 'spamcop_relayhost',
|
||||
default => undef,
|
||||
type => $Mail::SpamAssassin::Conf::CONF_TYPE_STRING,
|
||||
code => sub {
|
||||
my ($self, $key, $value, $line) = @_;
|
||||
if ($value =~ /^(\S+):(\d{2,5})$/) {
|
||||
$self->{spamcop_relayhost} = untaint_var($1);
|
||||
$self->{spamcop_relayport} = untaint_var($2);
|
||||
}
|
||||
elsif ($value =~ /^$/) {
|
||||
return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
|
||||
}
|
||||
else {
|
||||
return $Mail::SpamAssassin::Conf::INVALID_VALUE;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
$conf->{parser}->register_commands(\@cmds);
|
||||
}
|
||||
|
||||
@ -257,17 +286,31 @@ EOM
|
||||
|
||||
# send message
|
||||
my $failure;
|
||||
my $mx = $head{To};
|
||||
my $hello = Mail::SpamAssassin::Util::fq_hostname() || $from;
|
||||
$mx =~ s/.*\@//;
|
||||
$hello =~ s/.*\@//;
|
||||
for my $rr (Net::DNS::mx($mx)) {
|
||||
my $exchange = Mail::SpamAssassin::Util::untaint_hostname($rr->exchange);
|
||||
next unless $exchange;
|
||||
my $smtp;
|
||||
if ($smtp = Net::SMTP->new($exchange,
|
||||
|
||||
my @mxs;
|
||||
if ($options->{report}->{conf}->{spamcop_relayhost}) {
|
||||
push @mxs, $options->{report}->{conf}->{spamcop_relayhost};
|
||||
} else {
|
||||
my $mx = $head{To};
|
||||
$mx =~ s/.*\@//;
|
||||
foreach my $rr (Net::DNS::mx($mx)) {
|
||||
if (defined $rr->exchange) {
|
||||
push @mxs, Mail::SpamAssassin::Util::untaint_hostname($rr->exchange);
|
||||
}
|
||||
}
|
||||
if (!@mxs) {
|
||||
warn("reporter: failed to resolve SpamCop MX servers\n");
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
my $port = $options->{report}->{conf}->{spamcop_relayport} || 587;
|
||||
|
||||
for my $exchange (@mxs) {
|
||||
if (my $smtp = Net::SMTP->new($exchange,
|
||||
Hello => $hello,
|
||||
Port => 587,
|
||||
Port => $port,
|
||||
Timeout => 10))
|
||||
{
|
||||
if ($smtp->mail($from) && smtp_dbg("FROM $from", $smtp) &&
|
||||
|
@ -87,8 +87,8 @@ sub new {
|
||||
}
|
||||
}
|
||||
|
||||
$self->register_eval_rule("check_language");
|
||||
$self->register_eval_rule("check_body_8bits");
|
||||
$self->register_eval_rule("check_language"); # type does not matter
|
||||
$self->register_eval_rule("check_body_8bits", $Mail::SpamAssassin::Conf::TYPE_BODY_EVALS);
|
||||
|
||||
$self->set_config($mailsaobject->{conf});
|
||||
|
||||
@ -114,7 +114,7 @@ confidence. In that case, no action is taken.
|
||||
|
||||
The rule C<UNWANTED_LANGUAGE_BODY> is triggered if none of the languages
|
||||
detected are in the "ok" list. Note that this is the only effect of the
|
||||
"ok" list. It does not act as a whitelist against any other form of spam
|
||||
"ok" list. It does not act as a welcomelist against any other form of spam
|
||||
scanning.
|
||||
|
||||
In your configuration, you must use the two or three letter language
|
||||
@ -406,7 +406,7 @@ sub load_models {
|
||||
# create language ngram maps once
|
||||
for (@lm) {
|
||||
# look for end delimiter
|
||||
if (/^0 (.+)/) {
|
||||
if (index($_, '0 ') == 0 && /^0 (.+)/) {
|
||||
$ngram->{"language"} = $1;
|
||||
push(@nm, $ngram);
|
||||
# reset for next language
|
||||
@ -449,7 +449,14 @@ sub classify {
|
||||
$p += exists($ngram->{$_}) ? abs($ngram->{$_} - $i) : $maxp;
|
||||
$i++;
|
||||
}
|
||||
$results{$language} = $p;
|
||||
# Most latin1 languages have xx and xx.utf8 alternatives (those which
|
||||
# don't have should be named xx.utf-8). Always strip .utf8 from name,
|
||||
# it will not be accurate as matching will depend on normalize_charset
|
||||
# and mail encoding. Keep track of the best score for alternatives.
|
||||
$language = $short if index($language, '.utf8') > 0;
|
||||
if (!exists $results{$language} || $results{$language} > $p) {
|
||||
$results{$language} = $p
|
||||
}
|
||||
}
|
||||
my @results = sort { $results{$a} <=> $results{$b} } keys %results;
|
||||
|
||||
@ -459,7 +466,11 @@ sub classify {
|
||||
my @results_tag;
|
||||
foreach (@results[0..19]) {
|
||||
last unless defined $_;
|
||||
if($best != 0) {
|
||||
push @results_tag, sprintf "%s:%s(%.02f)", $_, $results{$_}, $results{$_} / $best;
|
||||
} else {
|
||||
push @results_tag, sprintf "%s:%s(unknown)", $_, $results{$_};
|
||||
}
|
||||
}
|
||||
$opts->{permsgstatus}->set_tag('TEXTCATRESULTS', join(' ', @results_tag));
|
||||
|
||||
@ -539,11 +550,14 @@ sub extract_metadata {
|
||||
|
||||
my $body = $msg->get_rendered_body_text_array();
|
||||
$body = join("\n", @{$body});
|
||||
$body =~ s/^Subject://i;
|
||||
|
||||
# Strip subject prefixes, enhances results
|
||||
$body =~ s/^(?:[a-z]{2,12}:\s*){1,10}//i;
|
||||
|
||||
# Strip anything that looks like url or email, enhances results
|
||||
$body =~ s{https?://\S+}{ }gs;
|
||||
$body =~ s{\S+?\@[a-zA-Z]\S+}{ }gs;
|
||||
$body =~ s/https?(?:\:\/\/|://|%3A%2F%2F)\S{1,1024}/ /gs;
|
||||
$body =~ s/\S{1,64}?\@[a-zA-Z]\S{1,128}/ /gs;
|
||||
$body =~ s/\bwww\.\S{1,128}/ /gs;
|
||||
|
||||
my $len = length($body);
|
||||
# truncate after 10k; that should be plenty to classify it
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user