AntiSpam Detection - “X-RMX-Spam” header content
The AntiSpam Detection as part of the Retarus Email Security calculates a spam probability for every Inbound email. Based on the spam probability, you as an administrator may define certain actions to be carried out for this email, e.g. adding a subject tag, storing the email in the user’s quarantine or discarding it. For detailed information about the available settings, refer to Administrator Manual - Email Security.
The goal of this manual is to give you a better understanding of how the spam probability is calculated by analyzing the value of the custom “X-RMX-Spam” header added to every Inbound email.
In the following chapters, you’ll find some general information about the structure of the header field’s value, as well as a list of the most common patterns mentioned there, together with their meaning – as not all of them are self-explaining.
Using this documentation, you will be able to understand the most common reasons for a higher spam probability, answer your end users’ questions related to the spam rating and maybe take required actions to avoid a high spam probability rating for certain desired Inbound emails.
Also, the reasons for an email being classified as a “newsletter” are included in this document.
This manual does not include all patterns and their meaning, but only the most common ones. Also, the listed patterns give you an idea about the reason for the spam probability, but they cannot give you all the details about the exact phrases or words that lead to a matching of the pattern. The patterns and phrases are updated constantly, and normally, only a combination of multiple characteristics finally leads to adding a certain percentage to the final result.
If you need to get more detailed information in specific cases, individual research has to be done by the Retarus analysts. In these cases, please contact the Retarus Support.
Please be aware of the fact that the structure and content of the X-RMX-Spam header may be changed by Retarus without prior notice. Retarus’ liability for any information contained in this manual is excluded, if and to the extent not arising from Retarus’ willful misconduct or gross negligence.
Spam header information and spam probability calculation
General information
For every Inbound email that is processed by the Retarus AntiSpam Detection, the custom header field X-RMX-Spam is inserted in the mail header. Its value contains the final spam probability, as well as the characteristics that have been taken into consideration for calculating the score (and the classification as “newsletter” if applicable).
Structure of the X-RMX-Spam header
Example header
X-RMX-Spam: Gauge=XXXXXXXXXIIIIII, Probability=96%, Report='SPF_SOFTFAIL! 2.500,
NO_MESSAGE_ID! 1.600, RDNS_SUSP! 1.500, FORGED_FROM_GMAIL! 1.000, DATE_MISSING!
0.800, NO_REAL_NAME! 0.200, RMX_CAMPAIGN2_R3+ 0, RMX_CAMPAIGN4_R3+ 0,
RMX_RQ_100_LOW+ 0, RDNS_SUSP_FORGED_FROM 3.5, SUBJ_1WORD 0.1, HTML_00_01 0.05,
HTML_00_10 0.05, MIME_TEXT_ONLY_MP_MIXED 0.05, BODYTEXTP_SIZE_3000_LESS 0,
BODYTEXTP_SIZE_400_LESS 0, BODY_SIZE_1000_LESS 0, BODY_SIZE_100_199 0,
BODY_SIZE_2000_LESS 0, BODY_SIZE_5000_LESS 0, BODY_SIZE_7000_LESS 0,
NO_CTA_URI_FOUND 0, NO_URI_FOUND 0, NO_URI_HTTPS 0, RDNS_NXDOMAIN 0,
RDNS_SUSP_GENERIC 0, SMALL_BODY 0, __BODY_NO_MAILTO 0, __CT 0,
__CTYPE_HAS_BOUNDARY 0, __CTYPE_MULTIPART 0, __CTYPE_MULTIPART_MIXED 0,
__FRAUD_WEBMAIL 0, __FRAUD_WEBMAIL_FROM 0, __FROM_GMAIL 0, __HAS_FROM 0,
__MIME_TEXT_ONLY 0, __MIME_TEXT_P 0, __MIME_TEXT_P1 0, __MIME_TEXT_P2 0,
__MIME_VERSION 0, __NO_HTML_TAG_RAW 0, __PHISH_SPEAR_STRUCTURE_1 0,
__TO_MALFORMED_2 0, __TO_NO_NAME 0, __bl.spamcop.net_NOTLISTED,
__dnsbl.sorbs.net_NOTLISTED, __ix.dnsbl.manitu.net_NOTLISTED'
The X-RMX-Spam header value consists of different parts (see example above):
Probability: Calculated spam probability between 0 and 100%, including percentage points added or subtracted due to the authentication results of the email (settings in myEAS portal for DKIM/SPF check). “Probability” is the value that is used by Retarus AntiSpam Detection and compared to the configured thresholds in order to decide about passing the filter, tagging the subject or putting an email in quarantine.
Gauge: Legacy part of the header. “Gauge” shows the calculated spam probability using a graphical format referring to Roman numerals (but not consistently, only an “X” for every 10 percentage points and an “I” for the remaining single percentage points). In some cases, Gauge differs from “Probability”, because the authentication results (settings in myEAS portal for DKIM/SPF check) are not taken into consideration for the “Gauge”. Therefore, “Probability” remains the relevant value for further processing.
Report: List of all “rules” of Retarus AntiSpam Detection that have been matched. A “rule” normally consis of multiple characteristics/components/patterns identified in the email.
Rules starting with underscores are “sub-rules” that do not add a spam percentage on their own, but have been identified and are taken into consideration for the spam probability calculation in combination with other sub-rules.
Numbers after a rule signify the weight of this rule (=combined patterns) used for the calculation of the spam probability. The numbers are not converted 1:1 to the spam probability percentage, but these weights are summed up to a “score” which is then used for the conversion (see the calculation of the spam probability below). Sub-rules have a “0” because they do not count on their own.
‘+’ and ‘!’ characters after a pattern transport internal information about custom or adapted rules, but don’t have any importance for the calculation of the spam probability.
Calculation of the spam probability
The different weights of all matching rules (see descriptions above) are finally summed up to the final “score” and this score is used for calculating the final spam probability.
As a rule of thumb, the score can be converted into the spam probability by multiplication with 10; in reality, the following formula is used:
PROBABILITY = 1/(1 + exp(-(SCORE-5)/2))
In the example header above, the individual rule weights are added up to a score of 11.35, which is translated into a spam probability of 96%.
List of common patterns (rules)
In the following table, the most common rules mentioned in the X-RMX-Spam header value are explained.
Rule | Description | Score |
---|---|---|
BLOCKED_GW_IP | IP of relay is blocked - IP-BlockerDB | +7.5 |
BOUNCE_AUTORESP | Contains content that suggests the message is an Auto Responder | 0.0 |
BOUNCE_GENERIC | Message contains content that suggests it is a bounce/autoresponder Meta: BOUNCE_CHALLENGE || BOUNCE_NDR || BOUNCE_SPAM || BOUNCE_AUTORESP | 0.0 |
HTML_x_y (e.g., HTML_50_70) | Message consists of x-y% HTML code. | +0.05 to +0.4 |
DOMAIN_OBFU_DOT | Body text contains an obfuscated domain from known TLDs | +1.5 |
FRAUD_HIGH_X3 | Likely to be a fraud message, as it contains at least three different highly probable fraud phrases (e.g., phrases pushing for an urgent reply or related to lottery, loans, payments etc.). | +3.0 |
FRAUD_HIGH_X3_WEBMAIL | Likely to be a fraud message coming from a webmail system. | +1.2 |
FRAUD_X3_LARGE_BODY | Mail body size larger than most fraud emails. | -1.8 |
IMGSPAM_TABLE | Image spam wrapped in <a> and <td> tags (>10 image) | +1.5 |
IMGSPAM_TABLE_1 | Image spam wrapped in <a> and <td> tags (1 image) | +0.2 |
IMGSPAM_TABLE_2_3 | Image spam wrapped in <a> and <td> tags (between 2 to 3 images) | +0.25 |
IMGSPAM_TABLE_4_6 | Image spam wrapped in <a> and <td> tags (between 4 to 6 images) | +0.7 |
IMGSPAM_TABLE_7 | Image spam wrapped in <a> and <td> tags (> 7 images) | +1.0 |
IN_REP_TO | Found a “In-Reply-To” header. | -0.6 |
KNOWN_SPAM_PARAGRAPH | Message is blocklisted | +8.0 |
KNOWN_ENTERTAINMENT_CAMPAIGN | Wants your time/money in exchange for entertainment | +8.0 |
KNOWN_FINANCIAL_CAMPAIGN | All about money | +8.0 |
KNOWN_HEALTH_CAMPAIGN | Pushes health products | +8.0 |
KNOWN_INTERNET_CAMPAIGN | Pushes internet services or products | +8.0 |
KNOWN_MTA_TFX | Trusted MTA according to Traffix | 0.0 |
KNOWN_SPAM_ATTACHMENTSIG | Known attachment content | +8.0 |
KNOWN_SPAM_CONTENT | Message content is blocklisted | +8.0 |
KNOWN_SPAM_EXCELSIG | Known excel content | +8.0 |
KNOWN_SPAM_GIFSIG | Known GIF content | +8.0 |
KNOWN_SPAM_GIFSIG_ATTACHED | Known attached GIF | +8.0 |
KNOWN_SPAM_HTML_JPEGSIG | Known HTML and JPEG content | +8.0 |
KNOWN_SPAM_IMAGESIG | Known HTML and GIF content | +8.0 |
KNOWN_SPAM_PDFSIG | Known PDF content | +8.0 |
KNOWN_SPAM_PNGSIG | Known PNG content | +8.0 |
KNOWN_SPAM_RARSIG | Known RAR content | +8.0 |
KNOWN_SPAM_RARSIG_EXT | Known RAR content (by extension) | +8.0 |
KNOWN_SPAM_URLRAW | Known spam URL (raw format) | +8.0 |
KNOWN_SPAM_ZIPSIG | Known ZIP content | +8.0 |
KNOWN_SPAM_ZIPSIG_EXT | Known ZIP content (by extension) | +8.0 |
MIME_BOUND_NEXTPART | Spam tool pattern in MIME boundary | +1.2 |
OBFUSCATION | Possible text obfuscation | 0.0 |
PHISH_TRUSTED_RDNS | Relayed via a known trusted phish target | -2.0 |
REFERENCES | Has a valid-looking “References” header. | -1.0 |
RELAY_IN_NIXSPAM | Source IP is listed at NiXSpam; see http://dnsbl.manitu.net. | -3.8 |
RELAY_IN_RETARUS_HOST_BL1 | Source IP is listed in one of the Retarus IP Blocklists. | different score, |
RELAY_IN_SORBS | Source IP is listed at dnsbl.sorbs.net | +3.2 |
RELAY_IN_SPAMCOP_NET | Source IP is listed at bl.spamcop.net | +3.0 |
RETURN_RECEIPT | Contains a "read rcpt" header | -0.5 |
RMX_AUTHENTICATION_RESULTS | Spam probability score added or subtracted due to the SPF or DKIM customer settings in myEAS portal. | different score, |
RMX_RQ_0 | recipient quality (valid recipients transmitted from an IP address in a defined timeframe) of 0% | +6.0 |
RMX_RQ_1_5 | recipient quality 1-5% | +5.75 |
RMX_RQ_6_10 | recipient quality 6-10% | +5.5 |
RMX_RQ_11_15 | recipient quality 11-15% | +5.25 |
RMX_RQ_16_20 | recipient quality 16-20% | +5.0 |
RMX_RQ_21_25 | recipient quality 21-25% | +4.75 |
RMX_RQ_26_30 | recipient quality 26-30% | +4.5 |
RMX_RQ_31_35 | recipient quality 31-35% | +4.25 |
RMX_RQ_36_40 | recipient quality 36-40% | +4.0 |
RMX_RQ_41_45 | recipient quality 41-45% | +3.75 |
RMX_RQ_46_50 | recipient quality 46-50% | +3.5 |
RMX_RQ_51_55 | recipient quality 51-55% | +3.3 |
RMX_RQ_56_60 | recipient quality 56-60% | +3.0 |
RMX_RQ_61_65 | recipient quality 61-65% | +2.8 |
RMX_RQ_66_70 | recipient quality 66-70% | +2.0 |
RMX_RQ_71_75 | recipient quality 71-75% | +1.5 |
RMX_RQ_76_80 | recipient quality 76-80% | +1.0 |
RMX_RQ_81_85 | recipient quality 81-85% | +0.8 |
RMX_RQ_86_90 | recipient quality 86-90% | +0.5 |
RMX_RQ_91_95 | recipient quality 91-95% | 0.0 |
RMX_RQ_96_99 | recipient quality 96-99% | 0.0 |
RMX_RQ_100_LOW | recipient quality 100% and less than 10 valid recipients transmitted from an IP address in a defined timeframe | 0.0 |
RMX_RQ_100 | recipient quality 100% and 10 or more valid recipients transmitted from an IP address in a defined timeframe | -0.75 |
RMX_RQ_100_HIGH | recipient quality 100% and between 1,000 and 9,999 valid recipients transmitted from an IP address in a defined timeframe | -0.75 |
RMX_RQ_100_VERY_HIGH | recipient quality 100% and more than 10,000 valid recipients transmitted from an IP address in a defined timeframe | -1.5 |
SPF_ERROR | SPF (Sender Policy Framework) check returned an error | 0.0 |
SPF_FAIL | SPF (Sender Policy Framework) check failed. “RMX_AUTHENTICATION_RESULTS” rule is inserted as well if spam probability percentage points have been added due to customer settings in myEAS portal. | 0.0 (spam probability percentage points may be added directly using the configuration options in myEAS portal) |
SPF_NEUTRAL | SPF (Sender Policy Framework) check returned "neutral" | +1.0 |
SPF_NONE | No SPF (Sender Policy Framework) record found | 0.0 |
SPF_PASS | SPF (Sender Policy Framework) check passed successfully “RMX_AUTHENTICATION_RESULTS” rule is inserted as well if spam probability percentage points have been added due to customer settings in myEAS portal. | 0.0 (spam probability percentage points may be added directly using the configuration options in myEAS portal) |
SPF_SOFTFAIL | SPF (Sender Policy Framework) check soft-failed | +2.5 |
SPF_UNKNOWN | SPF (Sender Policy Framework) check returned "unknown" | 0.0 |
SUSP_ENV_FROM | Suspicious SMTP/Envelope From sender address | +4.5 |
SXL_ATTACHMENT_SIG | Contains a known bad attachment (SXL lookup) | +8.0 |
SXL_BODY_SIG | Contains known spam content (SXL lookup) | +8.0 |
SXL_IP_SPAM | Received via a known spam network (SXL lookup) | +8.0 |
SXL_IP_TFX_SS | Received via a known spam source (SXL lookup) | +8.0 |
SXL_IP_TFX_WM | Received via a known whitelisted mail server (SXL lookup) | 0.0 |
SXL_JPEG_HTML_SIG | Contains a known spam JPEG/HTML combination | +8.0 |
SXL_PARA_SIG | Contains a known spam paragraph (SXL lookup) | +8.0 |
SXL_PDF_SIG | Contains a known spam PDF attachment (SXL lookup) | +8.0 |
SXL_PNG_SIG | Contains a known spam PNG image (SXL lookup) | +8.0 |
SXL_RAR_SIG | Contains a known bad RAR attachment (SXL lookup) | +8.0 |
SXL_URI | Contains a known spam URL (SXL lookup) | +7.0 |
SXL_URI_LAB | Contains a known spam URL (SXL lookup) | +7.0 |
SXL_URI_NEW | Contains a recently registered domain name (SXL lookup) | +2.2 |
Newsletter Rules
If any one of the following rules matches, the email is classified as “newsletter”.
Rule | Description |
---|---|
RMX_NEWSL_CONTENT, | Known newsletter content |
RMX_NEWSL_DISPLAY_BROWSER, | Known newsletter phrase “Display in browser” |
RMX_NEWSL_FROM | Known newsletter sending address |
RMX_NEWSL_HEADERS | Typical header for newsletters, list server |
RMX_NEWSL_HTML | Known newsletter HTML |
RMX_NEWSL_INVITATION, | Known newsletter phrase with invitation, |
RMX_NEWSL_SIGNOFF | Known newsletter sign-off phrase |
RMX_NEWSL_SUBJECT | Known newsletter subject |
RMX_NEWSL_SUBSCRIBE, | Known newsletter subscribe phrase (in German/English) |
RMX_NEWSL_UNSUBSCRIBE, | Known newsletter unsubscribe phrase (in German/English) |
RMX_NEWSL_URL | Known newsletter URL |