AntiSpam Detection - “X-RMX-Spam” header content

The AntiSpam Detection as part of the Retarus Email Security calculates a spam probability for every Inbound email. Based on the spam probability, you as an administrator may define certain actions to be carried out for this email, e.g. adding a subject tag, storing the email in the user’s quarantine or discarding it. For detailed information about the available settings, refer to Administrator Manual - Email Security.

The goal of this manual is to give you a better understanding of how the spam probability is calculated by analyzing the value of the custom “X-RMX-Spam” header added to every Inbound email.

In the following chapters, you’ll find some general information about the structure of the header field’s value, as well as a list of the most common patterns mentioned there, together with their meaning – as not all of them are self-explaining.

Using this documentation, you will be able to understand the most common reasons for a higher spam probability, answer your end users’ questions related to the spam rating and maybe take required actions to avoid a high spam probability rating for certain desired Inbound emails.

Also, the reasons for an email being classified as a “newsletter” are included in this document.

This manual does not include all patterns and their meaning, but only the most common ones. Also, the listed patterns give you an idea about the reason for the spam probability, but they cannot give you all the details about the exact phrases or words that lead to a matching of the pattern. The patterns and phrases are updated constantly, and normally, only a combination of multiple characteristics finally leads to adding a certain percentage to the final result.

If you need to get more detailed information in specific cases, individual research has to be done by the Retarus analysts. In these cases, please contact the Retarus Support.

Please be aware of the fact that the structure and content of the X-RMX-Spam header may be changed by Retarus without prior notice. Retarus’ liability for any information contained in this manual is excluded, if and to the extent not arising from Retarus’ willful misconduct or gross negligence.

Spam header information and spam probability calculation

General information

For every Inbound email that is processed by the Retarus AntiSpam Detection, the custom header field X-RMX-Spam is inserted in the mail header. Its value contains the final spam probability, as well as the characteristics that have been taken into consideration for calculating the score (and the classification as “newsletter” if applicable).

Structure of the X-RMX-Spam header

Example header

TEXT

X-RMX-Spam: Gauge=XXXXXXXXXIIIIII, Probability=96%, Report='SPF_SOFTFAIL! 2.500,
NO_MESSAGE_ID! 1.600, RDNS_SUSP! 1.500, FORGED_FROM_GMAIL! 1.000, DATE_MISSING!
0.800, NO_REAL_NAME! 0.200, RMX_CAMPAIGN2_R3+ 0, RMX_CAMPAIGN4_R3+ 0,
RMX_RQ_100_LOW+ 0, RDNS_SUSP_FORGED_FROM 3.5, SUBJ_1WORD 0.1, HTML_00_01 0.05,
HTML_00_10 0.05, MIME_TEXT_ONLY_MP_MIXED 0.05, BODYTEXTP_SIZE_3000_LESS 0,
BODYTEXTP_SIZE_400_LESS 0, BODY_SIZE_1000_LESS 0, BODY_SIZE_100_199 0,
BODY_SIZE_2000_LESS 0, BODY_SIZE_5000_LESS 0, BODY_SIZE_7000_LESS 0,
NO_CTA_URI_FOUND 0, NO_URI_FOUND 0, NO_URI_HTTPS 0, RDNS_NXDOMAIN 0,
RDNS_SUSP_GENERIC 0, SMALL_BODY 0, __BODY_NO_MAILTO 0, __CT 0,
__CTYPE_HAS_BOUNDARY 0, __CTYPE_MULTIPART 0, __CTYPE_MULTIPART_MIXED 0,
__FRAUD_WEBMAIL 0, __FRAUD_WEBMAIL_FROM 0, __FROM_GMAIL 0, __HAS_FROM 0,
__MIME_TEXT_ONLY 0, __MIME_TEXT_P 0, __MIME_TEXT_P1 0, __MIME_TEXT_P2 0,
__MIME_VERSION 0, __NO_HTML_TAG_RAW 0, __PHISH_SPEAR_STRUCTURE_1 0,
__TO_MALFORMED_2 0, __TO_NO_NAME 0, __bl.spamcop.net_NOTLISTED,
__dnsbl.sorbs.net_NOTLISTED, __ix.dnsbl.manitu.net_NOTLISTED'

The X-RMX-Spam header value consists of different parts (see example above):

Probability: Calculated spam probability between 0 and 100%, including percentage points added or subtracted due to the authentication results of the email (settings in myEAS portal for DKIM/SPF check). “Probability” is the value that is used by Retarus AntiSpam Detection and compared to the configured thresholds in order to decide about passing the filter, tagging the subject or putting an email in quarantine.
Gauge: Legacy part of the header. “Gauge” shows the calculated spam probability using a graphical format referring to Roman numerals (but not consistently, only an “X” for every 10 percentage points and an “I” for the remaining single percentage points). In some cases, Gauge differs from “Probability”, because the authentication results (settings in myEAS portal for DKIM/SPF check) are not taken into consideration for the “Gauge”. Therefore, “Probability” remains the relevant value for further processing.
Report: List of all “rules” of Retarus AntiSpam Detection that have been matched. A “rule” normally consis of multiple characteristics/components/patterns identified in the email.
- Rules starting with underscores are “sub-rules” that do not add a spam percentage on their own, but have been identified and are taken into consideration for the spam probability calculation in combination with other sub-rules.
- Numbers after a rule signify the weight of this rule (=combined patterns) used for the calculation of the spam probability. The numbers are not converted 1:1 to the spam probability percentage, but these weights are summed up to a “score” which is then used for the conversion (see the calculation of the spam probability below). Sub-rules have a “0” because they do not count on their own.
- ‘+’ and ‘!’ characters after a pattern transport internal information about custom or adapted rules, but don’t have any importance for the calculation of the spam probability.

Calculation of the spam probability

The different weights of all matching rules (see descriptions above) are finally summed up to the final “score” and this score is used for calculating the final spam probability.

As a rule of thumb, the score can be converted into the spam probability by multiplication with 10; in reality, the following formula is used:

PROBABILITY = 1/(1 + exp(-(SCORE-5)/2))

In the example header above, the individual rule weights are added up to a score of 11.35, which is translated into a spam probability of 96%.

List of common patterns (rules)

In the following table, the most common rules mentioned in the X-RMX-Spam header value are explained.

Rule	Description	Score
BLOCKED_GW_IP	IP of relay is blocked - IP-BlockerDB	+7.5
BOUNCE_AUTORESP	Contains content that suggests the message is an Auto Responder	0.0
BOUNCE_GENERIC	Message contains content that suggests it is a bounce/autoresponder Meta: BOUNCE_CHALLENGE \|\| BOUNCE_NDR \|\| BOUNCE_SPAM \|\| BOUNCE_AUTORESP	0.0
HTML_x_y (e.g., HTML_50_70)	Message consists of x-y% HTML code.	+0.05 to +0.4
DOMAIN_OBFU_DOT	Body text contains an obfuscated domain from known TLDs	+1.5
FRAUD_HIGH_X3	Likely to be a fraud message, as it contains at least three different highly probable fraud phrases (e.g., phrases pushing for an urgent reply or related to lottery, loans, payments etc.).	+3.0
FRAUD_HIGH_X3_WEBMAIL	Likely to be a fraud message coming from a webmail system.	+1.2
FRAUD_X3_LARGE_BODY	Mail body size larger than most fraud emails.	-1.8
IMGSPAM_TABLE	Image spam wrapped in <a> and <td> tags (>10 image)	+1.5
IMGSPAM_TABLE_1	Image spam wrapped in <a> and <td> tags (1 image)	+0.2
IMGSPAM_TABLE_2_3	Image spam wrapped in <a> and <td> tags (between 2 to 3 images)	+0.25
IMGSPAM_TABLE_4_6	Image spam wrapped in <a> and <td> tags (between 4 to 6 images)	+0.7
IMGSPAM_TABLE_7	Image spam wrapped in <a> and <td> tags (> 7 images)	+1.0
IN_REP_TO	Found a “In-Reply-To” header. Meta: (_IN_REP_TO && !_MANY_USER_AGENTS)	-0.6
KNOWN_SPAM_PARAGRAPH	Message is blocklisted	+8.0
KNOWN_ENTERTAINMENT_CAMPAIGN	Wants your time/money in exchange for entertainment	+8.0
KNOWN_FINANCIAL_CAMPAIGN	All about money	+8.0
KNOWN_HEALTH_CAMPAIGN	Pushes health products	+8.0
KNOWN_INTERNET_CAMPAIGN	Pushes internet services or products	+8.0
KNOWN_MTA_TFX	Trusted MTA according to Traffix	0.0
KNOWN_SPAM_ATTACHMENTSIG	Known attachment content	+8.0
KNOWN_SPAM_CONTENT	Message content is blocklisted	+8.0
KNOWN_SPAM_EXCELSIG	Known excel content	+8.0
KNOWN_SPAM_GIFSIG	Known GIF content	+8.0
KNOWN_SPAM_GIFSIG_ATTACHED	Known attached GIF	+8.0
KNOWN_SPAM_HTML_JPEGSIG	Known HTML and JPEG content	+8.0
KNOWN_SPAM_IMAGESIG	Known HTML and GIF content	+8.0
KNOWN_SPAM_PDFSIG	Known PDF content	+8.0
KNOWN_SPAM_PNGSIG	Known PNG content	+8.0
KNOWN_SPAM_RARSIG	Known RAR content	+8.0
KNOWN_SPAM_RARSIG_EXT	Known RAR content (by extension)	+8.0
KNOWN_SPAM_URLRAW	Known spam URL (raw format)	+8.0
KNOWN_SPAM_ZIPSIG	Known ZIP content	+8.0
KNOWN_SPAM_ZIPSIG_EXT	Known ZIP content (by extension)	+8.0
MIME_BOUND_NEXTPART	Spam tool pattern in MIME boundary Meta: (_NEXTPART_ALL && !_NEXTPART_NORMAL)	+1.2
OBFUSCATION	Possible text obfuscation Meta: __HIDDEN_HTML_CONTENT \|\| __HTML_SHORT_STR_X10 \|\| __HTML_STYLE_DEF_HIDDEN \|\| __HTML_STYLE_DEF_HIDDEN_AC \|\| __GENERAL_PUNCTUATION \|\| __LETTER_HEX_MIX	0.0
PHISH_TRUSTED_RDNS	Relayed via a known trusted phish target	-2.0
REFERENCES	Has a valid-looking “References” header. Meta: (_REFERENCES && !_MANY_USER_AGENTS)	-1.0
RELAY_IN_NIXSPAM	Source IP is listed at NiXSpam; see http://dnsbl.manitu.net.	-3.8
RELAY_IN_RETARUS_HOST_BL1	Source IP is listed in one of the Retarus IP Blocklists.	different score, e.g. +5.0
RELAY_IN_SORBS	Source IP is listed at dnsbl.sorbs.net	+3.2
RELAY_IN_SPAMCOP_NET	Source IP is listed at bl.spamcop.net	+3.0
RETURN_RECEIPT	Contains a "read rcpt" header Meta: __RETURN_RECEIPT_TO \|\| __NOTIFICATION_TO	-0.5
RMX_AUTHENTICATION_RESULTS	Spam probability score added or subtracted due to the SPF or DKIM customer settings in myEAS portal.	different score, e.g. +6.0 when 60% spam probability has been added.
RMX_RQ_0	recipient quality (valid recipients transmitted from an IP address in a defined timeframe) of 0%	+6.0
RMX_RQ_1_5	recipient quality 1-5%	+5.75
RMX_RQ_6_10	recipient quality 6-10%	+5.5
RMX_RQ_11_15	recipient quality 11-15%	+5.25
RMX_RQ_16_20	recipient quality 16-20%	+5.0
RMX_RQ_21_25	recipient quality 21-25%	+4.75
RMX_RQ_26_30	recipient quality 26-30%	+4.5
RMX_RQ_31_35	recipient quality 31-35%	+4.25
RMX_RQ_36_40	recipient quality 36-40%	+4.0
RMX_RQ_41_45	recipient quality 41-45%	+3.75
RMX_RQ_46_50	recipient quality 46-50%	+3.5
RMX_RQ_51_55	recipient quality 51-55%	+3.3
RMX_RQ_56_60	recipient quality 56-60%	+3.0
RMX_RQ_61_65	recipient quality 61-65%	+2.8
RMX_RQ_66_70	recipient quality 66-70%	+2.0
RMX_RQ_71_75	recipient quality 71-75%	+1.5
RMX_RQ_76_80	recipient quality 76-80%	+1.0
RMX_RQ_81_85	recipient quality 81-85%	+0.8
RMX_RQ_86_90	recipient quality 86-90%	+0.5
RMX_RQ_91_95	recipient quality 91-95%	0.0
RMX_RQ_96_99	recipient quality 96-99%	0.0
RMX_RQ_100_LOW	recipient quality 100% and less than 10 valid recipients transmitted from an IP address in a defined timeframe	0.0
RMX_RQ_100	recipient quality 100% and 10 or more valid recipients transmitted from an IP address in a defined timeframe	-0.75
RMX_RQ_100_HIGH	recipient quality 100% and between 1,000 and 9,999 valid recipients transmitted from an IP address in a defined timeframe	-0.75
RMX_RQ_100_VERY_HIGH	recipient quality 100% and more than 10,000 valid recipients transmitted from an IP address in a defined timeframe	-1.5
SPF_ERROR	SPF (Sender Policy Framework) check returned an error	0.0
SPF_FAIL	SPF (Sender Policy Framework) check failed. “RMX_AUTHENTICATION_RESULTS” rule is inserted as well if spam probability percentage points have been added due to customer settings in myEAS portal.	0.0 (spam probability percentage points may be added directly using the configuration options in myEAS portal)
SPF_NEUTRAL	SPF (Sender Policy Framework) check returned "neutral"	+1.0
SPF_NONE	No SPF (Sender Policy Framework) record found	0.0
SPF_PASS	SPF (Sender Policy Framework) check passed successfully “RMX_AUTHENTICATION_RESULTS” rule is inserted as well if spam probability percentage points have been added due to customer settings in myEAS portal.	0.0 (spam probability percentage points may be added directly using the configuration options in myEAS portal)
SPF_SOFTFAIL	SPF (Sender Policy Framework) check soft-failed	+2.5
SPF_UNKNOWN	SPF (Sender Policy Framework) check returned "unknown"	0.0
SUSP_ENV_FROM	Suspicious SMTP/Envelope From sender address	+4.5
SXL_ATTACHMENT_SIG	Contains a known bad attachment (SXL lookup)	+8.0
SXL_BODY_SIG	Contains known spam content (SXL lookup)	+8.0
SXL_IP_SPAM	Received via a known spam network (SXL lookup)	+8.0
SXL_IP_TFX_SS	Received via a known spam source (SXL lookup)	+8.0
SXL_IP_TFX_WM	Received via a known whitelisted mail server (SXL lookup)	0.0
SXL_JPEG_HTML_SIG	Contains a known spam JPEG/HTML combination	+8.0
SXL_PARA_SIG	Contains a known spam paragraph (SXL lookup)	+8.0
SXL_PDF_SIG	Contains a known spam PDF attachment (SXL lookup)	+8.0
SXL_PNG_SIG	Contains a known spam PNG image (SXL lookup)	+8.0
SXL_RAR_SIG	Contains a known bad RAR attachment (SXL lookup)	+8.0
SXL_URI	Contains a known spam URL (SXL lookup)	+7.0
SXL_URI_LAB	Contains a known spam URL (SXL lookup)	+7.0
SXL_URI_NEW	Contains a recently registered domain name (SXL lookup)	+2.2

Newsletter Rules

If any one of the following rules matches, the email is classified as “newsletter”.

Rule	Description
RMX_NEWSL_CONTENT, RMX_NEWSL_CONTENT_DE/EN/ES/FR/IT	Known newsletter content (in German/English/Spanish/French/Italian)
RMX_NEWSL_DISPLAY_BROWSER, RMX_NEWSL_DISPLAY_BROWSER_DE/EN	Known newsletter phrase “Display in browser” (in German/Englisch)
RMX_NEWSL_FROM	Known newsletter sending address
RMX_NEWSL_HEADERS	Typical header for newsletters, list server
RMX_NEWSL_HTML	Known newsletter HTML
RMX_NEWSL_INVITATION, RMX_NEWSL_INVITATION_ASIAN	Known newsletter phrase with invitation, Known newsletter phrase with invitation from Asia
RMX_NEWSL_SIGNOFF	Known newsletter sign-off phrase
RMX_NEWSL_SUBJECT	Known newsletter subject
RMX_NEWSL_SUBSCRIBE, RMX_NEWSL_SUBSCRIBE_DE/EN	Known newsletter subscribe phrase (in German/English)
RMX_NEWSL_UNSUBSCRIBE, RMX_NEWSL_UNSUBSCRIBE_DE/EN	Known newsletter unsubscribe phrase (in German/English)
RMX_NEWSL_URL	Known newsletter URL