Peeking into Email Validation Techniques

Oct 7, 2019

Published by

ISAAC KIM

•

Category:

Ready to see Bird
in action?

Schedule a demo

Peeking into Email Validation Techniques

From regular expression conformity to today’s sophisticated tools, email validation has gone through a few different eras. Before looking at the history of validation techniques and best practices, it’s essential to understand the basics.

What Is Email Validation and Why Is It Important?

Email validation is a method for verifying that a recipient’s address exists and accepts messages. Some techniques take validation an additional step by confirming the address belongs to the intended person, and the recipient wants to receive such communications. During the process, you’ll eliminate risky and invalid addresses from your list and protect your name as a reputable sender with Internet Service Providers (ISPs).

Taking care to validate user emails before sending marketing and transactional messages also increases the chances that your communications route directly to a reader’s inbox. In addition, campaign metrics are more accurate and provide deeper insights for future data-driven decisions.

A long time ago in an era far, far away… it started with Syntax Validation

Checking an email address for syntax accuracy has been the simplest version of email validation. The core elements of a valid email address are the local part, the @ symbol, the domain, and finally, the extension (.com, .org, etc.). To help standardize all the various syntaxes, specifications called Requests For Comments (RFCs) were published to determine what characters would be acceptable for local and domain parts. These RFCs eventually became quite extensive and created the need for open-source libraries to help validate email syntax in many languages.

Another vital component of syntax validation is apparent misspellings, such as heyitsme@gmsil.com. Proofing your existing list for these types of errors remains a best practice.

Validation SMTP Command and The Attack of the Spammers

Recognizing the need for help on validating email addresses, Internet Service Providers (ISPs) started to build in email address validation functionality. Thus, “VRFY” (also known as Verify) was built as an SMTP command which enabled senders to ask a receiving mail server if an email address was valid. With the hope to use VRFY to bring peace and order to the galactic Internet, it soon fell into the wrong hands of the dark side; spammers. After wide-scale abuse of this functionality, ISP administrators disabled VRFY, leaving email address validation in disarray.

SMTP Ping (The Spammer Menace)

After the fall of VRFY, senders creatively devised SMTP Ping, a different method to verify whether or not an email address was valid. SMTP Ping would be used to check against a remote mail server to see if an email address was alive. A connection to the Internet Service Provider (ISP) remote mail server, such as Gmail, would be made as if actually sending an email, but abruptly cut short without actually sending the email.

Typically, the conversation held in the connection between the sending mail server and the receiving ISP mail server would look like this:

In some scenarios, the ISP could provide feedback like this instead:

With SMTP Ping, senders could cut the conversation short after seeing the response back from the ISP, after requesting to send mail to the specified email address. This became a way to ping against an ISP to see if the receiving mail server found the email addresses to be valid or invalid, with some degree of confidence.

The Dark Side of SMTP Ping

ISPs consider SMTP Ping — also known as the broken handshake — as spammer behavior. ISPs can easily tell that you’re just checking validity of email addresses by looking at the conversation patterns: Calling in and hanging up repetitiously, with no (or very little) messages actually being sent, ends up in their mail server logs. After the history with SMTP VRFY, this type of behavior is now known to be spammy. ISPs are cracking down on this behavior and cracking down hard. Microsoft for example, considers this type of practice to be malicious and Hotmail finds SMTP Ping as evidence of a directory harvest attack. SMTP Ping attempts in progress will typically drop a hard block on all connections from the sending IP address. ISPs dislike SMTP Ping, and so do blacklist operators. Keep it up, and you’ll almost surely end up getting blacklisted.

In addition, many ISPs delay recipient validation until after the server conversation ends, resulting in an acceptance without confirmation of email address validity — a false positive. Some ISPs also protect recipients with greylisting. Under grey list protocols, the receiving server rejects emails from unknown senders. As a result, SMTP Pings will return an error message even on valid email addresses — a false negative.

Long story short, it’s a really bad practice, and it’s unreliable.

May the Verifications Be With You — Best Practices for Email Validation

There are various steps you can take to ensure your approach to validating emails is robust, including:

Get proactive with typos: Consider adding an auto-complete tool to your email entry field with popular @example.com extensions, such as @gmail.com. Allowing users to select from the preformatted entries reduces the potential for a typo.
Go for the double opt-in: Sending an email confirming a user request that requires them to verify their email satisfies regulations for certain markets and provides validation.
Take advantage of technology: Various email verification tools offer an easy, affordable way to maximize communication potential and maintain friendly relationships with ISPs.
Establish a routine: While it may be tempting to validate emails only when executing a bulk-send campaign, regularly scheduling a validation may be the better option. Consider how often and from where emails join your list — the more options for capturing contact information, the more potential for invalid addresses.

Why Use Email Validation Tools?

Many reasons exist for using email validation tools (EVs), including:

It saves you time: EVs rely on automated processes for validation that reduce the time associated with manual look-ups.
It saves you money: A clean email list with valid addresses eliminates bounces and increases ROI by reducing unnecessary spending.
It keeps you off the blocklist: Using EVs lowers spam complaints that can lead to message and sender blocking.
It improves deliverability: Readers need to receive your emails before interacting with them, so deliverability is a priority for promotional and transactional communications.
It offers real-time results: Being able to validate immediately across different points of the buyer’s process supplies you with higher-quality contact information.

A New Hope: Data-Driven

Rather than rely on SMTP Ping, there’s a different data-driven approach that does not make enemies with ISPs. Validating email addresses can be done by looking up against a large data set, with event data including hard bounces, deliveries, and engagement, as well as incorporating syntax validation, typo detection, DNS queries for valid domains, and quality checks for free, role-based, and disposable email addresses. This method heavily relies on the depth and breadth of the data the email validation tool or service is built upon, instead of depending on the ISP to provide back a specific response. You may not want to judge Master Yoda based on his size, but you’ll want to judge an email address validation tool by its data size.

SparkPost’s Recipient Validation is built on top of its large email data footprint, sending more than 37% of the world’s B2C and B2B email. Our data science team has done a thorough analysis of billions of email bounces and delivery events. Our findings establish that a single hard bounce isn’t enough to establish you shouldn’t send to an address. Using our data footprint, we are constantly updating our list of recipients and our algorithms to capture the true validity of a hard bounce, and analyzing all related email events to best answer the question: Can you deliver to this given email address?

As we continue to build and iterate upon our Recipient Validation, our goal is to make ours the most dependable and fastest validation tool on the market. Rumor has it our Recipient Validation will be able to make the Kessel Run in less than 12 parsecs, or at least something along those lines…