SMTP XOAUTH2 deep dive
Google and Microsoft are both promoting the use of XOAUTH2 for the SMTP, IMAP and POP protocols. Let's take a look at how this works within the SMTP protocol. For POP en IMAP are almost identical.
How does e-mail work (simplified version)
First, look what happens when you compose an email in your mail client (such as Thunderbird or Outlook). After you click the send button, the mail client connects to the configured mail server. It will use the SMTP protocol for that. Inside the SMTP protocol the mail client will exchange your username and password to authenticate you. When your credentials are right the mail server allows the client to hand over the e-mail message.
The e-mail messages will be stored on the mail server in a queue. Another part of the mail server is looking for items in the queue, when it found an item, it will check the destination of the message. It will lookup the MX records for a domain in the DNS to learn where messages for that domain must be deliverd, for example for gmail.com
the MX records will indicate that mail must be delived to the server gmail-smtp-in.l.google.com
.
Once the worker knows which server to contact, it will initiate a SMTP session with that server. Yes that's correct the same protocol that is used between your mail client and the mail server is also used between mail servers. This time, however, there is no authentication step, it will simply tell they other server "I have a message for a domain that you are responsible for".
The destination server will accept the message and check it for spam and viruses. After that the message is placed in the target user's mailbox.
A quick look at the protocol
Below is a very simple SMTP protocol conversation for delivering an email. We use Alice as the mail server that wants to deliver a message to another mail server, which we call Bob.
- Alice sets up a connection to Bob -
Bob: 220 bob.example.tld ESMTP server
Alice: HELO alice.example.tld
Bob: 250 Hello!
Alice: MAIL FROM: <user@alice.example.tld>
Bob: 250 Ok
Alice: RCPT TO: <user@bob.example.tld>
Bob: 250 Ok
Alice: DATA
Bob: 354 End data with <CR><LF>.<CR><LF>
Alice: From: "Alice" <user@alice.tld>
Alice: To: "Bob" <user@bob.tld>
Alice: Date: Wed, 18 Jun 2025 12:31:15 GMT
Alice: Subject: Hello world!
Alice:
Alice: Hi Bob, this is Alice, I want to say hello!
Alice:
Alice: .
Alice:
Bob: 250 Ok: queued as 1234
Alice: QUIT
Bob: 221 Bye
The protocol starts with saying hello between the servers with the HELO command. After that the protocol describes who the message is from. Then the DATA command is given and the server will accept anything until a new line, a period/dot (.) and a new line is given.
The message starts with the mail headers, which contain metadata about the message, it look that is is repeated some information that already was given inside the SMTP protocol. But it serves a different purpose. This can become clear when one message is sent to mulitple recipients. The SMTP protocol is done separately for each recipient. So if a message is sent to Bob and Charlie the HELO, MAIL FROM, RCPT TO and DATA is given for the message to Bob and after that again for Charlie. The only difference will be in the RCTP TO line. The header section in the DATA part of both messages header will contain both recipients in the To line. This allows Bob and Charlie to see that they were both addressed in the same email.
Do not allow sending in the wild
The previous example showed a conversation between servers. As we already know, the same protocol is also used for client-to-server communication. However, we don't want just any client to be able to ask any server to send an email, because that would make spamming far too easy.
So when a client wants to deliver a message to a server for future delivery, we will add an extra step for authentication.
- Alice sets up a connection to Bob -
Bob: 220 bob.example.tld ESMTP server
Alice: HELO alice.example.tld
Bob: 250 Hello!
Alice: AUTH PLAIN YWxpY2VAZXhhbXBsZS50bGQAYWxpY2VAZXhhbXBsZS50bGQAc3VwZXJzZWN1cmU=
Bob: 235 2.7.0 Authentication successful
Alice: MAIL FROM: ....
The AUTH command is followed by the AUTH protocol and the data for the authentication. PLAIN is a wildely used method. With PLAIN the authentication data is a very simple base64 string containing the username (twice) and the password, separated by the ASCII NULL-character (\0000 or chr(0)). The username is given twice to made it possible to access a delegated mailbox.
Is this secure?
The SMTP protocol as showned so far is done without encryption. So when somebody intercept the communcation all information is visible including - including the base64 encoded data for the login, which can be decoded.
To prevent this SMTPS can be used. With SMTPS the entire protocol is encapsulated in a SSL/TLS session.
More popular is STARTTLS. The connection is initially without encryption. At some point the client (in our case Alice) will sent a STARTTLS command. This will indicate that the connection will be encrypted with SSL/TLS from now. When this is done before an AUTH command is given the username/password will be send encrypted. Must servers require that STARTTLS is given before the AUTH command can be given.
The use of STARTTLS makes it possible to support both encrypted and unencrypted communication over the same server port, which is useful for communication between SMTP servers. However, it also introduces new attack vectors. Technologies like DANE can be used to mitigate these risks.
- Alice sets up a connection to Bob -
Bob: 220 bob.example.tld ESMTP server
Alice: HELO alice.example.tld
Bob: 250 Hello!
Alice: STARTTLS
Bob: 220 Go ahead
- TLS negotation and handshake -
- Communcation continues encrypted -
Alice: AUTH PLAIN YWxpY2VAZXhhbXBsZS50bGQAYWxpY2VAZXhhbXBsZS50bGQAc3VwZXJzZWN1cmU=
Bob: 235 2.7.0 Authentication successful
Alice: MAIL FROM: ....
Is AUTH PLAIN dangerous?
Some big companies (hi Google, hi Microsoft) have decided that the PLAIN method is no longer secure. Whether this is true depends on the use case. When dealing with a mail client used by a real user, 2FA/MFA technologies are a great additional protection layer. However when setting up a mail client with unique passwords for each client per user many security risks are also mittigated.
In cases where the sender is not a real user but an application, for example machine to user communication (applications that sending newsletters or password resets, or copiers that sending scans to users) 2FA/MFA is not an option. Setting secure passwords should be secure enough, IP filtering is a often a better security measure then choosing XOAUTH2 authentication.
Let's have a look at XOAUTH2
Now we understand how the SMTP protocol works and that it supports PLAIN auth and encrypted connections it is time to have a closer look at XOAUTH2. The XOAUTH2 authentication method does not begin inside the SMTP protocol but with an call (HTTPS) to the token endpoint of an IAM/IDP authorization server. This token request is part of the Oauth2 protocol. Depending on the chosen Oauth2 flow multiple requests and user interaction may be required. The appropriate flow depends on the specific use case. Below is a brief description of the two most suitable flows for this scenario.
Authorization code flow is ideal when a real user is authenticating. The mail client redirects the user to the authorization server, where all necessary steps for authenticating are performed, such as giving username, password, SMS verification or using a smart card or FIDO key. The method of authentication is controlled by the authorization server. After the authentication is completed the user is redirected back to the mail client with a code. The mail client can exchange this code for an access token with the token request.
Client credentials flow is intended for machines and application (also called service principals) that authenticating themselves. This is done directly by calling the token endpoint together with a client id and secret.
Both flows results in an access token, we needed for the XOAUTH2 auth. Like the AUTH PLAIN method a base64 string is constructed. The format for the string before we applied base64 user=<mailbox>\1auth=Bearer <access token>\1\1
. The <mailbox> is replaced by the sending mailbox (this can be the users own mailbox or a mailbox that the user has delegated access too). The \1 is representing the not printable ASCII null start of heading control character.
- Alice sets up a connection to Bob -
Bob: 220 bob.example.tld ESMTP server
Alice: HELO alice.example.tld
Bob: 250 Hello!
Alice: STARTTLS
Bob: 220 Go ahead
- TLS negotation and handshake -
- Communcation continues encrypted -
Alice: AUTH XOAUTH2 AUTH XOAUTH2 dXNlcj1hbGljZUBleGFtcGxlLnRsZAFhdXRoPWV5SjBlWEFpT2lKS1YxUWlMQ0p1YjI1alpTSTZJbFJMUlVoa1JrOHdUVGRJYTFSamFtVjFaVmhwYTJ0cE1URkpPLi4uLi4uLi4uLi4ualoyOVdqMWdQSjdvLTNkeFg2SnVDenpvbG9xc0lUNG5XWW9sVEZ5TjZEd1RXRGpqMnBYR19JY0owdyIBAQ==
Bob: 235 2.7.0 Authentication successful
Alice: MAIL FROM: ....
That's how XOAUTH2 works. It's not easy, but it's not extremely difficult either.