A crash course in SASL and DIGEST-MD5 for XMPP

Introduction 🔗

XMPP requires the use of the SASL DIGEST-MD5 mechanism in order to authenticate clients. The RFC itself is difficult to follow in places, however, the actual functionality the clients are required to implement in order to successfully authenticate to a DIGEST-MD5 aware server are minimal.

It is the goal of this document to show the basics of SASL and DIGEST-MD5. Hopefully this will be enough to get SASL newcomers up and running.

Note that this document is not intended to be any kind of authorative documentation on XMPP, SASL or DIGEST-MD5. If you’re in doubt, read the relevant specs. This document assumes that you’ll never want to use the channel integrity and encryption features that DIGEST-MD5 provides.

Utility code 🔗

Clients need a few utility functions available:

Startup 🔗

After the XML stream has been set up, the server will send a list of stream features that it supports. In amongst this will be a list of SASL mechanisms that are supported:

<stream:features>
  <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
    <mechanism>DIGEST-MD5</mechanism>
    <mechanism>PLAIN</mechanism>
    <mechanism>KERBEROS_V4</mechanism>
  </mechanisms>
</stream:features>

If DIGEST-MD5 is not here, then the rest of this document will be fairly useless. However, its reasonable to assume that the majority of servers out there will have DIGEST-MD5 available, since XMPP requires servers to implement it, and it has no known security holes.

To initiate the authentication exchange, the client sends a SASL authentication request, selecting DIGEST-MD5 as the desired mechanism:

    <auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='DIGEST-MD5'/>

Step one - challenge from server 🔗

The server will respond by sending a challenge, something like this:

<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
    cmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIi
    xxb3A9ImF1dGgiLGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNz
</challenge>

The contents of this challenge is encoded using Base64, and might look like this when decoded:

realm="cataclysm.cx",nonce="OA6MG9tEQGm2hh",qop="auth",charset=utf-8,algorithm=md5-sess

The challenge can then be parsed out into a number of comma-seperated key-value pairs (though not all require values, as indicated below). Brief descriptions are as follows:

Step two - response from client 🔗

The client is required to respond with the appropriate credentials (Base64 encoded, of course). An example response might be:

<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
    dXNlcm5hbWU9InJvYiIscmVhbG09ImNhdGFjbHlzbS5jeCIsbm9uY2U9Ik
    9BNk1HOXRFUUdtMmhoIixjbm9uY2U9Ik9BNk1IWGg2VnFUclJrIixuYz0w
    MDAwMDAwMSxxb3A9YXV0aCxkaWdlc3QtdXJpPSJ4bXBwL2NhdGFjbHlzbS
    5jeCIscmVzcG9uc2U9ZDM4OGRhZDkwZDRiYmQ3NjBhMTUyMzIxZjIxNDNh
    ZjcsY2hhcnNldD11dGYtOCxhdXRoemlkPSJyb2JAY2F0YWNseXNtLmN4L2
    15UmVzb3VyY2Ui
</response>

The decoded form of this is:

username="rob",realm="cataclysm.cx",nonce="OA6MG9tEQGm2hh",cnonce="OA6MHXh6VqTrRk",nc=00000001,qop=auth,digest-uri="xmpp/cataclysm.cx",response=d388dad90d4bbd760a152321f2143af7,charset=utf-8,authzid="[email protected]/myResource"

The meaning of the values described here are as follows:

Computing the response value 🔗

This is where the magic happens. The value of the response directive is computed as follows:

  1. Create a string of the form “username:realm:password”. Call this string X.
  2. Compute the 16 octet MD5 hash of X. Call the result Y.
  3. Create a string of the form “Y:nonce:cnonce:authzid”. Call this string A1.
  4. Create a string of the form “AUTHENTICATE:digest-uri”. Call this string A2.
  5. Compute the 32 hex digit MD5 hash of A1. Call the result HA1.
  6. Compute the 32 hex digit MD5 hash of A2. Call the result HA2.
  7. Create a string of the form “HA1:nonce:nc:cnonce:qop:HA2”. Call this string KD.
  8. Compute the 32 hex digit MD5 hash of KD. Call the result Z.

The resultant string Z should be sent to the server as the value of the “response” directive.

Step three - challenge from server 🔗

The server will now authenticate the client based on the presented credentials. If the authentication failed, the server will return:

<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>

If it was successful, the server will send a final challenge:

<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
    cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZA==
</challenge>

This challenge, when decoded, will contain a single directive “rspauth”. This directive is not useful for the purposes of this document, and may be safely ignored.

Step four - response from client 🔗

The client should respond with an empty response:

<response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>

Step five - result from server 🔗

At this point, the server should inform the client of successful authentication, like so:

<success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>

Starting a session 🔗

The connection is now authenticated. The client must now request an IM session by sending the following:

<iq type='set' id='sess_1'>
  <session xmlns='urn:ietf:params:xml:ns:xmpp-session'/>
</iq>

Assuming that your username/realm has allowed to start a session as the authzid that was specified during the SASL handshake, the session start request will succedd:

<iq type='result id='sess_1'/>

The session then continues as normal - get roster, send presence, etc.