A Replay-Resistant Canonicalization for DomainKeys Identified Mail (DKIM)

DomainKeys Identified Mail (DKIM) provides a digital signature mechanism for Internet messages, allowing a domain name owner to affix its domain name to a message in a way that can be cryptographically validated. presents the original threat model DKIM was meant to address, and the environment in which it was expected to work. Notably, DKIM decoupled itself from the transport of the message. The theory suggests it should be possible to validate a signature whether a message is in situ (i.e., in an inbox on disk), in transit between mail servers, or being retrieved through a mailbox access protocol. In particular, this meant a DKIM signature can validate irrespective of what is in the SMTP envelope containing it, or even when there is no envelope to consider. This means a message and its signature can be re-sent to anyone simply by changing the set of recipients in the envelope and passing the message back to a Mail Transport Agent (MTA). As the message itself is unaltered, any DKIM signature(s) on it will continue to validate. This is a form of replay attack, and it relies for its success on the perceived value (i.e., reputation) of the domain(s) named in the signature(s). This document describes a mechanism by which a signature and a message are coupled such that successful replays to other recipient sets are not possible, as the signature will no longer validate.

Several terms used in this document are based on their definitions in . The term "envelope recipient" is, using the notation proposed in that document, an RFC5321.RcptTo address.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here.

This section describes a new DKIM canonicalization that can be used by a signer to ensure that a signature will only validate for a specific recipient set. It is essentially the same as the "relaxed" header canonicalization method described in Section 3.4.2 of except that it also includes the envelope recipient(s) in the canonicalized which is then fed to the signing algorithm. DKIM signers and verifiers to date have no reason to be interested in any aspect of the envelope used to transport a message. This canonicalization is not possible without that context being available. This may prove to be a challenge to some operating environments. Also, this will make it impossible to validate a DKIM signature using this algorithm in a context where no envelope exists, such as when retrieving a message from a mailbox. The "no-replay-relaxed" header canonicalization algorithm MUST apply the following steps in order: Execute the "relaxed" header canonicalization algorithm as described in Section 3.4.2 of . Collect the set of envelope recipients. Sort them in typical lexical ASCII order. Format the list by concatenating them all in this sorted order, separated by CRLF strings (ASCII 13 followed by ASCII 10), and with the last one terminated by a CRLF. Prepend this list to the canonicalized form. The signing and verifying processes defined for DKIM are otherwise unmodified.

Consider the following SMTP transaction, wherein "C" denotes something sent by an SMTP client, "S" denotes something sent by an SMTP server, and terminating CRLFs in both directions are omitted:

C: MAIL FROM:<msk@example.net> S: 250 Sender OK C: RCPT TO:<bob@example.com> S: 250 Recipient OK C: RCPT TO:<alice@example.com> S: 250 Recipient OK C: DATA S: 354 Go ahead [message header omitted] [message body omitted] . C: 250 Message delivered The canonicalization described above would operate the same way as the "relaxed" header canonicalization would, except that the content fed to the hash algorithm would be preceded by:

alice@example.com<CR><LF> bob@example.com<CR><LF>

Use of this canonicalization guarantees that a signature will not verify unless sent to exactly the same set of envelope recipients as was present in the envelope when the message was prepared for signing. The fact that the recipient set is sorted allows verifiers to tolerate any reordering of the envelope that may be done in transit. However, if any original recipient is removed, or any new recipient added, the signature will not validate because the content passed to the hash step at the verifier will differ from what was done at the signer. Thus, in the replay scenario described in , the signature no longer validates. If the need to be able to validate a signature from storage (without an envelope) needs to be preserved, the signer can still add a second signature using some other header canonicalization that does not need the envelope context to verify. This, however, requires the verifier to understand when it is appropriate to use which signature. Note, however, that this is fragile in the modern Internet message ecosystem. Some scenarios that will yield false negatives with this method are described below.

If a receiving MTA notes that one of the envelope recipients refers to a mailbox in a domain for which it has administrative authority, but is known to be an alias, it may rewrite that envelope into its canonical form. For instance, if a receiving MTA is officially known as the mail server for "example.com", but also accepts mail for its users when addressed to "example.net", it may alter that latter address in the envelope to refer to its canonical name. This alters the recipient list, and thus alters the content passed to the hash algorithm when validating the signature, leading to a failure.

If a message contains envelope recipients at domains served by separate MTAs, compels the handling MTA to split the message, creating two envelopes containing identical content. The first of these will be addressed to one recipient and sent on its way; the second will be addressed to the other and sent via its own route. Upon arrival at either DKIM verifier, the recipient list has effectively been altered since signing. This alters the content passed to the hash algorithm when validating the signature, leading to a failure. This can be avoided by arranging that no envelope ever has more than a single recipient, but this renders useless an important "common factoring" feature of SMTP. In the case of a mailing list server that may need to distribute a single message to a very large number of recipients, this method can impose significant compute or storage costs.

IANA is asked to make the following entry in the "DKIM-Signature Canonicalization Header" sub-registry of the "DKIM Parameters" registry group: no-replay-relaxed [this document] active

All of the security considerations of apply when this canonicalization is in use. A signer that is forced to generate independently signed messages for each recipient in a situation where large recipient lists are common could be exploited to cause a denial-of-service attack simply from the fact that there is an amplication of work being done. The loss of the ability to verify messages signed using this canonicalization from their mailboxes will have unknown security impact.