< draft-ietf-tsvwg-ecn-tunnel-10.txt | rfc6040.txt | |||
---|---|---|---|---|
Transport Area Working Group B. Briscoe | Internet Engineering Task Force (IETF) B. Briscoe | |||
Internet-Draft BT | Request for Comments: 6040 BT | |||
Updates: 3168, 4301, 4774 August 26, 2010 | Updates: 3168, 4301, 4774 November 2010 | |||
(if approved) | Category: Standards Track | |||
Intended status: Standards Track | ISSN: 2070-1721 | |||
Expires: February 27, 2011 | ||||
Tunnelling of Explicit Congestion Notification | Tunnelling of Explicit Congestion Notification | |||
draft-ietf-tsvwg-ecn-tunnel-10 | ||||
Abstract | Abstract | |||
This document redefines how the explicit congestion notification | This document redefines how the explicit congestion notification | |||
(ECN) field of the IP header should be constructed on entry to and | (ECN) field of the IP header should be constructed on entry to and | |||
exit from any IP in IP tunnel. On encapsulation it updates RFC3168 | exit from any IP-in-IP tunnel. On encapsulation, it updates RFC 3168 | |||
to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec | to bring all IP-in-IP tunnels (v4 or v6) into line with RFC 4301 | |||
ECN processing. On decapsulation it updates both RFC3168 and RFC4301 | IPsec ECN processing. On decapsulation, it updates both RFC 3168 and | |||
to add new behaviours for previously unused combinations of inner and | RFC 4301 to add new behaviours for previously unused combinations of | |||
outer header. The new rules ensure the ECN field is correctly | inner and outer headers. The new rules ensure the ECN field is | |||
propagated across a tunnel whether it is used to signal one or two | correctly propagated across a tunnel whether it is used to signal one | |||
severity levels of congestion, whereas before only one severity level | or two severity levels of congestion; whereas before, only one | |||
was supported. Tunnel endpoints can be updated in any order without | severity level was supported. Tunnel endpoints can be updated in any | |||
affecting pre-existing uses of the ECN field, thus ensuring backward | order without affecting pre-existing uses of the ECN field, thus | |||
compatibility. Nonetheless, operators wanting to support two | ensuring backward compatibility. Nonetheless, operators wanting to | |||
severity levels (e.g. for pre-congestion notification--PCN) can | support two severity levels (e.g., for pre-congestion notification -- | |||
require compliance with this new specification. A thorough analysis | PCN) can require compliance with this new specification. A thorough | |||
of the reasoning for these changes and the implications is included. | analysis of the reasoning for these changes and the implications is | |||
In the unlikely event that the new rules do not meet a specific need, | included. In the unlikely event that the new rules do not meet a | |||
RFC4774 gives guidance on designing alternate ECN semantics and this | specific need, RFC 4774 gives guidance on designing alternate ECN | |||
document extends that to include tunnelling issues. | semantics, and this document extends that to include tunnelling | |||
issues. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at http://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 5741. | ||||
This Internet-Draft will expire on February 27, 2011. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
http://www.rfc-editor.org/info/rfc6040. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 1. Introduction ....................................................4 | |||
1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 1.1. Scope ......................................................5 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 2. Terminology .....................................................6 | |||
3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 14 | 3. Summary of Pre-Existing RFCs ....................................7 | |||
3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 14 | 3.1. Encapsulation at Tunnel Ingress ............................7 | |||
3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 15 | 3.2. Decapsulation at Tunnel Egress .............................8 | |||
4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 16 | 4. New ECN Tunnelling Rules ........................................9 | |||
4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 17 | 4.1. Default Tunnel Ingress Behaviour ..........................10 | |||
4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 17 | 4.2. Default Tunnel Egress Behaviour ...........................10 | |||
4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 19 | 4.3. Encapsulation Modes .......................................12 | |||
4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 21 | 4.4. Single Mode of Decapsulation ..............................14 | |||
5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 22 | 5. Updates to Earlier RFCs ........................................15 | |||
5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 22 | 5.1. Changes to RFC 4301 ECN Processing ........................15 | |||
5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 22 | 5.2. Changes to RFC 3168 ECN Processing ........................16 | |||
5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 23 | 5.3. Motivation for Changes ....................................17 | |||
5.3.1. Motivation for Changing Encapsulation . . . . . . . . 24 | 5.3.1. Motivation for Changing Encapsulation ..............17 | |||
5.3.2. Motivation for Changing Decapsulation . . . . . . . . 25 | 5.3.2. Motivation for Changing Decapsulation ..............18 | |||
6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 27 | 6. Backward Compatibility .........................................21 | |||
6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 27 | 6.1. Non-Issues Updating Decapsulation .........................21 | |||
6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 28 | 6.2. Non-Update of RFC 4301 IPsec Encapsulation ................21 | |||
6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 28 | 6.3. Update to RFC 3168 Encapsulation ..........................22 | |||
7. Design Principles for Alternate ECN Tunnelling Semantics . . . 29 | 7. Design Principles for Alternate ECN Tunnelling Semantics .......22 | |||
8. IANA Considerations (to be removed on publication): . . . . . 31 | 8. Security Considerations ........................................24 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 31 | 9. Conclusions ....................................................26 | |||
10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 32 | 10. Acknowledgements ..............................................26 | |||
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 33 | 11. References ....................................................27 | |||
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | 11.1. Normative References .....................................27 | |||
12.1. Normative References . . . . . . . . . . . . . . . . . . . 34 | 11.2. Informative References ...................................27 | |||
12.2. Informative References . . . . . . . . . . . . . . . . . . 34 | Appendix A. Early ECN Tunnelling RFCs ............................29 | |||
Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 35 | Appendix B. Design Constraints ...................................29 | |||
Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 36 | B.1. Security Constraints ......................................29 | |||
B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 36 | B.2. Control Constraints .......................................31 | |||
B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 38 | B.3. Management Constraints ....................................32 | |||
B.3. Management Constraints . . . . . . . . . . . . . . . . . . 39 | Appendix C. Contribution to Congestion across a Tunnel ...........33 | |||
Appendix C. Contribution to Congestion across a Tunnel . . . . . 40 | ||||
Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0) | Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0) | |||
Outer . . . . . . . . . . . . . . . . . . . . . . . . 41 | Outer ................................................34 | |||
Appendix E. Open Issues . . . . . . . . . . . . . . . . . . . . . 42 | Appendix E. Open Issues ..........................................35 | |||
Request to the RFC Editor (to be removed on publication): | ||||
In the RFC index, RFC3168 should be identified as an update to | ||||
RFC2003. RFC4301 should be identified as an update to RFC3168. | ||||
Changes from previous drafts (to be removed by the RFC Editor) | ||||
Full text differences between IETF draft versions are available at | ||||
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and | ||||
between earlier individual draft versions at | ||||
<http://www.briscoe.net/pubs.html#ecn-tunnel> | ||||
From ietf-09 to ietf-10 (current): | ||||
* Editorial changes: | ||||
+ Clarified couple of sentences in Introduction and one in | ||||
section 6.3 to distinguish whether the terms 'RFC3168' & | ||||
'RFC4301' refer to implementations or documents. | ||||
+ Corrected garbled sentence in the introduction about | ||||
backward compatibility. | ||||
+ Made it clear that 'drop' in Fig 2, Fig 4 and the following | ||||
para is an action, not a codepoint. | ||||
+ In sections 5.1 & 5.2, specifically identified the updated | ||||
sections of RFC3168 & RFC4301. | ||||
+ Avoided describing compatibility mode as 'optional' at the | ||||
end of section 5.2 where it should have said 'not always | ||||
obligatory' instead, because in section 4 compatibility mode | ||||
is normatively defined as obligatory in some circumstances | ||||
(rather than always optional). | ||||
+ Added RFC5659 as informative reference on pseudowires and | ||||
clarified only some pseudowires might be relevant examples. | ||||
+ Deleted "The views expressed here are those of the author | ||||
only." in the acknowledgements. | ||||
+ Fixed a few nits. | ||||
From ietf-08 to ietf-09: Added change log entry for -07 to -08 that | ||||
was previously omitted. | ||||
* Changes to standards action text: | ||||
+ Added RFC4774 to 'Updates:' header (the draft always has | ||||
extended the advice in RFC4774 (BCP124) which said very | ||||
little about tunnels. The GENART reviewer merely pointed | ||||
out that the header did not highlight this fact.) | ||||
* Editorial changes: | ||||
+ Abstract: s/providing backward compatibility./thus ensuring | ||||
backward compatibility./ | ||||
+ Moved PCN-related text motivating changes to decapsulation | ||||
from "Default Tunnel Egress Behaviour" (Section 4.2) to | ||||
"Motivation for Changing Decapsulation" (Section 5.3.2) | ||||
where it was merged with existing similar text. | ||||
+ In the non-normative Design Principles avoided using words | ||||
in lower case where they were in contexts that might make | ||||
them confusable with upper case RFC2119 normative language. | ||||
+ Added Stephen Hanna and Ben Campbell to acks and corrected | ||||
spelling of Agarwal. | ||||
+ Deleted endnote discussing corner case with IKEv2 manual | ||||
keying (identified as "to be removed before publication | ||||
following SecDir review"). | ||||
+ Deleted Appendices D & E on why existing ingress & egress | ||||
tunnelling behavour impede PCN and the endnotes that | ||||
referred to them (identified as "to be removed before | ||||
publication"). | ||||
+ Various minor corrections pointed out by reviewers. | ||||
From ietf-07 to ietf-08: | ||||
* Changes to standards actions: | ||||
+ Section 4: Changed non-RFC2119 phrase 'NOT RECOMMENDED' to | ||||
'SHOULD be avoided', wrt alternate ECN tunnelling schemes. | ||||
+ Section 4.2: Used upper-case in 'Alarms SHOULD be rate- | ||||
limited'. | ||||
+ Section 7: Made bullet #1 in the decapsulation guidelines | ||||
for alternate schemes more precise. Also changed any upper- | ||||
case keywords in this informative section to lower case. | ||||
* Editorial changes: | ||||
+ Changed copyright notice to allow for pre-5378 material. | ||||
+ Shifted supporting text intended for deletion on publication | ||||
into editorial comments. | ||||
+ Explained how to read the decapsulation matrices in their | ||||
captions. | ||||
+ Minor clarifications throughout. | ||||
From ietf-06 to ietf-07: | ||||
* Emphasised that this is the opposite of a fork in the RFC | ||||
series. | ||||
* Altered Section 5 to focus on updates to implementations of | ||||
earlier RFCs, rather than on updates to the text of the RFCs. | ||||
* Removed potential loop-holes in normative text that | ||||
implementers might have used to claim compliance without | ||||
implementing normal mode. Highlighted the deliberate | ||||
distinction between "MUST implement" and "SHOULD use" normal | ||||
mode. | ||||
* Added question for Security Directorate reviewers on whether to | ||||
mention a corner-case concerning manual keying of IPsec | ||||
tunnels. | ||||
* Minor clarifications, updated references and updated acks. | ||||
* Marked two appendices about PCN motivations for removal before | ||||
publication. | ||||
From ietf-05 to ietf-06: | ||||
* Minor textual clarifications and corrections. | ||||
From ietf-04 to ietf-05: | ||||
* Functional changes: | ||||
+ Section 4.2: ECT(1) outer with Not-ECT inner: reverted to | ||||
forwarding as Not-ECT (as in RFC3168 & RFC4301), rather than | ||||
dropping. | ||||
+ Altered rationale in bullet 3 of Section 5.3.2 to justify | ||||
this. | ||||
+ Distinguished alarms for dangerous and invalid combinations | ||||
and allowed combinations that are valid in some tunnel | ||||
configurations but dangerous in others to be alarmed at the | ||||
discretion of the implementer and/or operator. | ||||
+ Altered advice on designing alternate ECN tunnelling | ||||
semantics to reflect the above changes. | ||||
* Textual changes: | ||||
+ Changed "Future non-default schemes" to "Alternate ECN | ||||
Tunnelling Semantics" throughout. | ||||
+ Cut down Appendix D and Appendix E for brevity. | ||||
+ A number of clarifying edits & updated refs. | ||||
From ietf-03 to ietf-04: | ||||
* Functional changes: none | ||||
* Structural changes: | ||||
+ Added "Open Issues" appendix | ||||
* Textual changes: | ||||
+ Section title: "Changes from Earlier RFCs" -> "Updates to | ||||
Earlier RFCs" | ||||
+ Emphasised that change on decap to previously unused | ||||
combinations will propagate PCN encoding. | ||||
+ Acknowledged additional reviewers and updated references | ||||
From ietf-02 to ietf-03: | ||||
* Functional changes: | ||||
+ Corrected errors in recap of previous RFCs, which wrongly | ||||
stated the different decapsulation behaviours of RFC3168 & | ||||
RFC4301 with a Not-ECT inner header. This also required | ||||
corrections to the "Changes from Earlier RFCs" and the | ||||
Motivations for these changes. | ||||
+ Mandated that any future standards action SHOULD NOT use the | ||||
ECT(0) codepoint as an indication of congestion, without | ||||
giving strong reasons. | ||||
+ Added optional alarm when decapsulating ECT(1) outer, | ||||
ECT(0), but noted it would need to be disabled for | ||||
2-severity level congestion (e.g. PCN). | ||||
* Structural changes: | ||||
+ Removed Document Roadmap which merely repeated the Contents | ||||
(previously Section 1.2). | ||||
+ Moved "Changes from Earlier RFCs" (Section 5) before | ||||
Section 6 on Backward Compatibility and internally organised | ||||
both by RFC, rather than by ingress then egress. | ||||
+ Moved motivation for changing existing RFCs (Section 5.3) to | ||||
after the changes are specified. | ||||
+ Moved informative "Design Principles for Future Non-Default | ||||
Schemes" after all the normative sections. | ||||
+ Added Appendix A on early history of ECN tunnelling RFCs. | ||||
+ Removed specialist appendix on "Relative Placement of | ||||
Tunnelling and In-Path Load Regulation" (Appendix D in the | ||||
-02 draft) | ||||
+ Moved and updated specialist text on "Compromise on Decap | ||||
with ECT(1) Inner and ECT(0) Outer" from Security | ||||
Considerations to Appendix D | ||||
* Textual changes: | ||||
+ Simplified vocabulary for non-native-english speakers | ||||
+ Simplified Introduction and defined regularly used terms in | ||||
an expanded Terminology section. | ||||
+ More clearly distinguished statically configured tunnels | ||||
from dynamic tunnel endpoint discovery, before explaining | ||||
operating modes. | ||||
+ Simplified, cut-down and clarified throughout | ||||
+ Updated references. | ||||
From ietf-01 to ietf-02: | ||||
* Scope reduced from any encapsulation of an IP packet to solely | ||||
IP in IP tunnelled encapsulation. Consequently changed title | ||||
and removed whole section 'Design Guidelines for New | ||||
Encapsulations of Congestion Notification' (to be included in a | ||||
future companion informational document). | ||||
* Included a new normative decapsulation rule for ECT(0) inner | ||||
and ECT(1) outer that had previously only been outlined in the | ||||
non-normative appendix 'Comprehensive Decapsulation Rules'. | ||||
Consequently: | ||||
+ The Introduction has been completely re-written to motivate | ||||
this change to decapsulation along with the existing change | ||||
to encapsulation. | ||||
+ The tentative text in the appendix that first proposed this | ||||
change has been split between normative standards text in | ||||
Section 4 and Appendix D, which explains specifically why | ||||
this change would streamline PCN. New text on the logic of | ||||
the resulting decap rules added. | ||||
* If inner/outer is Not-ECT/ECT(0), changed decapsulation to | ||||
propagate Not-ECT rather than drop the packet; and added | ||||
reasoning. | ||||
* Considerably restructured: | ||||
+ "Design Constraints" analysis moved to an appendix | ||||
(Appendix B); | ||||
+ Added Section 3 to summarise relevant existing RFCs; | ||||
+ Structured Section 4 and Section 6 into subsections. | ||||
+ Added tables to sections on old and new rules, for precision | ||||
and comparison. | ||||
+ Moved Section 7 on Design Principles to the end of the | ||||
section specifying the new default normative tunnelling | ||||
behaviour. Rewritten and shifted text on identifiers and | ||||
in-path load regulators to Appendix B.1 [deleted in revision | ||||
-03]. | ||||
From ietf-00 to ietf-01: | ||||
* Identified two additional alarm states in the decapsulation | ||||
rules (Figure 4) if ECT(X) in outer and inner contradict each | ||||
other. | ||||
* Altered Comprehensive Decapsulation Rules (Appendix D) so that | ||||
ECT(0) in the outer no longer overrides ECT(1) in the inner. | ||||
Used the term 'Comprehensive' instead of 'Ideal'. And | ||||
considerably updated the text in this appendix. | ||||
* Added Appendix D.1 (removed again in a later revision) to weigh | ||||
up the various ways the Comprehensive Decapsulation Rules might | ||||
be introduced. This replaces the previous contradictory | ||||
statements saying complex backwards compatibility interactions | ||||
would be introduced while also saying there would be no | ||||
backwards compatibility issues. | ||||
* Updated references. | ||||
From briscoe-01 to ietf-00: | ||||
* Re-wrote Appendix C giving much simpler technique to measure | ||||
contribution to congestion across a tunnel. | ||||
* Added discussion of backward compatibility of the ideal | ||||
decapsulation scheme in Appendix D | ||||
* Updated references. Minor corrections & clarifications | ||||
throughout. | ||||
From briscoe-00 to briscoe-01: | ||||
* Related everything conceptually to the uniform and pipe models | ||||
of RFC2983 on Diffserv Tunnels, and completely removed the | ||||
dependence of tunnelling behaviour on the presence of any in- | ||||
path load regulation by using the [1 - Before] [2 - Outer] | ||||
function placement concepts from RFC2983; | ||||
* Added specific cases where the existing standards limit new | ||||
proposals, particularly Appendix E; | ||||
* Added sub-structure to Introduction (Need for Rationalisation, | ||||
Roadmap), added new Introductory subsection on "Scope" and | ||||
improved clarity; | ||||
* Added Design Guidelines for New Encapsulations of Congestion | ||||
Notification; | ||||
* Considerably clarified the Backward Compatibility section | ||||
(Section 6); | ||||
* Considerably extended the Security Considerations section | ||||
(Section 9); | ||||
* Summarised the primary rationale much better in the | ||||
conclusions; | ||||
* Added numerous extra acknowledgements; | ||||
* Added Appendix E. "Why resetting CE on encapsulation harms | ||||
PCN", Appendix C. "Contribution to Congestion across a Tunnel" | ||||
and Appendix D. "Ideal Decapsulation Rules"; | ||||
* Re-wrote Appendix B [deleted in a later revision], explaining | ||||
how tunnel encapsulation no longer depends on in-path load- | ||||
regulation (changed title from "In-path Load Regulation" to | ||||
"Non-Dependence of Tunnelling on In-path Load Regulation"), but | ||||
explained how an in-path load regulation function must be | ||||
carefully placed with respect to tunnel encapsulation (in a new | ||||
sub-section entitled "Dependence of In-Path Load Regulation on | ||||
Tunnelling"). | ||||
1. Introduction | 1. Introduction | |||
Explicit congestion notification (ECN [RFC3168]) allows a forwarding | Explicit congestion notification (ECN [RFC3168]) allows a forwarding | |||
element (e.g. a router) to notify the onset of congestion without | element (e.g., a router) to notify the onset of congestion without | |||
having to drop packets. Instead it can explicitly mark a proportion | having to drop packets. Instead, it can explicitly mark a proportion | |||
of packets in the 2-bit ECN field in the IP header (Table 1 recaps | of packets in the two-bit ECN field in the IP header (Table 1 recaps | |||
the ECN codepoints). | the ECN codepoints). | |||
The outer header of an IP packet can encapsulate one or more IP | The outer header of an IP packet can encapsulate one or more IP | |||
headers for tunnelling. A forwarding element using ECN to signify | headers for tunnelling. A forwarding element using ECN to signify | |||
congestion will only mark the immediately visible outer IP header. | congestion will only mark the immediately visible outer IP header. | |||
When a tunnel decapsulator later removes this outer header, it | When a tunnel decapsulator later removes this outer header, it | |||
follows rules to propagate congestion markings by combining the ECN | follows rules to propagate congestion markings by combining the ECN | |||
fields of the inner and outer IP header into one outgoing IP header. | fields of the inner and outer IP header into one outgoing IP header. | |||
This document updates those rules for IPsec [RFC4301] and non-IPsec | This document updates those rules for IPsec [RFC4301] and non-IPsec | |||
[RFC3168] tunnels to add new behaviours for previously unused | [RFC3168] tunnels to add new behaviours for previously unused | |||
combinations of inner and outer header. It also updates the ingress | combinations of inner and outer headers. It also updates the ingress | |||
behaviour of RFC3168 tunnels to match that of RFC4301 tunnels. | behaviour of RFC 3168 tunnels to match that of RFC 4301 tunnels. | |||
Tunnel endpoints complying with the updated rules will be backward | Tunnel endpoints complying with the updated rules will be backward | |||
compatible when interworking with tunnel endpoints complying with | compatible when interworking with tunnel endpoints complying with RFC | |||
RFC4301, RFC3168 or any earlier specification. | 4301, RFC 3168, or any earlier specification. | |||
When ECN and its tunnelling was defined in RFC3168, only the minimum | When ECN and its tunnelling was defined in RFC 3168, only the minimum | |||
necessary changes to the ECN field were propagated through tunnel | necessary changes to the ECN field were propagated through tunnel | |||
endpoints--just enough for the basic ECN mechanism to work. This was | endpoints -- just enough for the basic ECN mechanism to work. This | |||
due to concerns that the ECN field might be toggled to communicate | was due to concerns that the ECN field might be toggled to | |||
between a secure site and someone on the public Internet--a covert | communicate between a secure site and someone on the public Internet | |||
channel. This was because a mutable field like ECN cannot be | -- a covert channel. This was because a mutable field like ECN | |||
protected by IPsec's integrity mechanisms--it has to be able to | cannot be protected by IPsec's integrity mechanisms -- it has to be | |||
change as it traverses the Internet. | able to change as it traverses the Internet. | |||
Nonetheless, the latest IPsec architecture [RFC4301] considered a | Nonetheless, the latest IPsec architecture [RFC4301] considered a | |||
bandwidth limit of 2 bits per packet on a covert channel to be a | bandwidth limit of two bits per packet on a covert channel to be a | |||
manageable risk. Therefore, for simplicity, an RFC4301 ingress | manageable risk. Therefore, for simplicity, an RFC 4301 ingress | |||
copied the whole ECN field to encapsulate a packet. RFC4301 | copied the whole ECN field to encapsulate a packet. RFC 4301 | |||
dispensed with the two modes of RFC3168, one which partially copied | dispensed with the two modes of RFC 3168, one which partially copied | |||
the ECN field, and the other which blocked all propagation of ECN | the ECN field, and the other which blocked all propagation of ECN | |||
changes. | changes. | |||
Unfortunately, this entirely reasonable sequence of standards actions | Unfortunately, this entirely reasonable sequence of standards actions | |||
resulted in a perverse outcome; non-IPsec tunnels (RFC3168) blocked | resulted in a perverse outcome; non-IPsec tunnels (RFC 3168) blocked | |||
the 2-bit covert channel, while IPsec tunnels (RFC4301) did not--at | the two-bit covert channel, while IPsec tunnels (RFC 4301) did not -- | |||
least not at the ingress. At the egress, both IPsec and non-IPsec | at least not at the ingress. At the egress, both IPsec and non-IPsec | |||
tunnels still partially restricted propagation of the full ECN field. | tunnels still partially restricted propagation of the full ECN field. | |||
The trigger for the changes in this document was the introduction of | The trigger for the changes in this document was the introduction of | |||
pre-congestion notification (PCN [RFC5670]) to the IETF standards | pre-congestion notification (PCN [RFC5670]) to the IETF Standards | |||
track. PCN needs the ECN field to be copied at a tunnel ingress and | Track. PCN needs the ECN field to be copied at a tunnel ingress and | |||
it needs four states of congestion signalling to be propagated at the | it needs four states of congestion signalling to be propagated at the | |||
egress, but pre-existing tunnels only propagate three in the ECN | egress, but pre-existing tunnels only propagate three in the ECN | |||
field. | field. | |||
This document draws on currently unused (CU) combinations of inner | This document draws on currently unused (CU) combinations of inner | |||
and outer headers to add tunnelling of four-state congestion | and outer headers to add tunnelling of four-state congestion | |||
signalling to RFC3168 and RFC4301. Operators of tunnels who | signalling to RFC 3168 and RFC 4301. Operators of tunnels who | |||
specifically want to support four states can require that all their | specifically want to support four states can require that all their | |||
tunnels comply with this specification. However, this is not a fork | tunnels comply with this specification. However, this is not a fork | |||
in the RFC series. It is an update that can be deployed first by | in the RFC series. It is an update that can be deployed first by | |||
those that need it, and subsequently by all tunnel endpoint | those that need it, and subsequently by all tunnel endpoint | |||
implementations (RFC4301, RFC3168, RFC2481, RFC2401, RFC2003), which | implementations (RFC 4301, RFC 3168, RFC 2481, RFC 2401, RFC 2003), | |||
can safely be updated to this new specification as part of general | which can safely be updated to this new specification as part of | |||
code maintenance. This will gradually add support for four | general code maintenance. This will gradually add support for four | |||
congestion states to the Internet. Existing three state schemes will | congestion states to the Internet. Existing three state schemes will | |||
continue to work as before. | continue to work as before. | |||
In fact, this document is the opposite of a fork. At the same time | In fact, this document is the opposite of a fork. At the same time | |||
as supporting a fourth state, the opportunity has been taken to draw | as supporting a fourth state, the opportunity has been taken to draw | |||
together divergent ECN tunnelling specifications into a single | together divergent ECN tunnelling specifications into a single | |||
consistent behaviour, harmonising differences such as perverse covert | consistent behaviour, harmonising differences such as perverse covert | |||
channel treatment. Then any tunnel can be deployed unilaterally, and | channel treatment. Then, any tunnel can be deployed unilaterally, | |||
it will support the full range of congestion control and management | and it will support the full range of congestion control and | |||
schemes without any modes or configuration. Further, any host or | management schemes without any modes or configuration. Further, any | |||
router can expect the ECN field to behave in the same way, whatever | host or router can expect the ECN field to behave in the same way, | |||
type of tunnel might intervene in the path. | whatever type of tunnel might intervene in the path. | |||
1.1. Scope | 1.1. Scope | |||
This document only concerns wire protocol processing of the ECN field | This document only concerns wire protocol processing of the ECN field | |||
at tunnel endpoints and makes no changes or recommendations | at tunnel endpoints and makes no changes or recommendations | |||
concerning algorithms for congestion marking or congestion response. | concerning algorithms for congestion marking or congestion response. | |||
This document specifies common ECN field processing at encapsulation | This document specifies common ECN field processing at encapsulation | |||
and decapsulation for any IP in IP tunnelling, whether IPsec or non- | and decapsulation for any IP-in-IP tunnelling, whether IPsec or non- | |||
IPsec tunnels. It applies irrespective of whether IPv4 or IPv6 is | IPsec tunnels. It applies irrespective of whether IPv4 or IPv6 is | |||
used for either of the inner and outer headers. It applies for | used for either the inner or outer headers. It applies for packets | |||
packets with any destination address type, whether unicast or | with any destination address type, whether unicast or multicast. It | |||
multicast. It applies as the default for all Diffserv per-hop | applies as the default for all Diffserv per-hop behaviours (PHBs), | |||
behaviours (PHBs), unless stated otherwise in the specification of a | unless stated otherwise in the specification of a PHB (but Section 4 | |||
PHB (but Section 4 strongly deprecates such exceptions). It is | strongly deprecates such exceptions). It is intended to be a good | |||
intended to be a good trade off between somewhat conflicting | trade off between somewhat conflicting security, control, and | |||
security, control and management requirements. | management requirements. | |||
[RFC2983] is a comprehensive primer on differentiated services and | [RFC2983] is a comprehensive primer on differentiated services and | |||
tunnels. Given ECN raises similar issues to differentiated services | tunnels. Given ECN raises similar issues to differentiated services | |||
when interacting with tunnels, useful concepts introduced in RFC2983 | when interacting with tunnels, useful concepts introduced in RFC 2983 | |||
are used throughout, with brief recaps of the explanations where | are used throughout, with brief recaps of the explanations where | |||
necessary. | necessary. | |||
2. Terminology | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
Table 1 recaps the names of the ECN codepoints [RFC3168]. | Table 1 recaps the names of the ECN codepoints [RFC3168]. | |||
+------------------+----------------+---------------------------+ | +------------------+----------------+---------------------------+ | |||
| Binary codepoint | Codepoint name | Meaning | | | Binary codepoint | Codepoint name | Meaning | | |||
+------------------+----------------+---------------------------+ | +------------------+----------------+---------------------------+ | |||
| 00 | Not-ECT | Not ECN-capable transport | | | 00 | Not-ECT | Not ECN-capable transport | | |||
| 01 | ECT(1) | ECN-capable transport | | | 01 | ECT(1) | ECN-capable transport | | |||
| 10 | ECT(0) | ECN-capable transport | | | 10 | ECT(0) | ECN-capable transport | | |||
| 11 | CE | Congestion experienced | | | 11 | CE | Congestion experienced | | |||
+------------------+----------------+---------------------------+ | +------------------+----------------+---------------------------+ | |||
Table 1: Recap of Codepoints of the ECN Field [RFC3168] in the IP | Table 1: Recap of Codepoints of the ECN Field [RFC3168] | |||
Header | in the IP Header | |||
Further terminology used within this document: | Further terminology used within this document: | |||
Encapsulator: The tunnel endpoint function that adds an outer IP | Encapsulator: The tunnel endpoint function that adds an outer IP | |||
header to tunnel a packet (also termed the 'ingress tunnel | header to tunnel a packet (also termed the 'ingress tunnel | |||
endpoint' or just the 'ingress' where the context is clear). | endpoint' or just the 'ingress' where the context is clear). | |||
Decapsulator: The tunnel endpoint function that removes an outer IP | Decapsulator: The tunnel endpoint function that removes an outer IP | |||
header from a tunnelled packet (also termed the 'egress tunnel | header from a tunnelled packet (also termed the 'egress tunnel | |||
endpoint' or just the 'egress' where the context is clear). | endpoint' or just the 'egress' where the context is clear). | |||
skipping to change at page 14, line 37 | skipping to change at page 7, line 16 | |||
outer header to Not-ECT ("00"). | outer header to Not-ECT ("00"). | |||
Resetting ECN: On encapsulation, setting the ECN field of the new | Resetting ECN: On encapsulation, setting the ECN field of the new | |||
outer header to be a copy of the ECN field in the incoming header | outer header to be a copy of the ECN field in the incoming header | |||
except the outer ECN field is set to the ECT(0) codepoint if the | except the outer ECN field is set to the ECT(0) codepoint if the | |||
incoming ECN field is CE. | incoming ECN field is CE. | |||
3. Summary of Pre-Existing RFCs | 3. Summary of Pre-Existing RFCs | |||
This section is informative not normative, as it recaps pre-existing | This section is informative not normative, as it recaps pre-existing | |||
RFCs. Earlier relevant RFCs that were either experimental or | RFCs. Earlier relevant RFCs that were either Experimental or | |||
incomplete with respect to ECN tunnelling (RFC2481, RFC2401 and | incomplete with respect to ECN tunnelling (RFC 2481, RFC 2401, and | |||
RFC2003) are briefly outlined in Appendix A. The question of whether | RFC 2003) are briefly outlined in Appendix A. The question of | |||
tunnel implementations used in the Internet comply with any of these | whether tunnel implementations used in the Internet comply with any | |||
RFCs is not discussed. | of these RFCs is not discussed. | |||
3.1. Encapsulation at Tunnel Ingress | 3.1. Encapsulation at Tunnel Ingress | |||
At the encapsulator, the controversy has been over whether to | At the encapsulator, the controversy has been over whether to | |||
propagate information about congestion experienced on the path so far | propagate information about congestion experienced on the path so far | |||
into the outer header of the tunnel. | into the outer header of the tunnel. | |||
Specifically, RFC3168 says that, if a tunnel fully supports ECN | Specifically, RFC 3168 says that, if a tunnel fully supports ECN | |||
(termed a 'full-functionality' ECN tunnel in [RFC3168]), the | (termed a 'full-functionality' ECN tunnel in [RFC3168]), the | |||
encapsulator must not copy a CE marking from the inner header into | encapsulator must not copy a CE marking from the incoming header into | |||
the outer header that it creates. Instead the encapsulator must set | the outer header that it creates. Instead, the encapsulator must set | |||
the outer header to ECT(0) if the ECN field is marked CE in the | the outer header to ECT(0) if the ECN field is marked CE in the | |||
arriving IP header. We term this 'resetting' a CE codepoint. | arriving IP header. We term this 'resetting' a CE codepoint. | |||
However, the new IPsec architecture in [RFC4301] reverses this rule, | However, the new IPsec architecture in [RFC4301] reverses this rule, | |||
stating that the encapsulator must simply copy the ECN field from the | stating that the encapsulator must simply copy the ECN field from the | |||
incoming header to the outer header. | incoming header to the outer header. | |||
RFC3168 also provided a Limited Functionality mode that turns off ECN | RFC 3168 also provided a Limited Functionality mode that turns off | |||
processing over the scope of the tunnel by setting the outer header | ECN processing over the scope of the tunnel by setting the outer | |||
to Not-ECT ("00"). Then such packets will be dropped to indicate | header to Not-ECT ("00"). Then, such packets will be dropped to | |||
congestion rather than marked with ECN. This is necessary for the | indicate congestion, rather than marked with ECN. This is necessary | |||
ingress to interwork with legacy decapsulators ([RFC2481], [RFC2401] | for the ingress to interwork with legacy decapsulators ([RFC2481], | |||
and [RFC2003]) that do not propagate ECN markings added to the outer | [RFC2401], and [RFC2003]) that do not propagate ECN markings added to | |||
header. Otherwise such legacy decapsulators would throw away | the outer header. Otherwise, such legacy decapsulators would throw | |||
congestion notifications before they reached the transport layer. | away congestion notifications before they reached the transport | |||
layer. | ||||
Neither Limited Functionality mode nor Full Functionality mode are | Neither Limited Functionality mode nor Full Functionality mode are | |||
used by an RFC4301 IPsec encapsulator, which simply copies the | used by an RFC 4301 IPsec encapsulator, which simply copies the | |||
incoming ECN field into the outer header. An earlier key-exchange | incoming ECN field into the outer header. An earlier key-exchange | |||
phase ensures an RFC4301 ingress will not have to interwork with a | phase ensures an RFC 4301 ingress will not have to interwork with a | |||
legacy egress that does not support ECN. | legacy egress that does not support ECN. | |||
These pre-existing behaviours are summarised in Figure 1. | These pre-existing behaviours are summarised in Figure 1. | |||
+-----------------+-----------------------------------------------+ | +-----------------+-----------------------------------------------+ | |||
| Incoming Header | Outgoing Outer Header | | | Incoming Header | Departing Outer Header | | |||
| (also equal to +---------------+---------------+---------------+ | | (also equal to +---------------+---------------+---------------+ | |||
| Outgoing Inner | RFC3168 ECN | RFC3168 ECN | RFC4301 IPsec | | | departing Inner | RFC 3168 ECN | RFC 3168 ECN | RFC 4301 IPsec| | |||
| Header) | Limited | Full | | | | Header) | Limited | Full | | | |||
| | Functionality | Functionality | | | | | Functionality | Functionality | | | |||
+-----------------+---------------+---------------+---------------+ | +-----------------+---------------+---------------+---------------+ | |||
| Not-ECT | Not-ECT | Not-ECT | Not-ECT | | | Not-ECT | Not-ECT | Not-ECT | Not-ECT | | |||
| ECT(0) | Not-ECT | ECT(0) | ECT(0) | | | ECT(0) | Not-ECT | ECT(0) | ECT(0) | | |||
| ECT(1) | Not-ECT | ECT(1) | ECT(1) | | | ECT(1) | Not-ECT | ECT(1) | ECT(1) | | |||
| CE | Not-ECT | ECT(0) | CE | | | CE | Not-ECT | ECT(0) | CE | | |||
+-----------------+---------------+---------------+---------------+ | +-----------------+---------------+---------------+---------------+ | |||
Figure 1: IP in IP Encapsulation: Recap of Pre-existing Behaviours | Figure 1: IP-in-IP Encapsulation: Recap of Pre-Existing Behaviours | |||
3.2. Decapsulation at Tunnel Egress | 3.2. Decapsulation at Tunnel Egress | |||
RFC3168 and RFC4301 specify the decapsulation behaviour summarised in | RFC 3168 and RFC 4301 specify the decapsulation behaviour summarised | |||
Figure 2. The ECN field in the outgoing header is set to the | in Figure 2. The ECN field in the outgoing header is set to the | |||
codepoint at the intersection of the appropriate incoming inner | codepoint at the intersection of the appropriate arriving inner | |||
header (row) and incoming outer header (column). | header (row) and arriving outer header (column). | |||
+---------+------------------------------------------------+ | +---------+------------------------------------------------+ | |||
|Incoming | Incoming Outer Header | | |Arriving | Arriving Outer Header | | |||
| Inner +---------+------------+------------+------------+ | | Inner +---------+------------+------------+------------+ | |||
| Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
RFC3168->| Not-ECT | Not-ECT |Not-ECT |Not-ECT | <drop> | | RFC 3168->| Not-ECT | Not-ECT |Not-ECT |Not-ECT | <drop> | | |||
RFC4301->| Not-ECT | Not-ECT |Not-ECT |Not-ECT |Not-ECT | | RFC 4301->| Not-ECT | Not-ECT |Not-ECT |Not-ECT |Not-ECT | | |||
| ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(0) | CE | | |||
| ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE | | |||
| CE | CE | CE | CE | CE | | | CE | CE | CE | CE | CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
In pre-existing RFCs, the ECN field in the outgoing header was set to | In pre-existing RFCs, the ECN field in the outgoing header was set to | |||
the codepoint at the intersection of the appropriate incoming inner | the codepoint at the intersection of the appropriate arriving inner | |||
header (row) and incoming outer header (column) , or the packet was | header (row) and arriving outer header (column), or the packet was | |||
dropped where indicated. | dropped where indicated. | |||
Figure 2: IP in IP Decapsulation; Recap of Pre-existing Behaviour | Figure 2: IP in IP Decapsulation; Recap of Pre-Existing Behaviour | |||
The behaviour in the table derives from the logic given in RFC3168 | The behaviour in the table derives from the logic given in RFC 3168 | |||
and RFC4301, briefly recapped as follows: | and RFC 4301, briefly recapped as follows: | |||
o On decapsulation, if the inner ECN field is Not-ECT the outer is | o On decapsulation, if the inner ECN field is Not-ECT the outer is | |||
ignored. RFC3168 (but not RFC4301) also specified that the | ignored. RFC 3168 (but not RFC 4301) also specified that the | |||
decapsulator must drop a packet with a Not-ECT inner and CE in the | decapsulator must drop a packet with a Not-ECT inner and CE in the | |||
outer. | outer. | |||
o In all other cases, if the outer is CE, the outgoing ECN field is | o In all other cases, if the outer is CE, the outgoing ECN field is | |||
set to CE, but otherwise the outer is ignored and the inner is | set to CE; otherwise, the outer is ignored and the inner is used | |||
used for the outgoing ECN field. | for the outgoing ECN field. | |||
Section 9.2.2 of RFC3168 also made it an auditable event for an IPsec | Section 9.2.2 of RFC 3168 also made it an auditable event for an | |||
tunnel "if the ECN Field is changed inappropriately within an IPsec | IPsec tunnel "if the ECN Field is changed inappropriately within an | |||
tunnel...". Inappropriate changes were not specifically enumerated. | IPsec tunnel...". Inappropriate changes were not specifically | |||
RFC4301 did not mention inappropriate ECN changes. | enumerated. RFC 4301 did not mention inappropriate ECN changes. | |||
4. New ECN Tunnelling Rules | 4. New ECN Tunnelling Rules | |||
The standards actions below in Section 4.1 (ingress encapsulation) | The standards actions below in Section 4.1 (ingress encapsulation) | |||
and Section 4.2 (egress decapsulation) define new default ECN tunnel | and Section 4.2 (egress decapsulation) define new default ECN tunnel | |||
processing rules for any IP packet (v4 or v6) with any Diffserv | processing rules for any IP packet (v4 or v6) with any Diffserv | |||
codepoint. | codepoint. | |||
If these defaults do not meet a particular requirement, an alternate | If these defaults do not meet a particular requirement, an alternate | |||
ECN tunnelling scheme can be introduced as part of the definition of | ECN tunnelling scheme can be introduced as part of the definition of | |||
an alternate congestion marking scheme used by a specific Diffserv | an alternate congestion marking scheme used by a specific Diffserv | |||
PHB (see section 5 of [RFC3168] and [RFC4774]). When designing such | PHB (see [RFC4774] and Section 5 of [RFC3168]). When designing such | |||
alternate ECN tunnelling schemes, the principles in Section 7 should | alternate ECN tunnelling schemes, the principles in Section 7 should | |||
be followed. However, alternate ECN tunnelling schemes SHOULD be | be followed. However, alternate ECN tunnelling schemes SHOULD be | |||
avoided whenever possible as the deployment burden of handling | avoided whenever possible as the deployment burden of handling | |||
exceptional PHBs in implementations of all affected tunnels should | exceptional PHBs in implementations of all affected tunnels should | |||
not be underestimated. There is no requirement for a PHB definition | not be underestimated. There is no requirement for a PHB definition | |||
to state anything about ECN tunnelling behaviour if the default | to state anything about ECN tunnelling behaviour if the default | |||
behaviour in the present specification is sufficient. | behaviour in the present specification is sufficient. | |||
4.1. Default Tunnel Ingress Behaviour | 4.1. Default Tunnel Ingress Behaviour | |||
Two modes of encapsulation are defined here; a REQUIRED `normal mode' | Two modes of encapsulation are defined here; a REQUIRED 'normal mode' | |||
and a `compatibility mode', which is for backward compatibility with | and a 'compatibility mode', which is for backward compatibility with | |||
tunnel decapsulators that do not understand ECN. Note that these are | tunnel decapsulators that do not understand ECN. Note that these are | |||
modes of the ingress tunnel endpoint only, not the whole tunnel. | modes of the ingress tunnel endpoint only, not the whole tunnel. | |||
Section 4.3 explains why two modes are necessary and specifies the | Section 4.3 explains why two modes are necessary and specifies the | |||
circumstances in which it is sufficient to solely implement normal | circumstances in which it is sufficient to solely implement normal | |||
mode. | mode. | |||
Whatever the mode, an encapsulator forwards the inner header without | Whatever the mode, an encapsulator forwards the inner header without | |||
changing the ECN field. | changing the ECN field. | |||
In normal mode an encapsulator compliant with this specification MUST | In normal mode, an encapsulator compliant with this specification | |||
construct the outer encapsulating IP header by copying the 2-bit ECN | MUST construct the outer encapsulating IP header by copying the | |||
field of the incoming IP header. In compatibility mode it clears the | two-bit ECN field of the incoming IP header. In compatibility mode, | |||
ECN field in the outer header to the Not-ECT codepoint (the IPv4 | it clears the ECN field in the outer header to the Not-ECT codepoint | |||
header checksum also changes whenever the ECN field is changed). | (the IPv4 header checksum also changes whenever the ECN field is | |||
These rules are tabulated for convenience in Figure 3. | changed). These rules are tabulated for convenience in Figure 3. | |||
+-----------------+-------------------------------+ | +-----------------+-------------------------------+ | |||
| Incoming Header | Outgoing Outer Header | | | Incoming Header | Departing Outer Header | | |||
| (also equal to +---------------+---------------+ | | (also equal to +---------------+---------------+ | |||
| Outgoing Inner | Compatibility | Normal | | | departing Inner | Compatibility | Normal | | |||
| Header) | Mode | Mode | | | Header) | Mode | Mode | | |||
+-----------------+---------------+---------------+ | +-----------------+---------------+---------------+ | |||
| Not-ECT | Not-ECT | Not-ECT | | | Not-ECT | Not-ECT | Not-ECT | | |||
| ECT(0) | Not-ECT | ECT(0) | | | ECT(0) | Not-ECT | ECT(0) | | |||
| ECT(1) | Not-ECT | ECT(1) | | | ECT(1) | Not-ECT | ECT(1) | | |||
| CE | Not-ECT | CE | | | CE | Not-ECT | CE | | |||
+-----------------+---------------+---------------+ | +-----------------+---------------+---------------+ | |||
Figure 3: New IP in IP Encapsulation Behaviours | Figure 3: New IP in IP Encapsulation Behaviours | |||
4.2. Default Tunnel Egress Behaviour | 4.2. Default Tunnel Egress Behaviour | |||
To decapsulate the inner header at the tunnel egress, a compliant | To decapsulate the inner header at the tunnel egress, a compliant | |||
tunnel egress MUST set the outgoing ECN field to the codepoint at the | tunnel egress MUST set the outgoing ECN field to the codepoint at the | |||
intersection of the appropriate incoming inner header (row) and outer | intersection of the appropriate arriving inner header (row) and outer | |||
header (column) in Figure 4 (the IPv4 header checksum also changes | header (column) in Figure 4 (the IPv4 header checksum also changes | |||
whenever the ECN field is changed). There is no need for more than | whenever the ECN field is changed). There is no need for more than | |||
one mode of decapsulation, as these rules cater for all known | one mode of decapsulation, as these rules cater for all known | |||
requirements. | requirements. | |||
+---------+------------------------------------------------+ | +---------+------------------------------------------------+ | |||
|Incoming | Incoming Outer Header | | |Arriving | Arriving Outer Header | | |||
| Inner +---------+------------+------------+------------+ | | Inner +---------+------------+------------+------------+ | |||
| Header | Not-ECT | ECT(0) | ECT(1) | CE | | | Header | Not-ECT | ECT(0) | ECT(1) | CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
| Not-ECT | Not-ECT |Not-ECT(!!!)|Not-ECT(!!!)| <drop>(!!!)| | | Not-ECT | Not-ECT |Not-ECT(!!!)|Not-ECT(!!!)| <drop>(!!!)| | |||
| ECT(0) | ECT(0) | ECT(0) | ECT(1) | CE | | | ECT(0) | ECT(0) | ECT(0) | ECT(1) | CE | | |||
| ECT(1) | ECT(1) | ECT(1) (!) | ECT(1) | CE | | | ECT(1) | ECT(1) | ECT(1) (!) | ECT(1) | CE | | |||
| CE | CE | CE | CE(!!!)| CE | | | CE | CE | CE | CE(!!!)| CE | | |||
+---------+---------+------------+------------+------------+ | +---------+---------+------------+------------+------------+ | |||
The ECN field in the outgoing header is set to the codepoint at the | The ECN field in the outgoing header is set to the codepoint at the | |||
intersection of the appropriate incoming inner header (row) and | intersection of the appropriate arriving inner header (row) and | |||
incoming outer header (column) , or the packet is dropped where | arriving outer header (column), or the packet is dropped where | |||
indicated. Currently unused combinations are indicated by '(!!!)' or | indicated. Currently unused combinations are indicated by '(!!!)' or | |||
'(!)' | '(!)' | |||
Figure 4: New IP in IP Decapsulation Behaviour | Figure 4: New IP in IP Decapsulation Behaviour | |||
This table for decapsulation behaviour is derived from the following | This table for decapsulation behaviour is derived from the following | |||
logic: | logic: | |||
o If the inner ECN field is Not-ECT the decapsulator MUST NOT | o If the inner ECN field is Not-ECT, the decapsulator MUST NOT | |||
propagate any other ECN codepoint onwards. This is because the | propagate any other ECN codepoint onwards. This is because the | |||
inner Not-ECT marking is set by transports that rely on dropped | inner Not-ECT marking is set by transports that rely on dropped | |||
packets as an indication of congestion and would not understand or | packets as an indication of congestion and would not understand or | |||
respond to any other ECN codepoint [RFC4774]. Specifically: | respond to any other ECN codepoint [RFC4774]. Specifically: | |||
* If the inner ECN field is Not-ECT and the outer ECN field is CE | * If the inner ECN field is Not-ECT and the outer ECN field is | |||
the decapsulator MUST drop the packet. | CE, the decapsulator MUST drop the packet. | |||
* If the inner ECN field is Not-ECT and the outer ECN field is | * If the inner ECN field is Not-ECT and the outer ECN field is | |||
Not-ECT, ECT(0) or ECT(1) the decapsulator MUST forward the | Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the | |||
outgoing packet with the ECN field cleared to Not-ECT. | outgoing packet with the ECN field cleared to Not-ECT. | |||
o In all other cases where the inner supports ECN, the decapsulator | o In all other cases where the inner supports ECN, the decapsulator | |||
MUST set the outgoing ECN field to the more severe marking of the | MUST set the outgoing ECN field to the more severe marking of the | |||
outer and inner ECN fields, where the ranking of severity from | outer and inner ECN fields, where the ranking of severity from | |||
highest to lowest is CE, ECT(1), ECT(0), Not-ECT. This in no way | highest to lowest is CE, ECT(1), ECT(0), Not-ECT. This in no way | |||
precludes cases where ECT(1) and ECT(0) have the same severity; | precludes cases where ECT(1) and ECT(0) have the same severity; | |||
o Certain combinations of inner and outer ECN fields cannot result | o Certain combinations of inner and outer ECN fields cannot result | |||
from any transition in any current or previous ECN tunneling | from any transition in any current or previous ECN tunneling | |||
specification. These currently unused (CU) combinations are | specification. These currently unused (CU) combinations are | |||
indicated in Figure 4 by '(!!!)' or '(!)', where '(!!!)' means the | indicated in Figure 4 by '(!!!)' or '(!)', where '(!!!)' means the | |||
combination is CU and always potentially dangerous, while '(!)' | combination is CU and always potentially dangerous, while '(!)' | |||
means it is CU and possibly dangerous. In these cases, | means it is CU and possibly dangerous. In these cases, | |||
particularly the more dangerous ones, the decapsulator SHOULD log | particularly the more dangerous ones, the decapsulator SHOULD log | |||
the event and MAY also raise an alarm. | the event and MAY also raise an alarm. | |||
Just because the highlighted combinations are currently unused, | Just because the highlighted combinations are currently unused, | |||
does not mean that all the other combinations are always valid. | does not mean that all the other combinations are always valid. | |||
Some are only valid if they have arrived from a particular type of | Some are only valid if they have arrived from a particular type of | |||
legacy ingress, and dangerous otherwise. Therefore an | legacy ingress, and dangerous otherwise. Therefore, an | |||
implementation MAY allow an operator to configure logging and | implementation MAY allow an operator to configure logging and | |||
alarms for such additional header combinations known to be | alarms for such additional header combinations known to be | |||
dangerous or CU for the particular configuration of tunnel | dangerous or CU for the particular configuration of tunnel | |||
endpoints deployed at run-time. | endpoints deployed at run-time. | |||
Alarms SHOULD be rate-limited so that the anomalous combinations | Alarms SHOULD be rate-limited so that the anomalous combinations | |||
will not amplify into a flood of alarm messages. It MUST be | will not amplify into a flood of alarm messages. It MUST be | |||
possible to suppress alarms or logging, e.g. if it becomes | possible to suppress alarms or logging, e.g., if it becomes | |||
apparent that a combination that previously was not used has | apparent that a combination that previously was not used has | |||
started to be used for legitimate purposes such as a new standards | started to be used for legitimate purposes such as a new standards | |||
action. | action. | |||
The above logic allows for ECT(0) and ECT(1) to both represent the | The above logic allows for ECT(0) and ECT(1) to both represent the | |||
same severity of congestion marking (e.g. "not congestion marked"). | same severity of congestion marking (e.g., "not congestion marked"). | |||
But it also allows future schemes to be defined where ECT(1) is a | But it also allows future schemes to be defined where ECT(1) is a | |||
more severe marking than ECT(0), in particular enabling the simplest | more severe marking than ECT(0), in particular, enabling the simplest | |||
possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding] (see | possible encoding for PCN [PCN3in1] (see Section 5.3.2). Treating | |||
Section 5.3.2). Treating ECT(1) as either the same as ECT(0) or as a | ECT(1) as either the same as ECT(0) or as a higher severity level is | |||
higher severity level is explained in the discussion of the ECN nonce | explained in the discussion of the ECN nonce [RFC3540] in Section 8, | |||
[RFC3540] in Section 9, which in turn refers to Appendix D. | which in turn refers to Appendix D. | |||
4.3. Encapsulation Modes | 4.3. Encapsulation Modes | |||
Section 4.1 introduces two encapsulation modes, normal mode and | Section 4.1 introduces two encapsulation modes: normal mode, and | |||
compatibility mode, defining their encapsulation behaviour (i.e. | compatibility mode, defining their encapsulation behaviour (i.e., | |||
header copying or zeroing respectively). Note that these are modes | header copying or zeroing, respectively). Note that these are modes | |||
of the ingress tunnel endpoint only, not the tunnel as a whole. | of the ingress tunnel endpoint only, not the tunnel as a whole. | |||
To comply with this specification, a tunnel ingress MUST at least | To comply with this specification, a tunnel ingress MUST at least | |||
implement `normal mode'. Unless it will never be used with legacy | implement normal mode. Unless it will never be used with legacy | |||
tunnel egress nodes (RFC2003, RFC2401 or RFC2481 or the limited | tunnel egress nodes (RFC 2003, RFC 2401, or RFC 2481 or the limited | |||
functionality mode of RFC3168), an ingress MUST also implement | functionality mode of RFC 3168), an ingress MUST also implement | |||
`compatibility mode' for backward compatibility with tunnel egresses | compatibility mode for backward compatibility with tunnel egresses | |||
that do not propagate explicit congestion notifications [RFC4774]. | that do not propagate explicit congestion notifications [RFC4774]. | |||
We can categorise the way that an ingress tunnel endpoint is paired | We can categorise the way that an ingress tunnel endpoint is paired | |||
with an egress as either static or dynamically discovered: | with an egress as either static or dynamically discovered: | |||
Static: Tunnel endpoints paired together by prior configuration. | Static: Tunnel endpoints paired together by prior configuration. | |||
Some implementations of encapsulator might always be statically | Some implementations of encapsulator might always be statically | |||
deployed, and constrained to never be paired with a legacy | deployed, and constrained to never be paired with a legacy | |||
decapsulator (RFC2003, RFC2401 or RFC2481 or the limited | decapsulator (RFC 2003, RFC 2401 or RFC 2481 or the limited | |||
functionality mode of RFC3168). In such a case, only normal mode | functionality mode of RFC 3168). In such a case, only normal mode | |||
needs to be implemented. | needs to be implemented. | |||
For instance, RFC4301-compatible IPsec tunnel endpoints invariably | For instance, IPsec tunnel endpoints compatible with RFC 4301 | |||
use IKEv2 [RFC4306] for key exchange, which was introduced | invariably use Internet Key Exchange Protocol version 2 (IKEv2) | |||
alongside RFC4301. Therefore both endpoints of an RFC4301 tunnel | [RFC5996] for key exchange, the original specification of which | |||
can be sure that the other end is RFC4301-compatible, because the | was introduced alongside RFC 4301. Therefore, both endpoints of | |||
tunnel is only formed after IKEv2 key management has completed, at | an RFC 4301 tunnel can be sure that the other end is compatible | |||
which point both ends will be RFC4301-compliant by definition. | with RFC 4301, because the tunnel is only formed after IKEv2 key | |||
Therefore an IPsec tunnel ingress does not need compatibility | management has completed, at which point both ends will be | |||
mode, as it will never interact with legacy ECN tunnels. To | compliant with RFC 4301 by definition. Therefore an IPsec tunnel | |||
comply with the present specification, it only needs to implement | ingress does not need compatibility mode, as it will never | |||
the required normal mode, which is identical to the pre-existing | interact with legacy ECN tunnels. To comply with the present | |||
RFC4301 behaviour. | specification, it only needs to implement the required normal | |||
mode, which is identical to the pre-existing RFC 4301 behaviour. | ||||
Dynamic Discovery: Tunnel endpoints paired together by some form of | Dynamic Discovery: Tunnel endpoints paired together by some form of | |||
tunnel endpoint discovery, typically finding an egress on the path | tunnel endpoint discovery, typically finding an egress on the path | |||
taken by the first packet. | taken by the first packet. | |||
This specification does not require or recommend dynamic discovery | This specification does not require or recommend dynamic discovery | |||
and it does not define how dynamic negotiation might be done, but | and it does not define how dynamic negotiation might be done, but | |||
it recognises that proprietary tunnel endpoint discovery protocols | it recognises that proprietary tunnel endpoint discovery protocols | |||
exist. It therefore sets down some constraints on discovery | exist. It therefore sets down some constraints on discovery | |||
protocols to ensure safe interworking. | protocols to ensure safe interworking. | |||
If dynamic tunnel endpoint discovery might pair an ingress with a | If dynamic tunnel endpoint discovery might pair an ingress with a | |||
legacy egress (RFC2003, RFC2401 or RFC2481 or the limited | legacy egress (RFC 2003, RFC 2401, or RFC 2481 or the limited | |||
functionality mode of RFC3168), the ingress MUST implement both | functionality mode of RFC 3168), the ingress MUST implement both | |||
normal and compatibility mode. If the tunnel discovery process is | normal and compatibility mode. If the tunnel discovery process is | |||
arranged to only ever find a tunnel egress that propagates ECN | arranged to only ever find a tunnel egress that propagates ECN | |||
(RFC3168 full functionality mode, RFC4301 or this present | (RFC 3168 full functionality mode, RFC 4301, or this present | |||
specification), then a tunnel ingress can be compliant with the | specification), then a tunnel ingress can be compliant with the | |||
present specification without implementing compatibility mode. | present specification without implementing compatibility mode. | |||
While a compliant tunnel ingress is discovering an egress, it MUST | While a compliant tunnel ingress is discovering an egress, it MUST | |||
send packets in compatibility mode in case the egress it discovers | send packets in compatibility mode in case the egress it discovers | |||
is a legacy egress. If, through the discovery protocol, the | is a legacy egress. If, through the discovery protocol, the | |||
egress indicates that it is compliant with the present | egress indicates that it is compliant with the present | |||
specification, with RFC4301 or with RFC3168 full functionality | specification, with RFC 4301 or with RFC 3168 full functionality | |||
mode, the ingress can switch itself into normal mode. If the | mode, the ingress can switch itself into normal mode. If the | |||
egress denies compliance with any of these or returns an error | egress denies compliance with any of these or returns an error | |||
that implies it does not understand a request to work to any of | that implies it does not understand a request to work to any of | |||
these ECN specifications, the tunnel ingress MUST remain in | these ECN specifications, the tunnel ingress MUST remain in | |||
compatibility mode. | compatibility mode. | |||
If an ingress claims compliance with this specification it MUST NOT | If an ingress claims compliance with this specification, it MUST NOT | |||
permanently disable ECN processing across the tunnel (i.e. only using | permanently disable ECN processing across the tunnel (i.e., only | |||
compatibility mode). It is true that such a tunnel ingress is at | using compatibility mode). It is true that such a tunnel ingress is | |||
least safe with the ECN behaviour of any egress it may encounter, but | at least safe with the ECN behaviour of any egress it may encounter, | |||
it does not meet the central aim of this specification: introducing | but it does not meet the central aim of this specification: | |||
ECN support to tunnels. | introducing ECN support to tunnels. | |||
Instead, if the ingress knows that the egress does support | Instead, if the ingress knows that the egress does support | |||
propagation of ECN (full functionality mode of RFC3168 or RFC4301 or | propagation of ECN (full functionality mode of RFC 3168 or RFC 4301 | |||
the present specification), it SHOULD use normal mode, in order to | or the present specification), it SHOULD use normal mode, in order to | |||
support ECN where possible. Note that this section started by saying | support ECN where possible. Note that this section started by saying | |||
an ingress "MUST implement "normal mode, while it has just said an | an ingress "MUST implement" normal mode, while it has just said an | |||
ingress "SHOULD use" normal mode. This distinction is deliberate, to | ingress "SHOULD use" normal mode. This distinction is deliberate, to | |||
allow the mode to be turned off in exceptional circumstances but to | allow the mode to be turned off in exceptional circumstances but to | |||
ensure all implementations make normal mode available. | ensure all implementations make normal mode available. | |||
Implementation note: If a compliant node is the ingress for multiple | Implementation note: If a compliant node is the ingress for multiple | |||
tunnels, a mode setting will need to be stored for each tunnel | tunnels, a mode setting will need to be stored for each tunnel | |||
ingress. However, if a node is the egress for multiple tunnels, | ingress. However, if a node is the egress for multiple tunnels, | |||
none of the tunnels will need to store a mode setting, because a | none of the tunnels will need to store a mode setting, because a | |||
compliant egress only needs one mode. | compliant egress only needs one mode. | |||
skipping to change at page 21, line 40 | skipping to change at page 14, line 42 | |||
A compliant decapsulator only needs one mode of operation. However, | A compliant decapsulator only needs one mode of operation. However, | |||
if a compliant egress is implemented to be dynamically discoverable, | if a compliant egress is implemented to be dynamically discoverable, | |||
it may need to respond to discovery requests from various types of | it may need to respond to discovery requests from various types of | |||
legacy tunnel ingress. This specification does not define how | legacy tunnel ingress. This specification does not define how | |||
dynamic negotiation might be done by (proprietary) discovery | dynamic negotiation might be done by (proprietary) discovery | |||
protocols, but it sets down some constraints to ensure safe | protocols, but it sets down some constraints to ensure safe | |||
interworking. | interworking. | |||
Through the discovery protocol, a tunnel ingress compliant with the | Through the discovery protocol, a tunnel ingress compliant with the | |||
present specification might ask if the egress is compliant with the | present specification might ask if the egress is compliant with the | |||
present specification, with RFC4301 or with RFC3168 full | present specification, with RFC 4301 or with RFC 3168 full | |||
functionality mode. Or an RFC3168 tunnel ingress might try to | functionality mode. Or an RFC 3168 tunnel ingress might try to | |||
negotiate to use limited functionality or full functionality mode | negotiate to use limited functionality or full functionality mode | |||
[RFC3168]. In all these cases, a decapsulating tunnel egress | [RFC3168]. In all these cases, a decapsulating tunnel egress | |||
compliant with this specification MUST agree to any of these | compliant with this specification MUST agree to any of these | |||
requests, since it will behave identically in all these cases. | requests, since it will behave identically in all these cases. | |||
If no ECN-related mode is requested, a compliant tunnel egress MUST | If no ECN-related mode is requested, a compliant tunnel egress MUST | |||
continue without raising any error or warning, because its egress | continue without raising any error or warning, because its egress | |||
behaviour is compatible with all the legacy ingress behaviours that | behaviour is compatible with all the legacy ingress behaviours that | |||
do not negotiate capabilities. | do not negotiate capabilities. | |||
A compliant tunnel egress SHOULD raise a warning alarm about any | A compliant tunnel egress SHOULD raise a warning alarm about any | |||
requests to enter modes it does not recognise but, for 'forward | requests to enter modes it does not recognise but, for 'forward | |||
compatibility' with standards actions possibly defined after it was | compatibility' with standards actions possibly defined after it was | |||
implemented, it SHOULD continue operating. | implemented, it SHOULD continue operating. | |||
5. Updates to Earlier RFCs | 5. Updates to Earlier RFCs | |||
5.1. Changes to RFC4301 ECN processing | 5.1. Changes to RFC 4301 ECN Processing | |||
Ingress: An RFC4301 IPsec encapsulator is not changed at all by the | Ingress: An RFC 4301 IPsec encapsulator is not changed at all by the | |||
present specification. It uses the normal mode of the present | present specification. It uses the normal mode of the present | |||
specification, which defines packet encapsulation identically to | specification, which defines packet encapsulation identically to | |||
RFC4301. | RFC 4301. | |||
Egress: An RFC4301 egress will need to be updated to the new | Egress: An RFC 4301 egress will need to be updated to the new | |||
decapsulation behaviour in Figure 4, in order to comply with the | decapsulation behaviour in Figure 4, in order to comply with the | |||
present specification. However, the changes are backward | present specification. However, the changes are backward | |||
compatible; combinations of inner and outer that result from any | compatible; combinations of inner and outer that result from any | |||
protocol defined in the RFC series so far are unaffected. Only | protocol defined in the RFC series so far are unaffected. Only | |||
combinations that have never been used have been changed, | combinations that have never been used have been changed, | |||
effectively adding new behaviours to RFC4301 decapsulation without | effectively adding new behaviours to RFC 4301 decapsulation | |||
altering existing behaviours. The following specific updates to | without altering existing behaviours. The following specific | |||
section 5.1.2 of RFC4301 have been made: | updates to Section 5.1.2 of RFC 4301 have been made: | |||
* The outer, not the inner, is propagated when the outer is | * The outer, not the inner, is propagated when the outer is | |||
ECT(1) and the inner is ECT(0); | ECT(1) and the inner is ECT(0); | |||
* A packet with Not-ECT in the inner and an outer of CE is | * A packet with Not-ECT in the inner and an outer of CE is | |||
dropped rather than forwarded as Not-ECT; | dropped rather than forwarded as Not-ECT; | |||
* Certain combinations of inner and outer ECN field have been | * Certain combinations of inner and outer ECN field have been | |||
identified as currently unused. These can trigger logging | identified as currently unused. These can trigger logging | |||
and/or raise alarms. | and/or raise alarms. | |||
Modes: RFC4301 tunnel endpoints do not need modes and are not | Modes: RFC 4301 tunnel endpoints do not need modes and are not | |||
updated by the modes in the present specification. Effectively an | updated by the modes in the present specification. Effectively, | |||
RFC4301 IPsec ingress solely uses the REQUIRED normal mode of | an RFC 4301 IPsec ingress solely uses the REQUIRED normal mode of | |||
encapsulation, which is unchanged from RFC4301 encapsulation. It | encapsulation, which is unchanged from RFC 4301 encapsulation. It | |||
will never need the OPTIONAL compatibility mode as explained in | will never need the OPTIONAL compatibility mode as explained in | |||
Section 4.3. | Section 4.3. | |||
5.2. Changes to RFC3168 ECN processing | 5.2. Changes to RFC 3168 ECN Processing | |||
Ingress: On encapsulation, the new rule in Figure 3 that a normal | Ingress: On encapsulation, the new rule in Figure 3 that a normal | |||
mode tunnel ingress copies any ECN field into the outer header | mode tunnel ingress copies any ECN field into the outer header | |||
updates the full functionality behaviour of an RFC3168 ingress | updates the full functionality behaviour of an RFC 3168 ingress | |||
[RFC3168; section 9.1.1]. Nonetheless, the new compatibility mode | (Section 9.1.1 of [RFC3168]). Nonetheless, the new compatibility | |||
encapsulates packets identically to the limited functionality mode | mode encapsulates packets identically to the limited functionality | |||
of an RFC3168 ingress. | mode of an RFC 3168 ingress. | |||
Egress: An RFC3168 egress will need to be updated to the new | Egress: An RFC 3168 egress will need to be updated to the new | |||
decapsulation behaviour in Figure 4, in order to comply with the | decapsulation behaviour in Figure 4, in order to comply with the | |||
present specification. However, the changes are backward | present specification. However, the changes are backward | |||
compatible; combinations of inner and outer that result from any | compatible; combinations of inner and outer that result from any | |||
protocol defined in the RFC series so far are unaffected. Only | protocol defined in the RFC series so far are unaffected. Only | |||
combinations that have never been used have been changed, | combinations that have never been used have been changed, | |||
effectively adding new behaviours to RFC3168 decapsulation without | effectively adding new behaviours to RFC 3168 decapsulation | |||
altering existing behaviours. The following specific updates to | without altering existing behaviours. The following specific | |||
section 9.1.1 of RFC3168 have been made: | updates to Section 9.1.1 of RFC 3168 have been made: | |||
* The outer, not the inner, is propagated when the outer is | * The outer, not the inner, is propagated when the outer is | |||
ECT(1) and the inner is ECT(0); | ECT(1) and the inner is ECT(0); | |||
* Certain combinations of inner and outer ECN field have been | * Certain combinations of inner and outer ECN field have been | |||
identified as currently unused. These can trigger logging | identified as currently unused. These can trigger logging | |||
and/or raise alarms. | and/or raise alarms. | |||
Modes: An RFC3168 ingress will need to be updated if it is to comply | Modes: An RFC 3168 ingress will need to be updated if it is to | |||
with the present specification, whether or not it implemented the | comply with the present specification, whether or not it | |||
optional full functionality mode of section 9.1.1 of RFC3168. | implemented the optional full functionality mode of Section 9.1.1 | |||
of RFC 3168. | ||||
Section 9.1 of RFC3168 defined a (required) limited functionality | Section 9.1 of RFC 3168 defined a (required) limited functionality | |||
mode and an (optional) full functionality mode for a tunnel. In | mode and an (optional) full functionality mode for a tunnel. In | |||
RFC3168, modes applied to both ends of the tunnel, while in the | RFC 3168, modes applied to both ends of the tunnel, while in the | |||
present specification, modes are only used at the ingress--a | present specification, modes are only used at the ingress -- a | |||
single egress behaviour covers all cases. | single egress behaviour covers all cases. | |||
The normal mode of encapsulation is an update to the encapsulation | The normal mode of encapsulation is an update to the encapsulation | |||
behaviour of the full functionality mode of an RFC3168 ingress. | behaviour of the full functionality mode of an RFC 3168 ingress. | |||
The compatibility mode of encapsulation is identical to the | The compatibility mode of encapsulation is identical to the | |||
encapsulation behaviour of the limited functionality mode of an | encapsulation behaviour of the limited functionality mode of an | |||
RFC3168 ingress, except it is not always obligatory. | RFC 3168 ingress, except it is not always obligatory. | |||
The constraints on how tunnel discovery protocols set modes in | The constraints on how tunnel discovery protocols set modes in | |||
Section 4.3 and Section 4.4 are an update to RFC3168, but they are | Sections 4.3 and 4.4 are an update to RFC 3168, but they are | |||
unlikely to require code changes as they document existing safe | unlikely to require code changes as they document existing safe | |||
practice. | practice. | |||
5.3. Motivation for Changes | 5.3. Motivation for Changes | |||
An overriding goal is to ensure the same ECN signals can mean the | An overriding goal is to ensure the same ECN signals can mean the | |||
same thing whatever tunnels happen to encapsulate an IP packet flow. | same thing whatever tunnels happen to encapsulate an IP packet flow. | |||
This removes gratuitous inconsistency, which otherwise constrains the | This removes gratuitous inconsistency, which otherwise constrains the | |||
available design space and makes it harder to design networks and new | available design space and makes it harder to design networks and new | |||
protocols that work predictably. | protocols that work predictably. | |||
5.3.1. Motivation for Changing Encapsulation | 5.3.1. Motivation for Changing Encapsulation | |||
The normal mode in Section 4 updates RFC3168 to make all IP in IP | The normal mode in Section 4 updates RFC 3168 to make all IP-in-IP | |||
encapsulation of the ECN field consistent--consistent with the way | encapsulation of the ECN field consistent -- consistent with the way | |||
both RFC4301 IPsec [RFC4301] and IP in MPLS or MPLS in MPLS | both RFC 4301 IPsec [RFC4301] and IP-in-MPLS or MPLS-in-MPLS | |||
encapsulation [RFC5129] construct the ECN field. | encapsulation [RFC5129] construct the ECN field. | |||
Compatibility mode has also been defined so that a non-RFC4301 | Compatibility mode has also been defined so that an ingress compliant | |||
ingress can still switch to using drop across a tunnel for backwards | with a version of IPsec prior to RFC 4301 can still switch to using | |||
compatibility with legacy decapsulators that do not propagate ECN | drop across a tunnel for backward compatibility with legacy | |||
correctly. | decapsulators that do not propagate ECN. | |||
The trigger that motivated this update to RFC3168 encapsulation was a | The trigger that motivated this update to RFC 3168 encapsulation was | |||
standards track proposal for pre-congestion notification (PCN | a Standards-Track proposal for pre-congestion notification (PCN | |||
[RFC5670]). PCN excess rate marking only works correctly if the ECN | [RFC5670]). PCN excess-traffic-marking only works correctly if the | |||
field is copied on encapsulation (as in RFC4301 and RFC5129); it does | ECN field is copied on encapsulation (as in RFC 4301 and RFC 5129); | |||
not work if ECN is reset (as in RFC3168). This is because PCN excess | it does not work if ECN is reset (as in RFC 3168). This is because | |||
rate marking depends on the outer header revealing any congestion | PCN excess-traffic-marking depends on the outer header revealing any | |||
experienced so far on the whole path, not just since the last tunnel | congestion experienced so far on the whole path, not just since the | |||
ingress. | last tunnel ingress. | |||
PCN allows a network operator to add flow admission and termination | PCN allows a network operator to add flow admission and termination | |||
for inelastic traffic at the edges of a Diffserv domain, but without | for inelastic traffic at the edges of a Diffserv domain, but without | |||
any per-flow mechanisms in the interior and without the generous | any per-flow mechanisms in the interior and without the generous | |||
provisioning typical of Diffserv, aiming to significantly reduce | provisioning typical of Diffserv, aiming to significantly reduce | |||
costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP | costs. The PCN architecture [RFC5559] states that RFC 3168 IP-in-IP | |||
tunnelling of the ECN field cannot be used for any tunnel ingress in | tunnelling of the ECN field cannot be used for any tunnel ingress in | |||
a PCN domain. Prior to the present specification, this left a stark | a PCN domain. Prior to the present specification, this left a stark | |||
choice between not being able to use PCN for inelastic traffic | choice between not being able to use PCN for inelastic traffic | |||
control or not being able to use the many tunnels already deployed | control or not being able to use the many tunnels already deployed | |||
for Mobile IP, VPNs and so forth. | for Mobile IP, VPNs, and so forth. | |||
The present specification provides a clean solution to this problem, | The present specification provides a clean solution to this problem, | |||
so that network operators who want to use both PCN and tunnels can | so that network operators who want to use both PCN and tunnels can | |||
specify that every tunnel ingress in a PCN region must comply with | specify that every tunnel ingress in a PCN region must comply with | |||
this latest specification. | this latest specification. | |||
Rather than allow tunnel specifications to fragment further into one | Rather than allow tunnel specifications to fragment further into one | |||
for PCN, one for IPsec and one for other tunnels, the opportunity has | for PCN, one for IPsec, and one for other tunnels, the opportunity | |||
been taken to consolidate the diverging specifications back into a | has been taken to consolidate the diverging specifications back into | |||
single tunnelling behaviour. Resetting ECN was originally motivated | a single tunnelling behaviour. Resetting ECN was originally | |||
by a covert channel concern that has been deliberately set aside in | motivated by a covert channel concern that has been deliberately set | |||
RFC4301 IPsec. Therefore the reset behaviour of RFC3168 is an | aside in RFC 4301 IPsec. Therefore, the reset behaviour of RFC 3168 | |||
anomaly that we do not need to keep. Copying ECN on encapsulation is | is an anomaly that we do not need to keep. Copying ECN on | |||
anyway simpler than resetting. So, as more tunnel endpoints comply | encapsulation is simpler than resetting. So, as more tunnel | |||
with this single consistent specification, encapsulation will be | endpoints comply with this single consistent specification, | |||
simpler as well as more predictable. | encapsulation will be simpler as well as more predictable. | |||
Appendix B assesses whether copying rather than resetting CE on | Appendix B assesses whether copying rather than resetting CE on | |||
ingress will cause any unintended side-effects, from the three | ingress will cause any unintended side effects, from the three | |||
perspectives of security, control and management. In summary this | perspectives of security, control, and management. In summary, this | |||
analysis finds that: | analysis finds that: | |||
o From the control perspective either copying or resetting works for | o From the control perspective, either copying or resetting works | |||
existing arrangements, but copying has more potential for | for existing arrangements, but copying has more potential for | |||
simplifying control and resetting breaks at least one proposal | simplifying control and resetting breaks at least one proposal | |||
already on the standards track. | that is already on the Standards Track. | |||
o From the management and monitoring perspective copying is | o From the management and monitoring perspective, copying is | |||
preferable. | preferable. | |||
o From the traffic security perspective (enforcing congestion | o From the traffic security perspective (enforcing congestion | |||
control, mitigating denial of service etc) copying is preferable. | control, mitigating denial of service, etc.), copying is | |||
preferable. | ||||
o From the information security perspective resetting is preferable, | o From the information security perspective, resetting is | |||
but the IETF Security Area now considers copying acceptable given | preferable, but the IETF Security Area now considers copying | |||
the bandwidth of a 2-bit covert channel can be managed. | acceptable given the bandwidth of a two-bit covert channel can be | |||
managed. | ||||
Therefore there are two points against resetting CE on ingress while | Therefore, there are two points against resetting CE on ingress while | |||
copying CE causes no significant harm. | copying CE causes no significant harm. | |||
5.3.2. Motivation for Changing Decapsulation | 5.3.2. Motivation for Changing Decapsulation | |||
The specification for decapsulation in Section 4 fixes three problems | The specification for decapsulation in Section 4 fixes three problems | |||
with the pre-existing behaviours of both RFC3168 and RFC4301: | with the pre-existing behaviours found in both RFC 3168 and RFC 4301: | |||
1. The pre-existing rules prevented the introduction of alternate | 1. The pre-existing rules prevented the introduction of alternate | |||
ECN semantics to signal more than one severity level of | ECN semantics to signal more than one severity level of | |||
congestion [RFC4774], [RFC5559]. The four states of the 2-bit | congestion [RFC4774], [RFC5559]. The four states of the two-bit | |||
ECN field provide room for signalling two severity levels in | ECN field provide room for signalling two severity levels in | |||
addition to not-congested and not-ECN-capable states. But, the | addition to not-congested and not-ECN-capable states. But, the | |||
pre-existing rules assumed that two of the states (ECT(0) and | pre-existing rules assumed that two of the states (ECT(0) and | |||
ECT(1)) are always equivalent. This unnecessarily restricts the | ECT(1)) are always equivalent. This unnecessarily restricts the | |||
use of one of four codepoints (half a bit) in the IP (v4 & v6) | use of one of four codepoints (half a bit) in the IP (v4 and v6) | |||
header. The new rules are designed to work in either case; | header. The new rules are designed to work in either case; | |||
whether ECT(1) is more severe than or equivalent to ECT(0). | whether ECT(1) is more severe than or equivalent to ECT(0). | |||
As explained in Appendix B.1, the original reason for not | As explained in Appendix B.1, the original reason for not | |||
forwarding the outer ECT codepoints was to limit the covert | forwarding the outer ECT codepoints was to limit the covert | |||
channel across a decapsulator to 1 bit per packet. However, now | channel across a decapsulator to 1 bit per packet. However, now | |||
that the IETF Security Area has deemed that a 2-bit covert | that the IETF Security Area has deemed that a two-bit covert | |||
channel through an encapsulator is a manageable risk, the same | channel through an encapsulator is a manageable risk, the same | |||
should be true for a decapsulator. | should be true for a decapsulator. | |||
As well as being useful for general future-proofing, this problem | As well as being useful for general future-proofing, this problem | |||
is immediately pressing for standardisation of pre-congestion | is immediately pressing for standardisation of pre-congestion | |||
notification (PCN), which uses two severity levels of congestion. | notification (PCN), which uses two severity levels of congestion. | |||
If a congested queue used ECT(1) in the outer header to signal | If a congested queue used ECT(1) in the outer header to signal | |||
more severe congestion than ECT(0), the pre-existing | more severe congestion than ECT(0), the pre-existing | |||
decapsulation rules would have thrown away this congestion | decapsulation rules would have thrown away this congestion | |||
signal, preventing tunnelled traffic from ever knowing that it | signal, preventing tunnelled traffic from ever knowing that it | |||
should reduce its load. | should reduce its load. | |||
Before the present specification was written, the PCN working | Before the present specification was written, the PCN working | |||
group had to consider a number of wasteful or convoluted work- | group had to consider a number of wasteful or convoluted work- | |||
rounds to this problem. Without wishing to disparage the | rounds to this problem. Without wishing to disparage the | |||
ingenuity of these work-rounds, none were chosen for the | ingenuity of these work-rounds, none were chosen for the | |||
standards track because they were either somewhat wasteful, | Standards Track because they were either somewhat wasteful, | |||
imprecise or complicated. Instead a baseline PCN encoding was | imprecise, or complicated. Instead, a baseline PCN encoding was | |||
specified [RFC5696] that supported only one severity level of | specified [RFC5696] that supported only one severity level of | |||
congestion but allowed space for these work-rounds as | congestion but allowed space for these work-rounds as | |||
experimental extensions. | experimental extensions. | |||
But by far the simplest approach is that taken by the current | By far the simplest approach is that taken by the current | |||
specification: just to remove the covert channel blockages from | specification: just to remove the covert channel blockages from | |||
tunnelling behaviour--now deemed unnecessary anyway. Then | tunnelling behaviour -- now deemed unnecessary anyway. Then, | |||
network operators that want to support two congestion severity- | network operators that want to support two congestion severity | |||
levels for PCN can specify that every tunnel egress in a PCN | levels for PCN can specify that every tunnel egress in a PCN | |||
region must comply with this latest specification. Having taken | region must comply with this latest specification. Having taken | |||
this step, the simplest possible encoding for PCN with two | this step, the simplest possible encoding for PCN with two | |||
severity levels of congestion [I-D.ietf-pcn-3-in-1-encoding] can | severity levels of congestion [PCN3in1] can be used. | |||
be used. | ||||
Not only does this make two congestion severity-levels available | Not only does this make two congestion severity levels available | |||
for PCN, but also for other potential uses of the extra ECN | for PCN, but also for other potential uses of the extra ECN | |||
codepoint (e.g. [VCP]). | codepoint (e.g., [VCP]). | |||
2. Cases are documented where a middlebox (e.g. a firewall) drops | 2. Cases are documented where a middlebox (e.g., a firewall) drops | |||
packets with header values that were currently unused (CU) when | packets with header values that were currently unused (CU) when | |||
the box was deployed, often on the grounds that anything | the box was deployed, often on the grounds that anything | |||
unexpected might be an attack. This tends to bar future use of | unexpected might be an attack. This tends to bar future use of | |||
CU values. The new decapsulation rules specify optional logging | CU values. The new decapsulation rules specify optional logging | |||
and/or alarms for specific combinations of inner and outer header | and/or alarms for specific combinations of inner and outer | |||
that are currently unused. The aim is to give implementers a | headers that are currently unused. The aim is to give | |||
recourse other than drop if they are concerned about the security | implementers a recourse other than drop if they are concerned | |||
of CU values. It recognises legitimate security concerns about | about the security of CU values. It recognises legitimate | |||
CU values but still eases their future use. If the alarms are | security concerns about CU values, but still eases their future | |||
interpreted as an attack (e.g. by a management system) the | use. If the alarms are interpreted as an attack (e.g., by a | |||
offending packets can be dropped. But alarms can be turned off | management system) the offending packets can be dropped. | |||
if these combinations come into regular use (e.g. through a | However, alarms can be turned off if these combinations come into | |||
future standards action). | regular use (e.g., through a future standards action). | |||
3. While reviewing currently unused combinations of inner and outer, | 3. While reviewing currently unused combinations of inner and outer | |||
the opportunity was taken to define a single consistent behaviour | headers, the opportunity was taken to define a single consistent | |||
for the three cases with a Not-ECT inner header but a different | behaviour for the three cases with a Not-ECT inner header but a | |||
outer. RFC3168 and RFC4301 had diverged in this respect and even | different outer. RFC 3168 and RFC 4301 had diverged in this | |||
their common behaviours had never been justified. | respect and even their common behaviours had never been | |||
justified. | ||||
None of these combinations should result from Internet protocols | None of these combinations should result from Internet protocols | |||
in the RFC series, but future standards actions might put any or | in the RFC series, but future standards actions might put any or | |||
all of them to good use. Therefore it was decided that a | all of them to good use. Therefore, it was decided that a | |||
decapsulator must forward a Not-ECT inner unchanged when the | decapsulator must forward a Not-ECT inner header unchanged when | |||
arriving outer is ECT(0) or ECT(1). But for safety it must drop | the arriving outer header is ECT(0) or ECT(1). For safety, it | |||
a combination of Not-ECT inner and CE outer. Then, if some | must drop a combination of Not-ECT inner and CE outer headers. | |||
unfortunate misconfiguration resulted in a congested router | Then, if some unfortunate misconfiguration resulted in a | |||
marking CE on a packet that was originally Not-ECT, drop would be | congested router marking CE on a packet that was originally | |||
the only appropriate signal for the egress to propagate--the only | Not-ECT, drop would be the only appropriate signal for the egress | |||
signal a non-ECN-capable transport (Not-ECT) would understand. | to propagate -- the only signal a non-ECN-capable transport | |||
(Not-ECT) would understand. | ||||
It may seem contradictory that the same argument has not been | It may seem contradictory that the same argument has not been | |||
applied to the ECT(1) codepoint, given it is being proposed as an | applied to the ECT(1) codepoint, given it is being proposed as an | |||
intermediate level of congestion in a scheme progressing through | intermediate level of congestion in a scheme progressing through | |||
the IETF [I-D.ietf-pcn-3-in-1-encoding]. Instead, a decapsulator | the IETF [PCN3in1]. Instead, a decapsulator must forward a | |||
must forward a Not-ECT inner unchanged when its outer is ECT(1). | Not-ECT inner unchanged when its outer is ECT(1). The rationale | |||
The rationale for not dropping this CU combination is to ensure | for not dropping this CU combination is to ensure it will be | |||
it will be usable if needed in the future. If any | usable if needed in the future. If any misconfiguration led to | |||
misconfiguration led to ECT(1) congestion signals with a Not-ECT | ECT(1) congestion signals with a Not-ECT inner, it would not be | |||
inner, it would not be disastrous for the tunnel egress to | disastrous for the tunnel egress to suppress them, because the | |||
suppress them, because the congestion should then escalate to CE | congestion should then escalate to CE marking, which the egress | |||
marking, which the egress would drop, thus at least preventing | would drop, thus at least preventing congestion collapse. | |||
congestion collapse. | ||||
Problems 2 & 3 alone would not warrant a change to decapsulation, but | Problems 2 and 3 alone would not warrant a change to decapsulation, | |||
it was decided they are worth fixing and making consistent at the | but it was decided they are worth fixing and making consistent at the | |||
same time as decapsulation code is changed to fix problem 1 (two | same time as decapsulation code is changed to fix problem 1 (two | |||
congestion severity-levels). | congestion severity levels). | |||
6. Backward Compatibility | 6. Backward Compatibility | |||
A tunnel endpoint compliant with the present specification is | A tunnel endpoint compliant with the present specification is | |||
backward compatible when paired with any tunnel endpoint compliant | backward compatible when paired with any tunnel endpoint compliant | |||
with any previous tunnelling RFC, whether RFC4301, RFC3168 (see | with any previous tunnelling RFC, whether RFC 4301, RFC 3168 (see | |||
Section 3) or the earlier RFCs summarised in Appendix A (RFC2481, | Section 3), or the earlier RFCs summarised in Appendix A (RFC 2481, | |||
RFC2401 and RFC2003). Each case is enumerated below. | RFC 2401, and RFC 2003). Each case is enumerated below. | |||
6.1. Non-Issues Updating Decapsulation | 6.1. Non-Issues Updating Decapsulation | |||
At the egress, this specification only augments the per-packet | At the egress, this specification only augments the per-packet | |||
calculation of the ECN field (RFC3168 and RFC4301) for combinations | calculation of the ECN field (RFC 3168 and RFC 4301) for combinations | |||
of inner and outer headers that have so far not been used in any IETF | of inner and outer headers that have so far not been used in any IETF | |||
protocols. | protocols. | |||
Therefore, all other things being equal, if an RFC4301 IPsec egress | Therefore, all other things being equal, if an RFC 4301 IPsec egress | |||
is updated to comply with the new rules, it will still interwork with | is updated to comply with the new rules, it will still interwork with | |||
any RFC4301 compliant ingress and the packet outputs will be | any ingress compliant with RFC 4301 and the packet outputs will be | |||
identical to those it would have output before (fully backward | identical to those it would have output before (fully backward | |||
compatible). | compatible). | |||
And, all other things being equal, if an RFC3168 egress is updated to | And, all other things being equal, if an RFC 3168 egress is updated | |||
comply with the same new rules, it will still interwork with any | to comply with the same new rules, it will still interwork with any | |||
ingress complying with any previous specification (both modes of | ingress complying with any previous specification (both modes of RFC | |||
RFC3168, both modes of RFC2481, RFC2401 and RFC2003) and the packet | 3168, both modes of RFC 2481, RFC 2401, and RFC 2003) and the packet | |||
outputs will be identical to those it would have output before (fully | outputs will be identical to those it would have output before (fully | |||
backward compatible). | backward compatible). | |||
A compliant tunnel egress merely needs to implement the one behaviour | A compliant tunnel egress merely needs to implement the one behaviour | |||
in Section 4 with no additional mode or option configuration at the | in Section 4 with no additional mode or option configuration at the | |||
ingress or egress nor any additional negotiation with the ingress. | ingress or egress nor any additional negotiation with the ingress. | |||
The new decapsulation rules have been defined in such a way that | The new decapsulation rules have been defined in such a way that | |||
congestion control will still work safely if any of the earlier | congestion control will still work safely if any of the earlier | |||
versions of ECN processing are used unilaterally at the encapsulating | versions of ECN processing are used unilaterally at the encapsulating | |||
ingress of the tunnel (any of RFC2003, RFC2401, either mode of | ingress of the tunnel (any of RFC 2003, RFC 2401, either mode of RFC | |||
RFC2481, either mode of RFC3168, RFC4301 and this present | 2481, either mode of RFC 3168, RFC 4301, and this present | |||
specification). | specification). | |||
6.2. Non-Update of RFC4301 IPsec Encapsulation | 6.2. Non-Update of RFC 4301 IPsec Encapsulation | |||
An RFC4301 IPsec ingress can comply with this new specification | An RFC 4301 IPsec ingress can comply with this new specification | |||
without any update and it has no need for any new modes, options or | without any update and it has no need for any new modes, options, or | |||
configuration. So, all other things being equal, it will continue to | configuration. So, all other things being equal, it will continue to | |||
interwork identically with any egress it worked with before (fully | interwork identically with any egress it worked with before (fully | |||
backward compatible). | backward compatible). | |||
6.3. Update to RFC3168 Encapsulation | 6.3. Update to RFC 3168 Encapsulation | |||
The encapsulation behaviour of the new normal mode copies the ECN | The encapsulation behaviour of the new normal mode copies the ECN | |||
field whereas an RFC3168 ingress in full functionality mode reset it. | field, whereas an RFC 3168 ingress in full functionality mode reset | |||
However, all other things being equal, if an RFC3168 ingress is | it. However, all other things being equal, if an RFC 3168 ingress is | |||
updated to the present specification, the outgoing packets from any | updated to the present specification, the outgoing packets from any | |||
tunnel egress will still be unchanged. This is because all variants | tunnel egress will still be unchanged. This is because all variants | |||
of tunnelling at either end (RFC4301, both modes of RFC3168, both | of tunnelling at either end (RFC 4301, both modes of RFC 3168, both | |||
modes of RFC2481, RFC2401, RFC2003 and the present specification) | modes of RFC 2481, RFC 2401, RFC 2003, and the present specification) | |||
have always propagated an incoming CE marking through the inner | have always propagated an incoming CE marking through the inner | |||
header and onward into the outgoing header, whether the outer header | header and onward into the outgoing header; whether the outer header | |||
is reset or copied. Therefore, If the tunnel is considered as a | is reset or copied. Therefore, if the tunnel is considered a black | |||
black box, the packets output from any egress will be identical with | box, the packets output from any egress will be identical with or | |||
or without an update to the ingress. Nonetheless, if packets are | without an update to the ingress. Nonetheless, if packets are | |||
observed within the black box (between the tunnel endpoints), CE | observed within the black box (between the tunnel endpoints), CE | |||
markings copied by the updated ingress will be visible within the | markings copied by the updated ingress will be visible within the | |||
black box, whereas they would not have been before. Therefore, the | black box, whereas they would not have been before. Therefore, the | |||
update to encapsulation can be termed 'black-box backwards | update to encapsulation can be termed 'black-box backward compatible' | |||
compatible' (i.e. identical unless you look inside the tunnel). | (i.e., identical unless you look inside the tunnel). | |||
This specification introduces no new backward compatibility issues | This specification introduces no new backward compatibility issues | |||
when a compliant ingress talks with a legacy egress, but it has to | when a compliant ingress talks with a legacy egress, but it has to | |||
provide similar safeguards to those already defined in RFC3168. | provide similar safeguards to those already defined in RFC 3168. RFC | |||
RFC3168 laid down rules to ensure that an RFC3168 ingress turns off | 3168 laid down rules to ensure that an RFC 3168 ingress turns off ECN | |||
ECN (limited functionality mode) if it is paired with a legacy egress | (limited functionality mode) if it is paired with a legacy egress | |||
(RFC 2481, RFC2401 or RFC2003), which would not propagate ECN | (RFC 2481, RFC 2401, or RFC 2003), which would not propagate ECN | |||
correctly. The present specification carries forward those rules | correctly. The present specification carries forward those rules | |||
(Section 4.3). It uses compatibility mode whenever RFC3168 would | (Section 4.3). It uses compatibility mode whenever RFC 3168 would | |||
have used limited functionality mode, and their per-packet behaviours | have used limited functionality mode, and their per-packet behaviours | |||
are identical. Therefore, all other things being equal, an ingress | are identical. Therefore, all other things being equal, an ingress | |||
using the new rules will interwork with any legacy tunnel egress in | using the new rules will interwork with any legacy tunnel egress in | |||
exactly the same way as an RFC3168 ingress (still black-box backward | exactly the same way as an RFC 3168 ingress (still black-box backward | |||
compatible). | compatible). | |||
7. Design Principles for Alternate ECN Tunnelling Semantics | 7. Design Principles for Alternate ECN Tunnelling Semantics | |||
This section is informative not normative. | This section is informative, not normative. | |||
Section 5 of RFC3168 permits the Diffserv codepoint (DSCP)[RFC2474] | Section 5 of RFC 3168 permits the Diffserv codepoint (DSCP)[RFC2474] | |||
to 'switch in' alternative behaviours for marking the ECN field, just | to 'switch in' alternative behaviours for marking the ECN field, just | |||
as it switches in different per-hop behaviours (PHBs) for scheduling. | as it switches in different per-hop behaviours (PHBs) for scheduling. | |||
[RFC4774] gives best current practice for designing such alternative | [RFC4774] gives best current practice for designing such alternative | |||
ECN semantics and very briefly mentions in section 5.4 that | ECN semantics and very briefly mentions in Section 5.4 that | |||
tunnelling needs to be considered. The guidance below complements | tunnelling needs to be considered. The guidance below complements | |||
and extends RFC4774, giving additional guidance on designing any | and extends RFC 4774, giving additional guidance on designing any | |||
alternate ECN semantics that would also require alternate tunnelling | alternate ECN semantics that would also require alternate tunnelling | |||
semantics. | semantics. | |||
The overriding guidance is: "Avoid designing alternate ECN tunnelling | The overriding guidance is: "Avoid designing alternate ECN tunnelling | |||
semantics, if at all possible." If a scheme requires tunnels to | semantics, if at all possible". If a scheme requires tunnels to | |||
implement special processing of the ECN field for certain DSCPs, it | implement special processing of the ECN field for certain DSCPs, it | |||
will be hard to guarantee that every implementer of every tunnel will | will be hard to guarantee that every implementer of every tunnel will | |||
have added the required exception or that operators will have | have added the required exception or that operators will have | |||
ubiquitously deployed the required updates. It is unlikely a single | ubiquitously deployed the required updates. It is unlikely a single | |||
authority is even aware of all the tunnels in a network, which may | authority is even aware of all the tunnels in a network, which may | |||
include tunnels set up by applications between endpoints, or | include tunnels set up by applications between endpoints, or | |||
dynamically created in the network. Therefore it is highly likely | dynamically created in the network. Therefore, it is highly likely | |||
that some tunnels within a network or on hosts connected to it will | that some tunnels within a network or on hosts connected to it will | |||
not implement the required special case. | not implement the required special case. | |||
That said, if a non-default scheme for tunnelling the ECN field is | That said, if a non-default scheme for tunnelling the ECN field is | |||
really required, the following guidelines might prove useful in its | really required, the following guidelines might prove useful in its | |||
design: | design: | |||
On encapsulation in any alternate scheme: | On encapsulation in any alternate scheme: | |||
1. The ECN field of the outer header ought to be cleared to Not- | 1. The ECN field of the outer header ought to be cleared to Not- | |||
ECT ("00") unless it is guaranteed that the corresponding | ECT ("00") unless it is guaranteed that the corresponding | |||
tunnel egress will correctly propagate congestion markings | tunnel egress will correctly propagate congestion markings | |||
introduced across the tunnel in the outer header. | introduced across the tunnel in the outer header. | |||
2. If it has established that ECN will be correctly propagated, | 2. If it has established that ECN will be correctly propagated, | |||
an encapsulator ought to also copy incoming congestion | an encapsulator also ought to copy incoming congestion | |||
notification into the outer header. The general principle | notification into the outer header. The general principle | |||
here is that the outer header should reflect congestion | here is that the outer header should reflect congestion | |||
accumulated along the whole upstream path, not just since the | accumulated along the whole upstream path, not just since the | |||
tunnel ingress (Appendix B.3 on management and monitoring | tunnel ingress (Appendix B.3 on management and monitoring | |||
explains). | explains). | |||
In some circumstances (e.g. PCN [RFC5559] and perhaps some | In some circumstances (e.g., PCN [RFC5559] and perhaps some | |||
pseudowires [RFC5659]), the whole path is divided into | pseudowires [RFC5659]), the whole path is divided into | |||
segments, each with its own congestion notification and | segments, each with its own congestion notification and | |||
feedback loop. In these cases, the function that regulates | feedback loop. In these cases, the function that regulates | |||
load at the start of each segment will need to reset | load at the start of each segment will need to reset | |||
congestion notification for its segment. Often the point | congestion notification for its segment. Often, the point | |||
where congestion notification is reset will also be located at | where congestion notification is reset will also be located at | |||
the start of a tunnel. However, the resetting function can be | the start of a tunnel. However, the resetting function can be | |||
thought of as being applied to packets after the encapsulation | thought of as being applied to packets after the encapsulation | |||
function--two logically separate functions even though they | function -- two logically separate functions even though they | |||
might run on the same physical box. Then the code module | might run on the same physical box. Then, the code module | |||
doing encapsulation can keep to the copying rule and the load | doing encapsulation can keep to the copying rule and the load | |||
regulator module can reset congestion, without any code in | regulator module can reset congestion, without any code in | |||
either module being conditional on whether the other is there. | either module being conditional on whether the other is there. | |||
On decapsulation in any new scheme: | On decapsulation in any alternate scheme: | |||
1. If the arriving inner header is Not-ECT it implies the | 1. If the arriving inner header is Not-ECT, the transport will | |||
transport will not understand other ECN codepoints. If the | not understand other ECN codepoints. If the outer header | |||
outer header carries an explicit congestion marking, the | carries an explicit congestion marking, the alternate scheme | |||
alternate scheme would be expected to drop the packet--the | would be expected to drop the packet -- the only indication of | |||
only indication of congestion the transport will understand. | congestion the transport will understand. If the alternate | |||
If the alternate scheme recommends forwarding rather than | scheme recommends forwarding rather than dropping such a | |||
dropping such a packet, it will need to clearly justify this | packet, it will need to clearly justify this decision. If the | |||
decision. If the inner is Not-ECT and the outer carries any | inner is Not-ECT and the outer carries any other ECN codepoint | |||
other ECN codepoint that does not indicate congestion, the | that does not indicate congestion, the alternate scheme can | |||
alternate scheme can forward the packet, but probably only as | forward the packet, but probably only as Not-ECT. | |||
Not-ECT. | ||||
2. If the arriving inner header is other than Not-ECT, the ECN | 2. If the arriving inner header is one other than Not-ECT, the | |||
field that the alternate decapsulation scheme forwards ought | ECN field that the alternate decapsulation scheme forwards | |||
to reflect the more severe congestion marking of the arriving | ought to reflect the more severe congestion marking of the | |||
inner and outer headers. | arriving inner and outer headers. | |||
3. Any alternate scheme will need to define a behaviour for all | 3. Any alternate scheme will need to define a behaviour for all | |||
combinations of inner and outer headers, even those that would | combinations of inner and outer headers, even those that would | |||
not be expected to result from standards known at the time and | not be expected to result from standards known at the time and | |||
even those that would not be expected from the tunnel ingress | even those that would not be expected from the tunnel ingress | |||
paired with the egress at run-time. Consideration should be | paired with the egress at run-time. Consideration should be | |||
given to logging such unexpected combinations and raising an | given to logging such unexpected combinations and raising an | |||
alarm, particularly if there is a danger that the invalid | alarm, particularly if there is a danger that the invalid | |||
combination implies congestion signals are not being | combination implies congestion signals are not being | |||
propagated correctly. The presence of currently unused | propagated correctly. The presence of currently unused | |||
combinations may represent an attack, but the new scheme | combinations may represent an attack, but the new scheme | |||
should try to define a way to forward such packets, at least | should try to define a way to forward such packets, at least | |||
if a safe outgoing codepoint can be defined. | if a safe outgoing codepoint can be defined. | |||
Raising an alarm allows a management system to decide whether | Raising an alarm allows a management system to decide whether | |||
the anomaly is indeed an attack, in which case it can decide | the anomaly is indeed an attack, in which case it can decide | |||
to drop such packets. This is a preferable approach to hard- | to drop such packets. This is a preferable approach to hard- | |||
coded discard of packets that seem anomalous today, but may be | coded discard of packets that seem anomalous today, but may be | |||
needed tomorrow in future standards actions. | needed tomorrow in future standards actions. | |||
8. IANA Considerations (to be removed on publication): | 8. Security Considerations | |||
This memo includes no request to IANA. | ||||
9. Security Considerations | ||||
Appendix B.1 discusses the security constraints imposed on ECN tunnel | Appendix B.1 discusses the security constraints imposed on ECN tunnel | |||
processing. The new rules for ECN tunnel processing (Section 4) | processing. The new rules for ECN tunnel processing (Section 4) | |||
trade-off between information security (covert channels) and traffic | trade-off between information security (covert channels) and traffic | |||
security (congestion monitoring & control). Ensuring congestion | security (congestion monitoring and control). Ensuring congestion | |||
markings are not lost is itself an aspect of security, because if we | markings are not lost is itself an aspect of security, because if we | |||
allowed congestion notification to be lost, any attempt to enforce a | allowed congestion notification to be lost, any attempt to enforce a | |||
response to congestion would be much harder. | response to congestion would be much harder. | |||
Security issues in unlikely but possible scenarios: | Security issues in unlikely, but possible, scenarios: | |||
Tunnels intersecting Diffserv regions with alternate ECN semantics: | Tunnels intersecting Diffserv regions with alternate ECN semantics: | |||
If alternate congestion notification semantics are defined for a | If alternate congestion notification semantics are defined for a | |||
certain Diffserv PHB, the scope of the alternate semantics might | certain Diffserv PHB, the scope of the alternate semantics might | |||
typically be bounded by the limits of a Diffserv region or | typically be bounded by the limits of a Diffserv region or | |||
regions, as envisaged in [RFC4774] (e.g. the pre-congestion | regions, as envisaged in [RFC4774] (e.g., the pre-congestion | |||
notification architecture [RFC5559]). The inner headers in | notification architecture [RFC5559]). The inner headers in | |||
tunnels crossing the boundary of such a Diffserv region but ending | tunnels crossing the boundary of such a Diffserv region but ending | |||
within the region can potentially leak the external congestion | within the region can potentially leak the external congestion | |||
notification semantics into the region, or leak the internal | notification semantics into the region, or leak the internal | |||
semantics out of the region. [RFC2983] discusses the need for | semantics out of the region. [RFC2983] discusses the need for | |||
Diffserv traffic conditioning to be applied at these tunnel | Diffserv traffic conditioning to be applied at these tunnel | |||
endpoints as if they are at the edge of the Diffserv region. | endpoints as if they are at the edge of the Diffserv region. | |||
Similar concerns apply to any processing or propagation of the ECN | Similar concerns apply to any processing or propagation of the ECN | |||
field at the endpoints of tunnels with one end inside and the | field at the endpoints of tunnels with one end inside and the | |||
other outside the domain. [RFC5559] gives specific advice on this | other outside the domain. [RFC5559] gives specific advice on this | |||
for the PCN case, but other definitions of alternate semantics | for the PCN case, but other definitions of alternate semantics | |||
will need to discuss the specific security implications in each | will need to discuss the specific security implications in each | |||
case. | case. | |||
ECN nonce tunnel coverage: The new decapsulation rules improve the | ECN nonce tunnel coverage: The new decapsulation rules improve the | |||
coverage of the ECN nonce [RFC3540] relative to the previous rules | coverage of the ECN nonce [RFC3540] relative to the previous rules | |||
in RFC3168 and RFC4301. However, nonce coverage is still not | in RFC 3168 and RFC 4301. However, nonce coverage is still not | |||
perfect, as this would have led to a safety problem in another | perfect, as this would have led to a safety problem in another | |||
case. Both are corner-cases, so discussion of the compromise | case. Both are corner-cases, so discussion of the compromise | |||
between them is deferred to Appendix D. | between them is deferred to Appendix D. | |||
Covert channel not turned off: A legacy (RFC3168) tunnel ingress | Covert channel not turned off: A legacy (RFC 3168) tunnel ingress | |||
could ask an RFC3168 egress to turn off ECN processing as well as | could ask an RFC 3168 egress to turn off ECN processing as well as | |||
itself turning off ECN. An egress compliant with the present | itself turning off ECN. An egress compliant with the present | |||
specification will agree to such a request from a legacy ingress, | specification will agree to such a request from a legacy ingress, | |||
but it relies on the ingress always sending Not-ECT in the outer. | but it relies on the ingress always sending Not-ECT in the outer | |||
If the egress receives other ECN codepoints in the outer it will | header. If the egress receives other ECN codepoints in the outer | |||
process them as normal, so it will actually still copy congestion | it will process them as normal, so it will actually still copy | |||
markings from the outer to the outgoing header. Referring for | congestion markings from the outer to the outgoing header. | |||
example to Figure 5 (Appendix B.1), although the tunnel ingress | Referring, for example, to Figure 5 (Appendix B.1), although the | |||
'I' will set all ECN fields in outer headers to Not-ECT, 'M' could | tunnel ingress 'I' will set all ECN fields in outer headers to | |||
still toggle CE or ECT(1) on and off to communicate covertly with | Not-ECT, 'M' could still toggle CE or ECT(1) on and off to | |||
'B', because we have specified that 'E' only has one mode | communicate covertly with 'B', because we have specified that 'E' | |||
regardless of what mode it says it has negotiated. We could have | only has one mode regardless of what mode it says it has | |||
specified that 'E' should have a limited functionality mode and | negotiated. We could have specified that 'E' should have a | |||
check for such behaviour. But we decided not to add the extra | limited functionality mode and check for such behaviour. However, | |||
complexity of two modes on a compliant tunnel egress merely to | we decided not to add the extra complexity of two modes on a | |||
cater for an historic security concern that is now considered | compliant tunnel egress merely to cater for an historic security | |||
manageable. | concern that is now considered manageable. | |||
10. Conclusions | 9. Conclusions | |||
This document allows tunnels to propagate an extra level of | This document allows tunnels to propagate an extra level of | |||
congestion severity. It uses previously unused combinations of inner | congestion severity. It uses previously unused combinations of inner | |||
and outer header to augment the rules for calculating the ECN field | and outer headers to augment the rules for calculating the ECN field | |||
when decapsulating IP packets at the egress of IPsec (RFC4301) and | when decapsulating IP packets at the egress of IPsec (RFC 4301) and | |||
non-IPsec (RFC3168) tunnels. | non-IPsec (RFC 3168) tunnels. | |||
This document also updates the ingress tunnelling encapsulation of | This document also updates the ingress tunnelling encapsulation of | |||
RFC3168 ECN to bring all IP in IP tunnels into line with the new | RFC 3168 ECN to bring all IP-in-IP tunnels into line with the new | |||
behaviour in the IPsec architecture of RFC4301, which copies rather | behaviour in the IPsec architecture of RFC 4301, which copies rather | |||
than resets the ECN field when creating outer headers. | than resets the ECN field when creating outer headers. | |||
The need for both these updated behaviours was triggered by the | The need for both these updated behaviours was triggered by the | |||
introduction of pre-congestion notification (PCN) onto the IETF | introduction of pre-congestion notification (PCN) onto the IETF | |||
standards track. Operators wanting to support PCN or other alternate | Standards Track. Operators wanting to support PCN or other alternate | |||
ECN schemes that use an extra severity level can require that their | ECN schemes that use an extra severity level can require that their | |||
tunnels comply with the present specification. This is not a fork in | tunnels comply with the present specification. This is not a fork in | |||
the RFC series, it is an update that can be deployed first by those | the RFC series, it is an update that can be deployed first by those | |||
that need it, and subsequently by all tunnel endpoint implementations | that need it, and subsequently by all tunnel endpoint implementations | |||
during general code maintenance. It is backward compatible with all | during general code maintenance. It is backward compatible with all | |||
previous tunnelling behaviours, so existing single severity level | previous tunnelling behaviours, so existing single severity level | |||
schemes will continue to work as before, but support for two severity | schemes will continue to work as before, but support for two severity | |||
levels will gradually be added to the Internet. | levels will gradually be added to the Internet. | |||
The new rules propagate changes to the ECN field across tunnel end- | The new rules propagate changes to the ECN field across tunnel | |||
points that previously blocked them to restrict the bandwidth of a | endpoints that previously blocked them to restrict the bandwidth of a | |||
potential covert channel. Limiting the channel's bandwidth to 2 bits | potential covert channel. Limiting the channel's bandwidth to two | |||
per packet is now considered sufficient. | bits per packet is now considered sufficient. | |||
At the same time as removing these legacy constraints, the | At the same time as removing these legacy constraints, the | |||
opportunity has been taken to draw together diverging tunnel | opportunity has been taken to draw together diverging tunnel | |||
specifications into a single consistent behaviour. Then any tunnel | specifications into a single consistent behaviour. Then, any tunnel | |||
can be deployed unilaterally, and it will support the full range of | can be deployed unilaterally, and it will support the full range of | |||
congestion control and management schemes without any modes or | congestion control and management schemes without any modes or | |||
configuration. Further, any host or router can expect the ECN field | configuration. Further, any host or router can expect the ECN field | |||
to behave in the same way, whatever type of tunnel might intervene in | to behave in the same way, whatever type of tunnel might intervene in | |||
the path. This new certainty could enable new uses of the ECN field | the path. This new certainty could enable new uses of the ECN field | |||
that would otherwise be confounded by ambiguity. | that would otherwise be confounded by ambiguity. | |||
11. Acknowledgements | 10. Acknowledgements | |||
Thanks to David Black for his insightful reviews and patient | Thanks to David Black for his insightful reviews and patient | |||
explanations of better ways to think about function placement and | explanations of better ways to think about function placement and | |||
alarms. Thanks to David and to Anil Agarwal for pointing out cases | alarms. Thanks to David and to Anil Agarwal for pointing out cases | |||
where it is safe to forward CU combinations of headers. Also thanks | where it is safe to forward CU combinations of headers. Also, thanks | |||
to Arnaud Jacquet for the idea for Appendix C. Thanks to Gorry | to Arnaud Jacquet for the idea for Appendix C. Thanks to Gorry | |||
Fairhurst, Teco Boot, Michael Menth, Bruce Davie, Toby Moncaster, | Fairhurst, Teco Boot, Michael Menth, Bruce Davie, Toby Moncaster, | |||
Sally Floyd, Alfred Hoenes, Gabriele Corliano, Ingemar Johansson, | Sally Floyd, Alfred Hoenes, Gabriele Corliano, Ingemar Johansson, | |||
Philip Eardley and David Harrington for their thoughts and careful | Philip Eardley, and David Harrington for their thoughts and careful | |||
review comments, and to Stephen Hanna, Ben Campbell and members of | review comments, and to Stephen Hanna, Ben Campbell, and members of | |||
the IESG for respectively conducting the Security Directorate, | the IESG for respectively conducting the Security Directorate, | |||
General Area and IESG reviews. | General Area, and IESG reviews. | |||
Bob Briscoe is partly funded by Trilogy, a research project (ICT- | Bob Briscoe is partly funded by Trilogy, a research project (ICT- | |||
216372) supported by the European Community under its Seventh | 216372) supported by the European Community under its Seventh | |||
Framework Programme. | Framework Programme. | |||
Comments Solicited (to be removed by the RFC Editor): | 11. References | |||
Comments and questions are encouraged and very welcome. They can be | ||||
addressed to the IETF Transport Area working group mailing list | ||||
<tsvwg@ietf.org>, and/or to the authors. | ||||
12. References | ||||
12.1. Normative References | 11.1. Normative References | |||
[RFC2003] Perkins, C., "IP Encapsulation within | [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, | |||
IP", RFC 2003, October 1996. | October 1996. | |||
[RFC2119] Bradner, S., "Key words for use in | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
RFCs to Indicate Requirement Levels", | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
BCP 14, RFC 2119, March 1997. | ||||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
Black, "The Addition of Explicit | of Explicit Congestion Notification (ECN) to IP", | |||
Congestion Notification (ECN) to IP", | RFC 3168, September 2001. | |||
RFC 3168, September 2001. | ||||
[RFC4301] Kent, S. and K. Seo, "Security | [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | |||
Architecture for the Internet | Internet Protocol", RFC 4301, December 2005. | |||
Protocol", RFC 4301, December 2005. | ||||
12.2. Informative References | 11.2. Informative References | |||
[I-D.ietf-pcn-3-in-1-encoding] Briscoe, B., Moncaster, T., and M. | [PCN3in1] Briscoe, B., Moncaster, T., and M. Menth, "Encoding 3 PCN- | |||
Menth, "Encoding 3 PCN-States in the | States in the IP header using a single DSCP", Work | |||
IP header using a single DSCP", | in Progress, July 2010. | |||
draft-ietf-pcn-3-in-1-encoding-03 | ||||
(work in progress), July 2010. | ||||
[RFC2401] Kent, S. and R. Atkinson, "Security | [RFC2401] Kent, S. and R. Atkinson, "Security Architecture for the | |||
Architecture for the Internet | Internet Protocol", RFC 2401, November 1998. | |||
Protocol", RFC 2401, November 1998. | ||||
[RFC2474] Nichols, K., Blake, S., Baker, F., | [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, | |||
and D. Black, "Definition of the | "Definition of the Differentiated Services Field (DS | |||
Differentiated Services Field (DS | Field) in the IPv4 and IPv6 Headers", RFC 2474, | |||
Field) in the IPv4 and IPv6 Headers", | December 1998. | |||
RFC 2474, December 1998. | ||||
[RFC2481] Ramakrishnan, K. and S. Floyd, "A | [RFC2481] Ramakrishnan, K. and S. Floyd, "A Proposal to add Explicit | |||
Proposal to add Explicit Congestion | Congestion Notification (ECN) to IP", RFC 2481, | |||
Notification (ECN) to IP", RFC 2481, | January 1999. | |||
January 1999. | ||||
[RFC2983] Black, D., "Differentiated Services | [RFC2983] Black, D., "Differentiated Services and Tunnels", | |||
and Tunnels", RFC 2983, October 2000. | RFC 2983, October 2000. | |||
[RFC3540] Spring, N., Wetherall, D., and D. | [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | |||
Ely, "Robust Explicit Congestion | Congestion Notification (ECN) Signaling with Nonces", | |||
Notification (ECN) Signaling with | RFC 3540, June 2003. | |||
Nonces", RFC 3540, June 2003. | ||||
[RFC4306] Kaufman, C., "Internet Key Exchange | [RFC4774] Floyd, S., "Specifying Alternate Semantics for the | |||
(IKEv2) Protocol", RFC 4306, | Explicit Congestion Notification (ECN) Field", BCP 124, | |||
December 2005. | RFC 4774, November 2006. | |||
[RFC4774] Floyd, S., "Specifying Alternate | [RFC5129] Davie, B., Briscoe, B., and J. Tay, "Explicit Congestion | |||
Semantics for the Explicit Congestion | Marking in MPLS", RFC 5129, January 2008. | |||
Notification (ECN) Field", BCP 124, | ||||
RFC 4774, November 2006. | ||||
[RFC5129] Davie, B., Briscoe, B., and J. Tay, | [RFC5559] Eardley, P., "Pre-Congestion Notification (PCN) | |||
"Explicit Congestion Marking in | Architecture", RFC 5559, June 2009. | |||
MPLS", RFC 5129, January 2008. | ||||
[RFC5559] Eardley, P., "Pre-Congestion | [RFC5659] Bocci, M. and S. Bryant, "An Architecture for Multi- | |||
Notification (PCN) Architecture", | Segment Pseudowire Emulation Edge-to-Edge", RFC 5659, | |||
RFC 5559, June 2009. | October 2009. | |||
[RFC5659] Bocci, M. and S. Bryant, "An | [RFC5670] Eardley, P., "Metering and Marking Behaviour of PCN- | |||
Architecture for Multi-Segment | Nodes", RFC 5670, November 2009. | |||
Pseudowire Emulation Edge-to-Edge", | ||||
RFC 5659, October 2009. | ||||
[RFC5670] Eardley, P., "Metering and Marking | [RFC5696] Moncaster, T., Briscoe, B., and M. Menth, "Baseline | |||
Behaviour of PCN-Nodes", RFC 5670, | Encoding and Transport of Pre-Congestion Information", | |||
November 2009. | RFC 5696, November 2009. | |||
[RFC5696] Moncaster, T., Briscoe, B., and M. | [RFC5996] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen, | |||
Menth, "Baseline Encoding and | "Internet Key Exchange Protocol Version 2 (IKEv2)", | |||
Transport of Pre-Congestion | RFC 5996, September 2010. | |||
Information", RFC 5696, | ||||
November 2009. | ||||
[VCP] Xia, Y., Subramanian, L., Stoica, I., | [VCP] Xia, Y., Subramanian, L., Stoica, I., and S. Kalyanaraman, | |||
and S. Kalyanaraman, "One more bit is | "One more bit is enough", Proc. SIGCOMM'05, ACM | |||
enough", Proc. SIGCOMM'05, ACM | CCR 35(4)37--48, 2005, | |||
CCR 35(4)37--48, 2005, <http:// | <http://doi.acm.org/10.1145/1080091.1080098>. | |||
doi.acm.org/10.1145/1080091.1080098>. | ||||
Appendix A. Early ECN Tunnelling RFCs | Appendix A. Early ECN Tunnelling RFCs | |||
IP in IP tunnelling was originally defined in [RFC2003]. On | IP-in-IP tunnelling was originally defined in [RFC2003]. On | |||
encapsulation, the incoming header was copied to the outer and on | encapsulation, the incoming header was copied to the outer and on | |||
decapsulation the outer was simply discarded. Initially, IPsec | decapsulation, the outer was simply discarded. Initially, IPsec | |||
tunnelling [RFC2401] followed the same behaviour. | tunnelling [RFC2401] followed the same behaviour. | |||
When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 | When ECN was introduced experimentally in [RFC2481], legacy (RFC 2003 | |||
or RFC2401) tunnels would have discarded any congestion markings | or RFC 2401) tunnels would have discarded any congestion markings | |||
added to the outer header, so RFC2481 introduced rules for | added to the outer header, so RFC 2481 introduced rules for | |||
calculating the outgoing header from a combination of the inner and | calculating the outgoing header from a combination of the inner and | |||
outer on decapsulation. RC2481 also introduced a second mode for | outer on decapsulation. RFC 2481 also introduced a second mode for | |||
IPsec tunnels, which turned off ECN processing (Not-ECT) in the outer | IPsec tunnels, which turned off ECN processing (Not-ECT) in the outer | |||
header on encapsulation because an RFC2401 decapsulator would discard | header on encapsulation because an RFC 2401 decapsulator would | |||
the outer on decapsulation. For RFC2401 IPsec this had the side- | discard the outer on decapsulation. For RFC 2401 IPsec, this had the | |||
effect of completely blocking the covert channel. | side effect of completely blocking the covert channel. | |||
In RFC2481 the ECN field was defined as two separate bits. But when | In RFC 2481, the ECN field was defined as two separate bits. But | |||
ECN moved from the experimental to the standards track [RFC3168], the | when ECN moved from Experimental to Standards Track [RFC3168], the | |||
ECN field was redefined as four codepoints. This required a | ECN field was redefined as four codepoints. This required a | |||
different calculation of the ECN field from that used in RFC2481 on | different calculation of the ECN field from that used in RFC 2481 on | |||
decapsulation. RFC3168 also had two modes; a 'full functionality | decapsulation. RFC 3168 also had two modes; a 'full functionality | |||
mode' that restricted the covert channel as much as possible but | mode' that restricted the covert channel as much as possible but | |||
still allowed ECN to be used with IPsec, and another that completely | still allowed ECN to be used with IPsec, and another that completely | |||
turned off ECN processing across the tunnel. This 'limited | turned off ECN processing across the tunnel. This 'limited | |||
functionality mode' both offered a way for operators to completely | functionality mode' both offered a way for operators to completely | |||
block the covert channel and allowed an RFC3168 ingress to interwork | block the covert channel and allowed an RFC 3168 ingress to interwork | |||
with a legacy tunnel egress (RFC2481, RFC2401 or RFC2003). | with a legacy tunnel egress (RFC 2481, RFC 2401, or RFC 2003). | |||
The present specification includes a similar compatibility mode to | The present specification includes a similar compatibility mode to | |||
interwork safely with tunnels compliant with any of these three | interwork safely with tunnels compliant with any of these three | |||
earlier RFCs. However, unlike RFC3168, it is only a mode of the | earlier RFCs. However, unlike RFC 3168, it is only a mode of the | |||
ingress, as decapsulation behaviour is the same in either case. | ingress, as decapsulation behaviour is the same in either case. | |||
Appendix B. Design Constraints | Appendix B. Design Constraints | |||
Tunnel processing of a congestion notification field has to meet | Tunnel processing of a congestion notification field has to meet | |||
congestion control and management needs without creating new | congestion control and management needs without creating new | |||
information security vulnerabilities (if information security is | information security vulnerabilities (if information security is | |||
required). This appendix documents the analysis of the tradeoffs | required). This appendix documents the analysis of the trade-offs | |||
between these factors that led to the new encapsulation rules in | between these factors that led to the new encapsulation rules in | |||
Section 4.1. | Section 4.1. | |||
B.1. Security Constraints | B.1. Security Constraints | |||
Information security can be assured by using various end to end | Information security can be assured by using various end-to-end | |||
security solutions (including IPsec in transport mode [RFC4301]), but | security solutions (including IPsec in transport mode [RFC4301]), but | |||
a commonly used scenario involves the need to communicate between two | a commonly used scenario involves the need to communicate between two | |||
physically protected domains across the public Internet. In this | physically protected domains across the public Internet. In this | |||
case there are certain management advantages to using IPsec in tunnel | case, there are certain management advantages to using IPsec in | |||
mode solely across the publicly accessible part of the path. The | tunnel mode solely across the publicly accessible part of the path. | |||
path followed by a packet then crosses security 'domains'; the ones | The path followed by a packet then crosses security 'domains'; the | |||
protected by physical or other means before and after the tunnel and | ones protected by physical or other means before and after the tunnel | |||
the one protected by an IPsec tunnel across the otherwise unprotected | and the one protected by an IPsec tunnel across the otherwise | |||
domain. The scenario in Figure 5 will be used where endpoints 'A' | unprotected domain. The scenario in Figure 5 will be used where | |||
and 'B' communicate through a tunnel. The tunnel ingress 'I' and | endpoints 'A' and 'B' communicate through a tunnel. The tunnel | |||
egress 'E' are within physically protected edge domains, while the | ingress 'I' and egress 'E' are within physically protected edge | |||
tunnel spans an unprotected internetwork where there may be 'men in | domains, while the tunnel spans an unprotected internetwork where | |||
the middle', M. | there may be 'men in the middle', M. | |||
physically unprotected physically | physically unprotected physically | |||
<-protected domain-><--domain--><-protected domain-> | <-protected domain-><--domain--><-protected domain-> | |||
+------------------+ +------------------+ | +------------------+ +------------------+ | |||
| | M | | | | | M | | | |||
| A-------->I=========>==========>E-------->B | | | A-------->I=========>==========>E-------->B | | |||
| | | | | | | | | | |||
+------------------+ +------------------+ | +------------------+ +------------------+ | |||
<----IPsec secured----> | <----IPsec secured----> | |||
tunnel | tunnel | |||
skipping to change at page 37, line 29 | skipping to change at page 30, line 36 | |||
Figure 5: IPsec Tunnel Scenario | Figure 5: IPsec Tunnel Scenario | |||
IPsec encryption is typically used to prevent 'M' seeing messages | IPsec encryption is typically used to prevent 'M' seeing messages | |||
from 'A' to 'B'. IPsec authentication is used to prevent 'M' | from 'A' to 'B'. IPsec authentication is used to prevent 'M' | |||
masquerading as the sender of messages from 'A' to 'B' or altering | masquerading as the sender of messages from 'A' to 'B' or altering | |||
their contents. 'I' can use IPsec tunnel mode to allow 'A' to | their contents. 'I' can use IPsec tunnel mode to allow 'A' to | |||
communicate with 'B', but impose encryption to prevent 'A' leaking | communicate with 'B', but impose encryption to prevent 'A' leaking | |||
information to 'M'. Or 'E' can insist that 'I' uses tunnel mode | information to 'M'. Or 'E' can insist that 'I' uses tunnel mode | |||
authentication to prevent 'M' communicating information to 'B'. | authentication to prevent 'M' communicating information to 'B'. | |||
Mutable IP header fields such as the ECN field (as well as the TTL/ | Mutable IP header fields such as the ECN field (as well as the Time | |||
Hop Limit and DS fields) cannot be included in the cryptographic | to Live (TTL) / Hop Limit and DS fields) cannot be included in the | |||
calculations of IPsec. Therefore, if 'I' copies these mutable fields | cryptographic calculations of IPsec. Therefore, if 'I' copies these | |||
into the outer header that is exposed across the tunnel it will have | mutable fields into the outer header that is exposed across the | |||
allowed a covert channel from 'A' to M that bypasses its encryption | tunnel it will have allowed a covert channel from 'A' to 'M' that | |||
of the inner header. And if 'E' copies these fields from the outer | bypasses its encryption of the inner header. And if 'E' copies these | |||
header to the inner, even if it validates authentication from 'I', it | fields from the outer header to the outgoing, even if it validates | |||
will have allowed a covert channel from 'M' to 'B'. | authentication from 'I', it will have allowed a covert channel from | |||
'M' to 'B'. | ||||
ECN at the IP layer is designed to carry information about congestion | ECN at the IP layer is designed to carry information about congestion | |||
from a congested resource towards downstream nodes. Typically a | from a congested resource towards downstream nodes. Typically, a | |||
downstream transport might feed the information back somehow to the | downstream transport might feed the information back somehow to the | |||
point upstream of the congestion that can regulate the load on the | point upstream of the congestion that can regulate the load on the | |||
congested resource, but other actions are possible [RFC3168; section | congested resource, but other actions are possible [RFC3168], Section | |||
6]. In terms of the above unicast scenario, ECN effectively intends | 6. In terms of the above unicast scenario, ECN effectively intends | |||
to create an information channel (for congestion signalling) from 'M' | to create an information channel (for congestion signalling) from 'M' | |||
to 'B' (for 'B' to feed back to 'A'). Therefore the goals of IPsec | to 'B' (for 'B' to feed back to 'A'). Therefore, the goals of IPsec | |||
and ECN are mutually incompatible, requiring some compromise. | and ECN are mutually incompatible, requiring some compromise. | |||
With respect to using the DS or ECN fields as covert channels, | With respect to using the DS or ECN fields as covert channels, | |||
section 5.1.2 of RFC4301 says, "controls are provided to manage the | Section 5.1.2 of RFC 4301 says, "controls are provided to manage the | |||
bandwidth of this channel". Using the ECN processing rules of | bandwidth of this channel". Using the ECN processing rules of RFC | |||
RFC4301, the channel bandwidth is two bits per datagram from 'A' to | 4301, the channel bandwidth is two bits per datagram from 'A' to 'M' | |||
'M' and one bit per datagram from 'M' to 'A' (because 'E' limits the | and one bit per datagram from 'M' to 'B' (because 'E' limits the | |||
combinations of the 2-bit ECN field that it will copy). In both | combinations of the 2-bit ECN field that it will copy). In both | |||
cases the covert channel bandwidth is further reduced by noise from | cases, the covert channel bandwidth is further reduced by noise from | |||
any real congestion marking. RFC4301 implies that these covert | any real congestion marking. RFC 4301 implies that these covert | |||
channels are sufficiently limited to be considered a manageable | channels are sufficiently limited to be considered a manageable | |||
threat. However, with respect to the larger (6b) DS field, the same | threat. However, with respect to the larger (six-bit) DS field, the | |||
section of RFC4301 says not copying is the default, but a | same section of RFC 4301 says not copying is the default, but a | |||
configuration option can allow copying "to allow a local | configuration option can allow copying "to allow a local | |||
administrator to decide whether the covert channel provided by | administrator to decide whether the covert channel provided by | |||
copying these bits outweighs the benefits of copying". Of course, an | copying these bits outweighs the benefits of copying". Of course, an | |||
administrator considering copying of the DS field has to take into | administrator who plans to copy the DS field has to take into account | |||
account that it could be concatenated with the ECN field giving an 8b | that it could be concatenated with the ECN field, creating a covert | |||
per datagram covert channel. | channel with eight bits per datagram. | |||
For tunnelling the 6b Diffserv field two conceptual models have had | For tunnelling the six-bit Diffserv field, two conceptual models have | |||
to be defined so that administrators can trade off security against | had to be defined so that administrators can trade off security | |||
the needs of traffic conditioning [RFC2983]: | against the needs of traffic conditioning [RFC2983]: | |||
The uniform model: where the Diffserv field is preserved end-to-end | The uniform model: where the Diffserv field is preserved end-to-end | |||
by copying into the outer header on encapsulation and copying from | by copying into the outer header on encapsulation and copying from | |||
the outer header on decapsulation. | the outer header on decapsulation. | |||
The pipe model: where the outer header is independent of that in the | The pipe model: where the outer header is independent of that in the | |||
inner header so it hides the Diffserv field of the inner header | inner header so it hides the Diffserv field of the inner header | |||
from any interaction with nodes along the tunnel. | from any interaction with nodes along the tunnel. | |||
However, for ECN, the new IPsec security architecture in RFC4301 only | However, for ECN, the new IPsec security architecture in RFC 4301 | |||
standardised one tunnelling model equivalent to the uniform model. | only standardised one tunnelling model equivalent to the uniform | |||
It deemed that simplicity was more important than allowing | model. It deemed that simplicity was more important than allowing | |||
administrators the option of a tiny increment in security, especially | administrators the option of a tiny increment in security, especially | |||
given not copying congestion indications could seriously harm | given not copying congestion indications could seriously harm | |||
everyone's network service. | everyone's network service. | |||
B.2. Control Constraints | B.2. Control Constraints | |||
Congestion control requires that any congestion notification marked | Congestion control requires that any congestion notification marked | |||
into packets by a resource will be able to traverse a feedback loop | into packets by a resource will be able to traverse a feedback loop | |||
back to a function capable of controlling the load on that resource. | back to a function capable of controlling the load on that resource. | |||
To be precise, rather than calling this function the data source, it | To be precise, rather than calling this function the data source, it | |||
will be called the Load Regulator. This allows for exceptional cases | will be called the 'Load Regulator'. This allows for exceptional | |||
where load is not regulated by the data source, but usually the two | cases where load is not regulated by the data source, but usually the | |||
terms will be synonymous. Note the term "a function _capable of_ | two terms will be synonymous. Note the term "a function _capable of_ | |||
controlling the load" deliberately includes a source application that | controlling the load" deliberately includes a source application that | |||
doesn't actually control the load but ought to (e.g. an application | doesn't actually control the load but ought to (e.g., an application | |||
without congestion control that uses UDP). | without congestion control that uses UDP). | |||
A--->R--->I=========>M=========>E-------->B | A--->R--->I=========>M=========>E-------->B | |||
Figure 6: Simple Tunnel Scenario | Figure 6: Simple Tunnel Scenario | |||
A similar tunnelling scenario to the IPsec one just described will | A similar tunnelling scenario to the IPsec one just described will | |||
now be considered, but without the different security domains, | now be considered, but without the different security domains, | |||
because the focus now shifts to whether the control loop and | because the focus now shifts to whether the control loop and | |||
management monitoring work (Figure 6). If resources in the tunnel | management monitoring work (Figure 6). If resources in the tunnel | |||
are to be able to explicitly notify congestion and the feedback path | are to be able to explicitly notify congestion and the feedback path | |||
is from 'B' to 'A', it will certainly be necessary for 'E' to copy | is from 'B' to 'A', it will certainly be necessary for 'E' to copy | |||
any CE marking from the outer header to the inner header for onward | any CE marking from the outer header to the outgoing header for | |||
transmission to 'B', otherwise congestion notification from resources | onward transmission to 'B'; otherwise, congestion notification from | |||
like 'M' cannot be fed back to the Load Regulator ('A'). But it does | resources like 'M' cannot be fed back to the Load Regulator ('A'). | |||
not seem necessary for 'I' to copy CE markings from the inner to the | But it does not seem necessary for 'I' to copy CE markings from the | |||
outer header. For instance, if resource 'R' is congested, it can | incoming to the outer header. For instance, if resource 'R' is | |||
send congestion information to 'B' using the congestion field in the | congested, it can send congestion information to 'B' using the | |||
inner header without 'I' copying the congestion field into the outer | congestion field in the inner header without 'I' copying the | |||
header and 'E' copying it back to the inner header. 'E' can still | congestion field into the outer header and 'E' copying it back to the | |||
write any additional congestion marking introduced across the tunnel | outgoing header. 'E' can still write any additional congestion | |||
into the congestion field of the inner header. | marking introduced across the tunnel into the congestion field of the | |||
outgoing header. | ||||
All this shows that 'E' can preserve the control loop irrespective of | All this shows that 'E' can preserve the control loop irrespective of | |||
whether 'I' copies congestion notification into the outer header or | whether 'I' copies congestion notification into the outer header or | |||
resets it. | resets it. | |||
That is the situation for existing control arrangements but, because | That is the situation for existing control arrangements but, because | |||
copying reveals more information, it would open up possibilities for | copying reveals more information, it would open up possibilities for | |||
better control system designs. For instance, resetting CE marking on | better control system designs. For instance, resetting CE marking on | |||
encapsulation breaks the standards track PCN congestion marking | encapsulation breaks the Standards-Track PCN congestion marking | |||
scheme [RFC5670]. It ends up removing excessive amounts of traffic | scheme [RFC5670]. It ends up removing excessive amounts of traffic | |||
unnecessarily. Whereas copying CE markings at ingress leads to the | unnecessarily (Section 5.3.1). Whereas copying CE markings at | |||
correct control behaviour. | ingress leads to the correct control behaviour. | |||
B.3. Management Constraints | B.3. Management Constraints | |||
As well as control, there are also management constraints. | As well as control, there are also management constraints. | |||
Specifically, a management system may monitor congestion markings in | Specifically, a management system may monitor congestion markings in | |||
passing packets, perhaps at the border between networks as part of a | passing packets, perhaps at the border between networks as part of a | |||
service level agreement. For instance, monitors at the borders of | service level agreement. For instance, monitors at the borders of | |||
autonomous systems may need to measure how much congestion has | autonomous systems may need to measure how much congestion has | |||
accumulated so far along the path, perhaps to determine between them | accumulated so far along the path, perhaps to determine between them | |||
how much of the congestion is contributed by each domain. | how much of the congestion is contributed by each domain. | |||
In this document the baseline of congestion marking (or the | In this document, the baseline of congestion marking (or the | |||
Congestion Baseline) is defined as the source of the layer that | Congestion Baseline) is defined as the source of the layer that | |||
created (or most recently reset) the congestion notification field. | created (or most recently reset) the congestion notification field. | |||
When monitoring congestion it would be desirable if the Congestion | When monitoring congestion, it would be desirable if the Congestion | |||
Baseline did not depend on whether packets were tunnelled or not. | Baseline did not depend on whether or not packets were tunnelled. | |||
Given some tunnels cross domain borders (e.g. consider M in Figure 6 | Given some tunnels cross domain borders (e.g., consider 'M' in | |||
is monitoring a border), it would therefore be desirable for 'I' to | Figure 6 is monitoring a border), it would therefore be desirable for | |||
copy congestion accumulated so far into the outer headers, so that it | 'I' to copy congestion accumulated so far into the outer headers, so | |||
is exposed across the tunnel. | that it is exposed across the tunnel. | |||
For management purposes it might be useful for the tunnel egress to | For management purposes, it might be useful for the tunnel egress to | |||
be able to monitor whether congestion occurred across a tunnel or | be able to monitor whether congestion occurred across a tunnel or | |||
upstream of it. Superficially it appears that copying congestion | upstream of it. Superficially, it appears that copying congestion | |||
markings at the ingress would make this difficult, whereas it was | markings at the ingress would make this difficult, whereas it was | |||
straightforward when an RFC3168 ingress reset them. However, | straightforward when an RFC 3168 ingress reset them. However, | |||
Appendix C gives a simple and precise method for a tunnel egress to | Appendix C gives a simple and precise method for a tunnel egress to | |||
infer the congestion level introduced across a tunnel. It works | infer the congestion level introduced across a tunnel. It works | |||
irrespective of whether the ingress copies or resets congestion | irrespective of whether the ingress copies or resets congestion | |||
markings. | markings. | |||
Appendix C. Contribution to Congestion across a Tunnel | Appendix C. Contribution to Congestion across a Tunnel | |||
This specification mandates that a tunnel ingress determines the ECN | This specification mandates that a tunnel ingress determines the ECN | |||
field of each new outer tunnel header by copying the arriving header. | field of each new outer tunnel header by copying the arriving header. | |||
Concern has been expressed that this will make it difficult for the | Concern has been expressed that this will make it difficult for the | |||
tunnel egress to monitor congestion introduced only along a tunnel, | tunnel egress to monitor congestion introduced only along a tunnel, | |||
which is easy if the outer ECN field is reset at a tunnel ingress | which is easy if the outer ECN field is reset at a tunnel ingress | |||
(RFC3168 full functionality mode). However, in fact copying CE marks | (RFC 3168 full functionality mode). However, in fact copying CE | |||
at ingress will still make it easy for the egress to measure | marks at ingress will still make it easy for the egress to measure | |||
congestion introduced across a tunnel, as illustrated below. | congestion introduced across a tunnel, as illustrated below. | |||
Consider 100 packets measured at the egress. Say it measures that 30 | Consider 100 packets measured at the egress. Say it measures that 30 | |||
are CE marked in the inner and outer headers and 12 have additional | are CE marked in the inner and outer headers and 12 have additional | |||
CE marks in the outer but not the inner. This means packets arriving | CE marks in the outer but not the inner. This means packets arriving | |||
at the ingress had already experienced 30% congestion. However, it | at the ingress had already experienced 30% congestion. However, it | |||
does not mean there was 12% congestion across the tunnel. The | does not mean there was 12% congestion across the tunnel. The | |||
correct calculation of congestion across the tunnel is p_t = 12/ | correct calculation of congestion across the tunnel is p_t = 12/ | |||
(100-30) = 12/70 = 17%. This is easy for the egress to measure. It | (100-30) = 12/70 = 17%. This is easy for the egress to measure. It | |||
is simply the proportion of packets not marked in the inner header | is simply the proportion of packets not marked in the inner header | |||
(70) that have a CE marking in the outer header (12). This technique | (70) that have a CE marking in the outer header (12). This technique | |||
works whether the ingress copies or resets CE markings, so it can be | works whether the ingress copies or resets CE markings, so it can be | |||
used by an egress that is not sure which RFC the ingress complies | used by an egress that is not sure with which RFC the ingress | |||
with. | complies. | |||
Figure 7 illustrates this in a combinatorial probability diagram. | Figure 7 illustrates this in a combinatorial probability diagram. | |||
The square represents 100 packets. The 30% division along the bottom | The square represents 100 packets. The 30% division along the bottom | |||
represents marking before the ingress, and the p_t division up the | represents marking before the ingress, and the p_t division up the | |||
side represents marking introduced across the tunnel. | side represents marking introduced across the tunnel. | |||
^ outer header marking | ^ outer header marking | |||
| | | | |||
100% +-----+---------+ The large square | 100% +-----+---------+ The large square | |||
| | | represents 100 packets | | | | represents 100 packets | |||
skipping to change at page 41, line 21 | skipping to change at page 34, line 26 | |||
p_t + +---------+ = 12/70 | p_t + +---------+ = 12/70 | |||
| | 12 | = 17% | | | 12 | = 17% | |||
0 +-----+---------+---> | 0 +-----+---------+---> | |||
0 30% 100% inner header marking | 0 30% 100% inner header marking | |||
Figure 7: Tunnel Marking of Packets Already Marked at Ingress | Figure 7: Tunnel Marking of Packets Already Marked at Ingress | |||
Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0) Outer | Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0) Outer | |||
A packet with an ECT(1) inner and an ECT(0) outer should never arise | A packet with an ECT(1) inner and an ECT(0) outer should never arise | |||
from any known IETF protocol. Without giving a reason, RFC3168 and | from any known IETF protocol. Without giving a reason, RFC 3168 and | |||
RFC4301 both say the outer should be ignored when decapsulating such | RFC 4301 both say the outer should be ignored when decapsulating such | |||
a packet. This appendix explains why it was decided not to change | a packet. This appendix explains why it was decided not to change | |||
this advice. | this advice. | |||
In summary, ECT(0) always means 'not congested' and ECT(1) may imply | In summary, ECT(0) always means 'not congested' and ECT(1) may imply | |||
the same [RFC3168] or it may imply a higher severity congestion | the same [RFC3168] or it may imply a higher severity congestion | |||
signal [RFC4774], [I-D.ietf-pcn-3-in-1-encoding], depending on the | signal [RFC4774], [PCN3in1], depending on the transport in use. | |||
transport in use. Whether they mean the same or not, at the ingress | Whether or not they mean the same, at the ingress the outer should | |||
the outer should have started the same as the inner and only a broken | have started the same as the inner, and only a broken or compromised | |||
or compromised router could have changed the outer to ECT(0). | router could have changed the outer to ECT(0). | |||
The decapsulator can detect this anomaly. But the question is, | The decapsulator can detect this anomaly. But the question is, | |||
should it correct the anomaly by ignoring the outer, or should it | should it correct the anomaly by ignoring the outer, or should it | |||
reveal the anomaly to the end-to-end transport by forwarding the | reveal the anomaly to the end-to-end transport by forwarding the | |||
outer? | outer? | |||
On balance, it was decided that the decapsulator should correct the | On balance, it was decided that the decapsulator should correct the | |||
anomaly, but log the event and optionally raise an alarm. This is | anomaly, but log the event and optionally raise an alarm. This is | |||
the safe action if ECT(1) is being used as a more severe marking than | the safe action if ECT(1) is being used as a more severe marking than | |||
ECT(0), because it passes the more severe signal to the transport. | ECT(0), because it passes the more severe signal to the transport. | |||
skipping to change at page 42, line 23 | skipping to change at page 35, line 30 | |||
seems to require a similar compromise. However, because that case is | seems to require a similar compromise. However, because that case is | |||
reversed, no compromise is necessary; it is best to forward the outer | reversed, no compromise is necessary; it is best to forward the outer | |||
whether the transport expects the ECT(1) to mean a higher severity | whether the transport expects the ECT(1) to mean a higher severity | |||
than ECT(0) or the same severity. Forwarding the outer either | than ECT(0) or the same severity. Forwarding the outer either | |||
preserves a higher value (if it is higher) or it reveals an anomaly | preserves a higher value (if it is higher) or it reveals an anomaly | |||
to the transport (if the two ECT codepoints mean the same severity). | to the transport (if the two ECT codepoints mean the same severity). | |||
Appendix E. Open Issues | Appendix E. Open Issues | |||
The new decapsulation behaviour defined in Section 4.2 adds support | The new decapsulation behaviour defined in Section 4.2 adds support | |||
for propagation of 2 severity levels of congestion. However | for propagation of two severity levels of congestion. However, | |||
transports have no way to discover whether there are any legacy | transports have no way to discover whether there are any legacy | |||
tunnels on their path that will not propagate 2 severity levels. It | tunnels on their path that will not propagate two severity levels. | |||
would have been nice to add a feature for transports to check path | It would have been nice to add a feature for transports to check path | |||
support, but this remains an open issue that will have to be | support, but this remains an open issue that will have to be | |||
addressed in any future standards action to define an end-to-end | addressed in any future standards action to define an end-to-end | |||
scheme that requires 2-severity levels of congestion. PCN avoids | scheme that requires two severity levels of congestion. PCN avoids | |||
this problem because it is only for a controlled region, so all | this problem because it is only for a controlled region, so all | |||
legacy tunnels can be upgraded by the same operator that deploys PCN. | legacy tunnels can be upgraded by the same operator that deploys PCN. | |||
Author's Address | Author's Address | |||
Bob Briscoe | Bob Briscoe | |||
BT | BT | |||
B54/77, Adastral Park | B54/77, Adastral Park | |||
Martlesham Heath | Martlesham Heath | |||
Ipswich IP5 3RE | Ipswich IP5 3RE | |||
End of changes. 221 change blocks. | ||||
961 lines changed or deleted | 582 lines changed or added | |||
This html diff was produced by rfcdiff 1.40. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |